Some time ago, 2 of my PhD students were facing the prospect of going away to do their research elsewhere for a few months. In both cases, and for different reasons, it made a lot of sense for them to go. But we needed to stay in touch, and I needed to keep tabs on what they were doing. These days, with tools like Skype and Google docs, collaborating over the Internet is really easy. However, neither Skype nor Google docs are designed for supporting the specific kinds of interactions that go on between advisor and student, and within small academic units (aka “Labs”). First, we need to run them both independently. Second, and more importantly, we can’t really share a PDF or powerpoint document in the way that we do when we are working face to face going through a paper or a presentation — pointing, highlighting, using words like “here”, “this paragraph”, “this picture”, “go to next page”, etc. So I built my own virtual lab.
Google docs allows for shared editing, and it is very good at supporting independent interactions on the same document — that’s its value proposition. But that’s also its weakness: my interactions on the document are independent of yours. I can be looking at the beginning of the document and talking about text in there, and you won’t know what I’m referring to unless you happen to also be looking at the same part of the document and I use absolute references like “paragraph 3”, etc. With Google docs, we may not even be on the same page, literally, and this is quite different from the interactions that happen face-to-face over someone’s computer or a projector. Both forms of interactions (i.e. independent editing of the same document and shared visualization of a document) are valuable and serve specific needs. Google docs just happens to serve the former very well, but not serve the latter too well. I needed the latter, because that’s what we do in my Lab when we meet to work on papers and presentations.
So I thought, let’s use a virtual world with voice. I happen to be a developer of a Second Life -like server, OpenSimulator, and this seemed like a great opportunity to eat my own dog food, now that there’s enough good 3D content out there that I don’t need to go build 3D models myself (an artistic skill that I don’t possess). I also took advantage of Vivox’s generous donation to indie games, so my OpenSimulator environments all have voice. I started with the Universal Campus, a Creative Commons build done by someone we hired last year. But as I viewed the virtual environment through my students’ eyes, it was pretty clear that these 3D virtual worlds are also not enough to support the kinds of interactions that I wanted.
First of all, the UI of 3D world viewers is such that it gives way too much freedom to the users, more than they can handle, especially if they aren’t experts of these 3D UIs — and very few of my students are. The concept of the camera is really foreign to someone who is not used to free-roaming 3D environments. Even for people who are used to them, there is no standardized way of moving the camera; each virtual world viewer / 3D game does it differently. In the case of OpenSimulator, the Second Life viewers do it with a non trivial combination of keystrokes and mouse manipulation — physical coordination that requires a lot of practice! Noobies get lost very easily as to what the hell is going on and to what exactly they should be paying attention to — a classic UI #fail. Also, in the particular case of the Second Life viewers, there is no way to display PDF documents — a show stopper for me, because in Academia PDF documents are everywhere.
It was clear that if I wanted to build the virtual collaboration environment I had in mind, I had to do it myself on top of an existing one.
The camera constraints are done with an existing function (llSetCameraParams) for the server to control the user’s camera. SL and OpenSimulator aficionados may not like this kind of control, because they know how to control the camera, but it is so useful for noobies who don’t! It pretty much eliminates 90% of complexity when it comes to pointing people to where they should be looking at.
The PDF support required some hacking on the server-side. I made an addon to OpenSimulator that can handle the uploading of PDF documents and then serves them to the viewers.
Having these technical pieces in place, I then proceeded to the other fun part of the project, which was to model the specific face-to-face interactions that go on when I work with my students in the real world Lab. There’s basically 4 kinds of things we do: (1) we do dry-runs of presentations over a single computer screen or projector; (2) we work on papers together, again sharing a single computer screen or a projector; (3) we work on the same paper independently, on two or more computers; and (4) we browse the web together on a single computer screen. My virtual Lab (vLab, for short) has 4 “stations”, each one specifically designed for these 4 kinds of interactions (see first picture in this post). Let me describe each one.
Shared Projector for Presentations
The projector for doing dry-runs of presentations looks like a large display on one of the walls of the virtual meeting room. People interact with it by sitting on the chairs nearby. Once sited, the server takes control of their camera, placing it in exactly the right spot. I chose to place it in a point in space so that people can see not just the display but also the audience. This way, we know who is talking, and there is a sense of being together. Each person has access to a highlighter that serves both as laser pointer and selector (see the little red pointer and the green highlight in the picture). So both the presenter and the audience can point to, and highlight, things.
The presentations are uploaded as PDF files (PowerPoint can export PDF). The uploading starts by clicking the green button with the arrow on the top right of the display, which then takes the user through a normal web-based file selection dialog. Depending on how large the PDF file is, the uploading and processing can take anytime from 5 seconds to 5 minutes. Not too bad, and comparable to setting up a real world projector setup.
Shared Paper Reader
The shared paper reader looks like a giant tablet on another one of the walls of the virtual meeting room. Following the design principle introduced before for the shared display, people interact with the paper reader by sitting on the chairs nearby. Once sited, the camera goes to the right spot, and people can control the highlighter, similar to the shared display. Again, the camera position is such that both the display and the people interacting with it are shown.
What’s new here is that selected areas can be added notes, similar to the Acrobat PDF reader. This is useful for leaving notes to be discussed later, something that we do on papers a lot. Also, for convenience, the notes can be later downloaded as a simple text file — I’m considering making my next conference paper reviewing work here…
For shared editing of documents, I didn’t have to reinvent the wheel. Google does this brilliantly already. This kind of interaction is done over the conference table. Again, to start, people sit on the chairs and the camera goes to exactly the right spot, which, in this case, happens to be different for each chair. The Second Life viewer supports Media on a Prim (MOAP), which allows us to display web pages on the faces of 3D objects. So another consequence of sitting on the conference table chairs is that people are automatically given a HUD (Heads Up Display) that points to Google Docs.
These HUDs are not shared; each person has his/her own. As such, each person can sign in independently to Google Docs, and proceed exactly as they would if they were using the regular web browser. The interactions over the documents here are entirely Google Docs’ interactions, no more and no less. So for shared editing, the documents need to be shared there first. Also, each person can independently minimize/show the HUD, or get rid of it altogether.
Shared Web Browser
The only thing I’m doing here is controlling people’s cameras as they sit on the couch in front of this web browser.
Usage and Lessons Learned
It was a lot of fun to design and implement my vLab. But was it worth it?
Well, yes it was! I have been using it a lot more than I had anticipated. In fact, during this past quarter I had more meetings in the vLab than I had in the real world lab! This was in part due to those 2 students being away; we met there at least once a week, and went over 2 papers (with one of those students) and one PhD dissertation (with the other). But I met there also with the rest of my students who are here in Irvine. I happen to work from home a lot (I love my home office), so for any impromptu work over papers and presentations, the vLab is the place to be!
Technically, the most important lesson from this is that neither existing Web-based tools nor free-roaming 3D virtual environments are ready to support the kinds of workflows that we have face to face. There’s a lot of extra design and implementation work that needs to be done on top of existing tools. For Web-based tools, the challenge is to extend them so they support rich multi-user interactions; for free-roaming 3D virtual environments where multi-user interactions are a given, the challenge is to constrain them to some extent.