To respond to a growing demand for usability testing of games and applications on mobile devices, Psychster has been researching camera mounts. Here we share what we've learned. Feel free to give us a call if you want one of our prototypes.
At the outset, we agree that the need for an external camera mount will disappear as soon as there is a good screen-sharing software solution, where a USB cable connects the device directly to a laptop or PC. But as of this writing, you need to jailbreak the iOS to do this, which is a non-starter for us. True, you can already throw the screen up on the wall pretty easily, but Psychster clients often like to view the sessions remotely, or after the fact. And any time you want to see user's hands as well as the screen, an external camera will be required.
Jumana Al Hashal, a mobile app developer and alumni of the MCDM program at the University of Washington, did a thorough exploration of the software and hardward options if you want to read deeper. We also enjoyed working with Tony Santos from the HCDE program at UW on some early prototypes of the camera mount. He greatly helped us learn what works and doesn't work with the design.
As scientists and business-minded consultants, our short-list of requirements for camera mounts was as follows:
- The mount must accomodate a variety of research questions and physical positions without interruption. Some clients will only be concerned with what is on the screen, whereas others will need to see the user's hands or even facial gestures. Sometimes, the device will be held in portrait position, other times in landscape position. Sometimes the user will need to hold the device with one hand and touch it with a finger, other times the user will use two thumbs. It should not be necessary to interrupt the session and re-adjust the camera to switch between these circumstances.
- The mount must not unduly interfere or alter how a user would physically interact with the device under normal circumstances. During the test, the mount should effectively disappear so the user's behavior is as natural as possible. Another way to put this is the sled must not impair the ecological validity of the test.
- The mount must maximize the clarity of the recording while minimizing costs and filesizes. Obviously, when you start talking cameras, you can get pretty fancy. But this is research, which is a cheap, quick proxy to reality. So we're going to cap what we would spend on a mount at $200. We still need to capture the screen clearly, even down to a 4-5 point font. And it's always good to avoid unnecessarily large filesizes to be able to share them with clients without spending hours editing, rendering, uploading and downloading.
We did a sweep of the web. How well does what's out there meet our requirements?
Tripods capture users well, but screens poorly. A tripod and a swivel mount is a very reasonable idea. But we're not satisfied by the constraints this puts on the users and research scenarios. In a nutshell, the camera doesn't move, but the device does, and so you have no control over the jostle in the screen image. Thus a tripod only works if users set the device down. When they lift it up (to text or play with thumbs), even if they could stay in frame (maybe with the help of tape you put on the desk), the screen no longer faces the camera directly, impairing the capture. We also don't relish the idea of saying "wait, wait, wait, you're out of frame" or "wait, wait, let me adjust the camera" every few minutes during a session. So we only use a fixed mount when clients care about the user's hands rather than the screen.
Sleds are good, but many are designed only for portait view. The first advantage of mounting the camera more or less on the device, is that no amount of motion will impair the screen capture. This is important, since many apps use tilting, re-orienting, or even jiggling to use. You can also zoom out and get most of the user's hands while still seeing the screen clearly. So we believe "sleds" are the way to go. But the one below, while cheap, will not work if the user needs to turn the device on the side for landscape view. What happens is they lift it off the platform, which is awkward and ruins the stability.
Watch out for focal distance and lighted webcams. The other gotcha with the sled above is that if you attach a cheap webcam to it, many of them do not focus on objects nearer than 40cm (16 in). So the device will be too close and out of focus. Also, any light you throw on the device is reflected right back in the form of glare, so webcams with flashlights are to be avoided.
Long, top-heavy necks are awkward. Your first reaction to solving the focal distance problem might be to lengthen the neck and get a better camera. But this is a blind alley. The sled becomes top-heavy and uncomfortable to use. When we tried prototypes like this with users, they wanted to set it down on the desk, which is a good indicator that it was uncomfortable and interfering with their use of the mobile device.
A second camera for users' faces should not be mounted on the sled. If your research question requires that you see users' faces, great. Sometimes that is necessary for what you need to learn. But the camera trained on users' faces should not be mounted on the sled. This is even heavier, and it doesn't work. The sled moves with the device, but the user doesn't, so the image of the user is poor. It jiggles and they are often out of frame, especially during emotional moments which is precisely what you want to observe (like when they win a game or forget to save the text they just composed).
Another gotcha with the 2-camera idea is that if you want the recording to show a "picture in a picture" with the user's face shown with the screen, you'll need to make the leap out of chap recording and editing software (like what comes with Logitech cameras) to more expensive software like Morea or Camtasia. The ability to record PIP is a premium feature.
The mount is only the half of it: gain, exposure, and contrast are key. After we got a mount we were happy with, we concentrated on getting a really great video capture. Turns out this was harder than you think. Most webcams (our preferred device for low cost, low weight, and USB connectivity) are not made to shoot something that itself is a light source shining back at them. And when it comes to making out that 5-point font, it's more about managing the light than the focus. The image below illustrates what we mean: most of the text is blown out and unreadable. Furthermore, we needed to adjust this on the fly without stopping the recording.
Our working prototype: the Psychster "Usability Palette". We settled on a Logitech HD C525 webcam mounted with a gooseneck on a wood palette with a non-slip surface. Because this camera is designed to focus on close objects, it hovers about 8 inches (20 cm) above the palette. Thus it is light, comfortable, and not top-heavy. The image is demonstration-quality, and we can pan, zoom, and even adjust the gain and exposure WITHOUT stopping the recording.
Why the palette? It's a happy medium allowing users to switch between portrait and landscape orientation without being interrupted to change sleds. It's designed to be comfortable whether users interact with their touchscreen with a single finger or with two thumbs.
And what about the user's face? Again, to be prepared for any research question, we're currently using a desk-mounted gooseneck about 24 inches (60cm) long. This allows us to capture users' faces, or turn around and capture them interacting with the device from over their shoulder. We can zoom out and get a whole-room view, or zoom in to a device located in a taped-off area on the desk. Essentially, this design allowed us the greatest flexibility.
The camera is the Logitech HD C510, which must be at least 40cm away from the object to be in focus, but still has a versatile mount and great software controls. It was necessary to modify it to turn the lens upside-down so as to capture devices right-side up.
This video shows a user playing a game by holding the palette and using a single finger.
This video shows a user playing a game by tipping the device. But due to the stability of the palette, you can barely tell.
How big are the recording files? We prefer to shoot in widescreen to have the aspect ratio match the device screen in landscape view. After much testing, we've decided both the sound and the resolution/size can be set pretty low. So we're predicting that 60min recordings will be about 240MB. If necessary, we could compress them for sharing with clients online in post-production editing, but most likely we'll just use conferencing software to share smaller versions of videos online. If clients need a really sharp recording, we're sure they won't mind receiving the files on a USB drive.
Thanks for reading. Have questions? Write us at firstname.lastname@example.org .