tl;dr: Test your project idea as soon as possible. If in doubt, simply show sketches of the user interactions you plan to build to real human beings. We did our first user test after just 4 working days into the project. This post documents why we did this and what we found out.
Key Insights:
🎧 The concept of a virtual representation of real-life space with dynamic sound is understood and approved by the users and stimulates interest in further use of the app.
👆Drag and Drop interactions are often not the first choice when interacting in the app. This needs to be addressed if we decide to continue with this pattern.
😬 The way we used the mumbling from other conversations as an indicator, that you are in a room with others was too irritating. Iterations needed.
👩👩👧 There is no real need for the sound to be stereo. Volume change in dependency of approximation appears to be sufficient in order to create a spatial sound experience.
🤫 Test users come up with similar additional functionalities as the project team. Most wish for a way to address other participants privately without leaving the group setting.
We were just 4 working days into the project when we conducted this very first user test. Our goal was to get user feedback as soon as possible in order to verify or falsify our main hypotheses without spending a lot of of time on fine-tuning concepts or development that would potentially not prove valid afterwards. The central idea being, that a explorable virtual representation of a room, with spatial sound, would improve video call experiences.
Do users understand the concept of a virtual space with dynamic sound?
Do users understand, that they can control what they are hearing, by dragging themselves to the source?
Can users imagine, that this feature could improve the way a video call would work?
Is it fun to use?
The setup
We built a very rough interactive prototype with Axure, in which the test users could see a video representation of themselves in a 2-dimensional room, as well as animated representations of other participants that were arranged in two small groups. When entering the room participants would hear a mixture of the two different conversations (mumbling) held in each group. If users would not automatically start exploring the dummy or having problems with navigation, the interviewer would use a set of prompts in order to guide participants without giving away too much information about the functionality. This included the main task for the users: To find and join the conversation about gardening.
As we did not want to spend too much energy on a refined prototype our setup consisted of some real-world ingredients to fake functionalities of the video-call API we intended to use. Mainly the changing audio levels during the drag interactions were not at all programmed into the dummy, but manually operated by a team member who sat behind the test users and observed the interactions on screen in order to simultaneously change the volume levels of the two conversations. Secondly, one of the conversations on screen was pre-recorded, while the other consisted of two people, who just sat next door in a jitsi call, having the same conversation about gardening over and over again. Funny enough, neither of these sometimes very obvious trickeries was questioned nor really realized by participants. On the contrary, everybody was surprised at how well the app was already running. 🙊
After the exploration of the "app", each test users was invited to join a debrief interview in which we wanted to understand the participants experience in more detail:
Describe what you think just happened in this test.
Was it what you were expecting?
Would you ever use this app?
Was there anything that annoyed or unsettled you?
"Dynamic sound is pretty cool!"
"Mumbling at the beginning is quite irritating."
"Cool, that I could listen into different conversations."
"Can I create my own groups?"
"I wanna know exactly if and when others can see and hear me."
"The browser becomes a city."
"I would totally use this for family calls with Italy."
The early testing with a lo-fi prototype is sufficient to generate valuable insights, that would consume many resources if tested with properly coded prototypes.
Faking the sound (changing the volume when a user moves around instead of using stereo) made us realize that we can start with less complex technology to create the intended benefit for users.
Even with an on-site lo-fi paper prototype, settings can be tricky. From more predictable complications with the internet connection to events like a rock band shooting a music video in the studio space just underneath your test setting. 🤪
Internal debrief should happen right after each interview with the participants in order not to lose or mix up insights.
Filming or taking useable photos is hard to do while guiding users through the test. This should be assigned as a separate task.
This applies even more to taking notes.
Recruitment for user tests should start as early as possible.
Faces of interviewers shouldn't appear in the dummy, as it confuses testers (We used screenshots of ourselves to display call participants).
Using a Miro board to make notes and cluster also speed up your process, if you debrief in one room.
Wash your hands, wear a mask, and mingle in virtual spaces.