Lessons from Designing VR Audio Visualizations in Collaborative Spaces

Tongyu Zhou, April 2022

This page documents the challenges, lessons, and takeaways from two simple collaborative audio visualization applications that I created in Unity XR. While the applications were intended for different use cases, analyzing user responses to them can help us understand how different design choices affect the user experience. Both applications used Unity PUN to enable multiplayer sessions. Users tried out each application in separate 20-30 minute sessions and filled out a survey at the end to share their thoughts.

Application 1: 3D Dynamic Spectrogram for Ocean Soundscapes

The goal of this application was to allow users to visualize the ocean soundscape via a live spectrogram that reacted to the frequencies and amplitudes of the playing sound. Users can control sound amplitude as well as hover over the spectrogram to get the exact frequency and amplitude at that point. Changes each user makes to spectrogram orientation are shared across all users in the session.

Some challenges encountered while creating the application include:

  • Enabling spectrogram rotation controls -- the base of the spectrogram is a procedurally generated mesh. Rotational transforms operate around a singular point, (0, 0) in local space, as the pivot. Thus, when the quaternion of the controller rotations to applied to the spectrogram, the rotation is accurate but feels strange because it is pivoting around a point.

  • Gradient generation -- a gradient is used to emphasize differences in amplitude. Constructing this gradient required creating a custom shader.


Users tested out the application in a synchronous classroom setting, as summarized here. They were given two tasks 1) pause the audio, examine the spectrogram, and identify sound waves at the highest and lowest frequencies and 2) identify a sound that does not belong using both the visualization and the audio.

Survey Takeaways


  • Missing functionalities:

    • ability to sync/de-sync audio, player movement, collaborative pointer, two-hand control for a smoother rotation of the spectrogram

Application 2: AOE Visualizer for Sound Sources in Rooms

The goal of this application was to allow users to visualize and control the area of effects of various 3D sound sources within a closed space. It uses Unity's built-in spatial audio system (with linear decay) and displays the area within which audio can be heard as a translucent sphere. A heatmap on the ground shows the intensity of the amplitude. Changes each user makes to audio source location or area of effects are shared across all users in the same session.

Some challenges encountered while creating the application include:

  • Forcing the AOE sphere to be visible from the inside -- Unity shaders do back-culling, meaning that only the front is rendered so you cannot actually see this amplitude indicator if you're standing inside it. To bypass this, you need to duplicate the mesh and flip the normals

  • Constructing the gradient heatmap on the floor -- the base of the heatmap is a mesh of triangles, which means that trying to alter colors based on amplitude in a circle around the sound source does not result in perfect circular gradients


Similar to the previous application, users tested out the application in a synchronous classroom setting, as summarized here. They were given two tasks 1) to grab an audio source to first try out spatial audio and then 2) to rearrange the sound sources in the room and adjust their areas of effects so that there is one region that overlaps with all the sources. After the tasks, users filled out a survey about their experience.

Survey Takeaways

  • Scenarios that could benefit from area-of-effect audio visualizations:

    • acoustics design for rooms or musical instruments, to get a better understanding of the spatial aspects of sound, architects of auditoriums, investigation of crime scenes, games

  • Missing functionalities:

    • more realistic grabbing, teleportation, reset button, ability to see sound waves (such as in Application 1)