XR Accessibility User Requirements

By Alastair Beeson

What are the XR Accessibility User Requirements

Link to the guidelines: https://www.w3.org/TR/xaur/


This document lists user needs and requirements for people with disabilities when using virtual reality or immersive environments, augmented or mixed reality and other related technologies (XR).


Although this document is produced by a major authority in web accessibility, it is in the early stages of its development. It is considered a W3C Working Group Note and a draft so it may undergo serious changes or revisions. It is very likely that it does not become a real W3C recommendation.


Thus it is a piece of “open source writing” and is constantly looking for contributions and community engagement. The guidelines are a really valuable resource for thinking about accessibility in a large inaccessible field and really is the first of its kind for evaluating XR software according to usability standards.


There has been some discussion about how the shortcomings of WebXR hinder accessibility which can be read about here: https://www.w3.org/WAI/APA/wiki/WebXR_Standards_and_Accessibility_Architecture_Issues


What are the biggest accessibility challenges of XR According to this Document?


Over emphasis on motion controls.


VR Headsets need the user to be a physical position to play.


Games and hardware being locked to certain manufacturers.


Gamification of VR forces game dynamics on the user.


Audio design lacks spatial accuracy.


What are Input Modalities?

The guidelines define several Input Modalities for XR. Input modalities are the ways people with disabilities can interact with and control an XR application. In most cases these should be combined in applications. Not all of these necessarily need to be implemented all at once as some are more difficult or still in their infancy.


  • Speech - this is where a user's voice is the main input. Using a range of speech commands, a user should be able to navigate in an XR environment, interact with the objects in that environment using their voice alone.

Elaboration: Speech is one of the most important input modalities to have in an accessible application. It is a motion agnostic input method, meaning that a disabled user does not have to use specific fine motor motions to control the application. This attribute can open up XR for so many possible users. This is definitely a really important input method to include in all applications.

  • Keyboard - this is where the keyboard alone is the user's main input. A user should be able to navigate in an XR environment, interact with the objects in that environment using the keyboard alone.

Elaboration: Another one of the most ubiquitous input modalities and likely one of the most popular “controllers” in the world. There isn’t really a reason not to have this in your application. Most major XR programming languages have built-in keyboard functionality. Likewise keyboard and mouse work with the Quest 2, Vive and Index.

  • Switch - this is where a button Switch alone is the user's main input. A user should be able to navigate in an XR environment, interact with the objects in that environment using a Switch alone. This switch may be used in conjunction with an assistive technology scanning application within the XR environment that allows them to select directions for navigation, macros for communication and interaction.

Elaboration: Most controllers for VR rely heavily on button inputs. That said controllers tend to use a combination of several buttons and switches like a d-pad or thumbsticks to perform different actions like moving, turning around, selecting objects. Instead, a user should be able to use a single button like the A button to navigate the space. This can be accomplished by macros like pressing the button 2 times to select or 3 times to deselect. This is pretty straightforward to set up but will require some advance planning to figure out all the possible macros and what tasks they map to.

  • Gesture - this is where gesture-based controllers are the main input and can be used to navigate in an XR environment, interact with the objects in that environment and make selections using gestures alone.

Elaboration: Gesture controls include multiple different kinds of gestures. This can be moving your head to look at specific objects or perform specific actions with gaze controls. Other gestures may be moving your controllers in specific ways. Like turning your palm to face you in order to bring up a menu. Another example might be pointing and pinching your fingers to select an object rather than press a button. Obviously developers will have to put time in to figure out what gestures would logically relate to what actions.

  • Eye Tracking - this is where eye tracking applications is the main input. Using a range of commands, a user should be able to navigate in an XR environment, interact with the objects in that environment using these eye tracking applications.

Elaboration: This is a very useful input modalities for users who have issues with fine motor skills or may have particularly debilitating physical disabilities and can only rely on their eyes for precise movements. That said, it really is in its early stages of development and many of the major XR headsets lack the functionality.


What are Output Modalities?


The guidelines define several Output Modalities for XR. Output modalities are the ways people with disabilities can experience an XR application. You should aim to combine multiple modalities as many users have disabilities that prevents them from properly experiencing a certain modality like a visually impaired user who can’t see visual output or an auditory impaired user who can’t hear auditory output.The latter two options Gustatory and Olfactory are not commonly used but there is research and experimentation with implementing them.


  • Tactile - this is using the sense of touch, or commonly referred to as haptics.

Elaboration: There are straightforward examples like controllers vibrating when grabbing an object. But this can get more complex and theoretical like spraying you with water to simulate the sea. There are many variables to haptics like the types of vibration, intensity or acceleration. Even acts like grabbing an object are constantly being revised to add new sensations like the difference between normal and shear force. This is a really great solution for users who may have auditory impairments, to feel the sensations of sound or for visually impaired users to orient themselves in a scene or confirm that they are performing certain actions.

  • Visual - this is using the sense of sight, such as 2D and 3D graphics.

Elaboration: The most important output modality for VR. Being able to see is essential to creating the sense of being in a virtual reality. As graphics keep increasing in fidelity, resolution, realness, lighting, so will the effectiveness and believability of virtual spaces. Nevertheless, VR apps should still aim to be accessible for the visually impaired and find ways of conveying their experience beyond visuals.

  • Auditory - this is using the sense of sound, such as rich spatial audio, surround sound.

Elaboration: Audio comes in many forms like spatial audio, surround sound, mono audio. A good sound set up really helps sell the immersiveness of a space. A rainforest simulation without birds chirping, water running, trees rustling is a lot less immersive. Fortunately even if a user has auditory impairments, other modalities especially haptics can be used to mimic the sensations.

  • Olfactory - this is the sense of smell.

Elaboration: There are some companies exploring this area but for the most part it isn’t commercially viable and more so a proof of concept. One olfactory focused VR product failed dramatically and scammed its investors. Olfactory sensations are accomplished using custom scents and aroma diffusers. Imagine a simulation set at sea where you might be able to smell the ocean. Or in a cooking game you could smell the food.

  • Gustatory - this is the sense of taste.

Elaboration: You can use thermal and electrical probes to mimic the sensations. Heating or cooling of the tongue can mimic tastes like minty or salty. Other times the sensation of chewing can be accomplished using bone conduction transducers. This is a strange area of VR but there are projects exploring it and many unique use cases like food tourism or weight loss. Imagine a simulation with food that you could taste or chew or being able to “travel” to a foreign country and try unique cuisine.


What are the XR Accessibility Categories


  • Immersive Semantics and Customization

Definition: A user of assistive technology wants to navigate, identify locations, objects and interact within an immersive environment.


  • Motion Agnostic Interactions

Definition: A person with a physical disability may want to interact with items in an immersive environment in a way that doesn't require particular bodily movement to perform any given action.


  • Immersive Personalization

Definition: Users with cognitive and learning disabilities may need to personalize the immersive experience in various ways.


  • Interaction and Target Customization

Definition: A user with limited mobility, or users with tunnel or peripheral vision may need a larger 'Target size' for a button or other controls.


  • Voice Commands

Definition: A user with limited mobility may want to be able to use Voice Commands within the immersive environment, to navigate, interact and communicate with others.


  • Color Changes

Definition: Color blind users may need to be able to customise the colors used in the immersive environment. This will help with understanding affordances of various controls or where color is used to signify danger or permission.


  • Magnification Context and Resetting

Definition: Screen magnification users may need to be able to check the context of their view in immersive environments.


  • Critical Messaging and Alerts

Definition: Screen magnification users may need to be made aware of critical messaging and alerts in immersive environments often without losing focus. They may also need to route these messages to a 'second screen'


  • Gestural Interfaces and Interactions

Definition: A blind user may wish to interact with a gestural interface, such as a virtual menu system.


  • Signing Videos and Text Description Transformation

Definition: A blind user may wish to interact with a gestural interface, such as a virtual menu system.


  • Safe Harbor Controls

Definition: People with Cognitive Impairments may be easily overwhelmed in Immersive Environments.


  • Immersive Time Limits

Definition: Users may be adversely affected by spending too much time in an immersive environment or experience, and may lose track of time.


  • Orientation and Navigation

Definition: A screen magnification user or user with a cognitive and learning disability or spatial orientation impairment needs to maintain focus and understand where they are in immersive environments.


  • Second Screen Devices

Definition: Users of assistive technology such as a blind, or deaf-blind users communicating via a RTC application in XR, may have sophisticated 'routing' requirements for various inputs and outputs and the need to manage same.


  • Interaction Speed

Definition: Users with physical disabilities or cognitive and learning disabilities may find some interactions too fast to keep up with or maintain.


  • Avoiding Sickness Triggers

Definition: Users with vestibular disorders, Epilepsy, and photo sensitivity may find some interactions trigger motion sickness and other affects. This may be triggered when doing teleportation or other movements in XR.


  • Spatial Audio Tracks and Alternatives

Definition: Hard of hearing users may need accommodations to perceive audio.


  • Spatial Orientation: Mono Audio Option

Definition: Users with spatial orientation impairments, cognitive impairments or hearing loss in just one ear may miss information in a stereo or binaural soundscape.


  • Captioning, Subtitling and Text: Support and Customization


Definition: Users may need to customize captions, subtitles and other text in XR environments.