Evaluating UX in robots
What methodologies can you use to measure the usability of social robots?
Robots are programmable devices that automatically perform complicated and often repetitive tasks. The term “robot” was first used in 1920 by Czech writer Karel Capek in his play “Rossum’s Universal Robots”. It comes from the word “Robota” which means worker of forced labour in Czech.
In total, there were about 30 million active robots in the world — the majority of which are vacuum cleaners! This means that at the moment there are ~250 humans to every robot. The number of commercially available robots is expected to grow by over 30% in the next few years. The pandemic has contributed to the need for robots.
An army of automatons has been deployed all over the world to help with the crisis: They are monitoring patients, sanitizing hospitals, making deliveries, and helping frontline medical workers reduce their exposure to the virus. Not all robots operate autonomously — many, in fact, require direct human supervision, and most are limited to simple, repetitive tasks. But robot makers say the experience they’ve gained during this trial-by-fire deployment will make their future machines smarter and more capable. — Guizzo & Klett, 2020
Robots can be classified into two main categories; industrial robots and service robots. Service robots can be either professional service robots (used for commercial tasks and have clearly defined tasks) or personal service robots — they are also known as social robots.
The main purpose of a social robot is to have meaningful social interactions with humans. They are expected to express emotions, talk with humans, learn and recognise natural cues, and develop personality and social competencies. Most existing social robots are designed to support children with special educational needs and disabilities (e.g., autism) or the aged population (e.g., older adults with dementia). Social robots can take various forms; from humanoid robots like KASPAR, to cartoonish robots such as Tito, to animal-like robots like MiRo, to machine-like robots such as Nao.
Designing successful social robots that will be widely adopted requires a good UX.
What kind of usability evaluation methodologies are deployed in social robot studies?
Developing social robots is a lengthy and complicated process. As a result, it is inefficient to evaluate the product after its actual physical development. When evaluating a social robot, users are provided with various types of stimuli to identify UX issues that depend on the stage of development the product is in. A recent review by Yung and colleagues reviewed 36 studies and identified four possible stimuli that can be used in usability studies with robots:
In the initial stages, a textual description of the robot can be used
A drawing or a still image showing the robot can also be used
Once the prototype has been created, testing can be done by showing users a video displaying what the robot looks like and what its expected interaction with humans is expected to be.
In the latter stages, usability testing can take the form of a live interaction between the participant and the prototype robot.
22.6% of the studies that used a live interaction with a prototype robot reviewed by Yung et al. chose a Wizard of Oz method. When using this method, participants interact with a system that they believe to be autonomous, but in reality, it is controlled remotely by an unseen human operator. This is a way to prevent unexpected errors and make up for specific functionality which has not been implemented yet.
The users recruited for usability studies should reflect the target market for the product. In the case of social robots, the main users are expected to be children or the elderly. When conducting UX research with these specific categories of users, researchers are presented with a number of limitations. More specifically, children with learning difficulties or adults with dementia cannot answer all the questions, so alternative information might be needed to evaluate the UX. A way to achieve this is by conducting additional interviews with family members, caregivers, or teachers of target users. This is an approach I used a few years ago when working with an animal-like robot and its interaction with children with autism.
The main evaluation techniques used in usability studies with social robots are summarised in the table below in order of popularity (from Yung et al).
Questionnaires seem to be the most widely used method and usually focused on aspects of the experience such as system response accuracy, likeability, cognitive demand, attitudes towards robots, and competence. Two of those questionnaires are the SASSI(Subjective Assessment of Speech System Interfaces) and the NARS(Negative Attitude toward Robots Scale). Video analysis examines the way users interact with the social robot and is appropriate in the latter stages of usability testing. Interviews can provide us more information about the emotional impact social robots can have on users. Finally, biometrics are less common and involve measuring users’ physiological reactions to the robots; respiration, heart electrical activity, and galvanic skin response (GSR). Changes in these measures can indicate the effect a product has on anxiety and the emotional states of the users.
What are the main evaluation dimensions in social robot studies?
Research suggests that user acceptance of new technology depends on three main dimensions:
a utilitarian aspect: it refers to how easy the device is to use and how useful it can be.
a hedonic aspect: this is about how emotionally attracted to the social robot, the user is. It depends on the appearance and the personality of the robot, as well as its perceived emotional value.
trust: this relates to the belief in the social robot that it will perform its function with personal integrity and reliability.
Different methodologies can assess the above dimensions. For example, the utilitarian aspect is based on performance and function, making the use of text, video, or live interaction appropriate.
A combination of questionnaires and interviews can cover all three dimensions of attitudes towards a robot. Yung et al. suggested that using biometric measures should mostly be used when we are assessing the hedonic aspect. Video analysis is suitable for evaluation in utilitarian or hedonic aspects. Trust is the hardest dimension to measure; a combination of methods is needed to get an accurate measurement.
The developmental stage of the product needs to be considered before selecting the appropriate UX evaluation type. Like in most products, early testing and a ‘quick and dirty’ evaluation is needed before development begins to allow problem finding in the early stage of the development cycle. Field studies are preferable as an evaluation method, when possible. Studies conducted in a lab setting, result in users exhibiting self-conscious and unnatural behaviour, less likely to predict the way they would interact in their usual environment. Overall, using multiple evaluation methods allows us to gather more accurate data.
It is worth noting that usability testing is different for industrial robots. Unlike social robots interaction with industrial robots often relies on user interfaces and is governed by a more clearly defined set of guidelines (e.g., visibility of system status). To develop a social robot successfully that will be widely adopted, developers should understand which elements can affect the UX and focus on the emotional impact the products have on users.