Bengisu Dost
UX designer

Research Study for Microsoft

How can we improve Microsoft’s “Seeing AI” application?

Project Background

In Dec 2020-Jan 2021, I conducted an independent accessibility study to evaluate Microsoft’s Seeing AI app in order to gauge its usability. Seeing AI brings the power of the cloud and artificial intelligence (AI) to identify people and objects, and audibly describes those objects for people with visual impairments. I have collected and analyzed user behaviors and needs through a mixed methodology (interviews + usability testing). I have reported my findings with design recommendations on my article and presented them to the current Project Lead at Seeing AI. Most of my recommendations are put on production after my presentation. I hope to continue to contribute accessibility community in the close future!

Roles

UX Researcher and Analyst

Tools & Methodologies Used

Usability Testings, Interviews, Qualitative Analysis, Secondary Research, Human-AI system improvements

Interviews

Over two weeks, I recruited 6 participants who were either blind or with low vision. Thanks to my participants, they were sincerely ready and excited to share their insights with me over Zoom sessions. As they went through the tasks I prepared for them I was able to gain a true understanding of their experience with this product. They shared their phone screen and audio with me through the Zoom iPhone screen share feature.

I gave them 4 different tasks with pre and post-interview questions. I chose the task scenarios I mentioned below because I believed some of them were the potential purposes for a user to use the app during Christmas time and others were the top features available at the time the sessions were done.

User Tasks

  1. Reading expiration date on any kitchen item

  2. Describing visual figures on a holiday card

  3. Describing people on an image

  4. Reading a document

Analysis

I transcribed all my interviews and conducted a qualitative analysis using a coding technique and an affinity diagram. For the coding technique, I tagged my participant quotes with the keywords that I came up and later I put those keywords and some of the important quotes on the sticky notes.

I have followed Microsoft’s AI research guidelines and matched my codes to “Guidelines for Human-AI Interaction”. This helped me synthesize my findings and bring up results that are aligned with Microsoft’s resources and proven records.

Research Study for Microsoft

Interview Insights in an Affinity Diagram (for the diagram details please email me at bengisudost@gmail.com)

Results and Design Recommendations

Guideline 1 - Make clear what the system can do

This is where we set expectations when designing AI systems. For example, none of my participants knew that they can have Seeing AI describe their photos from their camera roll. When I asked them to use the “Browse Photos” feature which was hidden under the Menu, they ran their photos. They were fascinated by the descriptions of their loved ones and pets. And actually, the “browse photos” feature described better than the “scene” feature for certain tasks. Therefore I believe users should be aware of the opportunity they have and use it for better results.

Potential Solution:

  • Bring up the “Browse Photos” feature to the main channel bar.

Guideline 2 - Make clear how well the system can do

This is the section where we explain the limits of AI to users. None of the participants was able to read an expiration date on a kitchen item during our testing sessions. The participants used a milk carton, a mini milk bottle, a box of chocolate bars, and a coffee creamer. One of the users thought that it’s hard to read text on a rounded surface. Most of them had no idea where the expiration date would be placed on those items. Even though most of them mentioned the “short text” feature is their favorite and easiest channel, this task was extremely difficult for them.

“It’s a little embarrassing that I couldn’t find the expiration date on the creamer” — Participant 3

“I would consider myself extremely lucky to find something like that (exp date)” — Participant 2

Potential Solution

  • Let users know how well the app can read at the beginning of the channel such as what type of prints or font sizes are available which prevents users from questioning or feeling bad about themselves for the tasks which are out of the app’s intention or limits.

Three of my participant also mentioned that they wished the system could read from digital screens such as blood pressure devices, digital stove screens, or recognize a person from a computer screen. However, over time by trial method, they learned that those functions were beyond the Seeing AI’s limits.

Potential Solution

  • Either state common tasks that are not available at the beginning of the channel, or brainstorm ideas to bring those new features alive.

Guideline 4 - Show contextually relevant information & Guideline 10 - Scope services when in doubt

During the study session, while users were going through tasks such as describing holiday cards and images, I discovered that accuracy and the way accuracy information is given to users were causing a problem for them. According to these guidelines, the AI system should display related information to the user’s task and environment, and also express itself accurately when it's in doubt.

For example, I observed 9 different attempts with failed objects and people recognition results, however, in only 2 of them, AI described these objects and people by starting a probability statement such as “probably…”. But surprisingly, during another user attempt for depicting a description of a boy on an image, the AI described him perfectly but put a probability at the beginning of the description “probably a smiling boy with a flower pattern on his shirt.” I believe the user should be informed of these probabilities with more consistently accurate results and informed about when AI is actually in doubt.

Potential Solution

  • Reevaluate, restructure, and refine the confidence level of AI on descriptions.

  • OR Design a new category scale for the confidence level of AI that gives more clarity to the users. i.e “Most likely” >“Probably” > “Might be”

Research Study for Microsoft

Some of the objects and photos used during the interviews

Guideline 9 - Support efficient correction

I discovered that the document feature is not successful in heavy text documents with multiple columns. It doesn't identify the locations of columns and corresponding headlines. For example, if the scanned document is an article and there are 3 columns on it, it would read the first line across all three columns then go to the second line, and so on.

“It goes straight across in cooking instruction columns such as oven, microwave, or stove options. It mixes up the cooking instructions.” — P3

Potential Solution

  • The way AI reads the document should be changed depending on the type of document. For example, in an article, you read the whole first column and then you read the second column. But like on a bill, you might need to read across with the corresponding headlines included.

  • Learn opportunities from other competitors that provide good results in documents such as VoiceDream Reader and KNFB Reader.

“If I am going to read something columnar, I probably would use VoiceDream Scanner. I have VoiceDream reader and the two integrate and it’s just a really wonderful reader.” — P4

Research Study for Microsoft

Document samples used in the study sessions

To keep this case study short, I have only provided a couple of my findings. If you like to learn more, please contact me at bengisudost@gmail.com!

Research Study for Microsoft

"Designing is like a tangram puzzle, putting meaningful experiences together with the conditions given. " 

- Bengisu

In Conclusion

Seeing AI is a great tool for the visually impaired community. All of my study participants stated that they like this app and appreciate all the efforts that Microsoft employees put into it. They know it's hard to master all the functions the app has but also they believe Seeing AI got good stuff in there and it's worth working towards mastery. Visual impairment is not going away and it’s actually rising with the increased aging population. I believe we have to be responsible and take action to tackle these issues with the vision of mastery.

What have I learned during this study?

  • Don't talk with the screen reader at the same time because participants can’t listen to both of you :) Make sure you hear what they hear during the test sessions.

  • Practice VoiceOver and join the blind community’s world. As one of my participants said “do it as if you needed to” and I totally agree!