Identity : Disembodied Voices
Updated: Nov 30, 2019
Why am I looking into disembodied voices? for our current project on exploring cult identity, my team and I have decided to create an A.I cult leader who will ultimately reach out to people by masking themselves as a virtual personal Assistant.
The virtual personal assistant that was popular in old Sci-Fi movies is now a reality. It's the cultured, disembodied voice at humanity's beck and call, willing to do any number of tasks we demand them to.
The voices of personal assistants like Siri, Alexa, Google Assistant, Cortana and others have become a global sensation.
They can not only talk in close to 21 different languages, but voices like Siri can also change gender and even accent.
Why this obsession with having a virtual assistant who will carry out such petty tasks we can do by ourselves? What type of voice is considered pleasant, trustworthy, encouraging?
is the usage of certain gendered voices more appealing according to who uses it and when?
Does the pitch and tone of the voice make it more efficient?
Is there a universal voice that appeals to everyone?
These are some of the questions I am hoping to find answers to by the end of this post to expand my knowledge on this topic.
According to the book Wired for Speech by Clifford Nass and Scott Brave, people are "voice-activated": we respond to voice technologies as we respond to actual people and behave as we would in any social situation.
Nass and Brave documents 10 years of research into the psychological, sociological and design elements of these voice interfaces.
Through their research, Nass and Brave found that men prefer male computer voice over female computer voice, while women prefer female more than male, despite this social identification both men and women prefer following the instruction of a male computer voice, even if the female voice is conveying the same information.
According to them, this is likely because of learned assumptions and social behaviour.
They also mention a female computer voice is seen as better teaching on relationship and love more than technical subjects, which is preferred by a male voice. Our societal association with gender naturally make us prefer a certain stereotypical voice over the other depending on the subject
Siri's 2013 voice back then has a tonal pitch that was close to 21 per cent lower than of an average woman's voice. More to have a "masculine voice", why? According to Rebecca Kleinberger, who is PhD holder and research assistant at MIT Media Lab, "Because of bone conduction, we each individually hear the lower part of our own voice better or louder than the higher parts. This seems to play a role in the fact that most of us dislike hearing our own voice recorded and also why generally we might prefer lower voices to higher voices."
"Could Siri mimic the voice of the user to be more likable? Absolutely. We humans do that all the time unconsciously, adapting our vocal timbre to the people we talk to."
if we change our speech/ voice delivery when we are giving a public speech to make it more articulate to the audience, then same should be applied to virtual voices, it should change according to who is using it. There is no such thing as a universal voice that everyone prefers. In fact, maybe I was asking the wrong question all along, I shouldn't be looking for the perfect voice rather the actual context of the conversation, though the tone/pitch/gender might be important, it is only at surface level.
Scott Brave looked into the neurological layers of disembodied voices. "One of the studies I was involved in [years ago] was related to emotions in cars, and what was the 'right' emotion for a car to represent as a co-pilot,", "It turns out that matching the user's emotions is more important than what the emotion is. It makes the user think, 'Hey, this entity is responding to me.'"
If Siri is more responsive to our emotional state when we are not expecting or demanding of her, our attention and usage of Siri or any kind of personal assistant for that matter will increase, maybe even to a dangerous level.
what is the so-called personal assistant not only listens to the users but also detect stress, power imbalances in relationships and match its voice, phrasing and tempo accordingly? If so, the "ideal" voice is specific to each person, and like a human's voice, it should adjust itself in real-time throughout the day.
Imagine an AI that corresponds to its user's manner of speaking -- that raises its voice or reacts sharply in response to its user's tone, not just the content of her speech.
There is an uncanny, morally grey area that comes with this territory. The goal of many developers is to create a seamless illusion of sentience, yet if users are being monitored and judged beyond their control or consent, the technology can easily be read as insidious or manipulative.