Artificial intelligence (AI) is weaving itself ever deeper into the fabric of our lives, and our quotidian interactions are no exception. The latest frontier? Call centres, where AI filters should become reality from 2025, neutralising the aggressive outbursts of disgruntled customers and the unfamiliar accents of services providers operating from offshore centres [1].
SoftBank, a Japanese tech giant, is currently developing a system dubbed "emotion cancelling technology." This AI ‘marvel’ utilises voice analysis to detect anger in a customer's tone and, in real-time, alters it to sound calmer. It can also smooth accents, aligning the offshore call centre employee’ intonations and pronunciation to the caller’s region. The ostensible aim: to reduce employee stress, abuses and cultivate positive customer relationships. However, beneath this seemingly benign veneer lurks a multitude of ethical and societal concerns requiring careful consideration.
Firstly, such technology risks obscuring the actual root causes of customer ire. Faulty service, billing snafus, and a general sense of being unheard often fuel customer frustration. AI, rather than addressing these core issues, merely masks the problem. This can have a domino effect, leaving customers perpetually frustrated and companies failing to address the very issues that ignite customer rage in the first place. If we turn the tables around, any fraudster able to access the technology will be able to sound calm, caring and concerned about their interlocutor’s wellbeing, all by simply activating this emotion smoothing option.
Secondly, this technology delves into the realm of emotional manipulation. By altering a customer's voice to sound calmer, AI essentially tampers with their emotions. This deprives them of authentic communication and the ability to fully express their dissatisfaction. A sanitised interaction devoid of genuine emotion fosters a hollow customer experience. Even more concerning is the idea to leave it to AI to determine how we should sound and from which threshold our tone needs to be corrected. One can only imagine the cascade of consequences this could drip feed, including an alteration of social norms over time as these are not innate but learnt as part of our education [2].
Thirdly, the spectre of inauthenticity looms large. Filtering emotions and accents through AI creates a chasm between interlocutors. Imagining a world where this technology is embedded into everyday office communication tools can seem frightening and full of distrust. Would genuine non face to face interactions disappear? This undermines the very essence of sincere human interaction, potentially eroding trust and hindering open communication. This would all be a blessing to fraudsters then able to blend in these ever more troubled waters making emotions decoding an endangered soft skill.
The potential for misuse further muddies the waters. This technology could morph from a customer service panacea into a tool for nefarious purposes. Malicious actors could conceivably leverage it to manipulate individuals or suppress critical information, an appealing state of play to the professionals of deception that are con artists.
Finally, a social disparity could emerge. Access to this technology might create a societal divide. The affluent, with the resources to access “real persons”, will be able to enjoy genuine interactions while the less fortunate are left to navigate filtered emotions or even deal with entirely AI powered conversational robots. This would render a whole segment of the western population ripe for the taking. Imagine how willingly such a public would throw itself in the arms of apparently well meaning “real persons” promising to turn their life around.
While AI-powered voice modification might appear to be a silver bullet for diffusing call centre tension, a closer look reveals a Pandora's box of ethical and societal quandaries. Before rushing headlong into widespread adoption, it may be worthwhile carefully considering the ramifications of such technology.
The discourse extends beyond the points raised here. Can AI ever truly replicate human empathy, a cornerstone of successful conflict resolution? Might AI's emotional filtering deprive us of valuable insights into the true state of mind of the person we are communicating with? Are we hurtling towards a future where human interaction is increasingly mediated by machines (unless you can afford to bypass this dystopia)?
The incoming use of AI to modify voices compels us to confront fundamental questions about how we wish to communicate and interact with one another.
References
[1] https://www.newsbytesapp.com/news/science/softbank-unveils-emotion-cancelling-ai-filter-for-call-centers-in-japan/story
[2] https://link.springer.com/article/10.1007/s10648-022-09688-z
Tom Vidovic is a senior financial crime compliance specialist. He held several roles in the financial services industry as well as the consulting sector, including as Financial Crime Advisory Manager for Deloitte; Associate Director, FCC Controls for Standard Chartered Bank; Financial Crime Forensic Manager for KPMG; and FIU Financial Crime Consultant for Wells Fargo; and most recently as Nominated Officer for Ghana International Bank. He is a Certified Fraud Examiner, Certified Anti-Money Laundering Specialist, and holds an MBA in Sustainable Finance.