The Future Alexa: Data Protection and the rise of Smart Voice Assistants

Intelligent Voice Assistants have found their way into our daily basis by offering a plethora of benefits and simplifying basic tasks and needs. However, their connectivity and pervasiveness raise serious concerns on consumers’ privacy, security, and data protection rights. This insight aims to shed some light on the current and future challenges.

Who wouldn’t want their own personal assistant on a full-time? People all over the world now have access to their Digital Voice Assistant like Alexa, Siri, Google Assistant and Microsoft’s Cortana, a reality that a few years ago we could only conceive in a George Lucas sci-fi movie. In recent years, we have witnessed the rise of home technology and voice-activated personal devices with the ability to monitor, collect and store thousands of personal data as they enter through the many facets of life. At the same time, the race for smart cities intensifies, and devices grow smarter and more interconnected than ever before. While voice technology is now regarded as one of the greatest promises of the future human-device interaction in smart environments.

The Technology

Digital Voice Assistant systems are integrated into devices such as smart speakers whose functionalities are based on a software system with natural voice recognition technology and can respond and execute a given voice command via voice detection and by connecting to other household services and devices. Currently, most Voice Assistants fall under the category of smart home devices which provide a variety of features activated through voice and can completely integrate with other household devices and applications.  The infrastructural system is built on artificial intelligence (‘AI’), machine learning (‘ML’), and it is connected through the Internet of Things (‘IoT’) allowing devices to synchronize digital interfaces with domestic applications, lighting systems, house thermostats, or media devices.

Yet, Voice Assistants also bring a significant risk due to the amount of personal data collected and exploited, particularly regarding the processing of sensitive data.  

This is aggravated by the fact that unless these devices are deliberately turned off, the smart speaker continues to passively listen, even though the recording system is only activated by the wake-up expression. Voice Assistants have brought an unprecedented change in the way we interact with technology and permanently incorporated the Internet of Things into our daily lives. Recent developments, however, have revealed that the most popular voice-activated assistants, such as Apple’s Siri, Amazon’s Alexa, and Google Home raise several concerns regarding data protection and security.  

Voice Assistants rely on vast volumes of user data to constantly improve and personalize their services. While this may benefit consumers by delivering services and product recommendations tailored to their specific needs and tastes, it may also pose disclosure issues as well as privacy and security risks. Even so, the excessive gathering, use, and sharing of users’ data is not only a problem of Digital Assistants but that also raised by most IoT devices and digital platforms.

Zooming into the Data Protection Concerns

Voice Assistants are becoming more popular among consumers, and even though they are a novelty in European countries, they are not new. The abovementioned illustrates how the complexity and opacity entailed throughout the processing may carry serious risks in terms of privacy, security, and even with agency as a consequence of the increasing human reliance on these devices’ automated choices. Considering the multiple values and interests at stake, consumers must regard when bringing these devices into their households the imminent paradox between the gradual erosion of their privacy in exchange for convenience and multi-tasking efficiency.

Narrowing down these issues to the data protection matters, one must understand that a person’s voice contains an alarming amount of information, such as indicators of age, gender, origin, behavioral characteristics or intentions. As a result, we are dealing with sensitive information under the parameters of the GDPR which requires special caution.

Furthermore, background noise recordings can provide information on the characteristics of the surrounding. In addition, users must be cautious regarding excessive data retention due to the complex storage and difficulties with the deletion which must be done by users most of the time.

 Secondly, the legitimacy grounds for collecting personal data (under Article 6GDPR) must be adequately applied and distinguished between performances within the scope of the contractual agreement or under the scope of consent. In this line, it is still unclear whether content personalization based on the data collected via these devices is an expected service comprehended in the contractual sphere or not.  Furthermore, accessing and erasing data stored in the cloud may not be possible, and this problem of excessive data collection is aggravated by the passively ‘listening’ feature of the voice-assistants, described above.

Tension can also be pinpointed to the lack of transparency, which is inherent to the AI and ML components. Users should be adequately informed about data processing methods, whereas such persistent opacity makes it harder to provide consumers control over their personal data and to foster their trustworthiness. Moreover, data subjects are unaware of how their data is processed, transferred and stored, as well as the types of data being collected and how it is used, if there are any third parties to whom the data is transferred or who can access the cloud where the recordings are kept or even how can they adequately exercise their rights under the GDPR.

Under these circumstances, personal data is collected and later repurposed for mostly unexpected uses, and without the data subjects’ knowledge or consent. For instance, such data collected through the voice-activated system is afterwards used for profiling, nudge consumers’ choices, personalized research, or advertisement. Which also calls into question the extent and significance of the informed consent since often the wake-up expression is misinterpreted and the device will start recording without the user’s awareness.

Besides, when it comes to vulnerable data subjects and consent, it is not possible to ensure, for instance, that parental consent is obtained before any data processing related to the child which requires particular care. Particularly, after the recent toy espionage scandals, one can imagine the magnitude of the risks associated with a future of fully connected smart devices. Considering this, one must outline the risks that the growing take-up of smart voice assistants by other vulnerable groups such as disabled people or the elderly. On the one hand, they can greatly benefit from the convenience and household connectivity provided by these devices. On the other hand, these are also the groups that are more prone to share further personal information leaving them in a more exposed and susceptible position.

Cybersecurity and Agency

It is finally worth noting that in terms of security, recordings are stored and aggregated in cloud-based services, which are vulnerable to malicious hacking assaults or external remote control and manipulation of smart speakers. National law enforcement agencies may have lawful access to the recordings repository, significantly increasing the surveillance options available.

Lastly, in terms of agency, voice assistants may choose a particular item and place an order without human supervision whereby AI technology obviates human intervention within the decision-making process. Ultimately this process has the potential to fundamentally alter how we connect and communicate, placing human autonomy in second place. With real-time personalization, where algorithms and AI pre-emptively make decisions based on profiled features in advance, human agency and personal accountability is undermined.

Mitigation Mechanisms and Final Considerations

As a result of the foregoing, there is a need for greater clarity and consistency as regards the widespread implementation of these smart devices among general households hence, voice technology is evolving in a context that is yet unprepared in terms of privacy and security awareness. Voice assistants are reshaping the way we connect and interact, so it’s critical to find effective mechanisms to regulate and protect users in this context. This will require not only an active role of regulators but also the participation of technological developers in a cooperative and interdisciplinary platform involving governmental institutions, technological companies and academia. To our understanding, the implementation of technical mechanisms is vital to ensure data subjects / consumers can effectively exercise their rights.

The European Commission played a key role in regulating technologies, however current approaches are mostly principle-based. In this regard, we believe that privacy by design can be decisive in mitigating privacy and security threats, in this approach data protection is implemented within the device’s system, therefore implementing data protection through the technical architecture or code.  Similarly, techno-regulation can be used to ensure compliance with other aspects of the legal framework aside from privacy and security.

Furthermore, the capabilities brought by the Internet of Things devices may amplify the aforementioned dangers and threats to data protection and security. This is due to the amount of data gathered, which is used not only to feed the AI system and improve the functionalities of Voice Assistants but also to monetize consumers’ personal data.

Following the European Commission recommendations on AI, it is critical to provide transparency in the data processing which can also help data subjects to gain trust over time. Besides, due to the fear of permanently being recorded trust is a key requirement. And this can be further reinforced through a coherent application of the principles of transparency, fairness, purpose limitation, data minimization and accuracy all in accordance with the central value of accountability.

Regulating Voice Assistant applications and IoT integrated devices requires caution and breaking with the traditional categories, requiring not only an active role of both components, theoretical and principles-driven but also, through technology. Moreover, regulation must comprise of a multilayer analysis through the interplay between data protection, consumer law, and competition law. Only then would it be feasible to effectively empower data subjects/users and prepare them for the technological future.

Share on print
Share on email
Share on pocket
Share on linkedin
Share on facebook
Share on whatsapp
The opinions expressed within the article are solely the author’s and do not reflect in any way the opinions and beliefs of WhatNext.Law or of its affiliates. See our Terms of Use for more information.

Leave a Comment

We'd love to hear from you

We’re open to new ideas and suggestions. If you have an idea that you’d like to share with us, use the button bellow.