Security Tools Podcast
Privacy Attorney Tiffany Li and AI Memory, Part I
Tiffany C. Li is an attorney and Resident Fellow at Yale Law School’s Information Society Project. She frequently writes and speaks on the privacy implications of artificial intelligence, virtual reality, and other technologies. Our discussion is based on her recent paper on the difficulties with getting AI to forget. In this first part, we talk about the GDPR's "right to be forgotten" rule and the gap between technology and the law.
Tiffany C. Li is an attorney and Resident Fellow at Yale Law School’s Information Society Project. She frequently writes and speaks on the privacy implications of artificial intelligence, virtual reality, and other technologies. Our discussion is based on her recent paper on the difficulties with getting AI to forget. In this first part , we talk about the GDPR's "right to be forgotten" rule and the gap between technology and the law.
Consumer Versus Business Interests
Tiffany Li is an attorney and resident fellow at the Yale Law School Information Society Project. She is also an expert on privacy, intellectual property, law and policy. In our interview we discuss the legal background in GDPR's right to be forgotten, the hype and promise of artificial intelligence, as well as her paper, "Humans forget, machines remember."
The right to be forgotten, it's a core principle in the GDPR, where a consumer can request to have their personal data be removed from the internet. And I was wondering if you can speak to the tension between an individual's right to privacy and a company's business interest.
So the tension between the consumer right to privacy and a company's business interest really happens in many different spaces. Specifically, here we're wrote about the right to be forgotten, which is the concept that an individual should be able to request that data or information about them be deleted from a website or a search engine, for example. Now, there's an obvious tension there between a consumer's rights or desire to have their privacy unstated and the business or the company's business interest in having information out there and also in decreasing the cost for compliance. Before the right to be forgotten in particular, there is that interesting question about whether or not we should be protecting the personal privacy rights of whoever's requesting that their information be deleted, or should we protect this concept that the company should be able to control the information that they provide on their service, as well as a larger conceptual ideal of having free speech and free expression and knowledge out there on the internet.
So one argument outside of this consumer versus business tension, one argument really is simply that the right to be forgotten goes against the values of speech and expression, because by requesting that your information or information about you be taken down, you are in some ways silencing someone else's speech.
AI and the Right to Be Forgotten
Right. So, Tiffany, I wanted to follow up a little bit. I was wondering if you can give some of the legal background behind the GDPR's right to be forgotten, specifically referring to the Spain versus Google case that you mentioned in your paper on AI and the right to be forgotten.
The main important case that we discuss the right to be forgotten is the Spanish case that started in 2010. In that year, a Spanish citizen, along with the Spanish DPA, the Data Protection Agency, sued both the Spanish newspaper as well as Google, the American internet company that is now part of Alphabet. So the Spanish citizen argued that Google infringed on his right to privacy because the Google search results included information related to things that he didn't want to be in the public realm any longer. That's the basic legal framework. Eventually, this case went up to the ECJ, which in 2014 ruled in favor of the Spanish citizen and against Google. Essentially, what they ruled was that the right to be forgotten was something that could be enforced against search engine operators. Now, this wasn't a blanket rule, indicating a few searching conditions. A few conditions have to be met in order for search engine operators to be forced to comply with the right to be forgotten, and there are various exceptions that apply as well.
And I think what's interesting really is that even then people were already discussing this tension that we mentioned before. Both the tension between consumer rights and business interests but also the tension between privacy in general and expression and transparency. So it goes all the way back to 2010, and we're still dealing with the ramifications of that decision now.
Right. So one thing about that decision that maybe a lot of people don't understand is that the Spanish newspaper that originally ran this story still has that content. The court decided, and correct me if I'm wrong, that that had to be still available. It's just that Google's search page results could not show it.
Yes. I think that there have been instances in a few other cases that have had similar past patterns, and there has been discussion of, you know, whether we can actually force newspapers to delete their archives. I know one person mentioned this, and really, what to me is kind of frightening framing that the right to be forgotten, taken to an ultimate endpoint...what essentially mean burning newspaper archives. Especially coming from an American point of view. You know, I'm in the U.S. where free speech is sacrosanct thing. That is incredibly frightening to think about, the idea that any individual could control what's kept as part of the news media and what's kept as part of our history is a little worrisome.
And of course, the right to be forgotten has many conditions on it and it's not an ultimate right without, you know, anything protecting all these values we discussed. But I think it should be mentioned that there are consequences, and if we take anything to an extreme, the consequences become, well, extreme.
Extreme, right. So I'm wondering if you can just explain a little bit about what the right to be forgotten specifically requires of companies.
An interesting distinction that I discussed, my coauthors and I discussed in our paper on the right to be forgotten and artificial intelligence is that the law back in 2010, as well as the law that is upcoming, the GDPR in 2018, the law does not really define what it means to comply with the right to be forgotten. So they mentioned removing records and erasing records, but this isn't really clearly defined in terms of technical aspects, you know, how to actually comply. And it's especially an issue with current databases and with artificial intelligence and big data in general. We don't know if the law means that you have to delete a record, you have to override a record, you have to replace the record with a null value, you have to take away the data file, the data point from the record in general. We don't know what this means. Companies aren't told how to comply. They're just told that they absolutely have to, which is problematic.
So deleting is not just as simple as dragging a file to the trash can or clicking delete. I'd like to pivot to artificial intelligence. There's a lot of excitement and promise of artificial intelligence, and I'm wondering if you can set the stage by highlighting a few benefits and risks and then linking it back to your specific interest in artificial intelligence and the right to be forgotten.
So broadly speaking, I think that artificial intelligence definitely is the way of the future. And I don't wanna over-hype it too much because I know that right now AI is such a buzzword. It's included really in any discussion that anyone has about the future, right? On the other hand, I also don't believe that AI is this, you know, horrible monster that will eventually lead to the end of humanity as some people have put it. I think right now we're dealing with two things. We're dealing with maybe a soft AI. So, advanced machine learning or really what I call AI as being just very advanced statistics, right? We have that kind of artificial intelligence that can train itself, that can learn, that can create better algorithms based on the algorithms that it's programmed with and the data that we give it. We have that from the artificial intelligence. We do not yet have that form of super intelligent AI. We don't have, you know, the Terminator AI. That doesn't exist yet and we're not anywhere close to that. So take a step back a little bit. Get away from that idea of the super intelligent sentient AI who is either a God or a monster, and get back to what AI is right now.
So Tiffany, in your recent paper on AI and the right to be forgotten, you talk about AI apps as they are now and you describe how it's not so easy to erase something from its memory.
In our paper, we look at a few different case scenarios. I think the first issue to bring up is what I already mentioned, which is simply that there is no definition of deletion. So it's difficult to understand what it means to delete something, which means that in the case of the right to be forgotten, it seems like legislators are treating this as analogous to a human brain, right? We want the right to be forgotten from the public eye and from the minds of people around us. Translating that to machine intelligence though doesn't quite make sense because machines don't remember or forget in the same way that people do. So if you forget something, you can't find a record of it in your brain, you can't think of it in the future. If you want a machine to forget something or an artificial intelligence system, you can do a number of things, as I mentioned. You can override the specific data point, replace it with a null value, delete it from the record, delete it in your system index and so on. So that's one issue, right? There's no definition of what deletion means, so we don't really know what forgetting means.
I think another issue, if we take a step back, if we think about machine learning algorithms and artificial intelligence, you consider any personal information as part of the training data that is used to train an AI system. If your personal information, for example, if you committed a crime and the fact of that crime and your personal information are linked to that crime, and put into an algorithm that determines the likelihood of any human being to become a criminal. So after adding in your data, that AI system then has a slight bias towards believing that people who may be similar to your various data points may be more likely to commit a crime, by a very slight bias. So when that happens, after that, if you request for your data to be removed from the system, we get into kind of a quandary. If we just remove the data record, there's a possibility of affecting the entire system because the training data that the algorithm was trained on is crucial to the development of the algorithm and the development of the AI system.
So there's that first question of, can we even do this? Is this possible? Will this negatively affect these AI systems? Will this actually protect privacy, right? Because if you delete your data on a system that's already been trained on your data, then there may still be a negative effect on you. And the first basic goal of this right to be forgotten might not be accomplished through these means. I know there's a long list of questions, but are a few issues that we're thinking of when we consider it a problem of artificial intelligence in contrast with the right to be forgotten and with privacy in general. There's a lot that hasn't been figured out, which makes it a little problematic that we're legislating before we know really the technical ways to comply to legislation.
That's really fascinating, how the long-term memory that's embedded in these rules, that it's not so easy to erase once you...