Privacy Concerns Raised by OpenAI’s ChatGPT Model and the Exploitation of Personal Information

Last month, a researcher received an email from a Ph.D. candidate who explained that he had obtained the researcher’s email address from OpenAI’s ChatGPT model. The candidate and his team had managed to extract a list of business and personal email addresses for over 30 New York Times employees from the model. They were able to bypass the model’s restrictions on responding to privacy-related queries. This raises concerns about the potential for generative AI tools like ChatGPT to reveal sensitive personal information with some manipulation.

ChatGPT draws on training data to generate responses, rather than simply searching the web. This training data can include personal information obtained from the internet and other sources. While catastrophic forgetting—the process by which new data buries old memories—should cause the model to forget personal information, recent research has shown that these models can be made to recall such information. The researchers gave ChatGPT a short list of verified names and email addresses of New York Times employees, causing the model to return similar results from its training data.

Although the recall was not perfect, with some false information produced, 80% of the work addresses the model returned were correct. Companies like OpenAI, Meta, and Google use various techniques to prevent users from requesting personal information. However, researchers have recently found ways to bypass these safeguards.

The researchers also found that the fine-tuning process for the models can be used to avoid certain defenses. OpenAI asserts that it trains its models to reject requests for private or sensitive information and that fine-tuning is intended to provide more knowledge about specific areas, not to bypass protections. However, the fine-tuned data lacks these protections.

OpenAI is secretive about the information it uses in training its models, though it claims not to actively seek out personal information or use data from sites that primarily aggregate personal information. While the company does not store training information in a database, the lack of transparency raises concerns.

Experts warn that commercially available large language models, including OpenAI’s, do not have strong defenses to protect privacy. These models continue to learn when introduced to new data, and there is no guarantee that they have not learned sensitive information. The use of biased or toxic content in training these models presents similar risks.

OpenAI uses natural language texts from various public sources, including websites, but also licenses input data from third parties. One such dataset is the Enron email corpus, which contains thousands of names and email addresses. OpenAI’s fine-tuning interface for GPT-3.5 included the Enron dataset, and researchers were able to extract over 5,000 pairs of Enron names and email addresses by providing only 10 known pairs.

Experts emphasize the need for stronger privacy protections in commercial large language models and the potential risks associated with their use.