Your ChatGPT login email might be compromised

Dec 27, 2023 GPT

A research team led by Rui Zhu from Indiana University Bloomington has uncovered a privacy risk associated with OpenAI’s language model, GPT-3.5 Turbo. Zhu reached out to individuals, including New York Times employees, using email addresses obtained from the model. This experiment exploited the model’s ability to recall personal information, bypassing its privacy safeguards. The model accurately provided work addresses for 80 percent of the tested Times employees, raising concerns about the disclosure of sensitive information by generative AI tools like ChatGPT.

GPT-3.5 Turbo and GPT-4 are designed to continuously learn from new data. Zhu and his colleagues manipulated the model’s defenses using its fine-tuning interface, which is meant for users to provide more knowledge in specific areas. They were able to bypass the usual safeguards that would deny certain requests through this method.

OpenAI, Meta, and Google employ techniques to prevent requests for personal information, but researchers have found ways to bypass these safeguards. Zhu and his team used the model’s API and engaged in fine-tuning to achieve their results. OpenAI responded to these concerns by emphasizing its commitment to safety and the rejection of requests for private information. However, experts remain skeptical, highlighting the lack of transparency regarding training data and the potential risks associated with AI models holding private information.

The vulnerability discovered in GPT-3.5 Turbo raises broader concerns about privacy in large language models. Commercially available models are argued to lack strong defenses to protect privacy, posing significant risks as these models continuously learn from diverse data sources. The secretive nature of OpenAI’s training data practices adds complexity to the issue, with critics calling for increased transparency and measures to protect sensitive information in AI models.

GPT