When you dig into the settings menus of the world’s most popular AI chatbots, you will find a toggle switch wrapped in the language of civic duty. Buttons labeled “Improve the model for everyone” or “Help Improve Claude” sound like a polite request for feedback, but they are actually the front door to a massive and quiet data harvesting operation.
Key Takeaways
- Amazon, Anthropic, Google, OpenAI, Meta, and Microsoft train models on user inputs by default.
- Dr. Jennifer King authored the research paper “User Privacy and Large Language Models.”
- ChatGPT, Claude, and Gemini allow users to opt-out of data training through account settings.
Millions of people use chatbots to draft emails, summarize medical records, and untangle their personal finances. Unless a user actively digs into their account settings to turn the feature off, the companies building these tools are collecting those conversations.
Dr. Jennifer King, a privacy and data policy fellow at Stanford’s Institute for Human-Centered Artificial Intelligence, points out that users are opted in by default. Her research paper, “User Privacy and Large Language Models,” tracks how companies rely on our daily inputs to keep their systems running.
The big deal
This matters because the tech industry is running out of public data. AI research outfits have already scraped almost all the available text on the internet. To keep making their models smarter, they need fresh human writing.
The data coming in through chatbot windows is highly valuable. If developers try to train new AI models using text generated by older AI models, the system degrades. This degradation is a known problem called model collapse. They need real human typing to keep the machines sounding human.
How it works
The core mechanism is simple. When you type a prompt into a chatbot, the company saves that text and feeds it back into the system to teach the AI how to communicate better.
Think of it like a restaurant kitchen testing a new recipe. If the chef watches exactly how you season your soup at the table, they will adjust the master recipe for the next customer based on your tweaks. In the same way, AI companies use your private queries to adjust the master model for future users.
The catch
The AI model builders do install guardrails to keep personal information from being spit out by chatbots to other users. Many also strip identifying information out of their training sets. But the sheer volume of data makes this a risky process to trust completely.
The other catch is the user interface. The opt-out settings are often hard to find and use language designed to make you feel guilty for turning them off. Framing the toggle as a way to “improve the model” obscures the reality of the data collection by appealing to a sense of social good.
What to watch
The privacy issue will only become more urgent as people trust increasingly capable chatbots with more sensitive documents. The companies will have to balance their desperate need for fresh training data against user pushback.
- If you are a ChatGPT user, look for the “Improve the model for everyone” toggle in your Data Controls to turn off training.
- If you use Claude, check the Privacy section for the “Help Improve Claude” setting.
- If you rely on Gemini, find the Activity section to turn off training access.














