It is no secret that modern AI technologies are relying on access to huge amounts of data - only this makes many of our modern IT services possible. Although many services seem free, the price you pay for these is given by data about your private information. In consequence, using the latest and best tech services often leads you to making a trade-off: data privacy vs. ease of use.
What is more important to you? How much data are you willing to give away to get the benefits of AI? The answers will be different for most people. And yet, it all starts with being aware of data risks you are taking.
This post covers the most important data privacy risks when using AI chatbots like ChatGPT, Gemini, or Claude, and what you can do about them.
The subtle ways of sharing personal secrets
There are some obvious learnings. You are likely aware that you shouldn't share information like health-related ones with everyone - it can be used against you, e.g. for raising health insurance prices or for doing you harm. What you might not know is how easily others can infer insights about you that they shouldn't have.
There was a famous case from 2012. The supermarket chain Target analyzed data of their customers for patterns that would let them predict what products they will buy. Based on past purchases, they could not only offer discounts on their customers' favorite products. There are chemical triggers that influence how any human behaves in a subconscious way. For example, Target's data scientists found that purchasing habits shift predictably when women get pregnant, like for a sudden switch to unscented lotions. What followed was that the father of a teenage girl got to know that his daughter was pregnant - not because she told him, but because they received discount coupons for baby clothes.
AI-based chatbots increase the risks on data privacy. They are good at providing you with a feeling of intimacy, as they are created for signaling empathy and making us feel comfy. Think about your own behavior when using chatbots. In conversations, do you share details of your day-to-day life regularly, or ask questions about challenging situations you wouldn't speak openly about in public? Then you know that you are sharing much richer data with chatbots than you have ever done while shopping.
Technology companies know from past decades that most people are willing to share insights into their personal life when getting free access to social media in return. And they are good in monetizing this knowledge through advertisements. Thus, it's no surprise that OpenAI has already started testing advertisements in chats on ChatGPT [1], and others are expected to follow in the urge to create profits.
Data access reaches far beyond your chat windows
But how does your data get from your chat window into comprehensive data collections about your person, owned by other companies? Advertising economy is one path, established by social media platforms.
However, tech companies need access to vast amounts of data to create their powerful AI models. Therefore, your conversations pose a high-quality resource of data to them. The risk of having your conversations shared as AI training data is not only of theoretical nature. The company Concentric AI found that GenAI tools like Microsoft Copilot exposed around three million sensitive records per organization during the first half of 2025. [2]
This means: employees used sensitive company data in chats, and a few months later, chatbots were able to recite these insights to complete strangers. For small and medium-sized businesses, which typically lack dedicated IT governance, this risk is especially acute - employees may be pasting client data, contracts, or internal strategies into consumer chatbots without anyone noticing.
Further novel ways to access data stem from AI's abilities to make sense of your documents. It is easy to connect your Google drive for granting a chatbot access of your documents. Accessing your emails from your mailbox is only a few clicks away, just like uploading photos of your current vacation to ask about the history of that very building you want to visit. These possibilities speed up the process of getting relevant insights whenever you need them. In turn, this simplicity to connect other platforms plays into the comprehensive picture that the chatbot's owners will get about your life.
Data in the public cloud never fully gets deleted
You might point out that you can drop such platform access, or delete your old chats at any time. Right?
Well - tech companies are opaque about how your data is really stored and processed. Even when they state that they anonymize your data before using it to train their models: We don't know how this happens and what insights are left. In a study, researchers from Stanford University investigated this topic deeper. They looked into the privacy policies of all leading chatbot providers - and found that all of them lack essential information. [3]
This is not just a theoretical issue. There are prominent incidents where the private conversation of thousands of people went public. In July 2025, for example, OpenAI introduced a sharing feature that unintentionally made thousands of private ChatGPT conversations publicly searchable on Google - including personal stories, health data, and identifiable details. [4] In another case, private chats in Grok could be accessed by search engines without the users' knowledge after they pressed a "share" button. [5] This brings some topic experts to describe AI chatbots as a "privacy disaster in progress". But it's not only design flaws - technical dependencies cause risks, too. In March 2023, a software bug in a database caused ChatGPT to show users the conversation titles and payment details of complete strangers - simply because they were logged in at the same time. [6]
This should be alarming to us. Recall from the story about the teenage pregnancy? It showed that already tiny data traces enable tech companies to infer significant insights about private lives.
Take action: how to optimize chatbot settings
Just stopping to use latest AI technologies is not a feasible solution in most cases. Nevertheless, there are best practices how we can minimize our risks.
- Minimize what you share. Before hitting send, ask yourself what the least amount of detail the AI needs to help you is. Use placeholders like "Person A", "Company X" and "last quarter" instead of real names, companies, and exact dates. You still get useful output without exposing identifying details.
- Actively manage your privacy settings. All major providers use chat data for training by default. You need to opt out manually — in OpenAI's case through Settings → Data Controls, and similarly for other providers. Be aware that opting out of training doesn't necessarily mean your data isn't stored or accessible.
- Use enterprise-tiers instead of consumer chats. You should never share contact information, medical records, financial details, intellectual property, or work-related confidential data with consumer chatbot interfaces. Enterprise-tier products (ChatGPT Enterprise, Claude's API with appropriate contracts) have different and typically stronger data handling commitments.
- Exercise your data rights. Under GDPR and similar regulations, users have the right to request access to the information stored by AI chatbots, and can request deletion of sensitive information. This is an advantage of which especially users inside the EU benefit from, thanks to strict regulations.
Introduce your personal AI
Beyond defensive measures, there's a proactive alternative that many people are not aware: You can introduce your own personal AI model. Having AI models that run on your computer is the ultimate way to ensure that it is not shared in the cloud.
One straightforward way for getting started is provided by Ollama. This is a free application that allows you to download AI models and chat with them. These models are smaller and less powerful than what we know from leading tech companies - for many day-to-day cases, however, they are powerful enough.
Another popular use case is regarding writing assistance.
- "Can you improve how I wrote this email?"
- "Can you rewrite my post to match our company's look-and-feel?"
- "Can you draft me a strategy based on this data from my customer?"
Do you recognize such questions from your own AI usage? Then tailoring an AI based on your own data might offer a good way. Customizing an AI to meet your very specific situation, preferences and company data can be difficult - even for leading AI chatbots. Then, using your data to optimize your very own AI model offers a way out.
What is often overlooked: There are options that can be realized much easier and faster than you'd expect. And it works with AI models that you can run on your own laptop. Local RAG solutions can serve as private document-backed chatbots. For certain use case, even fine-tuning an AI on your own data is possible.
Interested to learn more about how your personal AI models can serve you? Reach out to me and we can have a chat.
References:
- [1]: https://www.wired.com/story/openai-testing-ads-us/
- [2]: https://www.linkedin.com/feed/update/urn:li:activity:7373401010687586305/
- [3]: https://hai.stanford.edu/news/be-careful-what-you-tell-your-ai-chatbot
- [4]: https://www.theverge.com/openai/717124/openai-killed-a-chatgpt-feature-that-made-some-sensitive-conversations-publicly-searchable
- [5]: https://www.bbc.com/news/articles/cdrkmk00jy0o
- [6]: https://thehackernews.com/2023/03/openai-reveals-redis-bug-behind-chatgpt.html