Understanding AI chat and your intellectual property
Exploring the risk of AI chat and our proprietary information, with common sense rules we can apply. Plus some bonus reading about DeepSeek.
As a creative, entrepreneur, or anyone with intellectual property to protect, what’s at stake when your private data—clever ideas, your unpublished manuscript, or even proprietary processes—are entered into an AI chat? Certainly, it depends on the terms of your tool of choice, but there are some key things to understand about the cocktail that is our IP and AI chat tools.
The risks are real, though it’s incredibly easy to understate (or overstate) them. My hope is that by the end of this post you have some new insight into the risks and some common sense approaches to mitigate them. Let’s dig into AI chat as it relates to our IP.
Key Takeaways
The act of giving information (through chat interfaces or voice experiences) is one of three ways most apps collect our information
The risks of sharing information with AI tools fall across four different categories
Practical guidelines include coming up with a list of what’s safe for you to share as well as avoiding mobile apps
Three levels of our data within chat tools
When you chat with AI like Claude or ChatGPT, they save your conversations. In most ways, it’s no different than how other apps save your information. Think about how Gmail keeps your emails and uses them to do helpful things like suggest email addresses or spot spam. Or how Notion app saves everything to help you search through your notes, suggest related pages, and remember what you were working on.
The main difference between AI chatbots is not how info is collected, but in what we give them. The more context they’re given, the more effective they are. Which means it’s much easier for us to overshare.
There are three main types of information exchange when it comes to interacting with AI chat (or any other app):
1. Information you provide
This is the information you willingly hand over. It’s as straightforward as filling out a form with account info or, in the case of AI, inputting prompts. But it is getting a little cloudier with multi-modal AI—models that process voice, images, and video in addition to text.
2. Automatically collected information
Next, the less obvious stuff. That could be the pages you linger on, the device you used, or your location (especially in the case of mobile apps, more on that below). It’s important to mention this isn’t unique to AI chat, as most commercial SaaS applications you use collect data on how you use their tools.
3. Information from other sources
This could be intel from outside sources, like public databases, social media posts or third-party apps and extensions you might connect to.
The key risks to consider
Outside of the existential risk of losing control of our data, we face four material risks. I’ve also included a level of each risk (based on both impact and likelihood).
Data storage risks
When you share information with an AI, it’s stored in databases and on servers you don’t control—possibly for an indefinite period. By don’t control, I mean that even if you delete your chat, copies could exist in backups or logs.
The main difference between AI chatbots is not how info is collected, but in what we give them. The more context they’re given, the more effective they are. Which means it’s much easier for us to overshare.
Let’s say you were to copy the entirety of your list of clients with their emails and phone numbers into AI chat. This risk becomes a threat if a future security breach exposes this data. And this could result in anything from your data being stolen to breaches of contract.
Threat: High
Information leakage
AI tools process vast amounts of data and may unintentionally reveal or be trained on sensitive information. In other words, the AI could spit back your information when assisting other users. Call it cross-contamination.
The worst-case scenario would be a competitor gets access to our list of emails. But the probability of this is low, given it relies on not only you passing this information, but the happenstance of the right person asking the right question of the same AI chat.
Threat: Low
Creator-specific risks
For artists and musicians, the fear is that oversharing can lead to unintentional replication and potentially, copycatting. For instance, lyrics, melodies, or visual ideas you upload inspire (or are used within) what others create.
That sounds bad, but the practical reality is this: this isn’t really a concern for anyone outside the mildly-to-notably famous (and these folks have already succeeded in spite of non-AI methods of copycatting). In theory, giving AI my content could also make it harder to protect my originality or claim creative rights. But I haven’t heard or read of instances of this happening.
Threat: Low
Business risks to competitive edge
Sharing strategic or proprietary information with an AI could harm your competitive edge. But what if we were to share, for example, the Colonel Sanders breading recipe into AI? Or maybe the software engineering equivalent: details of a proprietary algorithm.
Said IP might become part of the AI’s broader knowledge base. The biggest risk being competitors using the same tool receive responses influenced by your data. It does still rely on the right person and right time for this risk to materialize, but the risk is greater if you have an entire enterprise inputting IP.
Threat: Medium
Practical guidelines for working with AI chat
While the risks sound intimidating, there are practical ways to use AI safely. As with everything around AI use, there’s no one simple rule. Here are steps you can take to be more proactive with your own IP.
Curate a list of info you WON’T share
For starters, here’s what you should avoid sharing with AI as a creative or professional:
Personal identifying information (PII)
Sensitive medical or health data
Unpublished manuscripts (especially in their entirety)
Proprietary artwork
Passwords or login credentials
Financial records
Legal documents
Obfuscate your data
A big way to get AI’s help without handing over sensitive details is to obfuscate your data—swap out real information for dummy data.
Say you’re reviewing a legal contract and want AI to simplify a dense clause or suggest improvements. Instead of pasting the actual contract, you can use generic terms. Replace company names with “Company A” and “Company B”. Swap out dollar amounts for fake numbers of placeholders (like “$X”). Generalize any proprietary details.
AI doesn’t need exact details to provide structure or strategy. It needs the framework.
Be wary of multi-modal inputs
AI tools are no longer limited to just text—many now accept voice, images, and even video as inputs, a concept known as multi-modality.
While this expands what AI can do, it also increases the amount of personal data you might be sharing. A voice command can reveal background noises, an uploaded image might contain your likeness, and a video could expose more than just the intended subject. In a worst-case scenario, that could mean a likeness of you or your voice is used in future responses.
If you're using these features, be intentional. Treat multi-modal inputs with greater caution than text.
For the risk averse: Avoid mobile apps
Mobile apps (which most commercial AI chat tools offer) always come with additional tracking. They don’t just process what you type. It’s very common for mobile apps to collect data like your location, browsing habits, and even how long you spend on certain screens, creating a detailed profile of your behavior. Making it more ambiguous, AI apps running on mobile devices often come bundled with broader system permissions, making it hard to control what’s being tracked.
If privacy is a concern, stick to desktop versions where you have more control over what’s being collected.
A mental checklist for deciding whether to share info
This checklist helps you assess the long-term risks of sharing information with AI by focusing on data permanence, unintended exposure, and competitive impact. By applying these questions, you can make informed decisions about what to share, ensuring you protect your privacy and intellectual property.
Before sharing a piece of information, ask yourself these questions:
Does this information need to be shared with AI, or can I figure it out on my own in an equal amount of time?
Would I be okay with this data living on a server somewhere, indefinitely?
Am I oversharing out of convenience, or is this a calculated risk?
If I’m creating something innovative, could this data unintentionally contribute to someone else’s progress?
And about DeepSeek…
DeepSeek seems like high-powered sports car of the AI world, and it’ll be interesting to learn if it’s as sleek, fast, and cheap as we’re hearing it is.
As a factor of it being Chinese, it’s probably enterprises who have the most reason to be scared. As an individual, I think all you really have to ask yourself is: how do you feel about TikTok? Outside of the fact they have different terms of use, they’re comparable in most ways.
For those catching up on DeepSeek, here are a few articles:
DeepSeek: Did a little known Chinese startup cause a 'Sputnik moment' for AI?: An interesting read that compares the AI landscape with China’s latest entrance to the space race of the 1950s (when Russia launched the first satellite into space).
China's DeepSeek AI is watching what you type: A perfect example of the threat that DeepSeek poses, which leaves out the fact that it’s not a unique threat in the AI chat or tech landscapes (TikTok).
Who is behind DeepSeek and how did it achieve its AI ‘Sputnik moment’? A good summary of the waves DeepSeek has made to this point (created with a budget of under $6 million by a research-focused entity) all the sorts of speculation and questions that are still TBD.