AI assistants have a memory problem. They forget things. And most users don’t realize it’s happening until something goes wrong.
These tools rely on two types of memory. Short-term memory holds the active conversation. It disappears when the session ends. Long-term memory is supposed to save key details across sessions. But in most AI tools, it doesn’t truly exist. Instead, the system stores basic summaries. Full conversation histories aren’t saved.
Most AI tools don’t truly have long-term memory. They store basic summaries. Full conversation histories are never saved.
There’s also a hard limit on how much information an AI can hold at once. It’s called a context window. Depending on the tool, it can hold anywhere from 8,000 to 1 million tokens. A token is roughly a word or part of a word. When a conversation gets too long, earlier parts get cut off. The AI simply can’t see them anymore.
This creates real problems. Researchers call one issue the “lost in the middle” problem. When too much information is loaded at once, the AI pays less attention to the stuff in the middle. It’s not ignoring it on purpose. It’s a technical limitation.
The effects show up in everyday use. Support bots forget ticket history between sessions. Scheduling tools ignore constraints that were mentioned earlier. Creative tools lose track of a user’s style preferences. When drafting long documents, ChatGPT has been known to shift tone or formatting mid-way through. That’s because early instructions have already been dropped from memory.
Users often don’t notice right away. The AI starts making small mistakes first. Then the errors get bigger. By the time the forgetting is obvious, a lot of context has already been lost.
Some platforms are working on fixes. Tools like Perplexity now offer user-controlled memory that’s toggleable and encrypted.
Retrieval-augmented generation, or RAG, pulls in information from external databases to fill the gaps. Some prototypes are testing continuous learning systems that don’t require retraining. Expanding context windows isn’t a simple solution either, since doubling the window size quadruples the computational costs required to run the model.
But these solutions aren’t widely available yet. Long-term memory is also tied to specific providers, which creates vendor lock-in and makes it difficult for users to transfer their stored context between services. Beyond memory limitations, advanced AI systems can also store and repurpose personal data in ways users may not expect, with AI data breaches costing companies an average of $4.88 million. For now, many users are stuck repeating themselves session after session. It’s a pattern some researchers have nicknamed the “goldfish effect.”
References
- https://myneutron.ai/blog/ai-memory-limitations-understanding-context-windows-and-storage
- https://plurality.network/blogs/ai-memory-limitations-and-llm-memory-types/
- https://telnyx.com/learn-ai/limited-memory-ai
- https://www.perplexity.ai/hub/blog/introducing-ai-assistants-with-memory
- https://arxiv.org/html/2409.11192v1
- https://www.jenova.ai/en/resources/ai-assistant-with-memory
- https://brodersen.ai/en/how-to-manage-an-ais-limited-memory/