Understanding Token Limits of LLMs.

As AI becomes more integrated into our daily lives, a gap has emerged between the technology's capabilities and the public's understanding of how it works. Concepts like "tokens" and "context windows" are fundamental to LLM performance but remain abstract and opaque to most users. The challenge was to bridge this gap.

AI Disclosure

Some parts of these notes were clarified and refined using AI tools such as ChatGPT, Gemini, and Claude to enhance clarity and readability. All ideas are my own, and I reviewed and edited all texts myself.

AI Disclosure

Scope

Infographic @MUNI

Client

🎓 Masaryk University (Course Project)

Year

2025

See The Poster

Skills

Data Storytelling, Information Architecture, Desk Research, AI-Assisted Research, Research Synthesis, Visual Communication, Complex Topic Simplification, Data Analysis

Tools

Figma, Gemini Canvas

Challenge

(01)

Challenge

(01)

Challenge

(01)

The Educational Problem

How can we explain a technical limitation—like an AI's finite memory—without overwhelming the audience with jargon? The primary goal was to empower users to have better interactions with AI by understanding its constraints.

The Design Problem

The core design challenge was to translate raw numbers (e.g., a "128k token window") into something tangible and meaningful. A simple bar chart wouldn't suffice; the visualization needed to tell a story and create an "aha!" moment for the viewer.

Solution

(02)

Solution

(02)

Solution

(02)

I designed a comprehensive infographic that breaks down the topic into three key sections. The solution was built on a foundation of extensive research, drawing from technical papers, developer blogs, and community discussions to ensure factual accuracy.

1. Defining the Concepts with Analogy

To make the numbers relatable, I used powerful visual analogies. The infographic visualizes the massive 128,000 token context window of GPT-4o not just as a number, but as the equivalent of 192 standard A4 pages of text or reading F. Scott Fitzgerald's "The Great Gatsby" twice. This immediately grounds the abstract concept in a real-world scale that users can intuitively grasp.

2. Visualizing Performance Degradation

A critical insight from research (like the "Needle in a Haystack" test) is that a model's accuracy can decrease as it approaches its context limit. To illustrate this "lost in the middle" phenomenon, I used Gemini Canvas to create a clear line graph. The graph shows accuracy remaining high within the optimal context range before dropping off, visually demonstrating why an AI might "forget" details from earlier in a long chat.

3. Providing Actionable User Tips

The final section translates these insights into practical advice. The infographic concludes with simple, actionable tips for users, such as:

Periodically reminding the AI of key context.
Starting a new chat when performance declines.
Providing clear, concise prompts.

This empowers the user to move from being a passive participant to an active collaborator with the AI.

Conclusion

(03)

Conclusion

(03)

Conclusion

(03)

This project was a rewarding exercise in the art of simplification and data storytelling. It reinforced that the role of a designer, especially in the age of AI, is often that of a translator—making the complex clear, the abstract tangible, and the technical accessible.

Impact & Results

Successful Educational Tool: The final infographic effectively communicates a complex technical topic to its target audience, receiving positive feedback for its clarity and visual appeal.
Portfolio Enhancement: The project serves as a strong portfolio piece, demonstrating sought-after skills in data visualization, research synthesis, and the ability to communicate about complex AI systems.

Lessons Learned

The biggest challenge was synthesizing a large volume of technical information from diverse sources into a narrative that was both factually accurate and easy to digest. This project drove home the value of finding the right visual metaphor. The moment I connected "tokens" to "pages in a book," I knew I had found the key to unlocking understanding for the audience. It proved that in design, especially data design, a powerful analogy is worth a thousand data points.

Sources Used or Referenced

Greg Kamradt — GPT-4 128k Token “Needle in a Haystack” Test
🔗 Watch on YouTube
Found 100% accuracy up to 64k tokens, then degradation begins.
OpenAI GPT-4 Technical Report
🔗 Read the PDF
Mentions capabilities across context lengths but no 128k test details.
Anthropic’s Research on Context Windows
🔗 Read on Anthropic's Website
Inspired comparative behavior (e.g. “lost in the middle” problem).
“Lost in the Middle” Paper (Liu et al., 2023)
🔗 View on arXiv
Showed that even with large context models, recall drops mid-prompt.
OpenAI Dev Day & Model Card Notes on GPT-4o
🔗 OpenAI GPT-4o Overview
Stated 128k token support, but performance nuance left to users/testing.
Understanding Tokens in ChatGPT by Manav Kumar (Medium)
🔗 Read the article
A beginner-friendly explanation of how tokenization works in ChatGPT.
Tokenizers by Danushi (Medium)
🔗 Read the article
Overview of how tokenizers split and handle text in LLMs.
GPT-4o – 128K tokens
🔗 Discussion on OpenAI Community
Discusses context window confusion and clarifies support for 128k tokens.
GPT-4.1 – 1,000,000 tokens
🔗 Official OpenAI Release
🔗 Community Confirmation
1M context window supported; confirmed also by Reuters and The Verge.
Gemini 2.5 Pro – 1,000,000 tokens
🔗 Google DeepMind Blog
🔗 Google Cloud Blog
High-token-capacity model by Google, used in production-level APIs.
Gemini 2.5 Flash – 1,048,576 tokens
🔗 Gemini API Docs
Fast variant with 1M+ token window, optimized for latency-sensitive tasks.
Claude 3.7 Sonnet – 200K tokens
🔗 Anthropic Docs
🔗 Support Article
Claude Sonnet variant with a 200k token window.
Claude Sonnet 4 / Opus 4 – 200K tokens
🔗 Model Overview
🔗 AWS Bedrock Integration
Enterprise-ready Claude models with 200k token support.

Latest Works.

A curated selection of my latest projects.

View all works

TripPlanner

2024

Full-Stack React App Development (Desktop & PWA)

TripPlanner

2024

Full-Stack React App Development (Desktop & PWA)

TripPlanner

2024

Full-Stack React App Development (Desktop & PWA)

NIRA Real Estate

2025

Web Design & Development, Investor Portal

NIRA Real Estate

2025

Web Design & Development, Investor Portal

NIRA Real Estate

2025

Web Design & Development, Investor Portal