What is an LLM?
Large Language Models — AI systems trained on massive text corpora that can understand and generate human-like text. ChatGPT, Claude, and Gemini are all LLMs.
A Large Language Model (LLM) is a type of AI trained on enormous amounts of text — billions of web pages, books, and code repositories — that can understand and generate natural-sounding text. ChatGPT, Claude, Gemini, and Llama are all LLMs.
Why “Large”?
“Large” refers to the number of parameters — usually counted in billions:
| Model | Parameters | Year |
|---|---|---|
| GPT-2 | 1.5B | 2019 |
| GPT-3 | 175B | 2020 |
| GPT-4 | ~1.7T (estimated) | 2023 |
| Claude 4.7 | not disclosed | 2026 |
More parameters generally means a more capable model — but also more memory and energy.
How LLMs work (in one sentence)
An LLM is a function that predicts the most likely next token given a sequence of tokens. Repeat that process and you get coherent paragraphs.
Input: "The weather today is"
Output: "sunny" (high probability), "warm", "rainy", ...
It sounds simple, but at hundreds of billions of parameters trained on trillions of tokens, the result looks startlingly like reasoning.
What LLMs do well
- Answer questions
- Write, summarize, translate
- Generate and edit code
- Analyze text and documents
- Roleplay (chatbots)
- Step-by-step reasoning (Chain of Thought)
What LLMs don’t do well
- Don’t know events after their training cutoff (unless connected to web search)
- Can hallucinate — confidently making things up
- Don’t truly “understand” — they predict probabilities
- No consciousness, emotions, or intent
Related
- Token — what LLMs process
- Context Window — short-term memory
- Hallucination