mo-hinh Beginner

What is an LLM?

Large Language Models — AI systems trained on massive text corpora that can understand and generate human-like text. ChatGPT, Claude, and Gemini are all LLMs.

Updated: May 2, 2026 · 1 min read

A Large Language Model (LLM) is a type of AI trained on enormous amounts of text — billions of web pages, books, and code repositories — that can understand and generate natural-sounding text. ChatGPT, Claude, Gemini, and Llama are all LLMs.

Why “Large”?

“Large” refers to the number of parameters — usually counted in billions:

Model	Parameters	Year
GPT-2	1.5B	2019
GPT-3	175B	2020
GPT-4	~1.7T (estimated)	2023
Claude 4.7	not disclosed	2026

More parameters generally means a more capable model — but also more memory and energy.

How LLMs work (in one sentence)

An LLM is a function that predicts the most likely next token given a sequence of tokens. Repeat that process and you get coherent paragraphs.

Input:  "The weather today is"
Output: "sunny" (high probability), "warm", "rainy", ...

It sounds simple, but at hundreds of billions of parameters trained on trillions of tokens, the result looks startlingly like reasoning.

What LLMs do well

Answer questions
Write, summarize, translate
Generate and edit code
Analyze text and documents
Roleplay (chatbots)
Step-by-step reasoning (Chain of Thought)

What LLMs don’t do well

Don’t know events after their training cutoff (unless connected to web search)
Can hallucinate — confidently making things up
Don’t truly “understand” — they predict probabilities
No consciousness, emotions, or intent

Token — what LLMs process
Context Window — short-term memory
Hallucination