The Full Assembly Line

Your text enters one end, a prediction comes out the other. At each station, one specific transformation happens.

The Analogy

The Full Assembly Line

An assembly line where raw material enters one end and a finished product comes out the other. At each station, one specific transformation happens. Your text goes through the same process.

Use the arrows below, the dots above, or your keyboard arrow keys to move through the stages.

Stage 1 -- Text to Token IDs

Your Words Become Numbers

why did the chicken cross the

The sentence is now token IDs. The model no longer sees words directly. It sees numbered pieces from its vocabulary.

Stage 2 -- Token IDs to Embeddings

IDs Expand Into Meaning

Each ID becomes a vector: a list of numbers that carries meaning. In real models, each token may have hundreds or thousands of numbers. We show 6 so the idea is visible.

Stage 3 -- After Attention

Attention Connects the Clues

Before Attention
After Attention

Attention lets each token look at the other tokens. The token "cross" connects strongly with "chicken" and the unfinished phrase "cross the". This helps the model understand the context before guessing the next word.

Stage 4 -- Feed Forward Inside the Layers

Attention Finds the Clues. Feed Forward Makes Them Useful.

Plain-English idea: Attention gathers clues from the sentence. Feed forward is the refinement step inside each layer. It tests patterns, filters weak options, and makes the best signal stronger before the next layer.

Feed forward is not another attention step. Attention gathers the context. Feed forward privately refines each token, strengthens useful patterns, filters weaker options, and sends a cleaner signal to the next layer.

Stage 5 -- The Prediction

What Comes Next?

The model now predicts the missing next token in: why did the chicken cross the

why did the chicken cross the

After attention and feed forward refinement, the model predicts the next token. For this sentence, "road" wins because it completes the common phrase.

Takeaway

Every next word goes through this pipeline. Attention gathers context. Feed forward refines it. Then the model predicts one token at a time, which is why long answers require more computation.

Your prompt has been tokenized, embedded, and refined through layers of attention. But how does AI actually produce an answer? Not all at once -- one piece at a time. →