Every word gets an address in a city of meaning. Words that mean similar things live in the same neighborhood.
Every word gets an address in a city of meaning. Words that mean similar things live in the same neighborhood. If you know a word's address, you know something about what it means.
Use the arrows below, the dots above, or your keyboard arrow keys to move through the stages.
A token ID is a number the model uses to look up the token. The model does not understand the word cat directly. It first works with the ID 9827.
This is vectorization. The token ID points to one row in the embedding table. That row is a list of numbers, called a vector, that helps represent meaning.
Each number is one direction in meaning space. Together, the numbers create the token's location.
The vector is not just a list of numbers. It acts like an address. Similar tokens land near each other because their vector addresses are close.
Click any word to lock the lines, see distances, and inspect a clear sample embedding. Click the background to reset.
'Cat' and 'dog' live close together. 'Cat' and 'airplane' are far apart. The model knows meaning through proximity, not definitions.
Same word, different meaning, different address. Context decides where a word lands.
When AI seems biased, it learned associations from training data. Understanding this helps you spot bias and prompt around it.
Every word has an address. But 'bank' near 'money' and 'bank' near 'river' are different places. How does the model figure that out? →