Text is converted into vectors (Embeddings) and processed by the Transformer to predict the next word.