January 1, 2024 - Omar Sanseviero writes:

In this blog post, we’ll do an end-to-end example of the math within a transformer model. The goal is to get a good understanding of how the model works. To make this manageable, we’ll do lots of simplification. As we’ll be doing quite a bit of the math by hand, we’ll reduce the dimensions of the model. For example, rather than using embeddings of 512 values, we’ll use embeddings of 4 values. This will make the math easier to follow! We’ll use random vectors and matrices, but you can use your own values if you want to follow along.

Read The Random Transformer | Understand how transformers work by demystifying all the math behind them

No comments yet!

Machine Learning

!machine_learning@programming.dev

Create post

A community for posting things related to machine learning

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

Community stats

  • 1

    Monthly active users

  • 35

    Posts

  • 18

    Comments