Simple DL Part 2: Embeddings
In my opinion, you need to understand embeddings to really 'get' deep learning. Embeddings are the magic fairy dust that power every deep learning model, from ImageNet to GPT-3. I think in embeddings. Embeddings are the foundation for any intuition I have about DL, so all of my future posts in this series are going to refer back to the embedding concept.
- Embeddings are stores of information represented as a list of floats (a float vector).
- Float-vectors are unique because they are continuous, which means we can think of them like points on a map (or, more generally, points on an N-dimensional surface).
- A good embedding is one where similar information is 'close' to each other in our map.
- Because embeddings are lists of floats that represent concepts, we can turn concepts into computation.
- A deep learning model is made of a stack of embeddings. Embeddings are constrained by the input data (features) and the loss function.
- The features limit what the embeddings can learn, and the loss tells the model what to prioritize. Models are as good as their features and as bad as their loss.
- We can improve a model's performance by changing the features, the architecture, or the loss function. These change the embeddings, which changes the underlying information map.