Blog

A minimal Introduction to Quantization

For the last couple of weeks, I’ve been considering writing some introductory content for quantization. After exploring a bit more, I realized there are many great resources…

LLM Evals and Benchmarking

You go to Hugging Face, and you see there are 60 thousand text generation models, and you feel lost. How do you get the best model for your use case? How to get started? The…

Sentence Embeddings. Cross-encoders and Re-ranking

Deep Dive into Cross-encoders and Re-ranking

The Llama Hitchiking Guide to Local LLMs

Here are some terms that are useful to know when joining the Local LLM community.

Sentence Embeddings. Introduction to Sentence Embeddings

Everything you wanted to know about sentence embeddings (and maybe a bit more)

The Random Transformer

Understand how transformers work by demystifying all the math behind them

The GPU Poor strike back

Some months ago, SemiAnalysis published a flashy article with the premise that organizations with GPUs in the magnitude of tens of thousands had so many resources that the…