LLM Fundamentals

Lessons in this group, roughly in build order:

closed-weight-models — A closed-weight model is an LLM whose parameters are never released — you rent inference over an HTTP API…
open-weight-models — An open-weight model ships its trained parameters publicly (Llama 3, Mistral, Qwen, Gemma) so you can…
reasoning-vs-standard-models — Reasoning models (o3, Claude with extended thinking, Gemini Thinking, DeepSeek-R1) spend extra hidden…
context-windows — The context window is the maximum number of tokens a model can attend to at once — system prompt, history,…
fine-tuning-vs-prompt-engineering — Two ways to steer an LLM without training from scratch: change the input at inference time (prompting) or…
embeddings-and-vector-search — An embedding maps text to a fixed-length vector so that semantically similar text lands nearby; vector…
token-based-pricing — LLM APIs bill per token, not per request, and almost always charge output tokens several times more than…
pricing-of-common-models — A practical map of what the major LLM tiers cost per million tokens, and how to pick a tier so an agent…

tech-studies