# How Distillation Makes AI Models Smaller and Cheaper

## Metadata
- Author: [[Amos Zeeberg]]
- Full Title: How Distillation Makes AI Models Smaller and Cheaper
- Category: #articles
- Summary: Distillation is a method that uses a large AI model to train a smaller, cheaper one without losing much accuracy. It helps companies run powerful AI tools more efficiently by passing "soft" knowledge from a big "teacher" model to a smaller "student" model. This technique is widely used and continues to improve AI by making it faster and more affordable.
- URL: https://www.quantamagazine.org/how-distillation-makes-ai-models-smaller-and-cheaper-20250718/
## Highlights
- In January, the NovaSky lab at the University of California, Berkeley, [showed that distillation works well for training chain-of-thought reasoning models](https://novasky-ai.github.io/posts/sky-t1/), which use multistep “thinking” to better answer complicated questions. The lab says its fully open-source Sky-T1 model cost less than $450 to train, and it achieved similar results to a much larger open-source model. ([View Highlight](https://read.readwise.io/read/01k0f22rbd631yzkbwrwbjd3qy))