GPT 4 (according to George Hotz): - built on 220 billion parameters - 16 way mixture model - 8 sets of weights https://www.youtube.com/watch?v=1v-qvVIje4Y&t=276s