Announcement_44
New blog post: Smol but Mighty: Can Small Models Reason well? 🤔 now out on Hugging Face! In this article, I show that “Smol” (less than 2B parameters) language models pack impressive performance, competing with models several orders of magnitudes better from last year. I also find that Chinese-developed models (Deepseek R1, Qwen) exhibiting distinctly different cultural biases than American models.