OpenAI Goes Open Source Again: gpt-oss:20B Beats the Competition

Edward Krueger

8/14/2025

I'm excited by OpenAI's first open source LLM since gpt-2.

gpt-oss is impressive, especially given the 20b model's size size.

Here are some results from some internal benchmarks:

The first benchmark is a zero-shot classification task, assigning documents to one of 34 categories, with only the category names provided. Examples of categories are Business and Finance, Education, Automotive, Healthy Living, Entertainment, and Law.

gpt-oss:20b beat out the runner up small oss model, Mistral Small 3.1 24b, with a balanced accuracy of 0.684235 versus 0.681144. (Link to the rest of the benchmark results in comments.)

gpt-oss:20b was even more impressive in a second benchmark where a stratified sample of 30 status messages are classified into 15 status types. The model is provided the status types with descriptions.

It out preformed gpt-4o-mini (which itself out performed gpt-4o) with an accuracy of 30/30 versus 28/30.