How AI taught itself to reason
AI researcher and author Sebastian Raschka explains how large language models learned to “think through” their answers
I
f you work in artificial intelligence or machine learning, you’re probably familiar with vague and hotly debated definitions. The term “reasoning models” is no exception. Eventually, someone will define it formally in a paper, only for it to be redefined in the next, and so on.
I define “reasoning” as the process of answering questions that require complex, multi-step generation with intermediate steps. For example, factual question-answering such as “What is the capital of France?” doesn’t involve reasoning. In contrast, a question like “If a train is moving at 60mph and travels for three hours, how far does it go?” requires simple reasoning. After all, it requires recognising the relationship between distance, speed and time before arriving at the answer. A regular large language model (LLM) may only provide a short answer, whereas reasoning models typically include intermediate steps that reveal part of the thought process.
Most modern LLMs are capable of basic reasoning and can answer questions like the one above. So, today, when we refer to reasoning models, we typically mean LLMs that excel at more complex reasoning tasks, such as solving puzzles, riddles and mathematical proofs. Additionally, most LLMs branded as reasoning models now include a “thought” or “thinking” process as part of their response. Whether and how an LLM actually “thinks” is a separate discussion.
Intermediate steps in reasoning models can appear in two ways. First, they may be explicitly included in the response – the answer literally includes the thinking. Other reasoning LLMs, such as OpenAI’s o1, run multiple iterations with intermediate steps that are not always shown to the user.
When should we use reasoning models?
Now that we have defined reasoning models, we can move on to the more interesting part: how to build and improve LLMs for reasoning tasks. But before diving into the technical details, it is important to consider when reasoning models are actually needed.
Reasoning models are designed to be good at complex tasks such as solving puzzles, advanced math problems and challenging coding tasks. However, they are not necessary for simpler tasks such as summarisation, translation or knowledge-based question answering. In fact, using reasoning models for everything can be inefficient and expensive. For instance, reasoning models are typically more expensive to use, more verbose and sometimes more prone to errors due to “overthinking”. The simple rule applies: use the right tool (or type of LLM) for the task.