The halting problem asks: given a Turing machine M and input w, does M halt on w? Turing proved in 1936 that no TM can decide this — HALT_TM is undecidable. The proof uses diagonalization: assume a decider H exists, construct a machine D that does the opposite of what H predicts for D itself, yielding a contradiction. The halting problem is the canonical undecidable problem; hundreds of other undecidable problems are proved undecidable by reducing the halting problem to them.
Follow the diagonalization argument carefully, constructing the contradiction step-by-step. Then read Turing's original 1936 paper for historical context. Finally, practice the reduction technique by showing ε_TM (does M accept ε?) is undecidable via a reduction from HALT_TM.
You've established that decidable languages are those with Turing machines that always halt with a correct answer. The natural next question is: are *all* languages decidable? The halting problem provides the definitive answer — no — and it does so through one of the most elegant arguments in all of mathematics. The question is deceptively simple: given a description of a Turing machine M and an input w, does M eventually halt when run on w, or does it loop forever?
At first, this seems like it should be solvable. After all, you can examine the code of a program and often tell whether it will terminate. But the halting problem asks for a *universal* solution — an algorithm that works for *every* possible program and *every* possible input. Turing proved in 1936 that no such algorithm exists. The proof is a masterpiece of self-reference and contradiction, closely related to the diagonalization technique you know from Cantor's work on countability.
Here's how the argument works. Assume, for contradiction, that a Turing machine H exists that decides the halting problem: H(⟨M, w⟩) accepts if M halts on w, and rejects if M loops. Now construct a new machine D that takes a Turing machine description ⟨M⟩ as input, runs H(⟨M, ⟨M⟩⟩) to ask "does M halt when given its own description?", and then *does the opposite* — if H says M halts, D loops forever; if H says M loops, D halts and accepts. Now ask: what happens when D is given its own description? D(⟨D⟩) runs H(⟨D, ⟨D⟩⟩). If H says D halts on ⟨D⟩, then D loops — contradicting H's answer. If H says D loops on ⟨D⟩, then D halts — again contradicting H. Either way, H gives the wrong answer. Since we derived a contradiction from the assumption that H exists, no such H can exist.
The halting problem's significance extends far beyond this single result. It is the canonical undecidable problem — the starting point for proving that hundreds of other problems are also undecidable. The technique is reduction: to show that some new problem X is undecidable, you show that if you *could* decide X, you could use that ability to decide the halting problem — which you've already proved impossible. For example, the question "does Turing machine M accept the empty string?" is undecidable because you can transform any halting-problem instance into this form. Note the critical distinction: the halting problem is recognizable (you can run M on w and accept if it halts — you just can't detect the looping case), but it is not decidable. This gap between recognizability and decidability becomes a central theme in computability theory.