The halting problem asks whether there exists a Turing machine that, given any program and input, correctly determines whether that program halts on that input. Turing proved in 1936 that no such machine can exist. The proof uses diagonalization: assume a halting oracle H exists and construct a machine D that runs H on itself then does the opposite — D's behavior contradicts H's prediction, yielding a contradiction. This is the paradigmatic undecidability result and the template for hundreds of subsequent proofs.
Carefully trace through the diagonalization argument to see exactly where the contradiction arises. Then practice reducing other problems to the halting problem — recognizing it as the canonical hard problem from which undecidability spreads.
From your study of Turing machines, you know that a Turing machine is a precise mathematical model of computation: a tape, a finite set of states, and transition rules. Every algorithm corresponds to some Turing machine. The halting problem asks whether there exists a Turing machine H that, given the description of any program P and an input x, correctly outputs "halts" if P eventually terminates on x and "loops" if it does not. Turing proved in 1936 that no such H can exist — a landmark result that established the outer limits of what computation can do.
The proof is a proof by contradiction that uses diagonalization, the same technique Cantor used to show that the real numbers are uncountable. Assume H exists. Then construct a new machine D as follows: D takes a program description ⟨P⟩ as input, runs H on the pair (⟨P⟩, ⟨P⟩) — that is, asks H whether P halts when given its own description as input — and then does the opposite of whatever H predicts: if H says "halts," D loops forever; if H says "loops," D halts immediately.
Now ask what happens when D is given its own description ⟨D⟩ as input. If H says D halts on input ⟨D⟩, then D loops — but H said it halts, contradiction. If H says D loops on input ⟨D⟩, then D halts — but H said it loops, contradiction. Every possible answer H could give is wrong. Since D is a well-defined Turing machine and ⟨D⟩ is a valid input, the assumed oracle H fails on a specific case. The only consistent conclusion is that H cannot exist.
A critical distinction the proof does not make: it says no *general* algorithm works for all programs, not that no specific program's halting behavior can ever be known. For many individual programs — a loop with a fixed counter, a program with no loops at all — halting is trivially decidable. Undecidability is a claim about the impossibility of a single algorithm covering every case.
The halting problem is the canonical undecidable problem because undecidability of other problems is often proved by reduction to it. If you can show that solving problem X would let you solve the halting problem, then X must also be undecidable. This reduction technique — which you will see in Rice's theorem and beyond — makes the halting problem not just a historical curiosity but the central node in the entire web of undecidability results.