A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Simulated Annealing

Graduate Depth 100 in the knowledge graph ☐ I know this ☆ Set as goal

746prerequisites beneath it

Local Search Optimization Stochastic Gradient Descent and Variants→

Core Idea

Simulated annealing probabilistically accepts worse solutions early in search (high temperature) to escape local optima, then gradually accepts only improvements (low temperature) to converge. The cooling schedule determines the algorithm's behavior: fast cooling risks getting stuck in local optima, while slow cooling wastes iterations. The algorithm is theoretically guaranteed to find the global optimum with a sufficiently slow cooling schedule.

How It's Best Learned

Implement simulated annealing with different cooling schedules (linear, exponential, adaptive) and visualize how each affects solution quality over iterations.

Common Misconceptions

Simulated annealing always finds the global optimum (it requires infinitely slow cooling). Temperature should always decrease (adaptive schedules may increase temperature if improvement stalls).

Explainer

From local search optimization, you know the fundamental problem: hill climbing finds a local optimum but gets stuck there, unable to reach a potentially better solution elsewhere in the search space. Imagine you are hiking in fog and can only feel the slope beneath your feet. Hill climbing always walks uphill, so you reach the nearest peak — but it might be a small hill when a mountain is just across the valley. Simulated annealing solves this by occasionally allowing downhill steps, especially early in the search, giving the algorithm a chance to escape local optima and explore the broader landscape.

The key mechanism is the acceptance probability. When simulated annealing considers a neighboring solution, it always accepts improvements (moves to a better solution). But when the neighbor is *worse*, it accepts the move with probability exp(−ΔE / T), where ΔE is how much worse the neighbor is and T is the current temperature. This formula comes from statistical mechanics — it models how atoms in a heated metal occasionally jump to higher-energy configurations. At high temperature, exp(−ΔE / T) is close to 1, so almost any move is accepted, and the algorithm wanders freely through the search space. As temperature decreases, the probability of accepting worse moves drops, and the algorithm increasingly behaves like pure hill climbing, settling into a good solution.

The cooling schedule controls how temperature decreases over time and is the most important design choice. A common schedule is geometric cooling: T_new = α · T_old, where α is typically between 0.9 and 0.999. Fast cooling (low α, or few iterations) behaves almost like hill climbing — you barely explore before settling. Slow cooling (high α, or many iterations) gives the algorithm time to escape traps but takes longer to converge. The theoretical guarantee is striking: with an infinitely slow cooling schedule (specifically, T(t) ≥ C / log(t)), simulated annealing converges to the global optimum with probability 1. In practice, you never cool this slowly, so you trade guaranteed optimality for a good-enough solution in reasonable time.

Simulated annealing shines on combinatorial optimization problems where the search space is too large for exhaustive search and too rugged for gradient-based methods. Classic applications include the traveling salesman problem, circuit layout, and scheduling. The algorithm requires only three things: a way to represent solutions, a way to generate neighboring solutions, and a way to evaluate solution quality. It needs no gradient, no differentiability, and no assumptions about the structure of the search space. The tradeoff is that tuning the cooling schedule, initial temperature, and neighborhood structure requires experimentation — there is no single recipe that works for all problems.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Tree Structure and Node Properties → Binary Trees → Tree Traversals → Depth-First Search (DFS) → Depth-First Search: Implementation and Applications → Topological Sort → Dynamic Programming → Longest Common Subsequence (LCS) Problem → Edit Distance: Levenshtein Distance and DP → 0/1 Knapsack Problem: Bounded Capacity DP → Greedy Algorithms → Activity Selection Problem Using Greedy Algorithms → Dijkstra's Algorithm → A* Search Algorithm → Heuristic Search Functions → Local Search Optimization → Genetic Algorithms → Stochastic Gradient Descent and Variants → Simulated Annealing

Longest path: 101 steps · 746 total prerequisite topics

Prerequisites (2)

Local Search Optimizationhard Stochastic Gradient Descent and Variantssoft

Leads To (0)

No topics depend on this one yet.