A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Symbolic Execution

Research Depth 93 in the knowledge graph ☐ I know this ☆ Set as goal

4topics build on this

430prerequisites beneath it

Introduction to Propositional Logic Programming Language Semantics +1 more→→Bounded Model Checking Contract Verification (Blockchain)+2 more

Core Idea

Symbolic execution runs a program with symbolic values (placeholders like alpha, beta) instead of concrete inputs, tracking how these symbols propagate through computations and accumulating path constraints at each branch. When the program branches on a condition involving symbolic values, execution forks: one path adds the condition as a constraint, the other adds its negation. Each path's constraint set characterizes exactly which concrete inputs would follow that path. Feeding path constraints to an SMT solver generates concrete test inputs or proves path infeasibility. Symbolic execution systematically explores all program paths, enabling high-coverage test generation, bug finding, and bounded verification.

Explainer

Testing runs a program on specific concrete inputs and checks the output. This is fast but covers only the tested inputs — bugs on untested paths remain hidden. Symbolic execution inverts this by running the program on symbolic inputs (variables like alpha, beta rather than specific values), tracking how they flow through the computation. When the program computes y = x + 3, the symbolic executor records y = alpha + 3. When it branches on x > 5, it forks into two execution paths: one where alpha > 5 and one where alpha <= 5. Each path accumulates a path constraint — the conjunction of all branch conditions encountered along that path.

At any point, feeding the path constraint to an SMT solver (like Z3) either produces a concrete input satisfying all constraints (a test case that follows that exact path) or proves the constraints unsatisfiable (the path is infeasible — no concrete input can reach it). This gives symbolic execution two superpowers: automatic test generation (produce inputs that exercise specific paths, including hard-to-reach error paths) and bounded verification (if all paths to an error are explored and none is feasible, the error is unreachable within the explored scope).

The central challenge is path explosion: a program with n sequential branches has up to 2ⁿ paths. Loops make this worse — a loop with a symbolic bound has a different path for each possible iteration count. Real programs have astronomically many paths, and exhaustive exploration is infeasible. Practical symbolic execution tools use heuristics to prioritize interesting paths (depth-first, coverage-guided, or directed toward specific targets), path merging (combining paths that converge to the same program point, using disjunctive constraints), and bounded analysis (exploring paths up to a fixed depth or loop unrolling bound).

Concolic execution (concrete + symbolic), pioneered by tools like DART and CUTE, addresses another limitation: pure symbolic execution cannot handle operations it cannot model symbolically (system calls, native library functions, complex pointer arithmetic). Concolic execution runs the program concretely with real inputs while simultaneously maintaining symbolic constraints for the operations it can model. It uses the concrete execution to drive progress and the symbolic constraints to generate alternative inputs. By negating one branch condition at a time, it systematically steers the program toward unexplored paths.

Practical impact has been substantial. KLEE (Stanford) found bugs in GNU Coreutils that had survived decades of testing and manual code review. SAGE (Microsoft) uses concolic execution to find security vulnerabilities in Windows applications and file parsers, having found roughly one-third of all security bugs discovered during Windows 7 development. S2E (EPFL) combines symbolic execution with whole-system analysis, testing entire software stacks including OS kernels. angr (UC Santa Barbara) applies symbolic execution to binary analysis, working without source code. The technique has become a standard tool in both security research and software testing, complementing model checking and deductive verification with a path-centric, test-generation-oriented approach.

Practice Questions 3 questions