Program Synthesis

Research Depth 67 in the knowledge graph I know this Set as goal
Unlocks 2 downstream topics
synthesis sketching cegis sygus oracle-guided specification

Core Idea

Program synthesis automatically generates a program that meets a given specification. The specification can take many forms: logical formulas, input-output examples, natural language, or a reference implementation. The synthesis engine searches the space of possible programs for one that satisfies all constraints. Key approaches include enumerative search (try programs in order of size), constraint-based synthesis (encode the problem as a SAT/SMT query), and counterexample-guided inductive synthesis (CEGIS), which iterates between proposing candidate programs and checking them against the specification. Program synthesis inverts the verification problem: instead of checking whether a given program meets a spec, it finds a program that does.

Explainer

Program synthesis is the automated construction of programs from specifications. Where verification asks "does this program meet this spec?", synthesis asks "find me a program that meets this spec." The specification constrains the desired behavior — it might be a logical formula (for all inputs x, output f(x) satisfies P(x, f(x))), a set of input-output examples ({(1,1), (2,4), (3,9)} suggesting squaring), a reference implementation to optimize, or even a natural language description. The synthesis engine's job is to search the space of possible programs and find one that satisfies all constraints.

The challenge is that the space of programs is astronomically large and mostly filled with incorrect candidates. Enumerative search tries programs in order of increasing size, checking each against the specification. This is complete (it will eventually find a solution if one exists) but slow. Constraint-based synthesis encodes the search as a SAT or SMT problem: represent the unknown program symbolically, express the specification as constraints, and let a solver find a satisfying assignment. This can be very effective for small programs but the encoding size grows with program complexity.

CEGIS (Counterexample-Guided Inductive Synthesis), introduced by Solar-Lezama, combines the best of both approaches. It maintains a set of concrete input-output examples and iterates two phases. The synthesis phase finds a program consistent with the current examples — a smaller, easier problem than meeting the full specification. The verification phase checks the candidate against the complete specification using a verification tool (SMT solver, model checker). If the candidate passes, synthesis succeeds. If it fails, the verifier produces a counterexample — a specific input where the candidate misbehaves — which is added to the example set, and the cycle repeats. Each counterexample prunes a large swath of the search space, making convergence fast in practice.

The SyGuS (Syntax-Guided Synthesis) framework standardizes the synthesis problem. A SyGuS instance consists of a background theory (defining the semantics of operations), a syntactic grammar (defining the space of candidate programs), and a semantic specification (defining the desired behavior). The grammar is crucial: by restricting the search space to programs constructable from specific operators and patterns, it makes synthesis tractable. SyGuS competitions benchmark synthesis tools on standard problems, driving advances in the field.

Practical applications include programming by example (Excel's FlashFill, which synthesizes string transformations from examples), superoptimization (finding the shortest instruction sequence equivalent to a given program fragment), protocol synthesis (generating distributed protocols from high-level specifications), and program repair (synthesizing patches that fix bugs while preserving correct behavior). The connection to machine learning is growing: neural-guided synthesis uses learned models to prioritize which programs to try, combining the generalization of ML with the correctness guarantees of formal verification.

Practice Questions 3 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsPiecewise FunctionsStep FunctionsComposition of FunctionsInverse FunctionsRadical Functions and GraphsRational ExponentsExponential Functions and GraphsLogarithms IntroductionTime and Space ComplexityAmortized AnalysisHash TablesSymbol Tables and Scope ResolutionSemantic Analysis PhaseType Systems OverviewProgram Synthesis

Longest path: 68 steps · 388 total prerequisite topics

Prerequisites (3)

Leads To (2)