A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Simultaneous Localization and Mapping (SLAM)

Research Depth 130 in the knowledge graph ☐ I know this ☆ Set as goal

851prerequisites beneath it

Kalman Filter and State Estimation LiDAR and 3D Point Cloud Processing +2 more→

Core Idea

SLAM solves the chicken-and-egg problem: to build a map, the robot needs to know where it is; to know where it is, it needs a map. The solution is to simultaneously estimate the robot's pose (position and orientation) and the map of the environment, with uncertainty propagation through a Kalman filter or graph optimization. Visual SLAM uses camera images; LiDAR SLAM uses point cloud registration. A critical component is loop closure: when the robot returns to a previously visited location, the detector recognizes it and adds a constraint that corrects accumulated drift. The result is a consistent, driftless map. SLAM enables autonomous navigation, exploration, and 3D reconstruction without pre-built maps.

How It's Best Learned

Implement a simple monocular visual SLAM from scratch: feature extraction (SIFT/ORB), matching between frames, computing the essential matrix, triangulation of 3D points, pose estimation via PnP, bundle adjustment to refine estimates. Observe how drift accumulates as the robot moves. Add loop closure: when the current frame matches a past frame, add a loop constraint to optimize the full trajectory. Move to using existing SLAM frameworks (ORB-SLAM2 for visual, LOAM or LeGO-LOAM for LiDAR). Run on real sensor data and visualize the reconstructed map and camera trajectory.

Common Misconceptions

SLAM produces a perfect map by integrating all sensor observations; in reality, drift and errors are unavoidable without loop closure and global optimization.
Loop closure can be added to any SLAM system; actually, loop closure requires place recognition (detecting revisited locations), which is non-trivial and often fails in featureless or repetitive environments.
Visual SLAM works in all lighting conditions; actually, feature-based visual SLAM fails in low light, rain, or motion blur. LiDAR SLAM is more robust to lighting.
SLAM runs in real-time on standard CPUs; modern SLAM systems require GPU acceleration for real-time performance on high-frame-rate sensors.

Explainer

Imagine a mobile robot exploring an unknown building with only a camera or LiDAR for sensing and encoders tracking wheel rotation (odometry). As the robot moves, it builds a map of observed features. The problem: odometry drifts over time (wheel slip, sensor noise), so the robot loses track of its true location. Without knowing its true position, the map is warped and inconsistent. Conversely, without a map, it's impossible to know where the robot is. This is the SLAM problem: estimate the robot's trajectory and map simultaneously.

Filtering-based SLAM (EKF-SLAM, particle filter SLAM) maintains a single state estimate: the robot pose and all landmark positions. The Kalman filter predicts state based on odometry and updates based on sensor observations (feature measurements). As the robot observes landmarks, observations constrain the pose, reducing uncertainty. The key insight is that observations of known landmarks improve localization, and building maps and localizing are tightly coupled. The state vector grows as new landmarks are discovered, making the filter slower over time.

Graph-based SLAM is more modern: represent the SLAM problem as a graph where nodes are robot poses (one node per time step or keyframe) and landmarks, and edges are constraints (odometry, sensor observations). The goal is to find the configuration of nodes that best satisfies all constraints in a least-squares sense. Bundle adjustment (in visual SLAM) or pose graph optimization (in general SLAM) solves this. Graph-based methods scale better than filtering as the environment grows.

Visual SLAM uses camera images:

1. Initialization: From two views, extract matching features, compute the essential matrix, triangulate 3D points.

2. Pose estimation: For each new frame, match features to existing 3D points, solve PnP (Perspective-n-Point) to get the camera pose.

3. Mapping: If sufficient features are tracked, add the frame as a keyframe and triangulate new 3D points.

4. Loop closure: Periodically check if the current frame matches any past keyframes (place recognition). If yes, add a loop constraint to the graph.

5. Optimization: Bundle adjustment refines all camera poses and 3D points.

The advantage of keyframes: instead of processing every frame, you selectively add frames that contribute new information (large motion or new areas). This reduces computation and redundancy.

LiDAR SLAM uses point cloud registration:

1. Scan alignment: Use ICP to align the current scan to the previous scan, estimating the motion.

2. Pose estimation: Integrate motion estimates from successive scans.

3. Mapping: Accumulate aligned scans into a global map.

4. Loop closure: When the current scan matches a past scan (detected via scan similarity), add a loop constraint.

5. Optimization: Pose graph optimization corrects the entire trajectory.

LiDAR SLAM is often more robust than visual SLAM because scan-to-scan alignment is a well-defined geometric problem (point cloud registration) with good convergence properties. Visual SLAM must handle feature matching, which can fail in repetitive or featureless environments.

Loop closure is the critical component that eliminates drift. The problem is place recognition: how do you know you've returned to a previous location? Visual SLAM uses image-based methods (matching visual features), LiDAR SLAM uses scan-based methods (matching point clouds). A detector with high precision (few false positives) is essential—a false loop closure adds a wrong constraint that corrupts the map.

Once a loop closure is detected, the full trajectory is optimized to satisfy all constraints simultaneously. Early poses (which had accumulated drift) are corrected based on late observations. The result is a globally consistent map and trajectory.

Challenges:

Place recognition: Robust across viewpoint changes, lighting variations, and perceptual aliasing.
Scale ambiguity (monocular visual SLAM): A single camera can't determine absolute scale; motion is ambiguous between moving forward slowly or backward quickly. Stereo cameras or LiDAR (which provide metric depth) resolve this.
Computational efficiency: Real-time SLAM on mobile robots requires efficient feature detection, matching, and optimization.
Dynamic environments: Moving people, vehicles, or other robots violate the SLAM assumption that the world is static. Robust outlier rejection is needed.

Applications: autonomous vehicles, drones, mobile robots in warehouses, hand-held 3D reconstruction devices, and underwater robotics. Modern visual SLAM systems (ORB-SLAM, ORB-SLAM3) can run real-time on modest hardware. LiDAR SLAM is standard in autonomous vehicles and industrial drones.

SLAM is one of the most important problems in robotics, and decades of research have produced mature, practical solutions. The field continues to evolve with deep learning approaches (neural depth estimation, learned place recognition) and tighter sensor fusion integration.

Practice Questions 1 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Double Integrals: Definition and Setup → Iterated Integrals and Fubini's Theorem → Double Integrals over Rectangular Regions → Double Integrals over General Regions → Applications of Double Integrals: Area, Mass, and Moments → Triple Integrals in Cartesian Coordinates → Triple Integrals in Cylindrical and Spherical Coordinates → Change of Variables and the Jacobian Determinant → Applications of Triple Integrals: Volume and Mass → Vector Fields and Their Representations → Line Integrals of Vector Fields → Work and Circulation → Line Integrals of Scalar and Vector Functions → Fundamental Theorem for Line Integrals → Conservative Vector Fields → Conservative Vector Fields and Potential Functions → Curl and Divergence of Vector Fields → Curl and Divergence → Divergence Theorem → Electric Flux and Divergence Theorem → Gauss's Law: Integral Form and Meaning → Solving Problems with Gauss's Law → Conductors in Electrostatic Equilibrium → Capacitance and Capacitors → Dielectrics → Dielectric Constant and Relative Permittivity → Electric Field Inside Dielectric Materials → Dielectric Materials and Polarization → Dielectric Susceptibility and Permittivity → Energy Density in Electric Fields → Electric Current and Current Density → Electrical Resistance and Resistivity → Ohm's Law and Circuit Elements → Electromotive Force (EMF) and Batteries → Kirchhoff's Circuit Laws: Voltage and Current → DC Circuit Network Analysis Methods → Transient Response in RC Circuits → RC Circuits → First-Order Transient Circuit Response → Second-Order Transient Circuit Response → Feedback Control Fundamentals → PID Control for Robot Actuators → Actuators and Sensors in Robotics → Robot Vision Fundamentals → LiDAR and 3D Point Cloud Processing → Simultaneous Localization and Mapping (SLAM)

Longest path: 131 steps · 851 total prerequisite topics

Prerequisites (4)

Kalman Filter and State Estimationhard LiDAR and 3D Point Cloud Processingsoft Visual Servoing and Image-Based Robot Controlsoft Particle Filter Localization (Monte Carlo Localization)soft

Leads To (0)

No topics depend on this one yet.