Robotics From Zero
Module: Map The Unknown

SLAM Explained

Understand the chicken-and-egg problem of simultaneous localization and mapping, explore graph-based and feature-based SLAM approaches, and see how modern algorithms solve this fundamental robotics challenge.

12 min read

The Chicken-and-Egg Problem

Here's a riddle: to build a map, you need to know where you are. To know where you are, you need a map. So if you have neither... where do you start?

That's the SLAM problem — Simultaneous Localization And Mapping. The robot must build a map of an unknown environment while simultaneously figuring out its position on that incomplete map. It's like drawing a floor plan of a dark building while blindfolded, using only your footsteps and a flashlight to sense nearby walls.

SLAM chicken-and-egg problem — circular dependency between mapping (needs position) and localization (needs map)
SLAM's fundamental challenge: you need your position to build the map, but you need the map to find your position. SLAM algorithms break this circular dependency by jointly estimating both, updating each as new sensor data arrives.

SLAM is considered one of the fundamental problems in mobile robotics, and solving it robustly is what separates toy robots from serious autonomous systems.

Why SLAM is Hard

Let's break down the difficulty:

Circular Dependency

You need your position to add sensor data to the map in the right place. You need the map to figure out your position. Every measurement depends on every previous measurement — errors compound.

Sensor Drift

Odometry (wheel encoders, IMU) accumulates error over time. Walk 100 meters counting steps, and you might be off by several meters. The map you're building is based on these drifting position estimates.

Data Association

When the robot sees a corner, is it a new corner to add to the map, or one it saw 30 seconds ago? Getting this wrong creates duplicate landmarks or destroys your map.

Scale

A building might have thousands of landmarks. Tracking the relationships between all of them requires sophisticated data structures.

Note

The breakthrough that made SLAM practical was realizing you don't need to maintain a perfect map constantly — you just need to track uncertainty and occasionally correct accumulated errors through "loop closure" (which we'll cover in the next lesson).

Two Main Approaches

SLAM algorithms fall into two broad categories:

1. Feature-Based SLAM

The robot extracts distinctive landmarks (corners, edges, specific objects) from sensor data and tracks them as it moves. The map is a list of landmark positions.

Feature-based SLAM — showing how the robot tracks distinctive landmarks to simultaneously estimate its trajectory and the map
Feature-based SLAM extracts distinctive landmarks (corners, edges, objects) from sensor data and tracks them across frames. By triangulating against known landmarks, the robot estimates its position while adding new landmarks to the map.

How it works:

  1. Detect features in sensor data (camera: corners, edges; LiDAR: lines, planes)
  2. Match features across consecutive observations to track them
  3. Use feature positions to estimate robot motion
  4. Refine both robot trajectory and feature positions together

Pros:

  • Compact maps (just landmark coordinates)
  • Fast for sparse environments
  • Good for visual SLAM with cameras

Cons:

  • Requires distinctive features (fails in featureless hallways)
  • Matching features across views is hard
Feature-Based SLAM Outline
class FeatureBasedSLAM:
    def __init__(self):
        self.landmarks = []  # List of (x, y) positions
        self.robot_pose = Pose(0, 0, 0)
 
    def process_scan(self, sensor_data):
        # Extract features from sensor data
        observed_features = detect_features(sensor_data)
 
        # Match to known landmarks (or create new ones)
        for feature in observed_features:
            landmark_id = match_to_landmarks(feature, self.landmarks)
            if landmark_id is None:
                # New landmark
                self.landmarks.append(feature.position)
            else:
                # Update robot pose using known landmark
                correct_pose_estimate(self.robot_pose, landmark_id, feature)
 
        # Update landmark positions based on refined pose
        refine_landmark_positions()

2. Graph-Based SLAM

Instead of maintaining a single map, the algorithm builds a graph where nodes are robot poses at different times, and edges are spatial constraints (odometry measurements, landmark observations).

Graph-based SLAM — pose graph with nodes at robot positions and edges representing odometry and loop closure constraints
Graph-based SLAM builds a graph of poses (nodes) connected by spatial constraints (edges). Odometry edges link consecutive poses. Loop closure edges connect revisited locations. Optimization adjusts all poses to satisfy every constraint as well as possible.

The magic happens during graph optimization — periodically, the algorithm adjusts all poses to make the constraints as consistent as possible. This is like adjusting a house of cards until all the pieces fit together without contradiction.

How it works:

  1. Add a node for each robot pose as it moves
  2. Add edges between consecutive poses (from odometry)
  3. When recognizing a previous location, add a "loop closure" edge
  4. Periodically optimize the graph to minimize constraint violations
  5. Build the map from the optimized trajectory

Pros:

  • Handles large-scale environments
  • Naturally incorporates loop closure
  • Can produce occupancy grids or feature maps

Cons:

  • Periodic optimization can be slow
  • Requires good loop closure detection
Tip

Modern SLAM systems often combine both approaches — use features for tracking frame-to-frame, but maintain a pose graph for global optimization.

The Role of Filtering vs. Smoothing

Filtering vs smoothing — comparing EKF-SLAM (filters forward only) with graph-based SLAM (smoothes entire trajectory)
Filtering processes data forward-only — once a pose is estimated, it's locked in. Smoothing (graph-based) can revise past poses when new evidence arrives, like loop closure. This makes smoothing produce better maps but at higher computational cost.

There are two philosophies for handling SLAM uncertainty:

Filtering (EKF-SLAM, FastSLAM)

Maintain a probability distribution over the current robot pose and map. Update this distribution incrementally with each new measurement. Once a decision is made about past poses, it's locked in.

  • Fast per-update
  • Can't revise history
  • Errors accumulate

Smoothing (Graph-Based SLAM)

Maintain a history of poses and constraints. Periodically re-optimize the entire trajectory, revising past poses if new evidence (like loop closure) suggests they were wrong.

  • Slower periodic optimization
  • Can correct past errors
  • Better final map quality

Most modern SLAM systems use smoothing because loop closure (next lesson) is critical for long-term accuracy, and smoothing naturally incorporates it.

Visual SLAM vs. LiDAR SLAM

SLAM can use different sensors:

Sensor TypeProsConsCommon Use
CameraRich features, cheap, passiveStruggles in darkness, affected by lightingDrones, AR/VR headsets
LiDARWorks in any lighting, accurate depthExpensive, can't read text/colorSelf-driving cars, indoor robots
Both (Fusion)Best of both worldsComplexity, sensor synchronizationHigh-end autonomous systems

Visual SLAM (using cameras) is called VSLAM and often uses feature matching (ORB-SLAM, PTAM). LiDAR SLAM often uses scan matching (ICP, Cartographer).

What's Next?

SLAM works surprisingly well for short-term mapping, but there's a catch: errors slowly accumulate as the robot explores. The solution is loop closure — recognizing when you've returned to a previously visited place. That's our next lesson, and it's the secret ingredient that makes long-term SLAM possible.

Got questions? Join the community

Discuss this lesson, get help, and connect with other learners on r/softwarerobotics.

Join r/softwarerobotics

Frequently Asked Questions

How does SLAM work in simple terms?

SLAM builds a map and tracks the robot's location at the same time. As the robot moves, it takes sensor readings and matches them against what it has seen before to estimate both where it is and what the environment looks like. When it recognizes a place it visited earlier (loop closure), it corrects any accumulated drift in both the map and its trajectory.

What sensors are needed for SLAM?

SLAM can work with many sensor types. 2D LiDAR is the most common for indoor ground robots. 3D LiDAR is used for outdoor and autonomous vehicle SLAM. Cameras enable visual SLAM (vSLAM), which is cheaper but more sensitive to lighting. Many systems fuse multiple sensors — for example, LiDAR plus IMU, or stereo camera plus wheel odometry — for greater robustness.

Can SLAM work in real time?

Yes, modern SLAM algorithms are designed for real-time operation. 2D LiDAR SLAM (such as GMapping or Cartographer) runs comfortably at 10 to 30 Hz on a standard laptop CPU. Visual SLAM systems like ORB-SLAM3 also achieve real-time performance. However, large-scale mapping with dense 3D point clouds may require GPU acceleration or careful tuning to maintain real-time rates.

Further Reading

Related Lessons

Discussion

Sign in to join the discussion.