ROS is a flexible, distributed middleware framework for building complex robotic systems. Rather than monolithic control software, ROS organizes functionality into independent nodes (processes) that communicate via standardized message passing (topics and services). A sensor node publishes raw data; a perception node subscribes to sensor data and publishes detected objects; a planning node subscribes to objects and publishes waypoints; a control node subscribes to waypoints and commands actuators. This decoupling enables teams to develop and test components independently, swap implementations (use a different perception algorithm by launching a different node), and scale across multiple computers (nodes can run on the main computer, onboard microcontrollers, or a remote server). ROS also provides tools for visualization, debugging, and recording/playback of sensor data. The ecosystem includes thousands of open-source packages for common robotics tasks (SLAM, motion planning, manipulation). ROS 1 (mature but aging) has been largely succeeded by ROS 2 (improved real-time performance, security). Learning ROS is essential for roboticists; it shapes how they think about systems architecture.
ROS began as research software at Stanford and Willow Garage around 2007, addressing a real problem: robotics researchers were spending 50% of time building infrastructure (communication, visualization, logging) and 50% on research. ROS provided reusable infrastructure, accelerating robotics research dramatically. Over 15 years, it became the de facto standard for academic and research robotics.
Publish/Subscribe Architecture: ROS's core is a message-passing model. Nodes (independent processes) don't call each other's functions or share memory; instead, they publish data to topics and subscribe to topics of interest. A camera driver publishes images on `/camera/image`; a vision node subscribes and processes them. This loose coupling means the vision node doesn't know or care how the camera driver works — it just consumes images. Multiple nodes can subscribe to the same topic, multiple nodes can publish to different topics, and nodes can be added or removed without affecting others. This decoupling is architectural simplicity and fault tolerance: if the camera driver crashes, the vision node just stops receiving images (it detects this via a watchdog timer and triggers fallback behavior).
Distributed Computing: ROS enables scaling beyond a single computer. A robot might have a low-power embedded controller running motor drivers and low-level control; a mid-range ARM computer running vision; and a cloud server running heavy perception or optimization. These communicate via ROS messages — the communication is transparent whether nodes are on the same computer or across networks. This is invaluable for robotics: expensive computation (ML inference) can be offloaded to cloud; real-time critical code runs locally.
Standardization: ROS provides standardized message types for common robotics data: `sensor_msgs/Image` for images, `geometry_msgs/Twist` for velocity commands, `nav_msgs/OccupancyGrid` for maps. These standards mean a vision node from researcher A can feed images directly to a perception node from researcher B, without custom adapters. This interoperability has enabled a vast ecosystem of open-source packages.
Tooling: ROS provides powerful debugging and visualization tools. `rviz` visualizes robot state, sensors, and planned paths in 3D. `rosbag` records all messages for later playback and offline analysis (invaluable for debugging: record a failure, replay it, test fixes). `rqt` provides GUI tools for monitoring topics, calling services, and tuning parameters at runtime.
Real-Time Challenges: ROS 1 was not designed for real-time systems. The master adds latency; communication is not deterministic; garbage collection in Python or dynamic memory allocation in C++ can cause unpredictable delays. For soft real-time (control should usually run on-time), ROS 1 is adequate. For hard real-time (control must never miss a deadline), ROS 1 is insufficient — it's common to write a small hard-real-time component in C++ running natively, with ROS handling higher-level communication. ROS 2 improved this significantly, supporting real-time operating systems and deterministic message passing.
ROS 2 and the Ecosystem: ROS 2, released around 2017, addresses ROS 1's limitations. The major change is replacing the master with DDS (Data Distribution Service), a standard middleware used in aerospace. DDS is decentralized (no single point of failure), supports wide-area networks natively, and provides real-time guarantees. ROS 2 also improved security (ROS 1 assumes a trusted LAN), added client libraries in multiple languages, and improved the build system. The tradeoff is added complexity — ROS 2 configuration and debugging is harder than ROS 1 — but for deployed, production systems, the benefits are significant.
For someone learning robotics, ROS provides a practical platform for building real systems quickly. The architectural lessons — loose coupling through message passing, separation of concerns, and distributed computation — transfer to any robotics project, ROS-based or not.
No topics depend on this one yet.