Software architecture

ROS 2 architecture

The real-robot stack is a Robot Operating System 2 graph: a perception node that turns camera frames into a ball state, a controller node that turns the ball state into joint commands, and the Universal Robots firmware that closes the inner torque loop. Two controller variants share the same topic interface, so the PID and the MPC are interchangeable without touching the rest of the graph.

§ 1Architecture overview

The real-robot stack is split into three layers. At the bottom, the Universal Robots firmware closes a high-rate joint-space torque loop and exposes the arm through the standard ros2_control interfaces; this layer is treated as a black box. Above it sits a controller node, either the PID of the control page or the MPC of the MPC page, which reads the latest ball state and joint positions at every tick and writes back a joint command. Above the controller, a vision node runs the perception pipeline of the vision page and publishes the ball state on a topic that any controller can read.

A single custom message, BallState, carries everything the controller needs from perception in one timestamped snapshot. The UR driver's robot_state_publisher maintains the transform tree from the joint states and the robot description, so the controller queries the world-to-plate rotation at every tick instead of recomputing it from forward kinematics.

ROS 2 domain USB camera USB ball_tracker ball_balance_controller ball_balance_mpc RViz Ubuntu · ROS 2 humble workstation network /ball_state · /joint_states · /joint_traj PolyScope ros2_control · scaled JTC UR Control Box UR7e arm + plate + ball ROS 2 over the network USB industrial arm cable ROS 2 domain
Figure 1. Physical setup. The workstation runs the vision and controller nodes, the optional visualiser, and any rosbag tooling; it sees the camera over USB and the UR Control Box over the local network. The UR Control Box runs the manufacturer's ros2_control joint-trajectory controller and drives the arm. Workstation and Control Box live on the same ROS 2 domain.

§ 2Node graph

Each of the three layers of § 1 is implemented by one or more ROS 2 nodes. The graph contains one perception node, one of two interchangeable controller nodes, the UR driver and joint-trajectory controller, and supporting infrastructure for the transform tree and visualisation:

ball_tracker

vision node

Opens a USB camera, runs ArUco-homography for the plate frame and HSV segmentation for the ball, fuses the result through a Kalman filter, and publishes /ball_state.

ball_balance_controller · ball_balance_mpc

controller nodes (PID or MPC variant)

Two implementations of the same interface: the PID node of the PID page and the MPC node of the MPC page. Exactly one is brought up at a time; both subscribe to /ball_state and /joint_states, query the world-to-plate transform, and emit joint commands.

ball_balance_visualizer

visualisation node

Republishes the live ball state and the reference trajectory as RViz markers on /ball_balance_visualization.

scaled_joint_trajectory_controller · forward_position_controller

UR-side ros2_control plugins

The MPC uses the joint-trajectory variant; the PID uses the forward-position variant. Both close the inner torque loop with gravity compensation and emit encoder readings on /joint_states.

robot_state_publisher · static_transform_publisher

transform-tree infrastructure

Together they maintain the world $\to$ wrist $\to$ plate chain, which the controllers query for the plate-frame rotation at every tick.

/joint_states /plate_cmd USB camera 60 Hz ball_tracker ArUco · HSV · Kalman controller node PID or MPC variant UR firmware ros2_control · JTC UR7e frame /ball_state joint cmd τ robot_state_publisher + static_transform_publisher /tf, /tf_static ball_balance_visualizer → /ball_balance_visualization RViz optional viewer /ball_state markers forward path feedback / TF vision first-party control UR driver / infrastructure
Figure 1. Runtime ROS 2 graph. Forward path along the central row (camera $\to$ vision $\to$ controller $\to$ firmware $\to$ arm). Feedback topics arc above; the transform tree lives below the controller; the visualisation branch sits at the bottom.

§ 3Topics

The arrows in Figure 1 correspond to seven named topics. Each topic has a fixed message type; the controller and the tracker use a mix of standard ROS 2 messages and one custom message detailed in § 4.

/ball_state

ball_tracker_msgs/BallState · sensor-data QoS

tracker $\to$ controller, visualiser

Custom message bundling ball position, velocity, and tracker-health flags into a single timestamped snapshot (§ 4).

/joint_states

sensor_msgs/JointState

UR firmware $\to$ controller, robot_state_publisher

Joint encoder readings used by the rotational Jacobian in the controller and by the TF tree for the world-to-plate rotation.

/forward_position_controller/commands

std_msgs/Float64MultiArray

PID node $\to$ UR firmware

Six target joint positions per tick. The firmware closes the inner torque loop on these setpoints.

/scaled_joint_trajectory_controller/joint_trajectory

trajectory_msgs/JointTrajectory

MPC node $\to$ UR firmware

A single waypoint per tick. The firmware interpolates to the target over the trajectory time.

/plate_cmd

geometry_msgs/Vector3Stamped · optional

active controller $\to$ tracker

Commanded plate-tilt angles. When consumed by the tracker, they feed the Kalman filter's predict step as a gravity model.

/ball_balance_visualization

visualization_msgs/MarkerArray

visualiser $\to$ RViz

Plate geometry, live ball position, and reference trajectory rendered as RViz markers.

/tf, /tf_static

tf2_msgs/TFMessage

robot_state_publisher $\to$ controller

Transform-tree messages. The controller queries the buffer for the latest world $\to$ plate rotation at every tick.

§ 4Custom BallState message

Of the seven topics in § 3, six use stock ROS 2 message types (JointState, JointTrajectory, Vector3Stamped, MarkerArray, and the TF messages). The seventh, /ball_state, carries a custom message: the standard geometry_msgs/Point only encodes $(x, y, z)$, which is not enough, because the controller needs the ball velocity as well as the position, and it must gate on tracker-health flags before acting on a stale or occluded sample. Packing all of this into a single timestamped snapshot keeps the controller from having to align state, validity, and marker count from separate topics. The fields are:

header

std_msgs/Header

Frame id plate_center (position and velocity live in the plate frame) and the publish timestamp; used by subscribers for staleness checks.

x, y

ball position · float64 · metres

Ball-centre coordinates in $\{P\}$.

vx, vy

ball velocity · float64 · m/s

Time derivative of the ball position, output of the Kalman filter.

ball_found

bool · detection flag

True when the current frame contains a fresh detection; false during Kalman coast (predict-only).

tracking_valid

bool · safety gate

True when the filter is producing a usable estimate (either fresh detection or within the coast window). The controllers gate their action on this flag.

markers_found

int32 · 0–4

Number of ArUco markers visible. The plate homography is only valid when all four are seen; the controller falls back to home recovery when this count drops below the safety threshold.

pixel_u, pixel_v

float64 · debug only

Raw pixel coordinates of the detection. Used by overlays only; the controller never reads them.

§ 5One control cycle

Putting the nodes, the topics, and the message together, a single closed-loop iteration runs through six stages, clockwise around the loop:

1
Frame in

The vision node reads the next frame from the USB camera.

2
BallState out

Perception (ArUco + HSV + Kalman) runs; the resulting ball state is published on /ball_state.

3
Controller wakes

The active controller reads the new /ball_state, the latest /joint_states, and the world-to-plate transform from the TF buffer.

4
Joint command out

The five-stage pipeline of the control page runs and emits a joint command to the UR firmware.

5
Torque loop

The UR-side controller interpolates between waypoints and runs an internal joint-space PD controller with gravity compensation; encoders publish back on /joint_states.

6
Plate-tilt feedforward

The controller publishes the commanded plate tilt on /plate_cmd, which the tracker uses as a gravity model in the next Kalman predict step.

After step 6 the next frame arrives and the cycle starts over from step 1.

§ 6Launch composition

The graph described above is brought up at runtime by two independent launch files: one for the vision pipeline, one for the controller stack. The split allows the tracker to run on the perception machine and the controller to run on the machine that has the UR network, joined only by the ROS 2 domain. Choosing the PID or the MPC variant amounts to picking which controller launch is started; the tracker launch is unchanged.

tracker_launch.py vision pipeline ball_tracker tracker_params.yaml publishes /ball_state optionally subscribes /plate_cmd balance.launch.py PID variant · or mpc_balance.launch.py for MPC static_transform_publisher wrist → plate adapter controller node PID or MPC variant ball_balance_visualizer conditional on rviz arg rviz2 conditional on rviz arg publishes /forward_position_controller/commands · /plate_cmd launches communicate via the ROS 2 graph; /ball_state and /plate_cmd bridge the two
Figure 2. Launch composition. The vision pipeline and the controller stack are brought up by separate launch files; they share data via the ROS 2 graph rather than a direct interface. Choosing the PID or MPC controller is a matter of which controller launch is used; the tracker is unchanged.

§ 7Notation and acronyms

ROS 2
Robot Operating System 2.
QoS
Quality of Service profile (reliability and history settings of a topic).
TF
transform tree (ROS message tree of coordinate frames over time).
URDF
Unified Robot Description Format — XML description of the arm geometry.
UR
Universal Robots, the manufacturer of the arm.
scaled JTC
scaled joint-trajectory controller, the UR-side ros2_control plugin.

§ 8References

  1. Macenski, S., Foote, T., Gerkey, B., Lalancette, C., Woodall, W. (2022). Robot Operating System 2: Design, architecture, and uses in the wild. Science Robotics, 7(66), eabm6074. doi.org/10.1126/scirobotics.abm6074
  2. Coleman, D., Sucan, I. A., Chitta, S., Correll, N. (2014). Reducing the barrier to entry of complex robotic software: a MoveIt! case study. Journal of Software Engineering for Robotics, 5(1), 3–16. arxiv.org/abs/1404.3785
  3. Sucan, I. A., Moll, M., Kavraki, L. E. (2012). The Open Motion Planning Library. IEEE Robotics & Automation Magazine, 19(4), 72–82. doi.org/10.1109/MRA.2012.2205651
  4. Foote, T. (2013). tf: the transform library. IEEE TePRA, pp. 1–6. doi.org/10.1109/TePRA.2013.6556373
  5. Andersen, T. T. (2015). Optimizing the Universal Robots ROS driver. Technical Univ. of Denmark, Dept. of Electrical Engineering. orbit.dtu.dk/en/publications
  6. Sandoval Magallanes, J. A., et al. (2025). A ROS 2 interface for Universal Robots collaborative manipulators based on ur_rtde. arXiv:2511.17237. arxiv.org/abs/2511.17237