Software architecture
The real-robot stack is a Robot Operating System 2 graph: a perception node that turns camera frames into a ball state, a controller node that turns the ball state into joint commands, and the Universal Robots firmware that closes the inner torque loop. Two controller variants share the same topic interface, so the PID and the MPC are interchangeable without touching the rest of the graph.
The real-robot stack is split into three layers. At the bottom, the
Universal Robots firmware closes a high-rate joint-space torque
loop and exposes the arm through the standard ros2_control
interfaces; this layer is treated as a black box. Above it sits a
controller node, either the PID of the
control page or the MPC of the
MPC page, which reads the latest ball state and
joint positions at every tick and writes back a joint command. Above
the controller, a vision node runs the perception pipeline of the
vision page and publishes the ball state on a
topic that any controller can read.
A single custom message, BallState, carries everything the
controller needs from perception in one timestamped snapshot. The UR
driver's robot_state_publisher maintains the transform
tree from the joint states and the robot description, so the
controller queries the world-to-plate rotation at every tick instead
of recomputing it from forward kinematics.
ros2_control joint-trajectory controller and drives the arm. Workstation and Control Box live on the same ROS 2 domain.Each of the three layers of § 1 is implemented by one or more ROS 2 nodes. The graph contains one perception node, one of two interchangeable controller nodes, the UR driver and joint-trajectory controller, and supporting infrastructure for the transform tree and visualisation:
vision node
Opens a USB camera, runs ArUco-homography for the plate frame and HSV segmentation for the ball, fuses the result through a Kalman filter, and publishes /ball_state.
controller nodes (PID or MPC variant)
Two implementations of the same interface: the PID node of the PID page and the MPC node of the MPC page. Exactly one is brought up at a time; both subscribe to /ball_state and /joint_states, query the world-to-plate transform, and emit joint commands.
visualisation node
Republishes the live ball state and the reference trajectory as RViz markers on /ball_balance_visualization.
UR-side ros2_control plugins
The MPC uses the joint-trajectory variant; the PID uses the forward-position variant. Both close the inner torque loop with gravity compensation and emit encoder readings on /joint_states.
transform-tree infrastructure
Together they maintain the world $\to$ wrist $\to$ plate chain, which the controllers query for the plate-frame rotation at every tick.
The arrows in Figure 1 correspond to seven named topics. Each topic has a fixed message type; the controller and the tracker use a mix of standard ROS 2 messages and one custom message detailed in § 4.
ball_tracker_msgs/BallState · sensor-data QoS
tracker $\to$ controller, visualiser
Custom message bundling ball position, velocity, and tracker-health flags into a single timestamped snapshot (§ 4).
sensor_msgs/JointState
UR firmware $\to$ controller, robot_state_publisher
Joint encoder readings used by the rotational Jacobian in the controller and by the TF tree for the world-to-plate rotation.
std_msgs/Float64MultiArray
PID node $\to$ UR firmware
Six target joint positions per tick. The firmware closes the inner torque loop on these setpoints.
trajectory_msgs/JointTrajectory
MPC node $\to$ UR firmware
A single waypoint per tick. The firmware interpolates to the target over the trajectory time.
geometry_msgs/Vector3Stamped · optional
active controller $\to$ tracker
Commanded plate-tilt angles. When consumed by the tracker, they feed the Kalman filter's predict step as a gravity model.
visualization_msgs/MarkerArray
visualiser $\to$ RViz
Plate geometry, live ball position, and reference trajectory rendered as RViz markers.
tf2_msgs/TFMessage
robot_state_publisher $\to$ controller
Transform-tree messages. The controller queries the buffer for the latest world $\to$ plate rotation at every tick.
BallState message
Of the seven topics in § 3, six use stock ROS 2 message types
(JointState, JointTrajectory,
Vector3Stamped, MarkerArray, and the TF
messages). The seventh, /ball_state, carries a custom
message: the standard geometry_msgs/Point only encodes
$(x, y, z)$, which is not enough, because the controller needs the
ball velocity as well as the position, and it must gate on
tracker-health flags before acting on a stale or occluded sample.
Packing all of this into a single timestamped snapshot keeps the
controller from having to align state, validity, and marker count
from separate topics. The fields are:
std_msgs/Header
Frame id plate_center (position and velocity live in the plate frame) and the publish timestamp; used by subscribers for staleness checks.
ball position · float64 · metres
Ball-centre coordinates in $\{P\}$.
ball velocity · float64 · m/s
Time derivative of the ball position, output of the Kalman filter.
bool · detection flag
True when the current frame contains a fresh detection; false during Kalman coast (predict-only).
bool · safety gate
True when the filter is producing a usable estimate (either fresh detection or within the coast window). The controllers gate their action on this flag.
int32 · 0–4
Number of ArUco markers visible. The plate homography is only valid when all four are seen; the controller falls back to home recovery when this count drops below the safety threshold.
float64 · debug only
Raw pixel coordinates of the detection. Used by overlays only; the controller never reads them.
Putting the nodes, the topics, and the message together, a single closed-loop iteration runs through six stages, clockwise around the loop:
The vision node reads the next frame from the USB camera.
Perception (ArUco + HSV + Kalman) runs; the resulting ball state is published on /ball_state.
The active controller reads the new /ball_state, the latest /joint_states, and the world-to-plate transform from the TF buffer.
The five-stage pipeline of the control page runs and emits a joint command to the UR firmware.
The UR-side controller interpolates between waypoints and runs an internal joint-space PD controller with gravity compensation; encoders publish back on /joint_states.
The controller publishes the commanded plate tilt on /plate_cmd, which the tracker uses as a gravity model in the next Kalman predict step.
After step 6 the next frame arrives and the cycle starts over from step 1.
The graph described above is brought up at runtime by two independent launch files: one for the vision pipeline, one for the controller stack. The split allows the tracker to run on the perception machine and the controller to run on the machine that has the UR network, joined only by the ROS 2 domain. Choosing the PID or the MPC variant amounts to picking which controller launch is started; the tracker launch is unchanged.
ros2_control plugin.