29 Jul 2020
I’ve been using ROS2 quite a bit over the past several months. As
I’ve previously mentioned, it would appear there aren’t too many real robots running
ROS2 yet. We have a bit of a chicken-and-egg problem where the tools are not yet
fully ready for real robots, but until people start using ROS2 on real robots nobody
knows the real pain points.
There are many, many things that could be done in ROS2. But there is limited time to
implement them all - so we need to focus on those that enable robots and their
developers to “survive”.
I often get asked if ROS2 is ready for prime time. My answer for
a long time was “no”. I’m going to upgrade it to “maybe, depends on what you’re
doing” at this point. This post describes five things that I think would make the
answer “hell yes” for most roboticists. I actually hope this post ages poorly and
that all these things come to happen in ROS2.
Automatic QoS for RVIZ, rcl2cli
Quality of Service (QoS) is probably the biggest change between ROS1 and ROS2 - it’s also
the one that causes the most headaches from what I can tell.
The ROS2 Foxy release adds a
--verbose option to the
ros2 topic info command which is a huge
step in the right direction. This lets you quickly diagnose when a publisher and subscriber
are using incompatible QoS.
rosbag2 got a huge upgrade in ROS2 Foxy:
it automatically determines the proper settings for Quality of Service (QoS) so that it always
connects to the publisher you’re trying to record (note: if multiple publishers are publishing
to the same topic with different QoS it may not work - but really, who does that?).
Now we need that feature in RVIZ2 and the command line utilities (CLI).
These are debugging tools, so they need to be able to “just work” in most scenarios.
Since most of the time you’re using RVIZ2 to connect to sensor data, which is often published
with a non-default QoS (the sensor data profile), it’s absolutely bonkers that RVIZ uses
the default QoS on everything
(which is incompatible with sensor profile).
Even something as simple as latched topics
won’t work by default.
This is not an easy ask. It will involve significant changes to RVIZ as well as changes to
lower level packages like
message_filters, but I’m pretty sure this is the
single biggest bang-for-your-buck improvement that will make ROS2 work better for robot
Ok, I’m sounding like a broken record (or the squeaky caster on your 8 year old mobile manipulator),
but this is really important.
I’m not just talking about the lack of tutorials here. One of the things that made ROS great
for new developers in the 2011-2014 era (when it experienced huge growth in the community), was
a very polished and up-to-date wiki. If you wanted to find out about a package, you could go to
wiki.ros.org/package_name - and the documentation was right there (or if it wasn’t, you had a
pretty good idea this package wasn’t ready for prime time). With ROS2, we don’t have a centralized
place for documentation yet - and I think that is holding the community growth back.
There is also the issue of “user documentation”. Nearly everything for ROS2 is written assuming
an expert programming background (even more so than ROS1 documentation). Reading the source
code is not how you’re supposed to learn how to run a ROS driver for a laser scanner.
Building out a community is super important.
The best way to get a bug fixed is to find a developer who needs it fixed. I’ve only been using
ROS2 on-and-off for a couple months - and in that time I’ve fixed half a dozen bugs across multiple
ROS2 packages, and even taken on maintaining the ROS2 port of
urg_node and the related packages.
Subscriber Connect Callbacks
Now we’ll jump into a super technical issue - but the impact is huge - especially for those
doing perception (which is, you know, generally a big part of robotics).
When creating a publisher in ROS1, you could register a callback which would get called
whenever a subscriber connected or disconnected. This feature doesn’t yet exist in ROS2,
but I think it is essential for real robotics systems. Here’s why:
Robots can generate lots of sensor data - especially when you add processing pipelines
into the mix. Sometimes you might need a high-resolution point cloud with color and depth
information. Sometimes you need a low-res colorless point cloud. This is especially true
when the robot system does multiple tasks. For instance, imagine a robot that is both
mobile and a manipulator - for navigating the environment it wants that high frame
rate, low-res point cloud for collision avoidance. When the mobile manipulator gets
to the destination it wants to switch to a high-res point cloud to decide what to grab.
Sometimes you literally cannot be publishing all the data streams possible because
it would overwhelm the hardware (for instance, saturating the USB bus if you were to
pull depth and color and IR from most RGBD sensors at the same time).
In ROS1, you could create “lazy publishers” so that the creators of these intensive
data types would only create and publish the data when someone was listening. They
would be alerted to someone listening by the connect callback. The lack of lazy
publishers throughout various drivers and the
depth_image_proc> packages is a real challenge to building high performance
perception systems. When people ask me “is ROS2 ready?”, my first question these days is
“how much perception/vision are you doing?”.
To be clear, there are workarounds available in some cases. If you’re creating a
publisher yourself, you can:
- Create a loop that “polls” whether there are subscribers (using get_subscription_count)
as I did right now in the
- Use parameters to dynamically reconfigure what is running. While this might
work in some cases (and maybe even be a preferred solution for some use cases),
it likely leads to a more brittle system.
- Re-architect your system never need lazy publishers by hard coding exactly
what you need for a given robot. While some of this is likely to happen in
a more production environment, it doesn’t lend itself to code reuse and sharing
which was one of the major selling points of ROS1.
Note that I said, “if you’re creating a publisher yourself”. There are lots of
packages that are widely relied on in ROS1 whose ROS2 ports are crippled or broken
due to the lack of subscriber connect callbacks:
Note: in the month that I’ve been writing this post, a number of questions have been answered, so we’re already getting there!
I remember folks joking that ROS Answers was misnamed, because there were no answers there,
just questions. It’s actually not true - unless you search for the
There are a lot of really good questions there. Like, stuff that’s not anywhere in the documentation
and is probably quite relevant to a large number of users. Here’s a few examples:
ROS2 developers, please take note: we’ve got lots of great features in this system,
please help your users learn to how to actually use them - maybe they’ll even help
Your Robot on ROS2
There’s probably a bunch of other bugs/issues/etc hiding in the weeds. Your robot is probably
not exactly the same as mine - and your use cases are going to be different. We need more robots
running ROS2 to dig into things. The good news is: you can install ROS1 and ROS2 on the same system
and switch back and forth pretty easily.
16 Jun 2020
In my last post I finished porting the
robot_controllers package to ROS2. The last piece of hardware to
support the UBR-1 in ROS2 was the head camera, and it turned out to be the most
Note: I’ve created a new
tag from which you can find all my ROS2 posts.
The head camera in the UBR-1 is a vintage 2013-era Primesense Carmine. Getting it
to run in ROS2 meant porting the
In ROS1 the camera driver was implemented as a
nodelet for efficiency when pairing
it with components from image_pipeline.
Nodelets have been replaced by
components in ROS2. The setup for a
ROS2 component is actually quite simple and this is a part of ROS2 that is
very well documented:
An aside: most components are derived from
rclcpp::Node. A number of
calls in the
rclcpp API require that you pass an
Node derives from
so you can call
shared_from_this() to get the pointer. BUT - beware,
this is only works if the instance was created as a shared pointer (not a problem
when using the rclcpp ComponentManager to load your component), AND after the object is fully constructed.
I spent a good 20-30 minutes trying to figure out whey my camera driver kept
bad_weak_ptr - even reading through the code for
ComponentManager, before realizing it was happening in a function being called from the
The next challenge in porting openni2_camera is that ROS2 doesn’t yet support
subscriber connect callbacks. These are frequently used in ROS1 to implement lazy publishers.
Basically, when creating a publisher you would register a callback which gets called
whenever someone subscribes or unsubscribes from your topic. This way you can
decide whether to actually do heavy work - or in the case of a driver, possibly
start or stop data transmission from the hardware.
In the case of openni2_camera - it is actually extremely important to lazy publish. There are multiple
camera streams, but you cannot actually stream them all at once so you NEED lazy
publishers. I ended up working around this by creating a timer which calls what would
be the subscriber connect callback every second. It’s a bit of overhead, but way less than
publishing lots of unneeded data. Along the way, I found out that the
function in rclcpp doesn’t respect remapping, and
so the only way to properly get the subscriber count is to call
on the publisher. Since the openni2_camera uses
image_transport this meant tracking
getNumSubscribers on the the camera publisher class wasn’t implemented
yet (I think the port predates get_subscription_count being added). If you want to try
out the openni2_camera package in ROS2, for now you’ll need to use this
PR that I created (but hasn’t been
The openni2_camera driver only outputs the raw RGB camera and raw depth camera streams. To get
sensor_msgs/PointCloud2, we need to construct pipelines using
the components of
depth_image_proc. In ROS1, these were implemented as nodelets.
At some point they were ported to ROS2 as components, but not quite correctly (or maybe they
were correct when they were ported). A pair of
got them working again as components, and got the parameters working.
The next step was to setup a launch file that loaded the camera driver, some components
to rectify the RGB and Depth images, and a component to create a point cloud. In ROS1, this
existed as the packages
openni2_launch. Due to
the lack of lazy publishers and subscribers, you really can’t build the same lazy graph
that those packages defined right now. So I built a minimal graph of sensor processing
that I needed in a single launch file.
With this prepared, I could run the launch file and
ros2 topic hz /head_camera/depth_registered/points showed it was publishing!
Let’s back up and talk about that
depth_image_proc parameter PR for a moment.
Starting with ROS2 Dashing, parameters needed
to be declared. If you try to access an undeclared parameter, you get an error, unless you
get_parameter_or - which won’t throw an error. Unfortunately, that means code that used
get_parameter_or and was written pre-Dashing might just not actually load parameters correctly.
One of the big benefits of declaring all parameters is that you can do:
And get all the parameters. This is actually huge - as a maintainer of navigation in ROS1, I have
had more than a handful of occasions when I learned of a parameter that was undocumented but widely
used, and which I didn’t even know existed. ROS2 might lack documentation for things - but the
introspection is really nice and can be a work around for missing documentation.
The next step is clearly to look at the point cloud in RVIZ2. Should be super easy, right?
Not so fast. In ROS1, any subscriber could connect to a publisher of the same topic message type.
ROS2 introduces Quality of Service
settings and there are incompatible matches. RVIZ connected to the topic, but didn’t show any data.
In ROS2 Foxy, you can run
ros2 topic info topic_name --verbose and it will print
the QoS information for each publisher and subscriber. This is super helpful - but I unfortunately
didn’t find that until a few days after I was debugging this. As it turns out, depth_image_proc
components publish “best effort” but RVIZ subscribes with “reliable” and this is incompatible.
In Foxy, rosbag is apparently able to deduce the publisher QoS and connect correctly,
I’m planning to look into how that was done and see if I can make RVIZ work a bit better (or at least add
a drop down that lets us select the transport). It appears this will also require changes in
message_filters package, so it might take some time.
While we’re talking about RVIZ2 challenges, I’m also going to note this PR
which works around the fact that RVIZ2 is apparently entirely single threaded and the single thread
runs everything in the same update loop. Keep that in mind if things aren’t visualizing right,
it may not be your code.
Finally, I got the QoS correct (by hacking depth_image_proc, since it compiles faster) and
had my point cloud:
Just kidding, I still had to find out that my timestamps were wrong and
was filtering out all my messages. Once I fixed that in the openni2_camera package, I got the
view in RVIZ2.
For anyone wanting to try out the openni2_camera package in ROS2, there is more detailed information
in the ticket on the openni2_camera
I’m continuing to clean up issues in the packages that I’m porting, but my next big step is getting
navigation setup on the UBR-1 in ROS2. That post will probably come a little slower than these
last few have. I’m also working on porting the
robot to ROS2.
11 Jun 2020
In my last post I ported robot_controller_msgs
to ROS2. I’ve now ported the actual controllers and made a number of changes to the
Controller and ControllerManager classes. I can now move all the UBR-1
joints, which makes it almost as happy as this:
Image from Melonee Wise
Porting Gravity Compensation
Porting the gravity compensation controller was quite easy - it has very little in the
way of a ROS interface. The full commit is on GitHub.
As I mentioned in my last post, the way you declare plugins has changed a bit. In ROS1,
you would have an
export directive in the package.xml that noted the location
of the .xml file describing your plugins. In ROS2, this has moved to the CMake:
The one interesting quirk I did come across was that my driver node uses the
“robot_description” parameter, and so do most controllers. You can’t declare the same
parameter twice, so I ended up with this in the controllers:
I will also note one piece of code I use quite frequently, but haven’t really noticed
used in the tutorials or demos. When declaring a parameter, you can get the value
right then (very similar to how parameters in ROS1 were often used:
std::string root = node->declare_parameter<std::string>("root", "torso_lift_link");
Many controllers make extensive use of ROS actions. While Actions exist in ROS2, there
isn’t a direct replacement for the
SimpleActionServer and the API has changed
quite a bit.
There are several tutorial on ROS2 actions,
but they are limited to Python. For C++, you have to look at
In ROS1 each of the action-based controllers in robot_controllers had an execute thread
for the action server and the main thread would call update() periodically. With the new
APIs things are a bit different. The examples show spinning up a new thread for the execute
callback, but I ended up refactoring things to use the existing update() function to do
more of the work and spun up a rclcpp::Timer to publish feedback outside of update().
We’ll walk through the
FollowJointTrajectoryController as an example. In the
handle_goal callback I do
some minimal checking of the trajectory goal, but usually accept the goal:
if (goal_handle->trajectory.joint_names.size() != joints_.size())
"Trajectory goal size does not match controlled joints size.");
handle_accepted callback then sets us up to use the goal. Once a goal is
fully accepted at the end of the callback, I save a shared_ptr to the goal handle, which
is needed to eventually call
abort. The first step
in the callback is therefore to check if there is an
active_goal_, and preempt it:
bool preempted = false;
result->error_code = -6;
result->error_string = "preempted";
preempted = true;
There is not currently a “preempted” state like in ROS1 (issue),
so we just call
preempted flag here gets used later on to splice the new
trajectory into the old one and use the current velocity of the joints.
Then we can grab the actual goal message, and process it just like we would in a ROS1 action callback:
const auto goal = goal_handle->get_goal();
// process goal->trajectory so
// update() can use it
sampler_ = ...
// at the end of the callback
active_goal_ = goal_handle;
Then the update() function, which is called at some fixed rate, processes the goal
and eventually marks it as succeeded or aborted:
// Is trajectory active?
if (active_goal_ && sampler_)
// sample and command joints
// If we've violated the tolerances
// If we've completed the trajectory
In ROS1, services were based on XML-RPC. Calling a service was synchronous - if the service
provider died while processing your call then your client would hang. While there are work
arounds (such as creating a thread for your service client), APIs should be easy to use
correctly and hard to use incorrectly. For this reason, I often used actions in ROS1
where a service would have otherwise sufficed.
With ROS2, services can be called asynchronously. The
an API to start, stop, and load controllers. For ROS2, I converted this to a service. You can
see the (commit)
which does this conversion, as well as porting all of the Python scripts that interact
with the service. This greatly simplifies the code - converting three callbacks into a
single one with a significantly cleaner function signature.
ROS2 services are well documented with tutorials for both
I won’t go into any more detail here.
The TF library was one of the original hits in ROS1, making it easy to track where all your
robot bits are located and transform data between different coordinate frames. ROS2 includes
the TF2 library. The documentation on how to use it is currently quite sparse, at some point
I’m planning to add to the tutorial, but in
the meantime, here’s a quick highlight (in C++):
As with later ROS1 releases, you’ll want to depend on the
tf2_ros package, and
likely you want to depend on the
packages if you are converting those data types. Similarly, you still need a Buffer and a
The big change from ROS1 is the buffer now takes a clock in the constructor, so you need
access to an
From that point, it is pretty much just like ROS1 to transform things (see a real example
geometry_msg::msgs::PointStamped in = <blah>;
tf_buffer->transform(in, out, "target_frame");
catch (const tf2::TransformException& ex)
RCLCPP_ERROR(node.get_logger(), "Could not transform point.");
The broadcaster works pretty much identically to ROS1, other than the message renaming
and the constructor taking an rclcpp::Node::SharedPtr:
broadcaster = std::make_unique<tf2_ros::TransformBroadcaster>(nodePtr);
transform.header.stamp = node->now();
transform.header.frame_id = "odom";
transform.child_frame_id = "base_link";
// Fill in transform.transform.translation
// Fill in transform.transform.rotation
You can find the full/real code in
I’m currently wrapping up making the head camera work. That will be the subject of
my next post in a few days.