Porting EtherbotiX to ROS2

Now that the UBR-1 is running pretty well under ROS2, I have started to also port Maxwell and the Robomagellan robot to ROS2. Both depend on my etherbotix drivers package, so that is the first package to port.

Etherbotix in ROS2

In ROS1, the etherbotix_python package was written in Python and leveraged some code from the arbotix_ros package (mainly for the controllers). I decided while porting things to ROS2 that I would migrate to using C++ and leverage the robot_controllers which I had recently ported to ROS2.

Since this was effectively a new package, I used the ros2 pkg create command to setup the CMakeLists.txt, package.xml and other boiler plate stuff.

I then started to setup the node based off my very old asio_ros example. At some point I should probably setup a test to see how accurate ROS2 timers are, but I knew for sure that this code would work so I stuck with boost::asio.

Python Wrappers

In ROS1 I had a number of scripts for interacting with the Etherbotix. For some of these, such as read_etherbotix, it was easy to port them to C++. For others, such as my motor_trace script which uses matplotlib, I really wanted to keep the majority of the script in Python. To accomplish this, I wrapped my underlying C++ drivers using Boost Python.

It required a bit of CMake:

find_package(PythonLibs REQUIRED)
find_package(Boost REQUIRED python)

ament_python_install_package(${PROJECT_NAME})

add_library(etherbotix_py SHARED ...)
set_target_properties(etherbotix_py PROPERTIES
  LIBRARY_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/${PROJECT_NAME}
  PREFIX "")
target_link_libraries(etherbotix_py
  ${Boost_LIBRARIES}
  ${PYTHON_LIBRARIES}
)
ament_target_dependencies(etherbotix_py ...)

install(
  TARGETS etherbotix_py
  DESTINATION "${PYTHON_INSTALL_DIR}/${PROJECT_NAME}"
)

I then created a file etherbotix/__init__.py:

from etherbotix.etherbotix_py import Etherbotix

This allowed me to import my C++ class Etherbotix in my scripts with:

from etherbotix import Etherbotix

Why Boost Python? Aren’t there newer things out there? Yes, there are newer things out there, and I spent a while trying out pybind11 but I just couldn’t get it to work and reverted to what I knew already.

Exporting Libraries

Most of the ROS2 code I’ve ported has thus far been various nodes, rather than a library that will be used by other packages (the one exception being robot_controllers_interface). I hadn’t previously paid super close attention to how things are exported. There are a few declarations that get placed at the end of your CMakeLists.txt:

# Tell downstream packages where to find our headers
ament_export_include_directories(include)
# Tell downstream packages our libraries to link against
ament_export_libraries(my_library)
# Help downstream packages to find transitive dependencies
ament_export_dependencies(
  rclcpp
)
ament_package()

This code snippet is from my ros2_cookbook project on GitHub. You can also find the full commit for enabling downstream packages to build upon the library exported by the Etherbotix library.

Maxwell Bringup

So why did I have to export the Etherbotix libraries? I had to write a custom controller for the torso on Maxwell, and that controller had to access an Etherbotix instance. This involved a bit of custom code to add both a custom JointHandle and custom Controller. The controller also automatically loads the custom JointHandle.

One of the advantages of using robot_controllers is that several controllers worked out of the box. I had never actually updated the old Python code to work with Maxwell’s parallel jaw gripper, but with the ParallelGripperController and a ScaledMimicController for the second finger, everything worked out of the box.

The controllers for Maxwell are now all setup. You can see the configuration here.

Some Fancy-ish C++

Throughout the driver code we have to construct buffers of various bytes to send to the hardware - often involving arrays of bytes of varying length. This is generally really clean in Python, but in C++ usually results in something like this:

uint8_t len = 0;
buffer[len++] = 0xff;
buffer[len++] = 0xff;
buffer[len++] = Etherbotix::ETHERBOTIX_ID;
buffer[len++] = 5;  // Length of remaining packet
buffer[len++] = dynamixel::WRITE_DATA;
if (motor_idx == 1)
{
  buffer[len++] = Etherbotix::REG_MOTOR1_VEL;
}
else
{
  buffer[len++] = Etherbotix::REG_MOTOR2_VEL;
}
buffer[len++] = (desired_velocity_ & 0xff);
buffer[len++] = (desired_velocity_ >> 8);
buffer[len++] = dynamixel::compute_checksum(buffer, 9);

I decided to come with a cleaner approach, since there are quite a few instances of this throughout the code base. I ended up creating a get_write_packet function like I had in the Python code, with this signature:

inline uint8_t get_write_packet(
   uint8_t* buffer,
   uint8_t device_id,
   uint8_t address,
   std::vector<uint8_t> params)

And then using an initializer list to create the variable-size buffers of bytes to send:

uint8_t len = dynamixel::get_write_packet(
  buffer,
  Etherbotix::ETHERBOTIX_ID,
  (motor_idx == 1) ? Etherbotix::REG_MOTOR1_VEL : Etherbotix::REG_MOTOR2_VEL,
  {static_cast<uint8_t>(desired_velocity_ & 0xff),
   static_cast<uint8_t>(desired_velocity_ >> 8)}
);

Yes, this probably is not the fastest code (since it passes the vector of bytes by-value), but I like how it cleans up the code and none of these vectors are all that large. You can see the full commit on GitHub

Diagnostics

The final part of my etherbotix port was to add diagnostics back into the node. diagnostic_msgs are an underused feature in ROS. They offer a common way to send information about the status of things, mostly hardware. Drivers like urg_node, the joy node, and even higher level filters like robot_localization publish diagnostics.

While diagnostics can be passed through analyzers to output a diagnostics_agg topic, I often just use the rqt_runtime_monitor which access the raw diagnostics topic. I found it was missing in ROS2 - but there is a port, which hasn’t been merged yet. You can find that port here.

Next Steps

I’ve made some major progress on running navigation2 on the UBR-1 and that will be the subject of a post next week. After that I’ll be continuing on bringup of the robomagellan robot, including integrating robot_localization in ROS2.

Mapping and Localization in ROS2

Now that the drivers are pretty much operational for the UBR-1 robot under ROS2, I’m starting to work on the higher level applications. The first step was building a map and setting up localization against that map.

SLAM

In ROS1 there were several different Simultaneous Localization and Mapping (SLAM) packages that could be used to build a map: gmapping, karto, cartographer, and slam_toolbox. In ROS2, there was an early port of cartographer, but it is really not maintained. The other package that has been ported to ROS2 is slam_toolbox, which is basically slam_karto on steroids - the core scan matcher is the same, but everything else has been rewritten and upgraded.

Installation of slam_toolbox is super easy:

sudo apt-get install ros-foxy-slam-toolbox

I then created a launch file, which is an updated version of the online_sync_launch.py found within slam_toolbox:

from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import Node
from ament_index_python.packages import get_package_share_directory


def generate_launch_description():
    use_sim_time = LaunchConfiguration('use_sim_time')

    declare_use_sim_time_argument = DeclareLaunchArgument(
        'use_sim_time',
        default_value='false',
        description='Use simulation/Gazebo clock')

    start_sync_slam_toolbox_node = Node(
        parameters=[
          get_package_share_directory("ubr1_navigation") + '/config/mapper_params_online_sync.yaml',
          {'use_sim_time': use_sim_time}
        ],
        package='slam_toolbox',
        node_executable='sync_slam_toolbox_node',
        name='slam_toolbox',
        output='screen')

    ld = LaunchDescription()

    ld.add_action(declare_use_sim_time_argument)
    ld.add_action(start_sync_slam_toolbox_node)

    return ld

My updates were basically just to use my own config.yaml file. In that YAML file I had to update the frame ids (I don’t use a base_footprint, and my robot has a base_scan topic rather than scan). There are dozens of parameters to the Karto scan matcher and you can see the entire file on GitHub but the basic changes I had to make were:

slam_toolbox:
  ros__parameters:

    # ROS Parameters
    odom_frame: odom
    map_frame: map
    base_frame: base_link
    scan_topic: /base_scan

Now we can run the launch file and drive the robot around to build a map. We can also view the map in RVIZ. To get the map to come through, you will likely have to expand the options under the topic name and change the durability to transient local. Even though the documentation on ROS2 QoS says that volatile subscriber is compatible with a transient local publisher, I’ve found it doesn’t always seem to work right:

Now that we’ve built a map, it is time to save the map. The command is quite similar to ROS1, except you must pass the base name of the map (so here, I’m passing map, which means it will save map.yaml and map.pgm in the local directory):

ros2 run nav2_map_server map_saver_cli -f map

Next we can create a launch file to display the map - I used the example in nav2_bringup as my starting place and changed which package the map was stored in. You can find my launch file in the ubr1_navigation package. I started my localization launch file and opened RVIZ to find:

It turned out I had to adjust the free_thresh threshold in the map.yaml down to 0.196 (the same value in ROS1) for the map to look correct:

There are numerous parameters in slam_toolbox and many more features than I could possibly cover here. For a good introduction, check out ROSCon 2019 Talk by Steve Macenski.

Localization

While there are a variety of mapping options in ROS1 and some in ROS2, for localization it really is just Adaptive Monte Carlo Localization (AMCL). There is some ongoing work towards more modern localization solutions in ROS2, but it would seem to be a long way off.

The launch file we copied over for running the map_server also included AMCL in it (hence the name localization.launch.py).

For the most part, there are only a few parameters to tune in AMCL to generally get decent results:

amcl:
  ros__parameters:
    alpha1: 0.25
    alpha2: 0.2
    alpha3: 0.15
    alpha4: 0.2
    base_frame_id: "base_link"
    global_frame_id: "map"
    laser_model_type: "likelihood_field"
    max_beams: 60
    max_particles: 2000
    min_particles: 500
    odom_frame_id: "odom"
    robot_model_type: "differential"
    tf_broadcast: true
    z_hit: 0.6
    z_max: 0.05
    z_rand: 0.3
    z_short: 0.05

Before trying to tune AMCL, you really need to make sure your TF and odometry are setup correctly, there are some points in the Navigation Tuning Guide, which was written for ROS1, but is generally very much true in ROS2.

The most important parameters are setting the alphaX parameters to model your odometry noise. By default all of them are set to 0.2, but they should be adjusted based on the quality of your odometry:

  • alpha1 - noise in rotation from rotational motion
  • alpha2 - noise in rotation from translational motion
  • alpha3 - noise in translation from translational motion
  • alpha4 - noise in translation from rotational motion

These are somewhat intuitive to understand. For most robots, if they drive forward in a straight line, the odometry is very accurate - thus alpha3 is often the lowest value parameter. When the robot turns in place, it probably has more noise (unless you have a fantastically tuned gyro being merged with the wheel odometry) - so alpha1 often gets bumped up. My alpha1 is currently set high since I have not yet integrated the IMU on the UBR-1 into the ROS2 odometry.

When the alpha parameters are set too low, the odometry ends up driving the distribution of particles in the cloud more than the scan matcher. If your odometry is inaccurate, the robot will slowly get delocalized because the particle distribution lacks particles located at the true pose of the robot.

If the alpha parameters are set too high, the particle distribution spreads out and can induce noise in your pose estimate (and cause delocalization).

One of the best ways to test these parameters is in RVIZ. Add your laser scan to the display, and set the fixed frame of RVIZ to your map frame. Then turn the “Decay Time” of the laser way up (20-100 seconds). If your parameters are correct, the laser scans will all line up very well. If the parameters are crap, the walls raycast by the laser scanner will be very “thick” or unaligned.

To tune these parameters, I will often drop all of them lower than the default, usually something like 0.05 to 0.1 for each parameter.

A final check is to display the /particlecloud published by AMCL and make sure it isn’t diverging too much - if it is, you might have to reduce your alpha parameters. To see the particle cloud, you’ll have to switch the QoS to best effort. The image below shows what the cloud looks like when the robot is first localized, it should be a lot less spread out during normal operation:

Some Notes on Cyclone DDS

One of the biggest differences between ROS1 and ROS2 is the replacement of the single middleware with a plugin-based architecture. This allows ROS2 to use various Robotic Middle Ware (RMW) implementations. All these RMW implementations are currently based on DDS. You can read all about the details in the ROS2 Design Docs.

Over time, the supported RMW implementations have shifted and new ones have been introduced. The default is currently FastRTPS (which apparently has been renamed to FastDDS, but after the Foxy release). The newest option is CycloneDDS which uses Eclipse Cyclone DDS. Cyclone DDS has gotten a lot of praise lately, so let’s take a closer look.

RMW Implementations

Choosing between RMW implementations is still a bit of a challenge since ROS2 is still very much under active development. There are multiple tickets about FastDDS service discovery issues. CycloneDDS is less than two years old, which means it is still under very active development and might not be fully featured, but it is supposed to be really highly performant. Mixing multiple implementations at runtime has noted caveats.

Luckily, it’s very easy to switch between implementations by simply setting the RMW_IMPLEMENTATION environment variable (assuming the selected implementation is built/installed).

When switching between implementations, be sure to stop the ros2 dameon so that it gets restarted with the proper RMW implementation:

ros2 daemon stop

First you’ve heard of the ROS2 daemon? Check out this ROS Answers post which contains the best description I’ve seen.

Debugging Issues

While FastDDS was mostly working out of the box, the whole service problem was wreaking havoc on setting/getting parameters – and I’ve been tuning parameters frequently. I went ahead and set the RMW_IMPLEMENTATION to rmw_cyclonedds_cpp, or so I thought.

I noticed that service discovery wasn’t much better. Then I noticed on the robot I had set RMW_IMPLEMTATION - so I fixed the spelling mistake. Now everything should totally work great!

Wrong.

On the robot, discovery worked fine and services worked great - but half or more of the nodes couldn’t be seen by my laptop. Restarting launch files resulted in different nodes often missing!

I started to debug and came across the ddsperf tool. If you’re using ROS2 on MacOSX you’ll want to check out this issue on how to install ddsperf.

Multiple Network Interfaces

Running ddsperf sanity gave an interesting warning on the robot:

ddsperf: using network interface enp3s0 (udp/10.42.0.1) selected arbitrarily from: enp3s0, wlp2s0

The UBR-1 has two network interfaces: wlp2s0 is a wifi connection to the outside world and enp3s0 is an internal ethernet port which only talks to the robot hardware. Apparently, my nodes were frequently using the wrong network interface. The upstream Cyclone DDS README does mention, way down the page, that “proper use of multiple network interfaces simultaneously will come, but is not there yet.”

The configuration guide states that the selection of network adapter prefers non-link-local interfaces, but apparently something is tripping it up in detecting that the ethernet interface is configured that way.

The work around is to set a NetworkInterfaceAddress in the CYCLONEDDS_URI environment variable:

export CYCLONEDDS_URI='<CycloneDDS><Domain><General><NetworkInterfaceAddress>wlp2s0</NetworkInterfaceAddress></General></Domain></CycloneDDS>'

If you’re prone to typos, and want to make sure you’re actually running the expected RMW interface, I’d recommend this command:

ros2 doctor --report | grep middleware

After a few seconds, you should see:

middleware name    : rmw_cyclonedds_cpp

I actually setup an alias in my bashrc so that which_rmw runs that command. Once I settled on using Cyclone DDS as my new default, I also added the RMW_IMPLEMENTATION and CYCLONEDDS_URI settings to the bashrc on the robot.

Final Thoughts

Once I worked through the configuration issue, CycloneDDS appears to be the most stable of the few RMW implementations I’ve tried. I haven’t actually tested the performance head-to-head, but others have.

I would recommend looking at the Configuration section of the upstream Eclipse CycloneDDS project. This contains a bunch of useful information about what you can specify in the CYCLONEDDS_URI. The Guide to Configuring is also very worth reading. It’s honestly a great resource for simply understanding all those things you hoped you’d never need to learn about DDS.