-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Bug report
Required Info:
- Operating System:
- Ubuntu-24.04
- Computer:
- 12th Gen Intel(R) Core(TM) i5-12600KF
- ROS2 Version:
- rolling
- Version or commit hash:
- DDS implementation:
- Fast-RTPS
Description
We encountered this issue while deploying Navigation2 on an RK3588 platform. We observed that the navigation node group was using a high amount of CPU even when idle, which affected the overall stability of the system. After investigating, we identified that the default heartbeat period in nav2_util::LifecycleNode was set too low, leading to the excessive CPU usage.
- With the current heartbeat confirmation period set to 0.1s, the
nav2_containernode group's CPU usage is around 140% while idle. - When we increase the confirmation period to 1.0s, the idle CPU usage of the
nav2_containernode group drops to about 75%. - If we disable the
compositionfeature and run individual node processes, with a period interval of 0.1s, each process uses around 15% CPU; increasing the period to 1.0s brings the usage down to 7% per process.
This issue may partly explain why Nav2 performs poorly on hardware with limited computing resources.
CPU Usage Screenshots
nav2_containercpu usage with heartbeat period 0.1s.
- Single-node process cpu usage with heartbeat period 0.1s.
nav2_containercpu usage with heartbeat period 1.0s.
- Single-node process cpu usage with heartbeat period 1.0s.
Suggested Modification Strategy
-
Increase the
bond_heartbeat_periodinnav2_util::LifecycleNodeto 1.0s.
bond_heartbeat_period = this->declare_or_get_parameter<double>("bond_heartbeat_period", 0.1); -
Parameterize the hard-coded
bond_heartbeat_periodinnav2_lifecycle_manager::LifecycleManager, and update its default value to 1.0s. -
Remove the connection establishment check in the
createBondConnectionfunction ofnav2_lifecycle_manager::LifecycleManagerto avoid prolonged startup times.
navigation2/nav2_lifecycle_manager/src/lifecycle_manager.cpp
Lines 273 to 300 in 52334b0
LifecycleManager::createBondConnection(const std::string & node_name) { const double timeout_ns = std::chrono::duration_cast<std::chrono::nanoseconds>(bond_timeout_).count(); const double timeout_s = timeout_ns / 1e9; if (bond_map_.find(node_name) == bond_map_.end() && bond_timeout_.count() > 0.0) { bond_map_[node_name] = std::make_shared<bond::Bond>("bond", node_name, shared_from_this()); bond_map_[node_name]->setHeartbeatTimeout(timeout_s); bond_map_[node_name]->setHeartbeatPeriod(0.10); bond_map_[node_name]->start(); if ( !bond_map_[node_name]->waitUntilFormed( rclcpp::Duration(rclcpp::Duration::from_nanoseconds(timeout_ns / 2)))) { RCLCPP_ERROR( get_logger(), "Server %s was unable to be reached after %0.2fs by bond. " "This server may be misconfigured.", node_name.c_str(), timeout_s); return false; } RCLCPP_INFO(get_logger(), "Server %s connected with bond.", node_name.c_str()); } return true; }
Thanks
Thank you very much for your hard work and dedication to maintaining and improving Navigation2. We are submitting this issue to share our recent experience and observations, and hope that it can help make Nav2 better. We greatly appreciate your time and any suggestions.