Low-Energy Autonomy and Navigation (LEAN) Research Group

Zhengdond Zhang*, Amr Suleiman*, Luca Carlone, Vivienne Sze, Sertac Karaman, "Visual-Inertial Odometry on Chip: An Algorithm-and-Hardware Co-design Approach," Robotics: Science and Systems (RSS), July 2017

Abstract Paper Poster Slides Software

Autonomous navigation of miniaturized robots (e.g., nano/pico aerial vehicles) is currently a grand challenge for robotics research, due to the need for processing a large amount of sensor data (e.g., camera frames) with limited on-board computational resources. In this paper we focus on the design of a visual-inertial odometry (VIO) system in which the robot estimates its ego-motion (and a landmark-based map) from on- board camera and IMU data. We argue that scaling down VIO to miniaturized platforms (without sacrificing performance) requires a paradigm shift in the design of perception algorithms, and we advocate a co-design approach in which algorithmic and hardware design choices are tightly coupled. Our contribution is four-fold. First, we discuss the VIO co-design problem, in which one tries to attain a desired resource-performance trade-off, by making suitable design choices (in terms of hardware, algorithms, implementation, and parameters). Second, we characterize the design space, by discussing how a relevant set of design choices affects the resource-performance trade-off in VIO. Third, we provide a systematic experiment-driven way to explore the design space, towards a design that meets the desired trade-off. Fourth, we demonstrate the result of the co-design process by providing a VIO implementation on specialized hardware and showing that such implementation has the same accuracy and speed of a desktop implementation, while requiring a fraction of the power.

Amr Suleiman, Zhengdong Zhang, Luca Carlone, Sertac Karaman, Vivienne Sze, "Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones," IEEE Symposium on VLSI Circuits (VLSI-Circuits), June 2018

Abstract Paper Poster Slides Software

This paper presents Navion, an energy-efficient accelerator for visual-inertial odometry (VIO) that enables autonomous navigation of miniaturized robots (e.g., nano drones), and virtual/augmented reality on portable devices. The chip uses inertial measurements and mono/stereo images to estimate the drone’s trajectory and a 3D map of the environment. This estimate is obtained by running a state-of-the-art algorithm based on non-linear factor graph optimization, which requires large irregularly structured memories and heterogeneous computation flow. To reduce the energy consumption and footprint, the entire VIO system is fully integrated on chip to eliminate costly off-chip processing and storage. This work uses compression and exploits both structured and unstructured sparsity to reduce on-chip memory size by 4.1x. Parallelism is used under tight area constraints to increase throughput by 43%. The chip is fabricated in 65nm CMOS, and can process 752x480 stereo images at up to 171 fps and inertial measurements at up to 52 kHz, while consuming an average of 24mW. The chip is configurable to maximize accuracy, throughput and energy-efficiency across different environments. To the best of our knowledge, this is the first fully integrated VIO system in an ASIC.

Amr Suleiman, Zhengdong Zhang, Luca Carlone, Sertac Karaman, Vivienne Sze, "Navion: A 2mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones," IEEE Journal of Solid State Circuits (JSSC), 54:4(1106-1119), April 2019

Abstract Paper Poster Slides Software

This paper presents Navion, an energy-efficient accelerator for visual-inertial odometry (VIO) that enables au- tonomous navigation of miniaturized robots (e.g., nano drones), and virtual/augmented reality on portable devices. The chip uses inertial measurements and mono/stereo images to estimate the drone’s trajectory and a 3D map of the environment. This estimate is obtained by running a state-of-the-art VIO algorithm based on non-linear factor graph optimization, which requires large irregularly structured memories and heterogeneous compu- tation flow. To reduce the energy consumption and footprint, the entire VIO system is fully integrated on chip to eliminate costly off-chip processing and storage. This work uses compression and exploits both structured and unstructured sparsity to reduce on-chip memory size by 4.1×. Parallelism is used under tight area constraints to increase throughput by 43%. The chip is fabricated in 65nm CMOS, and can process 752×480 stereo images from EuRoC dataset in real-time at 20 frames per second (fps) consuming only an average power of 2mW. At its peak performance, Navion can process stereo images at up to 171 fps and inertial measurements at up to 52 kHz, while consuming an average of 24mW. The chip is configurable to maximize accuracy, throughput and energy-efficiency trade-offs and to adapt to different environments. To the best of our knowledge, this is the first fully-integrated VIO system in an ASIC.

Peter Li, Sertac Karaman, Vivienne Sze, “Memory-Efficient Gaussian Fitting for Depth Images in Real Time,” IEEE International Conference on Robotics and Automation (ICRA), May 2022

Abstract Paper Poster Slides Software

Computing consumes a significant portion of energy in many robotics applications, especially the ones involving energy-constrained robots. In addition, memory access accounts for a significant portion of the computing energy. For mapping a 3D environment, prior approaches reduce the map size while incurring a large memory overhead used for storing sensor measurements and temporary variables during computation. In this work, we present a memory-efficient algorithm, named Single-Pass Gaussian Fitting (SPGF), that accurately constructs a compact Gaussian Mixture Model (GMM) which approximates measurements from a depthmap generated from a depth camera. By incrementally constructing the GMM one pixel at a time in a single pass through the depthmap, SPGF achieves higher throughput and orders-of-magnitude lower memory overhead than prior multi-pass approaches. By processing the depthmap row-by-row, SPGF exploits intrinsic properties of the camera to efficiently and accurately infer surface geometries, which leads to higher precision than prior approaches while maintaining the same compactness of the GMM. Using a low-power ARM Cortex-A57 CPU on the NVIDIA Jetson TX2 platform, SPGF operates at 32fps, requires 43KB of memory overhead, and consumes only 0.11J per frame (depthmap). Thus, SPGF enables real-time mapping of large 3D environments on energy-constrained robots.

Peter Zhi Xuan Li, Sertac Karaman, Vivienne Sze, “GMMap: Memory-Efficient Continuous Occupancy Map Using Gaussian Mixture Model,” IEEE Transactions on Robotics (T-RO), Vol. 40, pp. 1339 – 1355, January 2024.

Abstract Paper Poster Slides Software

Energy consumption of memory accesses dominates the compute energy in energy-constrained robots which require a compact 3D map of the environment to achieve autonomy. Recent mapping frameworks only focused on reducing the map size while incurring significant memory usage during map construction due to multi-pass processing of each depth image. In this work, we present a memory-efficient continuous occupancy map, named GMMap, that accurately models the 3D environment using a Gaussian Mixture Model (GMM). Memory-efficient GMMap construction is enabled by the single-pass compression of depth images into local GMMs which are directly fused together into a globally-consistent map. By extending Gaussian Mixture Regression to model unexplored regions, occupancy probability is directly computed from Gaussians. Using a low-power ARM Cortex A57 CPU, GMMap can be constructed in real-time at up to 60 images per second. Compared with prior works, GMMap maintains high accuracy while reducing the map size by at least 56%, memory overhead by at least 88%, DRAM access by at least 78%, and energy consumption by at least 69%. Thus, GMMap enables real-time 3D mapping on energy-constrained robots.

Dasong Gao*, Peter Zhi Xuan Li*, Vivienne Sze, Sertac Karaman, "GEVO: Memory-Efficient Monocular Visual Odometry Using Gaussians," Arxiv 2024.

Abstract Paper Poster Slides Software

Constructing a high-fidelity representation of the 3D scene using a monocular camera can enable a wide range of applications on mobile devices, such as micro-robots, smartphones, and AR/VR headsets. On these devices, memory is often limited in capacity and its access often dominates the consumption of compute energy. Although Gaussian Splatting (GS) allows for high-fidelity reconstruction of 3D scenes, current GS-based SLAM is not memory efficient as a large number of past images is stored to retrain Gaussians for reducing catastrophic forgetting. These images often require two-orders-of-magnitude higher memory than the map itself and thus dominate the total memory usage. In this work, we present GEVO, a GS-based monocular SLAM framework that achieves comparable fidelity as prior methods by rendering (instead of storing) them from the existing map. Novel Gaussian initialization and optimization techniques are proposed to remove artifacts from the map and delay the degradation of the rendered images over time. Across a variety of environments, GEVO achieves comparable map fidelity while reducing the memory overhead to around 58 MBs, which is up to 94× lower than prior works.

Localization and Mapping

Technology Demonstrations

Papers