GMP³: Continuous trajectory optimisation for autonomous flight systems

In the field of unmanned aerial systems, modern approaches rely on learning-based trajectory planning. The publication "GMP3: Learning Driven, Bellman Guided Trajectory Planning for UAVs in Real Time on SE(3)" shows how GMP³ achieves continuous improvements through reinforcement learning - and clearly outperforms classic and sampling-based approaches.

Extracted onboard flight data from the experiment Left: Obstacles and Trajectory of the UAV. Right: Position and Velocities of the UAV over its flight time.

Learning-based planning methods are becoming increasingly important in the field of autonomous unmanned aerial vehicles (UAVs). The recently published paper by 
B. Salamat, D. Mattern, S.-S. Olzem, G. Elsbacher, C. Seidel and A. M. Tonello, "GMP3: Learning-Driven, Bellman-Guided Trajectory Planning for UAVs in Real-Time on SE(3)" impressively demonstrates how the GMP³ approach achieves a continuous improvement in trajectory quality through distributed, reinforcement learning - unlike classical methods, which only deliver selective progress.

The focus is on the reinforcement learning-based refinement of flight trajectories under a shared SE(3)-sensitive objective function, in which translation and rotation are optimised simultaneously. This holistic optimisation is particularly relevant for autonomous flight systems, as position and orientation are closely linked here. GMP³ thus leads to greater stability, efficiency and precision.

Advantages over conventional methods

Comparative analyses show:

  • Classical anytime optimisation methods show progress predominantly towards fixed planned refinement steps, i.e. discontinuous.
  • Sampling-based learning baselines often require significantly more iterations to achieve comparable results.
  • GMP³ uses a Bellman-guided learning approach, whereby the optimisation simultaneously addresses translation and rotation and enables a uniform, continuous increase in quality.

The results shown illustrate the potential of learning-driven, networked optimisation methods for complex motion planning problems of autonomous flight systems. At the same time, it becomes clear how interdisciplinary and cross-institute cooperation makes a decisive contribution to integrating different expertise from Robotics, machine learning, control engineering and optimisation into a coherent research project - a key factor for the success of GMP³. To the paper