MIT researchers have developed a groundbreaking technique that enables machines to solve complex stabilize-avoid problems more effectively than previous methods. The new machine-learning approach, presented in a paper by lead author Oswin So and senior author Chuchu Fan, allows autonomous aircraft to navigate treacherous terrain with a tenfold increase in stability and achieve their goals while ensuring safety.
The stabilize-avoid problem refers to the conflict autonomous aircraft face when attempting to reach their targets while avoiding collisions with obstacles or detection by radar. Many existing AI methods fail to overcome this challenge, hindering their ability to accomplish the mission safely.
To address this issue, the MIT researchers devised a two-step solution. First, they reframed the stabilize-avoid problem as a constrained optimization problem, enabling the agent to reach and stabilize within a designated goal region. By incorporating constraints, they ensured that the agent effectively avoided obstacles.
The second step involved reformulating the constrained optimization problem into the epigraph form, a mathematical representation that could be solved using a deep reinforcement learning algorithm. By overcoming the limitations of existing reinforcement learning approaches, the researchers were able to derive mathematical expressions specific to the system and combine them with existing engineering techniques.
The researchers conducted control experiments with various initial conditions to test their approach. Their method stabilized all trajectories while maintaining safety, outperforming several baseline methods. In a scenario inspired by a “Top Gun” movie, the researchers simulated a jet aircraft flying through a narrow corridor near the ground. Their controller effectively stabilized the jet, preventing crashes or stalls and outperforming other baselines.
This breakthrough technique holds promising applications in designing controllers for highly dynamic robots that require safety and stability guarantees, such as autonomous delivery drones. It could also be implemented as part of larger systems, assisting drivers in navigating hazardous conditions, for example, by reestablishing stability when a car skids on a snowy road.
The researchers envision providing reinforcement learning with the safety and stability guarantees necessary for deploying controllers in mission-critical systems. This approach represents a significant step toward achieving that goal. Moving forward, the team plans to enhance the technique to account for uncertainty when solving the optimization and to assess its performance when deployed on hardware, considering the dynamics of real-world scenarios.
Experts not involved in the research have commended the MIT team for improving reinforcement learning performance in systems where safety is paramount. The ability to generate safe controllers for complex scenarios, including a nonlinear jet aircraft model, has far-reaching implications for the field.