Sidebar

Blog


Blog


https://horea.caramizaru.xyz


โ† Go Back


Search by Tags




Timeline



"Do Differentiable Simulators Give Better Policy Gradients?" by H.J. Terry Suh, Max Simchowitz, Kaiqing Zhang, Russ Tedrake

Abstract:

Differentiable simulators promise faster computa- tion time for reinforcement learning by replacing zeroth-order gradient estimates of a stochastic objective with an estimate based on first-order gradients. However, it is yet unclear what fac- tors decide the performance of the two estimators on complex landscapes that involve long-horizon planning and control on physical systems, despite the crucial relevance of this question for the util- ity of differentiable simulators. We show that characteristics of certain physical systems, such as stiffness or discontinuities, may compromise the efficacy of the first-order estimator, and ana- lyze this phenomenon through the lens of bias and variance. We additionally propose an ฮฑ-order gra- dient estimator, with ฮฑ โˆˆ [0, 1], which correctly utilizes exact gradients to combine the efficiency of first-order estimates with the robustness of zero- order methods. We demonstrate the pitfalls of traditional estimators and the advantages of the ฮฑ-order estimator on some numerical examples.



Abstract:

Designing robots with extreme performance in a given task has long been an exciting research problem drawing attention from researchers in robotics, graphics, and artificial intelligence. As a robot is a combination of its hardware and software, an optimal robot requires both an excellent implementation of its hardware (e.g., morphological, topological, and geometrical designs) and an outstanding design of its software (e.g., perception, planning, and control algorithms). While we have seen promising breakthroughs for automating a robot's software design with the surge of deep learning in the past decade, exploration of optimal hardware design is much less automated and is still mainly driven by human experts, a process that is both labor-intensive and error-prone. Furthermore, experts typically optimize a robot's hardware and software separately, which may miss optimal designs that can only be revealed by optimizing its hardware and software simultaneously. This thesis argues that it is time to rethink robot design as a holistic process where a robot's body and brain should be co-optimized jointly and automatically. In this thesis, we present a computational robot design pipeline with differentiable simulation as a key player. We first introduce the concept of computational robot design on a real-world copter whose geometry and controller are co-optimized with a differentiable simulator, resulting in a custom copter that outperforms designs suggested by human experts by a substantial margin. Next, we push the boundary of differentiable simulation by developing advanced differentiable simulators for soft-body and fluid dynamics. Contrary to traditional belief, we show that deriving gradients for such intricate, high-dimensional physics systems can be both science and art. Finally, we discuss challenges in transferring computational designs discovered in simulation to real-world hardware platforms. We present a solution to this simulation-to-reality transfer problem using our differentiable simulator on an example of modeling and controlling a real-world soft underwater robot. We conclude this thesis by discussing open research directions in differentiable simulation and envisioning a fully automated computational design pipeline for real-world robots in the future.


2023/11/01 23:08 · Horea Caramizaru · 0 Comments · 0 Linkbacks
2023/10/31 16:06 · Horea Caramizaru · 0 Comments · 0 Linkbacks

<< Newer entries | Older entries >>

feed/start.txt ยท Last modified: 2023/11/12 22:57 by Horea Caramizaru