Log in to save to my catalogue

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile...

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile...

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3132697447

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

About this item

Full title

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-11

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

Learning diverse policies for non-prehensile manipulation is essential for improving skill transfer and generalization to out-of-distribution scenarios. In this work, we enhance exploration through a two-fold approach within a hybrid framework that tackles both discrete and continuous action spaces. First, we model the continuous motion parameter policy as a diffusion model, and second, we incorporate this into a maximum entropy reinforcement learning framework that unifies both the discrete and continuous components. The discrete action space, such as contact point selection, is optimized through Q-value function maximization, while the continuous part is guided by a diffusion-based policy. This hybrid approach leads to a principled objective, where the maximum entropy term is derived as a lower bound using structured variational inference. We propose the Hybrid Diffusion Policy algorithm (HyDo) and evaluate its performance on both simulation and zero-shot sim2real tasks. Our results show that HyDo encourages more diverse behavior policies, leading to significantly improved success rates across tasks - for example, increasing from 53% to 72% on a real-world 6D pose alignment task. Project page: https://leh2rng.github.io/hydo...

Alternative Titles

Full title

Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_3132697447

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3132697447

Other Identifiers

E-ISSN

2331-8422

How to access this item