SPLIT: SE(3)-diffusion via Local Geometry-based Score Prediction for 3D Scene-to-Pose-Set Matching Problems

doi:10.48550/arXiv.2411.10049

SPLIT: SE(3)-diffusion via Local Geometry-based Score Prediction for 3D Scene-to-Pose-Set Matching Problems

To enable versatile robot manipulation, robots must detect task-relevant poses for different purposes from raw scenes. Currently, many perception algorithms are designed for specific purposes, which limits the flexibility of the perception module. We present a general problem formulation called 3D scene-to-pose-set matching, which directly matches the corresponding poses from the scene without relying on task-specific heuristics. To address this, we introduce SPLIT, an SE(3)-diffusion model for generating pose samples from a scene. The model's efficiency comes from predicting scores based on local geometry with respect to the sample pose. Moreover, leveraging the conditioned generation capability of diffusion models, we demonstrate that SPLIT can generate the multi-purpose poses, required to complete both the mug reorientation and hanging manipulation within a single model.

Publication:

arXiv e-prints

Pub Date:

November 2024

DOI:

10.48550/arXiv.2411.10049

arXiv:

arXiv:2411.10049

Bibcode:

2024arXiv241110049K

Keywords:

Computer Science - Robotics

NASA/ADS

SPLIT: SE(3)-diffusion via Local Geometry-based Score Prediction for 3D Scene-to-Pose-Set Matching Problems

Abstract