Pose-free object classification from surface contact features in sequences of Robotic grasps

2403.19840

Published 4/1/2024 by Teresa Alves, Alexandre Bernardino, Plinio Moreno

🏷️

Abstract

In this work, we propose two cost efficient methods for object identification, using a multi-fingered robotic hand equipped with proprioceptive sensing. Both methods are trained on known objects and rely on a limited set of features, obtained during a few grasps on an object. Contrary to most methods in the literature, our methods do not rely on the knowledge of the relative pose between object and hand, which greatly expands the domain of application. However, if that knowledge is available, we propose an additional active exploration step that reduces the overall number of grasps required for a good recognition of the object. One of the methods depends on the contact positions and normals and the other depends on the contact positions alone. We test the proposed methods in the GraspIt! simulator and show that haptic-based object classification is possible in pose-free conditions. We evaluate the parameters that produce the most accurate results and require the least number of grasps for classification.

Create account to get full access

Introduction

The paper discusses the development of two different approaches, PN (Point and Normal) based and P (Point) based, for object identification using a robotic hand with proprioceptive and/or tactile sensing capabilities. The PN method relies on measuring 3D positions and surface normals at the finger contact points, while the P method only uses the positions of the contacts.

The researchers conducted experiments in a robotic simulator with the Barrett Hand, investigating an active learning strategy for exploration around the object to find the grasp with the highest information gain. This active exploration approach was compared to passive/random exploration under the same conditions.

The proposed methods were validated in the GraspIt! simulator, and their robustness was assessed by adding random noise to the sensor values and checking the corresponding performance. The methods maintain a likelihood score for each object at each grasp, allowing a decision on the object's identity to be made based on a certainty threshold.

The evaluation criteria included efficiency (the number of grasps required to reach a certain level of certainty) and accuracy (the fraction of correct decisions made by the system).

Related Work

The text discusses various methods for object identification using tactile sensing with robotic grippers and hands. Some key points:

Previous methods require full 3D reconstruction of objects, which can be computationally expensive. Humans can recognize objects with just a few grasps without full reconstruction.
The bag-of-features approach uses a two-fingered gripper to grasp and examine an object's height profile with tactile sensors on each finger. Neural networks can also analyze the sensor shapes for object identification.
The GraspIt! simulator can compute stable grasps for objects and provide contact poses. It has been used with real robotic hands like the Barrett Hand.
One approach uses motor values of a Barrett Hand's fingers and orientation to create tables and apply Bayesian inference for object identification, without using tactile sensor values.
This approach implements active learning to find the grasp orientation with the most information gain, but requires the object to be static.
Vision-based methods like using point clouds from RGBD cameras and point pair features can also be adapted for robotic grasp object identification.
A limitation of existing methods is that they are not pose-invariant and require the object to remain static.

Approach

The paper describes a haptic object identification approach based on using contact point information from a robotic hand's fingers. The core method calculates Point Pair Features (PPFs) from the relative positions and orientations of pairs of contact points. These PPFs are used as keys in a hash table that stores object models. There are two variants:

The PN method uses both the positions and normals at the contact points to calculate PPFs, making it robust to the object's pose.

The P method only uses the contact point positions, allowing identification with robots that cannot measure contact normals, but at the cost of some pose invariance.

During data collection, multiple grasps are performed on each object to account for sensor noise. The testing phase accumulates votes for objects based on how well the observed contact point data matches the hash tables.

The paper explores two exploration strategies - passive random exploration and active exploration that selects the next grasp to maximize the differentiation between the top object candidates. Different probability thresholds are evaluated as stopping criteria for identification.

Overall, it presents a contact geometry based approach for pose-invariant haptic object recognition, with options to trade off between accuracy and sensor requirements through the PN and P variants. The key technical details covered are the PPF feature calculation, voting scheme, exploration strategies, and probabilistic inference.

V Experiments

The paper discusses using the GraspIt! simulator to generate data for object classification. The simulator allows extracting contact pose information as a robotic hand rotates 360 degrees around objects. Five objects from the YCB database were chosen: a tuna can, mug, bowl, baseball, and foam brick.

To create training data, Gaussian noise was added to the simulated contact poses 50 times for each grasp to simulate sensor noise. This generated 50 sample grasps which were used as keys to populate hash tables for two different classification methods.

For testing, noise was again added to the simulated grasps to generate 50 sample keys. The existence of these keys was checked in each object's hash table to calculate the probability that grasp belongs to that object using an equation provided.

The section on processing grasp values does not provide any specific details about the classification methods themselves, only the data generation process.

Results

The paper presents results comparing two methods (PN and P) for object classification through grasping, using both passive and active exploration approaches. Key findings include:

Passive Learning:

The PN method generally required fewer grasps than the P method for object classification, except for the foam brick object.
For a 99% confidence threshold, the PN method allowed single-grasp identification for the bowl object and required 2-3 grasps for other objects, outperforming the P method.
The baseball was the most difficult object to identify, requiring up to 327 grasps for the P method.
Perception errors were generally lower for the PN method compared to P, except at low confidence thresholds where the P method performed slightly better for some objects.

Active Learning:

Both methods required far fewer grasps compared to passive learning.
The PN method outperformed P for most objects, requiring fewer grasps on average. The bowl showed similar results for both methods.
The active P method struggled with classifying the mug object, exhibiting around 15% error even at 99% confidence.
Perception errors were substantially lower with active learning compared to passive for both methods.

Overall:

Active exploration with the PN method achieved the best performance, requiring the lowest median number of grasps (5) and exhibiting low perception errors.
Passive exploration with the P method performed the worst, demanding the highest median grasps (16) and highest perception errors.

The results demonstrate the advantages of incorporating active exploration and using the PN method for efficient and accurate object classification through grasping.

Conclusions

The paper presents two methods for classifying objects from a sequence of grasps without requiring knowledge of the relative pose between the hand and the object. The first method (PN) uses the position and orientation of the contacts between the fingers and the object relative to the hand's reference frame. The second method (P) uses only the position of the contacts, catering to robots with less haptic sensing abilities.

Experiments demonstrated that both methods effectively recognize objects in a limited set with good accuracy using a small number of grasps. The PN method, utilizing richer features, achieves better performance, requiring approximately half the number of grasps compared to the P method on average.

In cases where the relative pose between the hand and object can be measured (e.g., with external sensors), an active exploration method is proposed to further reduce the number of required grasps. The PN method still outperforms the P method, but the difference is smaller, suggesting that simpler contact sensing can be sufficient when the object's pose relative to the hand is known.

Future work will focus on a more comprehensive characterization of the methods, involving more objects, increased hand movement degrees of freedom, and more natural environments, both in simulation and with a real robot hand.

The paper introduces a pose-free method for haptic recognition, improving the efficiency of haptic sensing methods for robots in recognizing objects under natural conditions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge

Haoxiang Ma, Modi Shi, Boyang Gao, Di Huang

We focus on the generalization ability of the 6-DoF grasp detection method in this paper. While learning-based grasp detection methods can predict grasp poses for unseen objects using the grasp distribution learned from the training set, they often exhibit a significant performance drop when encountering objects with diverse shapes and structures. To enhance the grasp detection methods' generalization ability, we incorporate domain prior knowledge of robotic grasping, enabling better adaptation to objects with significant shape and structure differences. More specifically, we employ the physical constraint regularization during the training phase to guide the model towards predicting grasps that comply with the physical rule on grasping. For the unstable grasp poses predicted on novel objects, we design a contact-score joint optimization using the projection contact map to refine these poses in cluttered scenarios. Extensive experiments conducted on the GraspNet-1billion benchmark demonstrate a substantial performance gain on the novel object set and the real-world grasping experiments also demonstrate the effectiveness of our generalizing 6-DoF grasp detection method.

4/3/2024

cs.RO cs.CV

Multi-fingered Robotic Hand Grasping in Cluttered Environments through Hand-object Contact Semantic Mapping

Lei Zhang, Kaixin Bai, Guowen Huang, Zhaopeng Chen, Jianwei Zhang

The integration of optimization method and generative models has significantly advanced dexterous manipulation techniques for five-fingered hand grasping. Yet, the application of these techniques in cluttered environments is a relatively unexplored area. To address this research gap, we have developed a novel method for generating five-fingered hand grasp samples in cluttered settings. This method emphasizes simulated grasp quality and the nuanced interaction between the hand and surrounding objects. A key aspect of our approach is our data generation method, capable of estimating contact spatial and semantic representations and affordance grasps based on object affordance information. Furthermore, our Contact Semantic Conditional Variational Autoencoder (CoSe-CVAE) network is adept at creating comprehensive contact maps from point clouds, incorporating both spatial and semantic data. We introduce a unique grasp detection technique that efficiently formulates mechanical hand grasp poses from these maps. Additionally, our evaluation model is designed to assess grasp quality and collision probability, significantly improving the practicality of five-fingered hand grasping in complex scenarios. Our data generation method outperforms previous datasets in grasp diversity, scene diversity, modality diversity. Our grasp generation method has demonstrated remarkable success, outperforming established baselines with 81.0% average success rate in real-world single-object grasping and 75.3% success rate in multi-object grasping. The dataset and supplementary materials can be found at https://sites.google.com/view/ffh-clutteredgrasping, and we will release the code upon publication.

4/16/2024

cs.RO cs.AI

Tactile-Driven Non-Prehensile Object Manipulation via Extrinsic Contact Mode Control

Miquel Oller, Dmitry Berenson, Nima Fazeli

In this paper, we consider the problem of non-prehensile manipulation using grasped objects. This problem is a superset of many common manipulation skills including instances of tool-use (e.g., grasped spatula flipping a burger) and assembly (e.g., screwdriver tightening a screw). Here, we present an algorithmic approach for non-prehensile manipulation leveraging a gripper with highly compliant and high-resolution tactile sensors. Our approach solves for robot actions that drive object poses and forces to desired values while obeying the complex dynamics induced by the sensors as well as the constraints imposed by static equilibrium, object kinematics, and frictional contact. Our method is able to produce a variety of manipulation skills and is amenable to gradient-based optimization by exploiting differentiability within contact modes (e.g., specifications of sticking or sliding contacts). We evaluate 4 variants of controllers that attempt to realize these plans and demonstrate a number of complex skills including non-prehensile planar sliding and pivoting on a variety of object geometries. The perception and controls capabilities that drive these skills are the building blocks towards dexterous and reactive autonomy in unstructured environments.

5/29/2024

cs.RO

GoalGrasp: Grasping Goals in Partially Occluded Scenarios without Grasp Training

Shun Gui, Yan Luximon

We present GoalGrasp, a simple yet effective 6-DOF robot grasp pose detection method that does not rely on grasp pose annotations and grasp training. Our approach enables user-specified object grasping in partially occluded scenes. By combining 3D bounding boxes and simple human grasp priors, our method introduces a novel paradigm for robot grasp pose detection. First, we employ a 3D object detector named RCV, which requires no 3D annotations, to achieve rapid 3D detection in new scenes. Leveraging the 3D bounding box and human grasp priors, our method achieves dense grasp pose detection. The experimental evaluation involves 18 common objects categorized into 7 classes based on shape. Without grasp training, our method generates dense grasp poses for 1000 scenes. We compare our method's grasp poses to existing approaches using a novel stability metric, demonstrating significantly higher grasp pose stability. In user-specified robot grasping experiments, our approach achieves a 94% grasp success rate. Moreover, in user-specified grasping experiments under partial occlusion, the success rate reaches 92%.

5/9/2024

cs.RO