Localization of Pallets on Shelves Using Horizontal Plane Projection of a 360-degree Image

Read original: arXiv:2404.17118 - Published 4/29/2024 by Yasuyo Kita, Yudai Fujieda, Ichiro Matsuda, Nobuyuki Kita

🖼️

Overview

The paper proposes a method for calculating the 3D position and orientation of a pallet on a shelf next to a forklift truck using a 360-degree camera.
The pallet appears distorted in the 360-degree image, making it difficult to extract its position and orientation.
The method projects the 360-degree image onto a vertical plane to detect the pallet, and then onto a horizontal plane to accurately determine the pallet's yaw angle.
The pallet's position is found by moving the vertical plane and finding the position where the projection best matches the actual pallet size.
Experiments confirm the method can calculate the pallet's position and orientation accurately enough for automatic forklift control.

Plain English Explanation

The researchers have developed a way to figure out the 3D location and angle of a pallet sitting on a shelf next to a forklift truck. They use a 360-degree camera mounted on the forklift to see both the pallet on the side and the area in front of the forklift.

However, the pallet appears distorted in the 360-degree image, making it hard to determine its exact position and orientation. To solve this, the researchers project the 360-degree image onto a flat vertical plane that lines up with the front of the shelf. This gives them a view of the pallet that's similar to looking at it straight-on.

They then take this detected pallet and project it onto a horizontal plane. This allows them to accurately measure the yaw angle, which is the left-right angle of the pallet's front face. To find the pallet's position, they move the vertical plane back and forth until the projected image best matches the actual size of the pallet's front.

Through testing in a lab and real warehouse, the researchers show their method can calculate the pallet's 3D position and orientation quickly and accurately enough for a forklift to automatically insert its forks into the pallet.

Technical Explanation

The key innovation in this paper is the use of both vertical and horizontal plane projections to accurately determine the 3D position and orientation of a pallet detected in a 360-degree camera image.

First, the researchers project the 360-degree image onto a vertical plane aligned with the front of the shelf. This creates a view of the pallet similar to what would be seen from directly in front of the shelf, allowing the pallet to be detected.

Next, they project this detected pallet outline onto a horizontal plane. This allows them to precisely measure the yaw angle, or left-right orientation, of the pallet's front face. To find the pallet's position, they move the vertical plane back and forth, finding the location where the projected pallet size best matches the actual pallet dimensions.

Through experiments in both a lab setting and a real warehouse, the researchers demonstrate their method can calculate the pallet's 3D position and orientation within the accuracy required for automatic forklift control, while keeping the computation time reasonable.

Critical Analysis

The paper presents a thoughtful and well-designed approach to the challenge of localizing pallets in a 360-degree camera view. The use of both vertical and horizontal plane projections is a clever technique to overcome the distortions inherent in wide-angle imaging.

However, the paper does not explore potential limitations or edge cases of the method. For example, it's unclear how the system would perform if the pallet was not neatly aligned on the shelf, or if there were visual obstructions between the camera and the pallet.

Additionally, the accuracy and speed results are presented in a high-level manner. More detailed quantitative metrics, as well as comparisons to other pallet localization approaches, would help readers better evaluate the merits of this specific technique.

Overall, the research represents a useful contribution, but further investigation of the method's robustness and limitations would strengthen the claims and provide a more complete picture for readers.

Conclusion

This paper introduces an effective technique for calculating the 3D position and orientation of a pallet placed on a shelf next to a forklift truck. By projecting the 360-degree camera image onto both vertical and horizontal planes, the system can accurately detect the pallet's location and yaw angle.

The experiments demonstrated the method's ability to determine the pallet's pose with the precision required for autonomous forklift control. This advance could lead to more efficient and safer material handling in warehouse environments. Further research into the technique's flexibility and performance at scale would help solidify its value for real-world deployment.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🖼️

Localization of Pallets on Shelves Using Horizontal Plane Projection of a 360-degree Image

Yasuyo Kita, Yudai Fujieda, Ichiro Matsuda, Nobuyuki Kita

In this paper, we propose a method for calculating the three-dimensional (3D) position and orientation of a pallet placed on a shelf on the side of a forklift truck using a 360-degree camera. By using a 360-degree camera mounted on the forklift truck, it is possible to observe both the pallet at the side of the forklift and one several meters ahead. However, the pallet on the obtained image is observed with different distortion depending on its 3D position, so that it is difficult to extract the pallet from the image. To solve this problem, a method [1] has been proposed for detecting a pallet by projecting a 360-degree image on a vertical plane that coincides with the front of the shelf to calculate an image similar to the image seen from the front of the shelf. At the same time as the detection, the approximate position and orientation of the detected pallet can be obtained, but the accuracy is not sufficient for automatic control of the forklift truck. In this paper, we propose a method for accurately detecting the yaw angle, which is the angle of the front surface of the pallet in the horizontal plane, by projecting the 360-degree image on a horizontal plane including the boundary line of the front surface of the detected pallet. The position of the pallet is also determined by moving the vertical plane having the detected yaw angle back and forth, and finding the position at which the degree of coincidence between the projection image on the vertical plane and the actual size of the front surface of the pallet is maximized. Experiments using real images taken in a laboratory and an actual warehouse have confirmed that the proposed method can calculate the position and orientation of a pallet within a reasonable calculation time and with the accuracy necessary for inserting the fork into the hole in the front of the pallet.

4/29/2024

🐍

Globally and Locally Optimized Pannini Projection for High FoV Rendering of 360-degree Images

Falah Jabar, Joao Ascenso, Maria Paula Queluz

To render a spherical (360 degree or omnidirectional) image on planar displays, a 2D image -- called as viewport -- must be obtained by projecting a sphere region on a plane, according to the users viewing direction and a predefined field of view (FoV). However, any sphere to plan projection introduces geometric distortions, such as object stretching and/or bending of straight lines, which intensity increases with the considered FoV. In this paper, a fully automatic content-aware projection is proposed, aiming to reduce the geometric distortions when high FoVs are used. This new projection is based on the Pannini projection, whose parameters are firstly globally optimized according to the image content, followed by a local conformality improvement of relevant viewport objects. A crowdsourcing subjective test showed that the proposed projection is the most preferred solution among the considered state-of-the-art sphere to plan projections, producing viewports with a more pleasant visual quality.

6/6/2024

🔎

HPPS: A Hierarchical Progressive Perception System for Luggage Trolley Detection and Localization at Airports

Zhirui Sun, Zhe Zhang, Jieting Zhao, Hanjing Ye, Jiankun Wang

The robotic autonomous luggage trolley collection system employs robots to gather and transport scattered luggage trolleys at airports. However, existing methods for detecting and locating these luggage trolleys often fail when they are not fully visible. To address this, we introduce the Hierarchical Progressive Perception System (HPPS), which enhances the detection and localization of luggage trolleys under partial occlusion. The HPPS processes the luggage trolley's position and orientation separately, which requires only RGB images for labeling and training, eliminating the need for 3D coordinates and alignment. The HPPS can accurately determine the position of the luggage trolley with just one well-detected keypoint and estimate the luggage trolley's orientation when it is partially occluded. Once the luggage trolley's initial pose is detected, HPPS updates this information continuously to refine its accuracy until the robot begins grasping. The experiments on detection and localization demonstrate that HPPS is more reliable under partial occlusion compared to existing methods. Its effectiveness and robustness have also been confirmed through practical tests in actual luggage trolley collection tasks. A website about this work is available at HPPS.

5/10/2024

Fully Geometric Panoramic Localization

Junho Kim, Jiwon Jeong, Young Min Kim

We introduce a lightweight and accurate localization method that only utilizes the geometry of 2D-3D lines. Given a pre-captured 3D map, our approach localizes a panorama image, taking advantage of the holistic 360 view. The system mitigates potential privacy breaches or domain discrepancies by avoiding trained or hand-crafted visual descriptors. However, as lines alone can be ambiguous, we express distinctive yet compact spatial contexts from relationships between lines, namely the dominant directions of parallel lines and the intersection between non-parallel lines. The resulting representations are efficient in processing time and memory compared to conventional visual descriptor-based methods. Given the groups of dominant line directions and their intersections, we accelerate the search process to test thousands of pose candidates in less than a millisecond without sacrificing accuracy. We empirically show that the proposed 2D-3D matching can localize panoramas for challenging scenes with similar structures, dramatic domain shifts or illumination changes. Our fully geometric approach does not involve extensive parameter tuning or neural network training, making it a practical algorithm that can be readily deployed in the real world. Project page including the code is available through this link: https://82magnolia.github.io/fgpl/.

4/1/2024