Globally and Locally Optimized Pannini Projection for High FoV Rendering of 360-degree Images

Read original: arXiv:2406.03282 - Published 6/6/2024 by Falah Jabar, Joao Ascenso, Maria Paula Queluz

🐍

Overview

This paper proposes a novel approach to reduce geometric distortions when rendering 360-degree or omnidirectional images on flat displays.
The method is based on the Pannini projection, which is optimized globally for the image content and then improved locally for relevant viewport objects.
A crowdsourcing study shows that the proposed projection is the most preferred solution among state-of-the-art sphere-to-plane projections, resulting in viewports with higher visual quality.

Plain English Explanation

When you view a 360-degree or omnidirectional image on a flat display, the image needs to be projected from a sphere onto a 2D plane, called a viewport. However, this projection can introduce geometric distortions, such as stretching or bending of objects, which become more pronounced as the field of view (FoV) increases.

The researchers in this paper have developed a new projection method that aims to reduce these geometric distortions. Their approach is based on the Pannini projection, but they first optimize the projection parameters globally based on the image content, and then further improve the local conformality (or "flatness") of the relevant objects in the viewport.

By conducting a crowdsourcing study, the researchers found that their proposed projection method was the most preferred solution among the state-of-the-art sphere-to-plane projections, producing viewports with a more pleasant visual quality, especially when using a wide field of view.

Technical Explanation

The paper presents a fully automatic content-aware projection method to reduce geometric distortions when rendering 360-degree or omnidirectional images on planar displays. The proposed approach is based on the Pannini projection, which is a type of sphere-to-plane projection.

First, the Pannini projection parameters are globally optimized according to the image content, aiming to minimize the overall geometric distortions. Then, a local conformality improvement step is applied to relevant viewport objects, further reducing distortions in these areas.

The researchers evaluated their method through a crowdsourcing subjective test, where participants compared the proposed projection to other state-of-the-art sphere-to-plane projections. The results showed that the new projection was the most preferred solution, producing viewports with a more pleasant visual quality, especially when using a wide field of view.

Critical Analysis

The paper presents a comprehensive and well-designed solution for reducing geometric distortions in 360-degree image rendering. The use of a content-aware optimization approach, combined with the local conformality improvement, is a novel and promising technique.

However, the paper does not discuss any limitations or potential issues with the proposed method. For example, it would be useful to know how the method performs on different types of 360-degree content, such as indoor vs. outdoor scenes, or how it handles stitching artifacts or other common issues in 360-degree imaging.

Additionally, the authors could have compared their method to other more recent fully geometric panoramic localization or monocular indoor 360-degree surface normal estimation techniques, which may also be relevant for improving 360-degree image rendering.

Overall, the paper presents a solid contribution to the field of 360-degree imaging, but there is room for further exploration and evaluation of the proposed method's performance and limitations.

Conclusion

This paper introduces a novel content-aware projection method to reduce geometric distortions when rendering 360-degree or omnidirectional images on planar displays. The approach is based on the Pannini projection, with global optimization of the projection parameters and local conformality improvement for relevant viewport objects.

The crowdsourcing evaluation showed that the proposed projection is the most preferred solution among the considered state-of-the-art sphere-to-plane projections, producing viewports with higher visual quality, especially when using wide fields of view.

This research could have important implications for improving the user experience of 360-degree content consumption on standard displays, as well as potentially informing efficient 360-depth estimation or unconstrained text-to-3D scene generation techniques for 360-degree content creation and manipulation.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🐍

Globally and Locally Optimized Pannini Projection for High FoV Rendering of 360-degree Images

Falah Jabar, Joao Ascenso, Maria Paula Queluz

To render a spherical (360 degree or omnidirectional) image on planar displays, a 2D image -- called as viewport -- must be obtained by projecting a sphere region on a plane, according to the users viewing direction and a predefined field of view (FoV). However, any sphere to plan projection introduces geometric distortions, such as object stretching and/or bending of straight lines, which intensity increases with the considered FoV. In this paper, a fully automatic content-aware projection is proposed, aiming to reduce the geometric distortions when high FoVs are used. This new projection is based on the Pannini projection, whose parameters are firstly globally optimized according to the image content, followed by a local conformality improvement of relevant viewport objects. A crowdsourcing subjective test showed that the proposed projection is the most preferred solution among the considered state-of-the-art sphere to plan projections, producing viewports with a more pleasant visual quality.

6/6/2024

Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Zhijie Shen, Chunyu Lin, Lang Nie, Kang Liao

Depth estimation from a monocular 360 image is important to the perception of the entire 3D environment. However, the inherent distortion and large field of view (FoV) in 360 images pose great challenges for this task. To this end, existing mainstream solutions typically introduce additional perspective-based 360 representations (textit{e.g.}, Cubemap) to achieve effective feature extraction. Nevertheless, regardless of the introduced representations, they eventually need to be unified into the equirectangular projection (ERP) format for the subsequent depth estimation, which inevitably reintroduces the troublesome distortions. In this work, we propose an oriented distortion-aware Gabor Fusion framework (PGFuse) to address the above challenges. First, we introduce Gabor filters that analyze texture in the frequency domain, thereby extending the receptive fields and enhancing depth cues. To address the reintroduced distortions, we design a linear latitude-aware distortion representation method to generate customized, distortion-aware Gabor filters (PanoGabor filters). Furthermore, we design a channel-wise and spatial-wise unidirectional fusion module (CS-UFM) that integrates the proposed PanoGabor filters to unify other representations into the ERP format, delivering effective and distortion-free features. Considering the orientation sensitivity of the Gabor transform, we introduce a spherical gradient constraint to stabilize this sensitivity. Experimental results on three popular indoor 360 benchmarks demonstrate the superiority of the proposed PGFuse to existing state-of-the-art solutions. Code can be available upon acceptance.

9/2/2024

Open Panoramic Segmentation

Junwei Zheng, Ruiping Liu, Yufan Chen, Kunyu Peng, Chengzhi Wu, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

Panoramic images, capturing a 360{deg} field of view (FoV), encompass omnidirectional spatial information crucial for scene understanding. However, it is not only costly to obtain training-sufficient dense-annotated panoramas but also application-restricted when training models in a close-vocabulary setting. To tackle this problem, in this work, we define a new task termed Open Panoramic Segmentation (OPS), where models are trained with FoV-restricted pinhole images in the source domain in an open-vocabulary setting while evaluated with FoV-open panoramic images in the target domain, enabling the zero-shot open panoramic semantic segmentation ability of models. Moreover, we propose a model named OOOPS with a Deformable Adapter Network (DAN), which significantly improves zero-shot panoramic semantic segmentation performance. To further enhance the distortion-aware modeling ability from the pinhole source domain, we propose a novel data augmentation method called Random Equirectangular Projection (RERP) which is specifically designed to address object deformations in advance. Surpassing other state-of-the-art open-vocabulary semantic segmentation approaches, a remarkable performance boost on three panoramic datasets, WildPASS, Stanford2D3D, and Matterport3D, proves the effectiveness of our proposed OOOPS model with RERP on the OPS task, especially +2.2% on outdoor WildPASS and +2.4% mIoU on indoor Stanford2D3D. The source code is publicly available at https://junweizheng93.github.io/publications/OPS/OPS.html.

7/15/2024

Elite360D: Towards Efficient 360 Depth Estimation via Semantic- and Distance-Aware Bi-Projection Fusion

Hao Ai, Lin Wang

360 depth estimation has recently received great attention for 3D reconstruction owing to its omnidirectional field of view (FoV). Recent approaches are predominantly focused on cross-projection fusion with geometry-based re-projection: they fuse 360 images with equirectangular projection (ERP) and another projection type, e.g., cubemap projection to estimate depth with the ERP format. However, these methods suffer from 1) limited local receptive fields, making it hardly possible to capture large FoV scenes, and 2) prohibitive computational cost, caused by the complex cross-projection fusion module design. In this paper, we propose Elite360D, a novel framework that inputs the ERP image and icosahedron projection (ICOSAP) point set, which is undistorted and spatially continuous. Elite360D is superior in its capacity in learning a representation from a local-with-global perspective. With a flexible ERP image encoder, it includes an ICOSAP point encoder, and a Bi-projection Bi-attention Fusion (B2F) module (totally ~1M parameters). Specifically, the ERP image encoder can take various perspective image-trained backbones (e.g., ResNet, Transformer) to extract local features. The point encoder extracts the global features from the ICOSAP. Then, the B2F module captures the semantic- and distance-aware dependencies between each pixel of the ERP feature and the entire ICOSAP feature set. Without specific backbone design and obvious computational cost increase, Elite360D outperforms the prior arts on several benchmark datasets.

5/28/2024