CP HDR: A feature point detection and description library for LDR and HDR images

Read original: arXiv:2403.19935 - Published 4/1/2024 by Artur Santos Nascimento, Valter Guilherme Silva de Souza, Daniel Oliveira Dantas, Beatriz Trinch~ao Andrade

✨

Introduction

The paper discusses the importance of feature point (FP) detection and description in various computer vision applications, such as 3D reconstruction, face recognition, image stitching, and object tracking. However, traditional FP detection algorithms designed for low dynamic range (LDR) images struggle with scenes that have extreme lighting conditions, leading to under- or overexposed areas and missing potential FPs.

The paper proposes using high dynamic range (HDR) images to overcome these issues, as HDR images can represent a wider range of luminance values compared to LDR images. The paper highlights the importance of FP extraction algorithms that are robust to image transformations and can reliably describe detected FPs.

The main challenge addressed is extracting FPs from images with extreme lighting conditions, where LDR images fail to accurately register certain regions. The paper explains the differences between LDR and HDR images in terms of dynamic range and the ability to represent details in under- and overexposed areas.

The authors have developed a library called CP_HDR that can handle both LDR and HDR images as input to FP detection and description algorithms. The paper describes the systematic review conducted to understand the state-of-the-art in this area, the metrics, datasets, and algorithms implemented in the CP_HDR library, and the comparison of their implementation with previous studies.

Systematic review

This section describes a systematic review of studies on feature point detection and description algorithms that use HDR images as input. The review process had three steps: planning, conduction, and description.

The planning step defined the main objective and four search questions: 1) Which tone mapping techniques are used? 2) Which detection and description algorithms are used? 3) Which evaluation metrics are used? 4) Which datasets are used?

The conduction step searched for relevant studies in several databases and selected 21 studies that met the inclusion criteria and did not fit any exclusion criteria.

The description step summarized the key findings from the selected studies:

Some studies modified existing detectors and descriptors to work with HDR images, while others used tone mapping to convert HDR to LDR images before applying the algorithms.
Commonly used tone mapping techniques include Mantiuk, Reinhard, Drago, and Fattal. The most used detectors are SURF, SIFT, Harris, and FAST. The most used descriptors are SIFT, SURF, FREAK, and BRISK.
Evaluation metrics include repeatability rate, mean average precision, and accuracy for tasks like facial expression recognition and object detection.
Only two studies made their HDR datasets publicly available.
Most studies focused on using tone mapping to improve feature point detection and description, rather than directly working with HDR images.

The summary concludes that more work is needed to directly use HDR images as input to detection and description algorithms, rather than relying on tone mapping.

Methods

This study proposes a library called CP_HDR that can detect and describe features using either LDR or HDR images as input. The authors compared the results obtained using LDR and HDR images.

The library includes state-of-the-art metrics, well-known detector and description algorithms with support for LDR and HDR images, modifications to improve detection and description in HDR images, and an automatic script to generate intensity segmentation.

The key metrics used in the study are:

Repeatability Rate (RR): Measures the percentage of features detected in a reference image that are also detected in a test image. Higher RR indicates more stable feature detection.
Uniformity Rate (UR): Measures how evenly features are distributed across areas of the image with different illumination levels. A higher UR indicates better feature detection in both bright and dark regions.
Matching: Uses the Nearest Neighbor Distance Ratio (NNDR) to evaluate feature matching between images.
Mean Average Precision (mAP): Evaluates the overall quality of feature matching.

The authors modified the Harris corner and SIFT detectors to improve performance on HDR images by applying a coefficient of variation (CV) filter and logarithmic transformation. They also created two datasets with LDR and HDR images to test the library under extreme lighting conditions.

The experimental pipeline involves running the detection algorithms, selecting the top 500 features, describing them using SIFT, matching features between image pairs, and calculating the various evaluation metrics.

V Results

The paper summarizes the results of evaluating feature detection and description algorithms on datasets with varying lighting conditions and dynamic ranges. Key findings include:

The Harris and SIFT algorithms had better repeatability (sRR) scores, while the HfHDR and SfHDR algorithms had better uniformity (UR) scores. SfHDR had the best UR values in 3 out of 5 datasets.
In most cases, UR metrics were better when using HDR images. This suggests HDR data can improve the distribution of detected features across bright, dark, and intermediate areas.
The Harris+SIFT and HfHDR+SIFT algorithms had better mean average precision (mAP) and matching rate than SIFT and SfHDR. SIFT performed best on the 2D distance dataset.
For the more complex Rana et al. datasets, corner detectors like Harris performed better than blob detectors. The noise introduced by the HDR processing in very dark and specular areas may have negatively impacted the Rana et al. dataset results.
Compared to a previous study, the updated SIFT and SfHDR implementations showed significant improvements in the sRR and UR metrics.

Conclusions

This study investigated the use of HDR (high dynamic range) images in feature point (FP) detection and description. The researchers conducted a systematic review of 21 studies, finding that only one used HDR images, while the others used tone mapping algorithms to convert HDR images to LDR (low dynamic range).

The researchers implemented a library called CP_HDR that supports both LDR and HDR images. It includes the SIFT algorithm, as well as the HfHDR and SfHDR algorithms proposed by previous researchers. The library also includes new detector and descriptor algorithms that can be easily added.

The results show that the CP_HDR algorithms improved FP detection compared to the previous implementation, especially when using HDR images. The HfHDR and SfHDR algorithms demonstrated better performance in terms of the UR (undetected ratio) metric. The detected FPs were well distributed across dark, intermediate, and bright areas of the images. In description, the SIFT algorithm performed better than SfHDR.

As future work, the researchers plan to implement the SURF algorithm in CP_HDR and compare the use of LDR, HDR, and tone-mapped LDR images, as well as improve the image processing filters to reduce noise, especially in low-light areas and on specular surfaces.

onflict of interest

The authors state they have no conflicts of interest to declare.

ata availability

The code developed for this study is available on the GitHub repository at https://github.com/ddantas-ufs/2024_cp_hdr.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

CP HDR: A feature point detection and description library for LDR and HDR images

Artur Santos Nascimento, Valter Guilherme Silva de Souza, Daniel Oliveira Dantas, Beatriz Trinch~ao Andrade

In computer vision, characteristics refer to image regions with unique properties, such as corners, edges, textures, or areas with high contrast. These regions can be represented through feature points (FPs). FP detection and description are fundamental steps to many computer vision tasks. Most FP detection and description methods use low dynamic range (LDR) images, sufficient for most applications involving digital images. However, LDR images may have saturated pixels in scenes with extreme light conditions, which degrade FP detection. On the other hand, high dynamic range (HDR) images usually present a greater dynamic range but FP detection algorithms do not take advantage of all the information in such images. In this study, we present a systematic review of image detection and description algorithms that use HDR images as input. We developed a library called CP_HDR that implements the Harris corner detector, SIFT detector and descriptor, and two modifications of those algorithms specialized in HDR images, called SIFT for HDR (SfHDR) and Harris for HDR (HfHDR). Previous studies investigated the use of HDR images in FP detection, but we did not find studies investigating the use of HDR images in FP description. Using uniformity, repeatability rate, mean average precision, and matching rate metrics, we compared the performance of the CP_HDR algorithms using LDR and HDR images. We observed an increase in the uniformity of the distribution of FPs among the high-light, mid-light, and low-light areas of the images. The results show that using HDR images as input to detection algorithms improves performance and that SfHDR and HfHDR enhance FP description.

4/1/2024

NeRF-Supervised Feature Point Detection and Description

Ali Youssef, Francisco Vasconcelos

Feature point detection and description is the backbone for various computer vision applications, such as Structure-from-Motion, visual SLAM, and visual place recognition. While learning-based methods have surpassed traditional handcrafted techniques, their training often relies on simplistic homography-based simulations of multi-view perspectives, limiting model generalisability. This paper presents a novel approach leveraging Neural Radiance Fields (NeRFs) to generate a diverse and realistic dataset consisting of indoor and outdoor scenes. Our proposed methodology adapts state-of-the-art feature detectors and descriptors for training on multi-view NeRF-synthesised data, with supervision achieved through perspective projective geometry. Experiments demonstrate that the proposed methodology achieves competitive or superior performance on standard benchmarks for relative pose estimation, point cloud registration, and homography estimation while requiring significantly less training data and time compared to existing approaches.

7/31/2024

On-the-fly Point Feature Representation for Point Clouds Analysis

Jiangyi Wang, Zhongyao Cheng, Na Zhao, Jun Cheng, Xulei Yang

Point cloud analysis is challenging due to its unique characteristics of unorderness, sparsity and irregularity. Prior works attempt to capture local relationships by convolution operations or attention mechanisms, exploiting geometric information from coordinates implicitly. These methods, however, are insufficient to describe the explicit local geometry, e.g., curvature and orientation. In this paper, we propose On-the-fly Point Feature Representation (OPFR), which captures abundant geometric information explicitly through Curve Feature Generator module. This is inspired by Point Feature Histogram (PFH) from computer vision community. However, the utilization of vanilla PFH encounters great difficulties when applied to large datasets and dense point clouds, as it demands considerable time for feature generation. In contrast, we introduce the Local Reference Constructor module, which approximates the local coordinate systems based on triangle sets. Owing to this, our OPFR only requires extra 1.56ms for inference (65x faster than vanilla PFH) and 0.012M more parameters, and it can serve as a versatile plug-and-play module for various backbones, particularly MLP-based and Transformer-based backbones examined in this study. Additionally, we introduce the novel Hierarchical Sampling module aimed at enhancing the quality of triangle sets, thereby ensuring robustness of the obtained geometric features. Our proposed method improves overall accuracy (OA) on ModelNet40 from 90.7% to 94.5% (+3.8%) for classification, and OA on S3DIS Area-5 from 86.4% to 90.0% (+3.6%) for semantic segmentation, respectively, building upon PointNet++ backbone. When integrated with Point Transformer backbone, we achieve state-of-the-art results on both tasks: 94.8% OA on ModelNet40 and 91.7% OA on S3DIS Area-5.

8/13/2024

🛠️

Perceptual Assessment and Optimization of High Dynamic Range Image Rendering

Peibei Cao, Rafal K. Mantiuk, Kede Ma

High dynamic range (HDR) rendering has the ability to faithfully reproduce the wide luminance ranges in natural scenes, but how to accurately assess the rendering quality is relatively underexplored. Existing quality models are mostly designed for low dynamic range (LDR) images, and do not align well with human perception of HDR image quality. To fill this gap, we propose a family of HDR quality metrics, in which the key step is employing a simple inverse display model to decompose an HDR image into a stack of LDR images with varying exposures. Subsequently, these decomposed images are assessed through well-established LDR quality metrics. Our HDR quality models present three distinct benefits. First, they directly inherit the recent advancements of LDR quality metrics. Second, they do not rely on human perceptual data of HDR image quality for re-calibration. Third, they facilitate the alignment and prioritization of specific luminance ranges for more accurate and detailed quality assessment. Experimental results show that our HDR quality metrics consistently outperform existing models in terms of quality assessment on four HDR image quality datasets and perceptual optimization of HDR novel view synthesis.

6/18/2024