Quantifying Noise of Dynamic Vision Sensor

2404.01948

Published 4/3/2024 by Evgeny V. Votyakov, Alessandro Artusi

Quantifying Noise of Dynamic Vision Sensor

Abstract

Dynamic visual sensors (DVS) are characterized by a large amount of background activity (BA) noise, which it is mixed with the original (cleaned) sensor signal. The dynamic nature of the signal and the absence in practical application of the ground truth, it clearly makes difficult to distinguish between noise and the cleaned sensor signals using standard image processing techniques. In this letter, a new technique is presented to characterise BA noise derived from the Detrended Fluctuation Analysis (DFA). The proposed technique can be used to address an existing DVS issues, which is how to quantitatively characterised noise and signal without ground truth, and how to derive an optimal denoising filter parameters. The solution of the latter problem is demonstrated for the popular real moving-car dataset.

Get summaries of the top AI research delivered straight to your inbox:

Overview

This paper explores methods for quantifying the noise in a dynamic vision sensor (DVS), a type of camera that captures changes in light rather than full images.
The researchers developed a "background activity (BA) filter" to reduce the amount of unwanted noise in the DVS signal.
They tested the BA filter and compared its performance to other noise reduction techniques, demonstrating its effectiveness in improving the quality of DVS data.

Plain English Explanation

Dynamic vision sensors (DVS) are a new type of camera that work differently than traditional cameras. Instead of capturing full images, DVS sensors only detect changes in light. This allows them to be more responsive and efficient, capturing rapid movements that normal cameras might miss.

However, DVS sensors also tend to pick up a lot of unwanted background noise - tiny fluctuations in the light that don't represent actual movement. This noise can make it harder to analyze the DVS data and extract useful information.

The researchers in this paper developed a special "background activity (BA) filter" to try and remove this noise. The BA filter analyzes the DVS data and identifies the background noise, then subtracts it out to leave behind a cleaner signal representing only the meaningful movement.

By testing the BA filter, the researchers showed that it was effective at improving the quality of the DVS data, reducing the amount of noise and making the important information more clear. This could be useful for applications like autonomous vehicles, robotics, and computer vision that rely on fast, responsive cameras to perceive the world.

Technical Explanation

The paper introduces the concept of a dynamic vision sensor (DVS), which is a type of event-based, asynchronous camera that captures changes in light intensity rather than full images. DVS sensors offer advantages like high temporal resolution and low power consumption, but they also suffer from significant background noise that can obscure the meaningful visual information.

To address this issue, the researchers developed a "background activity (BA) filter" that analyzes the DVS data and identifies the background noise, then subtracts it out. The BA filter works by maintaining a running estimate of the background light level at each pixel, and then using this estimate to determine whether an event (i.e. a change in light) represents meaningful signal or just background noise.

The researchers evaluated the performance of the BA filter through several experiments, including using it to process recordings of a rotating fan and a person walking. They compared the BA filter to other noise reduction techniques like median filtering and found that it was able to effectively suppress the background noise while preserving the important motion information.

Overall, the work demonstrates a practical approach for improving the signal quality of DVS data, which could enable more reliable use of these sensors in computer vision and robotics applications that require fast, efficient perception of the environment.

Critical Analysis

The paper provides a well-designed and thorough evaluation of the proposed BA filtering technique for reducing noise in DVS data. The experiments cover relevant scenarios and the quantitative results clearly show the benefits of the BA filter compared to other approaches.

That said, the paper does not delve into some potential limitations or edge cases of the technique. For example, it is unclear how the BA filter would perform in situations with rapid, complex motion that could be mistaken for background noise. Additionally, the sensitivity of the filter's parameters and how they may need to be tuned for different use cases is not explored in depth.

Furthermore, the paper does not discuss potential real-world challenges in deploying a DVS-based system with the BA filter, such as the computational overhead, power consumption, or integration with other vision processing pipelines. These practical considerations would be important for assessing the overall feasibility and applicability of the approach.

Overall, the research represents a valuable contribution to the field of event-based vision, but further investigation into the limitations and tradeoffs of the BA filtering technique would help provide a more comprehensive understanding of its strengths and weaknesses.

Conclusion

This paper presents an effective method for reducing noise in dynamic vision sensor (DVS) data through the use of a background activity (BA) filter. By maintaining an estimate of the background light level and subtracting it from the DVS events, the BA filter is able to suppress unwanted noise while preserving the important motion information.

The experiments demonstrate the BA filter's superior performance compared to other noise reduction techniques, making it a promising tool for improving the reliability and usability of DVS sensors in computer vision and robotics applications. While the paper does not explore all potential limitations, it provides a solid foundation for further research and development in this area.

As DVS sensors continue to advance, techniques like the BA filter will play a crucial role in unlocking their full potential for fast, efficient, and responsive perception of the environment. This work represents an important step forward in realizing the benefits of event-based vision technology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🖼️

Real-time Noise Source Estimation of a Camera System from an Image and Metadata

Maik Wischow, Patrick Irmisch, Anko Boerner, Guillermo Gallego

Autonomous machines must self-maintain proper functionality to ensure the safety of humans and themselves. This pertains particularly to its cameras as predominant sensors to perceive the environment and support actions. A fundamental camera problem addressed in this study is noise. Solutions often focus on denoising images a posteriori, that is, fighting symptoms rather than root causes. However, tackling root causes requires identifying the noise sources, considering the limitations of mobile platforms. This work investigates a real-time, memory-efficient and reliable noise source estimator that combines data- and physically-based models. To this end, a DNN that examines an image with camera metadata for major camera noise sources is built and trained. In addition, it quantifies unexpected factors that impact image noise or metadata. This study investigates seven different estimators on six datasets that include synthetic noise, real-world noise from two camera systems, and real field campaigns. For these, only the model with most metadata is capable to accurately and robustly quantify all individual noise contributions. This method outperforms total image noise estimators and can be plug-and-play deployed. It also serves as a basis to include more advanced noise sources, or as part of an automatic countermeasure feedback-loop to approach fully reliable machines.

4/5/2024

cs.CV cs.RO eess.IV

🧪

V2CE: Video to Continuous Events Simulator

Zhongyang Zhang, Shuyang Cui, Kaidong Chai, Haowen Yu, Subhasis Dasgupta, Upal Mahbub, Tauhidur Rahman

Dynamic Vision Sensor (DVS)-based solutions have recently garnered significant interest across various computer vision tasks, offering notable benefits in terms of dynamic range, temporal resolution, and inference speed. However, as a relatively nascent vision sensor compared to Active Pixel Sensor (APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled datasets. Prior efforts to convert APS data into events often grapple with issues such as a considerable domain shift from real events, the absence of quantified validation, and layering problems within the time axis. In this paper, we present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of DVS. A series of carefully designed losses helps enhance the quality of generated event voxels significantly. We also propose a novel local dynamic-aware timestamp inference strategy to accurately recover event timestamps from event voxels in a continuous fashion and eliminate the temporal layering problem. Results from rigorous validation through quantified metrics at all stages of the pipeline establish our method unquestionably as the current state-of-the-art (SOTA).

4/30/2024

cs.CV cs.AI

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

Yikang Zhou, Tao Zhang, Shunping Ji, Shuicheng Yan, Xiangtai Li

Modern video segmentation methods adopt object queries to perform inter-frame association and demonstrate satisfactory performance in tracking continuously appearing objects despite large-scale motion and transient occlusion. However, they all underperform on newly emerging and disappearing objects that are common in the real world because they attempt to model object emergence and disappearance through feature transitions between background and foreground queries that have significant feature gaps. We introduce Dynamic Anchor Queries (DAQ) to shorten the transition gap between the anchor and target queries by dynamically generating anchor queries based on the features of potential candidates. Furthermore, we introduce a query-level object Emergence and Disappearance Simulation (EDS) strategy, which unleashes DAQ's potential without any additional cost. Finally, we combine our proposed DAQ and EDS with DVIS to obtain DVIS-DAQ. Extensive experiments demonstrate that DVIS-DAQ achieves a new state-of-the-art (SOTA) performance on five mainstream video segmentation benchmarks. Code and models are available at url{https://github.com/SkyworkAI/DAQ-VS}.

4/8/2024

cs.CV

👀

Improving Interpretation Faithfulness for Vision Transformers

Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, Di Wang

Vision Transformers (ViTs) have achieved state-of-the-art performance for various vision tasks. One reason behind the success lies in their ability to provide plausible innate explanations for the behavior of neural architectures. However, ViTs suffer from issues with explanation faithfulness, as their focal points are fragile to adversarial attacks and can be easily changed with even slight perturbations on the input image. In this paper, we propose a rigorous approach to mitigate these issues by introducing Faithful ViTs (FViTs). Briefly speaking, an FViT should have the following two properties: (1) The top-$k$ indices of its self-attention vector should remain mostly unchanged under input perturbation, indicating stable explanations; (2) The prediction distribution should be robust to perturbations. To achieve this, we propose a new method called Denoised Diffusion Smoothing (DDS), which adopts randomized smoothing and diffusion-based denoising. We theoretically prove that processing ViTs directly with DDS can turn them into FViTs. We also show that Gaussian noise is nearly optimal for both $ell_2$ and $ell_infty$-norm cases. Finally, we demonstrate the effectiveness of our approach through comprehensive experiments and evaluations. Results show that FViTs are more robust against adversarial attacks while maintaining the explainability of attention, indicating higher faithfulness.

5/6/2024

cs.CV cs.AI cs.LG