EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More

Read original: arXiv:2408.16254 - Published 8/30/2024 by Kanghao Chen, Guoqiang Liang, Hangyu Li, Yunfan Lu, Lin Wang

EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More

Overview

Introduces a new low-light video enhancement method using an event camera
Presents a large-scale real-world dataset for low-light enhancement with an event camera
Demonstrates the effectiveness of the proposed method on downstream applications

Plain English Explanation

EvLight++: Low-Light Video Enhancement with an Event Camera is a research paper that focuses on improving the quality of videos captured in low-light conditions using a specialized type of camera called an event camera.

Event cameras are different from traditional cameras because they don't capture full frames of video. Instead, they detect changes in brightness and record those changes as a series of "events." This allows event cameras to capture a lot of detail in low-light situations where traditional cameras would struggle.

The paper introduces a new method for using the data from an event camera to enhance low-light videos. It also presents a large dataset of real-world low-light videos captured with an event camera, which can be used by other researchers to develop and test their own low-light enhancement techniques.

Finally, the paper demonstrates how the proposed low-light enhancement method can improve the performance of other computer vision tasks, like object detection and segmentation, when applied to low-light videos.

Technical Explanation

The paper begins by introducing the problem of low-light video enhancement and the potential benefits of using an event camera for this task. Event cameras have several advantages over traditional cameras, including high dynamic range, low latency, and the ability to capture detailed information in low-light conditions.

The researchers then present a large-scale dataset of real-world low-light videos captured with an event camera. This dataset, called EvLight++, includes over 100 video sequences across a variety of scenes and lighting conditions. The dataset is designed to be challenging, with many sequences capturing very dim or complex lighting environments.

Next, the paper describes the researchers' proposed method for low-light video enhancement using an event camera. The key idea is to use the event data from the camera to guide the enhancement of the corresponding low-light video frames. The method involves several steps, including event feature extraction, event-guided frame enhancement, and temporal consistency modeling.

The experimental results demonstrate that the proposed method outperforms several baseline approaches on the EvLight++ dataset, both in terms of objective image quality metrics and subjective evaluations. The researchers also show that the enhanced low-light videos can lead to improved performance on downstream computer vision tasks, such as object detection and segmentation.

Critical Analysis

The paper presents a comprehensive and well-designed study on low-light video enhancement using event cameras. The EvLight++ dataset appears to be a valuable contribution to the field, as it provides a challenging real-world benchmark for evaluating low-light enhancement methods.

However, the paper does not thoroughly discuss the limitations of the proposed method. For example, it's not clear how the method would perform on extremely low-light conditions or in the presence of other challenging factors, such as camera motion or occlusions. Additionally, the paper does not provide a detailed analysis of the computational complexity and runtime of the proposed approach, which could be an important consideration for real-world applications.

Furthermore, the paper could have benefited from a more in-depth comparison to related work. While the authors do mention some existing methods, a more comprehensive discussion of the strengths and weaknesses of other approaches could help readers better understand the novelty and significance of the proposed technique.

Conclusion

Overall, the paper presents a promising new method for low-light video enhancement using an event camera, along with a valuable dataset for evaluating such techniques. The demonstrated improvements in downstream computer vision tasks suggest that the proposed approach could have practical applications in areas like surveillance, autonomous navigation, and low-light imaging. While the paper has some room for improvement in terms of its critical analysis and discussion of limitations, it represents a notable contribution to the field of low-light video processing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More

Kanghao Chen, Guoqiang Liang, Hangyu Li, Yunfan Lu, Lin Wang

Event cameras offer significant advantages for low-light video enhancement, primarily due to their high dynamic range. Current research, however, is severely limited by the absence of large-scale, real-world, and spatio-temporally aligned event-video datasets. To address this, we introduce a large-scale dataset with over 30,000 pairs of frames and events captured under varying illumination. This dataset was curated using a robotic arm that traces a consistent non-linear trajectory, achieving spatial alignment precision under 0.03mm and temporal alignment with errors under 0.01s for 90% of the dataset. Based on the dataset, we propose textbf{EvLight++}, a novel event-guided low-light video enhancement approach designed for robust performance in real-world scenarios. Firstly, we design a multi-scale holistic fusion branch to integrate structural and textural information from both images and events. To counteract variations in regional illumination and noise, we introduce Signal-to-Noise Ratio (SNR)-guided regional feature selection, enhancing features from high SNR regions and augmenting those from low SNR regions by extracting structural information from events. To incorporate temporal information and ensure temporal coherence, we further introduce a recurrent module and temporal loss in the whole pipeline. Extensive experiments on our and the synthetic SDSD dataset demonstrate that EvLight++ significantly outperforms both single image- and video-based methods by 1.37 dB and 3.71 dB, respectively. To further explore its potential in downstream tasks like semantic segmentation and monocular depth estimation, we extend our datasets by adding pseudo segmentation and depth labels via meticulous annotation efforts with foundation models. Experiments under diverse low-light scenes show that the enhanced results achieve a 15.97% improvement in mIoU for semantic segmentation.

8/30/2024

Towards Real-world Event-guided Low-light Video Enhancement and Deblurring

Taewoo Kim, Jaeseok Jeong, Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon

In low-light conditions, capturing videos with frame-based cameras often requires long exposure times, resulting in motion blur and reduced visibility. While frame-based motion deblurring and low-light enhancement have been studied, they still pose significant challenges. Event cameras have emerged as a promising solution for improving image quality in low-light environments and addressing motion blur. They provide two key advantages: capturing scene details well even in low light due to their high dynamic range, and effectively capturing motion information during long exposures due to their high temporal resolution. Despite efforts to tackle low-light enhancement and motion deblurring using event cameras separately, previous work has not addressed both simultaneously. To explore the joint task, we first establish real-world datasets for event-guided low-light enhancement and deblurring using a hybrid camera system based on beam splitters. Subsequently, we introduce an end-to-end framework to effectively handle these tasks. Our framework incorporates a module to efficiently leverage temporal information from events and frames. Furthermore, we propose a module to utilize cross-modal feature information to employ a low-pass filter for noise suppression while enhancing the main structural information. Our proposed method significantly outperforms existing approaches in addressing the joint task. Our project pages are available at https://github.com/intelpro/ELEDNet.

8/28/2024

Event-assisted Low-Light Video Object Segmentation

Hebei Li, Jin Wang, Jiahui Yuan, Yue Li, Wenming Weng, Yansong Peng, Yueyi Zhang, Zhiwei Xiong, Xiaoyan Sun

In the realm of video object segmentation (VOS), the challenge of operating under low-light conditions persists, resulting in notably degraded image quality and compromised accuracy when comparing query and memory frames for similarity computation. Event cameras, characterized by their high dynamic range and ability to capture motion information of objects, offer promise in enhancing object visibility and aiding VOS methods under such low-light conditions. This paper introduces a pioneering framework tailored for low-light VOS, leveraging event camera data to elevate segmentation accuracy. Our approach hinges on two pivotal components: the Adaptive Cross-Modal Fusion (ACMF) module, aimed at extracting pertinent features while fusing image and event modalities to mitigate noise interference, and the Event-Guided Memory Matching (EGMM) module, designed to rectify the issue of inaccurate matching prevalent in low-light settings. Additionally, we present the creation of a synthetic LLE-DAVIS dataset and the curation of a real-world LLE-VOS dataset, encompassing frames and events. Experimental evaluations corroborate the efficacy of our method across both datasets, affirming its effectiveness in low-light scenarios.

4/3/2024

Seeing Motion at Nighttime with an Event Camera

Haoyue Liu, Shihan Peng, Lin Zhu, Yi Chang, Hanyu Zhou, Luxin Yan

We focus on a very challenging task: imaging at nighttime dynamic scenes. Most previous methods rely on the low-light enhancement of a conventional RGB camera. However, they would inevitably face a dilemma between the long exposure time of nighttime and the motion blur of dynamic scenes. Event cameras react to dynamic changes with higher temporal resolution (microsecond) and higher dynamic range (120dB), offering an alternative solution. In this work, we present a novel nighttime dynamic imaging method with an event camera. Specifically, we discover that the event at nighttime exhibits temporal trailing characteristics and spatial non-stationary distribution. Consequently, we propose a nighttime event reconstruction network (NER-Net) which mainly includes a learnable event timestamps calibration module (LETC) to align the temporal trailing events and a non-uniform illumination aware module (NIAM) to stabilize the spatiotemporal distribution of events. Moreover, we construct a paired real low-light event dataset (RLED) through a co-axial imaging system, including 64,200 spatially and temporally aligned image GTs and low-light events. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods in terms of visual quality and generalization ability on real-world nighttime datasets. The project are available at: https://github.com/Liu-haoyue/NER-Net.

4/19/2024