Towards Real-world Event-guided Low-light Video Enhancement and Deblurring

Read original: arXiv:2408.14916 - Published 8/28/2024 by Taewoo Kim, Jaeseok Jeong, Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon

Towards Real-world Event-guided Low-light Video Enhancement and Deblurring

Overview

This paper introduces a novel event-guided low-light video enhancement and deblurring method for real-world applications.
It leverages the high temporal resolution and high dynamic range of event cameras to guide the enhancement of low-light, blurry videos captured by conventional cameras.
The proposed method outperforms state-of-the-art low-light enhancement and deblurring techniques on various benchmarks.

Plain English Explanation

In this paper, the researchers have developed a new way to improve the quality of videos taken in low-light conditions and when the camera is shaky. They use a special type of camera called an event camera, which can capture extremely fast movements and has a wide dynamic range, to guide the enhancement of videos taken with a regular camera.

The key idea is that the event camera can provide important information about the motion and lighting in the scene, which can then be used to enhance the low-light video and remove the blur caused by camera shake. This allows the researchers to significantly improve the quality of the final video, making it more clear and sharp, even in very dark or blurry conditions.

The researchers have tested their method on various benchmark datasets and shown that it outperforms other state-of-the-art techniques for low-light enhancement and deblurring. This could have important real-world applications, such as in surveillance, autonomous vehicles, or any scenario where high-quality video in challenging lighting conditions is required.

Technical Explanation

The paper proposes a novel event-guided low-light video enhancement and deblurring method that leverages the advantages of event cameras to improve the quality of videos captured by conventional cameras in low-light and blurry conditions.

Event cameras are a type of sensor that captures changes in brightness rather than full image frames. This allows them to have a high temporal resolution and a wide dynamic range, which can be useful for various computer vision tasks.

The key idea of the proposed method is to align the event data with the low-light video and use the event information to guide the enhancement and deblurring process. Specifically, the method first aligns the event data and the low-light video using a cross-modal temporal alignment network. It then uses the aligned event data to guide a low-light enhancement network and a deblurring network to improve the quality of the final video.

The researchers have evaluated their method on various benchmark datasets, including the LED dataset, and have shown that it outperforms other state-of-the-art low-light enhancement and deblurring techniques.

Critical Analysis

The paper presents a promising approach to leveraging the advantages of event cameras for improving the quality of low-light and blurry videos captured by conventional cameras. The key strengths of the proposed method are its ability to effectively align the event data and the low-light video, and to use the event information to guide the enhancement and deblurring process.

However, the paper does not provide a detailed analysis of the limitations or potential issues of the proposed method. For example, it would be interesting to understand how the method performs in extreme low-light conditions or with highly dynamic scenes, where the event camera data may be less reliable.

Additionally, the paper could have explored the tradeoffs between the computational complexity of the proposed method and its performance, as well as the potential impact of the method on real-world applications, such as surveillance or autonomous vehicles.

Conclusion

This paper introduces a novel event-guided low-light video enhancement and deblurring method that leverages the advantages of event cameras to improve the quality of videos captured by conventional cameras in challenging lighting and motion conditions. The proposed approach outperforms state-of-the-art techniques on various benchmarks, and could have important real-world applications in scenarios where high-quality video is required, such as in surveillance, autonomous vehicles, or other computer vision tasks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Towards Real-world Event-guided Low-light Video Enhancement and Deblurring

Taewoo Kim, Jaeseok Jeong, Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon

In low-light conditions, capturing videos with frame-based cameras often requires long exposure times, resulting in motion blur and reduced visibility. While frame-based motion deblurring and low-light enhancement have been studied, they still pose significant challenges. Event cameras have emerged as a promising solution for improving image quality in low-light environments and addressing motion blur. They provide two key advantages: capturing scene details well even in low light due to their high dynamic range, and effectively capturing motion information during long exposures due to their high temporal resolution. Despite efforts to tackle low-light enhancement and motion deblurring using event cameras separately, previous work has not addressed both simultaneously. To explore the joint task, we first establish real-world datasets for event-guided low-light enhancement and deblurring using a hybrid camera system based on beam splitters. Subsequently, we introduce an end-to-end framework to effectively handle these tasks. Our framework incorporates a module to efficiently leverage temporal information from events and frames. Furthermore, we propose a module to utilize cross-modal feature information to employ a low-pass filter for noise suppression while enhancing the main structural information. Our proposed method significantly outperforms existing approaches in addressing the joint task. Our project pages are available at https://github.com/intelpro/ELEDNet.

8/28/2024

Cross-Modal Temporal Alignment for Event-guided Video Deblurring

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

Video deblurring aims to enhance the quality of restored results in motion-blurred videos by effectively gathering information from adjacent video frames to compensate for the insufficient data in a single blurred frame. However, when faced with consecutively severe motion blur situations, frame-based video deblurring methods often fail to find accurate temporal correspondence among neighboring video frames, leading to diminished performance. To address this limitation, we aim to solve the video deblurring task by leveraging an event camera with micro-second temporal resolution. To fully exploit the dense temporal resolution of the event camera, we propose two modules: 1) Intra-frame feature enhancement operates within the exposure time of a single blurred frame, iteratively enhancing cross-modality features in a recurrent manner to better utilize the rich temporal information of events, 2) Inter-frame temporal feature alignment gathers valuable long-range temporal information to target frames, aggregating sharp features leveraging the advantages of the events. In addition, we present a novel dataset composed of real-world blurred RGB videos, corresponding sharp videos, and event data. This dataset serves as a valuable resource for evaluating event-guided deblurring methods. We demonstrate that our proposed methods outperform state-of-the-art frame-based and event-based motion deblurring methods through extensive experiments conducted on both synthetic and real-world deblurring datasets. The code and dataset are available at https://github.com/intelpro/CMTA.

8/29/2024

EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More

Kanghao Chen, Guoqiang Liang, Hangyu Li, Yunfan Lu, Lin Wang

Event cameras offer significant advantages for low-light video enhancement, primarily due to their high dynamic range. Current research, however, is severely limited by the absence of large-scale, real-world, and spatio-temporally aligned event-video datasets. To address this, we introduce a large-scale dataset with over 30,000 pairs of frames and events captured under varying illumination. This dataset was curated using a robotic arm that traces a consistent non-linear trajectory, achieving spatial alignment precision under 0.03mm and temporal alignment with errors under 0.01s for 90% of the dataset. Based on the dataset, we propose textbf{EvLight++}, a novel event-guided low-light video enhancement approach designed for robust performance in real-world scenarios. Firstly, we design a multi-scale holistic fusion branch to integrate structural and textural information from both images and events. To counteract variations in regional illumination and noise, we introduce Signal-to-Noise Ratio (SNR)-guided regional feature selection, enhancing features from high SNR regions and augmenting those from low SNR regions by extracting structural information from events. To incorporate temporal information and ensure temporal coherence, we further introduce a recurrent module and temporal loss in the whole pipeline. Extensive experiments on our and the synthetic SDSD dataset demonstrate that EvLight++ significantly outperforms both single image- and video-based methods by 1.37 dB and 3.71 dB, respectively. To further explore its potential in downstream tasks like semantic segmentation and monocular depth estimation, we extend our datasets by adding pseudo segmentation and depth labels via meticulous annotation efforts with foundation models. Experiments under diverse low-light scenes show that the enhanced results achieve a 15.97% improvement in mIoU for semantic segmentation.

8/30/2024

Seeing Motion at Nighttime with an Event Camera

Haoyue Liu, Shihan Peng, Lin Zhu, Yi Chang, Hanyu Zhou, Luxin Yan

We focus on a very challenging task: imaging at nighttime dynamic scenes. Most previous methods rely on the low-light enhancement of a conventional RGB camera. However, they would inevitably face a dilemma between the long exposure time of nighttime and the motion blur of dynamic scenes. Event cameras react to dynamic changes with higher temporal resolution (microsecond) and higher dynamic range (120dB), offering an alternative solution. In this work, we present a novel nighttime dynamic imaging method with an event camera. Specifically, we discover that the event at nighttime exhibits temporal trailing characteristics and spatial non-stationary distribution. Consequently, we propose a nighttime event reconstruction network (NER-Net) which mainly includes a learnable event timestamps calibration module (LETC) to align the temporal trailing events and a non-uniform illumination aware module (NIAM) to stabilize the spatiotemporal distribution of events. Moreover, we construct a paired real low-light event dataset (RLED) through a co-axial imaging system, including 64,200 spatially and temporally aligned image GTs and low-light events. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods in terms of visual quality and generalization ability on real-world nighttime datasets. The project are available at: https://github.com/Liu-haoyue/NER-Net.

4/19/2024