Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network

Read original: arXiv:2304.11857 - Published 8/6/2024 by Rui Zhang, Luziwei Leng, Kaiwei Che, Hu Zhang, Jie Cheng, Qinghai Guo, Jiangxing Liao, Ran Cheng

🌐

Overview

Spiking Neural Networks (SNNs) are emerging as promising solutions for processing dynamic, asynchronous signals from event-based sensors.
SNNs face challenges in training and architectural design, resulting in limited performance compared to Artificial Neural Networks (ANNs) in event-based dense prediction tasks.
This work develops an efficient Spiking Encoder-Decoder Network (SpikingEDN) for large-scale event-based semantic segmentation tasks.

Plain English Explanation

The paper discusses the potential of Spiking Neural Networks (SNNs) for processing dynamic, event-based sensor data. Unlike traditional Artificial Neural Networks (ANNs), SNNs are designed to mimic the way the human brain processes information, using spike-based, low-power, event-driven computation.

While SNNs hold promise, the researchers note that they face challenges in training and architectural design, which has limited their performance on complex event-based tasks like semantic segmentation, compared to ANNs. To address this, the researchers develop a new model called the Spiking Encoder-Decoder Network (SpikingEDN).

The key innovations in the SpikingEDN include:

An adaptive threshold that improves the network's accuracy, sparsity, and robustness during inference
A Spiking Spatially-Adaptive Modulation module that enhances the representation of sparse events and multi-modal inputs, improving overall performance

The researchers show that the SpikingEDN achieves competitive results on event-based semantic segmentation tasks, compared to state-of-the-art ANN models, while requiring substantially fewer computational resources. This suggests the untapped potential of SNNs in event-based vision applications.

Technical Explanation

The researchers develop an efficient Spiking Encoder-Decoder Network (SpikingEDN) for large-scale event-based semantic segmentation tasks. To enhance the learning efficiency from dynamic event streams, they harness the adaptive threshold, which improves the network's accuracy, sparsity, and robustness during streaming inference.

Furthermore, the researchers develop a Dual-Path Spiking Spatially-Adaptive Modulation module, which is specifically tailored to enhance the representation of sparse events and multi-modal inputs, thereby considerably improving the network's performance.

The SpikingEDN achieves a mean Intersection over Union (MIoU) of 72.57% on the DDD17 dataset and 58.32% on the larger DSEC-Semantic dataset, showing competitive results to the state-of-the-art ANNs while requiring substantially fewer computational resources.

Critical Analysis

The paper presents a promising approach to leveraging the benefits of Spiking Neural Networks (SNNs) for event-based semantic segmentation tasks. The researchers have addressed key challenges in SNN training and architecture design, leading to competitive performance compared to ANN models.

However, the paper does not discuss the potential limitations or caveats of the proposed SpikingEDN. For instance, it would be helpful to understand the training and inference time requirements of the model, as well as its sensitivity to hyperparameter tuning or the size of the training dataset.

Additionally, the researchers could explore the generalization of the SpikingEDN to other event-based vision tasks, such as object detection or depth estimation, to further demonstrate the versatility of their approach.

Conclusion

This work highlights the untapped potential of Spiking Neural Networks (SNNs) for event-based vision applications. The proposed Spiking Encoder-Decoder Network (SpikingEDN) demonstrates competitive performance on semantic segmentation tasks while requiring substantially fewer computational resources compared to state-of-the-art Artificial Neural Network models.

The key innovations, such as the adaptive threshold and the Spiking Spatially-Adaptive Modulation module, showcase the ability of SNNs to effectively process dynamic, asynchronous event-based data. This research paves the way for further exploration of SNN architectures and their potential applications in the field of event-based vision and beyond.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network

Rui Zhang, Luziwei Leng, Kaiwei Che, Hu Zhang, Jie Cheng, Qinghai Guo, Jiangxing Liao, Ran Cheng

Spiking neural networks (SNNs), known for their low-power, event-driven computation and intrinsic temporal dynamics, are emerging as promising solutions for processing dynamic, asynchronous signals from event-based sensors. Despite their potential, SNNs face challenges in training and architectural design, resulting in limited performance in challenging event-based dense prediction tasks compared to artificial neural networks (ANNs). In this work, we develop an efficient spiking encoder-decoder network (SpikingEDN) for large-scale event-based semantic segmentation tasks. To enhance the learning efficiency from dynamic event streams, we harness the adaptive threshold which improves network accuracy, sparsity and robustness in streaming inference. Moreover, we develop a dual-path Spiking Spatially-Adaptive Modulation module, which is specifically tailored to enhance the representation of sparse events and multi-modal inputs, thereby considerably improving network performance. Our SpikingEDN attains a mean intersection over union (MIoU) of 72.57% on the DDD17 dataset and 58.32% on the larger DSEC-Semantic dataset, showing competitive results to the state-of-the-art ANNs while requiring substantially fewer computational resources. Our results shed light on the untapped potential of SNNs in event-based vision applications. The source code will be made publicly available.

8/6/2024

EvSegSNN: Neuromorphic Semantic Segmentation for Event Data

Dalia Hareb, Jean Martinet

Semantic segmentation is an important computer vision task, particularly for scene understanding and navigation of autonomous vehicles and UAVs. Several variations of deep neural network architectures have been designed to tackle this task. However, due to their huge computational costs and their high memory consumption, these models are not meant to be deployed on resource-constrained systems. To address this limitation, we introduce an end-to-end biologically inspired semantic segmentation approach by combining Spiking Neural Networks (SNNs, a low-power alternative to classical neural networks) with event cameras whose output data can directly feed these neural network inputs. We have designed EvSegSNN, a biologically plausible encoder-decoder U-shaped architecture relying on Parametric Leaky Integrate and Fire neurons in an objective to trade-off resource usage against performance. The experiments conducted on DDD17 demonstrate that EvSegSNN outperforms the closest state-of-the-art model in terms of MIoU while reducing the number of parameters by a factor of $1.6$ and sparing a batch normalization stage.

6/21/2024

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li, Lang Qin, Runhao Jiang, De Ma, Huajin Tang

Event cameras, with their high dynamic range and temporal resolution, are ideally suited for object detection, especially under scenarios with motion blur and challenging lighting conditions. However, while most existing approaches prioritize optimizing spatiotemporal representations with advanced detection backbones and early aggregation functions, the crucial issue of adaptive event sampling remains largely unaddressed. Spiking Neural Networks (SNNs), which operate on an event-driven paradigm through sparse spike communication, emerge as a natural fit for addressing this challenge. In this study, we discover that the neural dynamics of spiking neurons align closely with the behavior of an ideal temporal event sampler. Motivated by this insight, we propose a novel adaptive sampling module that leverages recurrent convolutional SNNs enhanced with temporal memory, facilitating a fully end-to-end learnable framework for event-based detection. Additionally, we introduce Residual Potential Dropout (RPD) and Spike-Aware Training (SAT) to regulate potential distribution and address performance degradation encountered in spike-based sampling modules. Empirical evaluation on neuromorphic detection datasets demonstrates that our approach outperforms existing state-of-the-art spike-based methods with significantly fewer parameters and time steps. For instance, our method yields a 4.4% mAP improvement on the Gen1 dataset, while requiring 38% fewer parameters and only three time steps. Moreover, the applicability and effectiveness of our adaptive sampling methodology extend beyond SNNs, as demonstrated through further validation on conventional non-spiking models. Code is available at https://github.com/Windere/EAS-SNN.

8/27/2024

Embedded event based object detection with spiking neural network

Jonathan Courtois, Pierre-Emmanuel Novac, Edgar Lemaire, Alain Pegatoquet, Benoit Miramond

The complexity of event-based object detection (OD) poses considerable challenges. Spiking Neural Networks (SNNs) show promising results and pave the way for efficient event-based OD. Despite this success, the path to efficient SNNs on embedded devices remains a challenge. This is due to the size of the networks required to accomplish the task and the ability of devices to take advantage of SNNs benefits. Even when edge devices are considered, they typically use embedded GPUs that consume tens of watts. In response to these challenges, our research introduces an embedded neuromorphic testbench that utilizes the SPiking Low-power Event-based ArchiTecture (SPLEAT) accelerator. Using an extended version of the Qualia framework, we can train, evaluate, quantize, and deploy spiking neural networks on an FPGA implementation of SPLEAT. We used this testbench to load a state-of-the-art SNN solution, estimate the performance loss associated with deploying the network on dedicated hardware, and run real-world event-based OD on neuromorphic hardware specifically designed for low-power spiking neural networks. Remarkably, our embedded spiking solution, which includes a model with 1.08 million parameters, operates efficiently with 490 mJ per prediction.

6/26/2024