Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators

Read original: arXiv:2305.14561 - Published 4/16/2024 by Yifan Qin, Zheyu Yan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

⚙️

Overview

The paper discusses a novel training concept called Negative Feedback Training (NFT) that enhances the robustness of Deep Neural Networks (DNNs) against device variations in compute-in-memory (CIM) accelerators built on non-volatile memory (NVM) devices.
CIM accelerators with NVM devices have advantages in energy efficiency and latency for DNN inference, but their inherent variations can degrade performance.
Existing methods that incorporate device variations during training have limitations, such as limited accuracy improvement, reduced prediction confidence, and convergence issues.
The proposed NFT leverages multi-scale noisy information captured from the network to improve robustness, with two specific instances: Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS).

Plain English Explanation

Deep Neural Networks (DNNs) are powerful machine learning models that can perform complex tasks like image recognition and natural language processing. However, when these DNNs are implemented in specialized hardware called compute-in-memory (CIM) accelerators, they can encounter performance issues due to the inherent variations in the non-volatile memory (NVM) devices used to build these accelerators.

To address this problem, the researchers in this paper propose a new training concept called "Negative Feedback Training" (NFT). The key idea behind NFT is to leverage the noisy and imperfect information captured from various stages of the neural network during training, rather than just relying on the final output of the model.

The researchers developed two specific instances of NFT: Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS). These methods were shown to outperform existing approaches, improving inference accuracy by up to 46.71% while also reducing uncertainty in the model's predictions and improving the model's convergence during training.

The effectiveness of the NFT concept highlights its potential in making DNNs more robust and reliable when deployed on hardware that has inherent variations, like the CIM accelerators built with NVM devices. This could be particularly important for applications where consistent and confident predictions are critical, such as autonomous driving or recommender systems.

Technical Explanation

The paper introduces a novel training concept called Negative Feedback Training (NFT) to enhance the robustness of Deep Neural Networks (DNNs) against device variations in compute-in-memory (CIM) accelerators built on non-volatile memory (NVM) devices.

CIM accelerators have advantages in energy efficiency and latency for DNN inference, but the stochastic nature and intrinsic variations of NVM devices can degrade the performance of DNN inference. Existing methods that incorporate device variations during training, such as using noise-augmented data or considering variations in the final model output, have limitations, including limited accuracy improvement, reduced prediction confidence, and convergence issues.

To address these limitations, the researchers draw inspiration from control theory and propose the NFT concept. NFT leverages the multi-scale noisy information captured from various stages of the neural network, rather than relying solely on the final output. The researchers develop two specific NFT instances:

Oriented Variational Forward (OVF): This method introduces noise-augmented information at intermediate layers of the network, guiding the model to learn robust features that are resilient to device variations.
Intermediate Representation Snapshot (IRS): This method captures snapshots of the intermediate representations during training and uses them to provide negative feedback, helping the model learn more robust features.

Extensive experiments conducted by the researchers show that their NFT-based methods outperform existing state-of-the-art approaches. The proposed methods achieve up to a 46.71% improvement in inference accuracy, while also reducing epistemic uncertainty, boosting output confidence, and improving convergence probability. These results highlight the generality and practicality of the NFT concept in enhancing the robustness of DNNs against device variations in CIM accelerators.

Critical Analysis

The researchers have presented a novel and intriguing approach to improving the robustness of Deep Neural Networks (DNNs) deployed on compute-in-memory (CIM) accelerators with non-volatile memory (NVM) devices. The Negative Feedback Training (NFT) concept they introduce is a departure from existing methods that primarily focus on the final model output, and instead leverages the multi-scale noisy information captured from the network during training.

One potential limitation of the NFT approach is the increased computational complexity and training time required, as the method involves capturing and processing intermediate representations of the network. The researchers acknowledge this and suggest that future work could explore ways to optimize the NFT process to make it more efficient.

Additionally, the paper does not provide a comprehensive analysis of the potential limitations or failure modes of the NFT approach. While the experiments demonstrate impressive improvements in accuracy, confidence, and convergence, it would be valuable to understand the specific scenarios or conditions where the NFT methods may not perform as well, or where they could potentially introduce unintended consequences.

Furthermore, the paper does not address the broader implications of the NFT concept beyond the specific CIM accelerator use case. It would be interesting to explore whether the NFT approach could be generalized to improve the robustness of DNNs in other hardware architectures or deployment scenarios, such as resource-constrained edge devices or applications with strict reliability requirements.

Overall, the researchers have presented a compelling and innovative approach to enhancing DNN robustness, and their work opens up avenues for further exploration and refinement of the NFT concept.

Conclusion

The paper introduces a novel Negative Feedback Training (NFT) concept that significantly improves the robustness of Deep Neural Networks (DNNs) deployed on compute-in-memory (CIM) accelerators built with non-volatile memory (NVM) devices. By leveraging multi-scale noisy information captured from the network during training, the proposed NFT-based methods, specifically Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS), demonstrate remarkable improvements in inference accuracy, prediction confidence, and convergence probability compared to existing state-of-the-art approaches.

The effectiveness of the NFT concept highlights its potential in making DNNs more reliable and trustworthy when implemented on hardware with inherent variations, such as CIM accelerators. This could have far-reaching implications for applications where consistent and confident predictions are critical, like autonomous driving, recommender systems, and other safety-critical systems. The researchers have laid the groundwork for further exploration and refinement of the NFT approach, which could lead to more robust and versatile deep learning models that can thrive in diverse hardware environments.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

⚙️

Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators

Yifan Qin, Zheyu Yan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference. Introducing these non-ideal device behaviors during DNN training enhances robustness, but drawbacks include limited accuracy improvement, reduced prediction confidence, and convergence issues. This arises from a mismatch between the deterministic training and non-deterministic device variations, as such training, though considering variations, relies solely on the model's final output. In this work, we draw inspiration from the control theory and propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network. We develop two specific NFT instances, Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS). Extensive experiments show that our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy while reducing epistemic uncertainty, boosting output confidence, and improving convergence probability. Their effectiveness highlights the generality and practicality of our NFT concept in enhancing DNN robustness against device variations.

4/16/2024

Augmented Neural Fine-Tuning for Efficient Backdoor Purification

Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Nazanin Rahnavard

Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks, where the behavior of DNNs can be compromised by utilizing certain types of triggers or poisoning mechanisms. State-of-the-art (SOTA) defenses employ too-sophisticated mechanisms that require either a computationally expensive adversarial search module for reverse-engineering the trigger distribution or an over-sensitive hyper-parameter selection module. Moreover, they offer sub-par performance in challenging scenarios, e.g., limited validation data and strong attacks. In this paper, we propose Neural mask Fine-Tuning (NFT) with an aim to optimally re-organize the neuron activities in a way that the effect of the backdoor is removed. Utilizing a simple data augmentation like MixUp, NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module. Our study further reveals that direct weight fine-tuning under limited validation data results in poor post-purification clean test accuracy, primarily due to overfitting issue. To overcome this, we propose to fine-tune neural masks instead of model weights. In addition, a mask regularizer has been devised to further mitigate the model drift during the purification process. The distinct characteristics of NFT render it highly efficient in both runtime and sample usage, as it can remove the backdoor even when a single sample is available from each class. We validate the effectiveness of NFT through extensive experiments covering the tasks of image classification, object detection, video action recognition, 3D point cloud, and natural language processing. We evaluate our method against 14 different attacks (LIRA, WaNet, etc.) on 11 benchmark data sets such as ImageNet, UCF101, Pascal VOC, ModelNet, OpenSubtitles2012, etc.

7/18/2024

🔎

Incremental Object-Based Novelty Detection with Feedback Loop

Simone Caldarella, Elisa Ricci, Rahaf Aljundi

Object-based Novelty Detection (ND) aims to identify unknown objects that do not belong to classes seen during training by an object detection model. The task is particularly crucial in real-world applications, as it allows to avoid potentially harmful behaviours, e.g. as in the case of object detection models adopted in a self-driving car or in an autonomous robot. Traditional approaches to ND focus on one time offline post processing of the pretrained object detection output, leaving no possibility to improve the model robustness after training and discarding the abundant amount of out-of-distribution data encountered during deployment. In this work, we propose a novel framework for object-based ND, assuming that human feedback can be requested on the predicted output and later incorporated to refine the ND model without negatively affecting the main object detection performance. This refinement operation is repeated whenever new feedback is available. To tackle this new formulation of the problem for object detection, we propose a lightweight ND module attached on top of a pre-trained object detection model, which is incrementally updated through a feedback loop. We also propose a new benchmark to evaluate methods on this new setting and test extensively our ND approach against baselines, showing increased robustness and a successful incorporation of the received feedback.

8/6/2024

🤿

DNFS-VNE: Deep Neuro Fuzzy System Driven Virtual Network Embedding

Ailing Xiao, Ning Chen, Sheng Wu, Peiying Zhang, Linling Kuang, Chunxiao Jiang

By decoupling substrate resources, network virtualization (NV) is a promising solution for meeting diverse demands and ensuring differentiated quality of service (QoS). In particular, virtual network embedding (VNE) is a critical enabling technology that enhances the flexibility and scalability of network deployment by addressing the coupling of Internet processes and services. However, in the existing deep neural networks (DNNs)-based works, the black-box nature DNNs limits the analysis, development, and improvement of systems. For example, in the industrial Internet of Things (IIoT), there is a conflict between decision interpretability and the opacity of DNN-based methods. In recent times, interpretable deep learning (DL) represented by deep neuro fuzzy systems (DNFS) combined with fuzzy inference has shown promising interpretability to further exploit the hidden value in the data. Motivated by this, we propose a DNFS-based VNE algorithm that aims to provide an interpretable NV scheme. Specifically, data-driven convolutional neural networks (CNNs) are used as fuzzy implication operators to compute the embedding probabilities of candidate substrate nodes through entailment operations. And, the identified fuzzy rule patterns are cached into the weights by forward computation and gradient back-propagation (BP). Moreover, the fuzzy rule base is constructed based on Mamdani-type linguistic rules using linguistic labels. In addition, the DNFS-driven five-block structure-based policy network serves as the agent for deep reinforcement learning (DRL), which optimizes VNE decision-making through interaction with the environment. Finally, the effectiveness of evaluation indicators and fuzzy rules is verified by simulation experiments.

7/4/2024