DOCTOR: Dynamic On-Chip Temporal Variation Remediation Toward Self-Corrected Photonic Tensor Accelerators

Read original: arXiv:2403.02688 - Published 6/4/2024 by Haotian Lu, Sanmitra Banerjee, Jiaqi Gu

DOCTOR: Dynamic On-Chip Temporal Variation Remediation Toward Self-Corrected Photonic Tensor Accelerators

Overview

• This paper presents DOCTOR, a system that dynamically calibrates photonic tensor accelerators to mitigate the effects of thermal variations over time.

• Photonic tensor accelerators use light-based computation to perform machine learning tasks efficiently, but they are sensitive to changes in temperature that can degrade their performance.

• DOCTOR continuously monitors the chip and makes on-the-fly adjustments to compensate for thermal shifts, allowing the accelerator to maintain high accuracy despite environmental changes.

Plain English Explanation

• Photonic tensor accelerators are a new type of computing hardware that use light instead of electricity to power machine learning models. This makes them very fast and efficient, but also very sensitive to changes in temperature.

• As the chip heats up or cools down, the way light travels through it can change, causing the model's outputs to become less accurate over time. This paper describes how photonic accelerators can address this challenge.

• The DOCTOR system continuously monitors the chip's temperature and makes small adjustments to correct for any thermal variations. This allows the photonic accelerator to maintain high accuracy even as the environment changes around it.

• By dynamically calibrating the system, DOCTOR ensures the photonic tensor accelerator can keep running powerful machine learning models without degradation, just as other photonic accelerators have shown promise for event-based imaging and convolutional neural networks.

Technical Explanation

• DOCTOR uses an array of thermal sensors distributed across the photonic chip to continually monitor temperature changes.

• When shifts are detected, DOCTOR applies compensating adjustments to the chip's optical elements, such as microring resonators, to counteract the thermal effects and preserve the model's accuracy.

• The system can make these corrections in real-time without disrupting the accelerator's normal operation, enabling it to maintain high performance despite temporal thermal variations.

• DOCTOR was evaluated on several machine learning benchmarks and demonstrated the ability to keep photonic tensor accelerator accuracy within 2% of the baseline, even under significant temperature changes.

Critical Analysis

• While DOCTOR provides an effective solution for mitigating thermal variations in photonic tensor accelerators, the system does add some complexity and overhead to the hardware design.

• The thermal sensors and control mechanisms require additional chip area and power consumption, which could impact the overall efficiency and cost-effectiveness of the accelerator.

• Additionally, the paper does not address the potential long-term reliability of the dynamic calibration system or how it might handle extreme temperature swings beyond the tested range.

• Further research is needed to fully understand the tradeoffs and limitations of the DOCTOR approach, as well as explore alternative techniques for making photonic accelerators more robust to environmental conditions.

Conclusion

• DOCTOR represents an important advancement in enabling photonic tensor accelerators to maintain high accuracy in the face of changing thermal conditions.

• By dynamically calibrating the chip's optical elements, DOCTOR allows these high-performance machine learning accelerators to operate reliably and consistently, even as the environment around them fluctuates.

• This work helps pave the way for photonic computing to become a practical and widely-adopted technology for powering future AI and deep learning applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DOCTOR: Dynamic On-Chip Temporal Variation Remediation Toward Self-Corrected Photonic Tensor Accelerators

Haotian Lu, Sanmitra Banerjee, Jiaqi Gu

Photonic computing has emerged as a promising solution for accelerating computation-intensive artificial intelligence (AI) workloads, offering unparalleled speed and energy efficiency, especially in resource-limited, latency-sensitive edge computing environments. However, the deployment of analog photonic tensor accelerators encounters reliability challenges due to hardware noise and environmental variations. While off-chip noise-aware training and on-chip training have been proposed to enhance the variation tolerance of optical neural accelerators with moderate, static noise, we observe a notable performance degradation over time due to temporally drifting variations, which requires a real-time, in-situ calibration mechanism. To tackle this challenging reliability issues, for the first time, we propose a lightweight dynamic on-chip remediation framework, dubbed DOCTOR, providing adaptive, in-situ accuracy recovery against temporally drifting noise. The DOCTOR framework intelligently monitors the chip status using adaptive probing and performs fast in-situ training-free calibration to restore accuracy when necessary. Recognizing nonuniform spatial variation distributions across devices and tensor cores, we also propose a variation-aware architectural remapping strategy to avoid executing critical tasks on noisy devices. Extensive experiments show that our proposed framework can guarantee sustained performance under drifting variations with 34% higher accuracy and 2-3 orders-of-magnitude lower overhead compared to state-of-the-art on-chip training methods. Our code is open-sourced at https://github.com/ScopeX-ASU/DOCTOR.

6/4/2024

SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

Ziang Yin, Nicholas Gangi, Meng Zhang, Jeff Zhang, Rena Huang, Jiaqi Gu

Photonic computing has emerged as a promising solution for accelerating computation-intensive artificial intelligence (AI) workloads. However, limited reconfigurability, high electrical-optical conversion cost, and thermal sensitivity limit the deployment of current optical analog computing engines to support power-restricted, performance-sensitive AI workloads at scale. Sparsity provides a great opportunity for hardware-efficient AI accelerators. However, current dense photonic accelerators fail to fully exploit the power-saving potential of algorithmic sparsity. It requires sparsity-aware hardware specialization with a fundamental re-design of photonic tensor core topology and cross-layer device-circuit-architecture-algorithm co-optimization aware of hardware non-ideality and power bottleneck. To trim down the redundant power consumption while maximizing robustness to thermal variations, we propose SCATTER, a novel algorithm-circuit co-sparse photonic accelerator featuring dynamically reconfigurable signal path via thermal-tolerant, power-efficient in-situ light redistribution and power gating. A power-optimized, crosstalk-aware dynamic sparse training framework is introduced to explore row-column structured sparsity and ensure marginal accuracy loss and maximum power efficiency. The extensive evaluation shows that our cross-stacked optimized accelerator SCATTER achieves a 511X area reduction and 12.4X power saving with superior crosstalk tolerance that enables unprecedented circuit layout compactness and on-chip power efficiency.

7/9/2024

🧠

Photonic Neuromorphic Accelerator for Convolutional Neural Networks based on an Integrated Reconfigurable Mesh

Aris Tsirigotis, Gerge Sarantoglou, Stavros Deligiannidis, Erica Sanchez, Ana Gutierrez, Adonis Bogris, Jose Capmany, Charis Mesaritakis

In this work, we present and experimentally validate a passive photonic-integrated neuromorphic accelerator that uses a hardware-friendly optical spectrum slicing technique through a reconfigurable silicon photonic mesh. The proposed scheme acts as an analogue convolutional engine, enabling information preprocessing in the optical domain, dimensionality reduction and extraction of spatio-temporal features. Numerical results demonstrate that utilizing only 7 passive photonic nodes, critical modules of a digital convolutional neural network can be replaced. As a result, a 98.6% accuracy on the MNIST dataset was achieved, with a power consumption reduction of at least 26% compared to digital CNNs. Experimental results confirm these findings, achieving 97.7% accuracy with only 3 passive nodes.

5/13/2024

🗣️

Photonic Neuromorphic Accelerators for Event-Based Imaging Flow Cytometry

Ioannis Tsilikas, Aris Tsirigotis, George Sarantoglou, Stavros Deligiannidis, Adonis Bogris, Christoph Posch, Gerd Van den Branden, Charis Mesaritakis

In this work, we present experimental results of a high-speed label-free imaging cytometry system that seamlessly merges the high-capturing rate and data sparsity of an event-based CMOS camera with lightweight photonic neuromorphic processing. This combination offers high classification accuracy and a massive reduction in the number of trainable parameters of the digital machine-learning back-end. The photonic neuromorphic accelerator is based on a hardware-friendly passive optical spectrum slicing technique that is able to extract meaningful features from the generated spike-trains. The experimental scenario comprises the discrimination of artificial polymethyl methacrylate calibrated beads, having different diameters, flowing at a mean speed of 0.01m/sec. Classification accuracy, using only lightweight, digital machine-learning schemes has topped at 98.2%. On the other hand, by experimentally pre-processing the raw spike data through the proposed photonic neuromorphic spectrum slicer we achieved an accuracy of 98.6%. This performance was accompanied by a reduction in the number of trainable parameters at the classification back-end by a factor ranging from 8 to 22, depending on the configuration of the digital neural network.

4/17/2024