Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

2026-06-03Machine Learning

Machine LearningArtificial IntelligenceHardware Architecture
AI summary

The authors explain that designing a neural network chip involves many connected steps, like creating the network, deciding how to build the hardware, and making sure it works well in production. They propose a flexible system that breaks down the design into four parts, each with a simple interface, so improvements can be made to one part without redoing everything. Their framework also includes a way to handle uncertainty in manufacturing by treating reliability as a resource to optimize. They tested their approach with three studies showing it works well for various applications, that reliability can be adjusted during design, and improvements in one part benefit the whole system automatically.

neural network processorco-designmonotone co-design theoryhardware mappingfabrication yieldConfidence metricPareto optimalitycompute resource allocationstochastic outcomesdesign interfaces
Authors
Yuyang Du, Yujun Huang, Gioele Zardini
Abstract
Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it difficult to improve one component without reworking the entire pipeline. This paper presents a unified framework, grounded in monotone co-design theory, that composes four interoperable design blocks spanning network training, chip mapping, wafer-level fabrication, and compute resource allocation. Each block exposes only a functionality-resource interface to the rest of the system, so any block can be refined without structural changes elsewhere. A central contribution is the treatment of uncertainty: rather than collapsing stochastic outcomes into point estimates, the framework introduces Confidence, the inverse of success probability, as an explicit and optimizable resource alongside cost, time, and power. Three case studies validate the approach. The first recovers Pareto-optimal implementations across heterogeneous application scenarios. The second confirms that Confidence functions as a continuously tunable design knob rather than a post-hoc diagnostic. The third demonstrates that improving a single block's implementation set automatically propagates to the global Pareto front, without modifying the co-design diagram.