An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

2026-06-08Hardware Architecture

Hardware ArchitectureArtificial IntelligenceMathematical SoftwarePerformance
AI summary

The authors explain that many different number formats are now used in machine learning hardware, but there is a lack of common, exact references to compare them. They created a collection of 84 number formats and six exact test sets to help engineers check if their model calculations match across different hardware. Their work includes clear documentation and tools to spot differences that are allowed by the specs, not mistakes. The authors do not introduce new formats or claim that any are better; instead, they provide open resources to help unify understanding and testing.

FP8BF16numeric formatsbit-exact conformancemachine learning hardwareIEEE P3109cross-validationJSON schema
Authors
Dmitrii Vasilev
Abstract
Numeric format proliferation in machine learning hardware -- FP8 (E4M3 and E5M2), BF16, MXFP4, microscaling block formats, and dozens of research variants -- has outpaced the availability of vendor-neutral, bit-exact reference material. Engineers porting models across accelerators encounter silent divergences that are difficult to diagnose without a shared ruler. This paper describes a catalog of 84 numeric formats spanning 13 families, a suite of six bit-exact conformance packs covering GF16, MXFP4 element, BF16, FP8 E4M3, FP8 E5M2, and E8M0 block scale, and an IEEE P3109 v3.2.0 cross-walk that maps each pack to its corresponding standards-track configured format. Each pack is a self-contained JSON document with a SHA-256 fingerprint, a shared row schema, and an anchor vector that encodes 3.0 -- the identity phi^2 + 1/phi^2 = 3 -- as a cross-pack sanity check. Packs are cross-validated against ml_dtypes 0.5.4 (Google/JAX); any divergence is documented explicitly and interpreted as a spec-permitted interpretation gap rather than hidden. The work is framed as registry filling: it does not propose new formats, make model-accuracy claims, or assert superiority over any vendor's implementation. All artifacts are publicly available at https://github.com/gHashTag/t27 under an open license.