Mixed-Format MXFP MAC Unit
Designed a MAC unit accepting any mix of MXFP8, MXFP6, and MXFP4 operands.
Graduation project advised by Prof. Jae-joon Kim.
I designed in Verilog a multiply-and-accumulate (MAC) unit supporting all three MXFP (Microscaling Floating-Point) formats — MXFP8, MXFP6, and MXFP4 — recently proposed at the 2023 OCP Global Summit for efficient AI computation. Beyond supporting each format individually, the unit accepts arbitrary mixes of MXFP8/6/4 operands within a single operation, including sub-format variants such as E3M2 and E2M3 within MXFP6. Because the MXFP format carries an individual exponent per element — unlike conventional block floating-point formats with fixed-point elements — the design requires a custom datapath that performs per-element alignment without sacrificing computational accuracy. A bitwidth analysis informed the choice of a 36-bit adder tree and a 67-bit final accumulator with a barrel shifter for single-cycle normalization.
Synthesized on Samsung 28nm CMOS using Synopsys Design Compiler, the design achieved 46% lower area, 42% lower power, and fewer pipeline stages (4 vs. 6) compared to an FP32 baseline.