This consistently shaves off about 40ms (~130ms -> ~90ms, 30% reduction) from build times when iterating.
On Windows, I suspect the result will be much greater due to slow filesystem perf there and the fact
that this reduces the # of files read.
This was originally brought to my attention as a possibility by @meshula in hexops/dawn#2, the way this
works is by reducing compilation units so that C headers only need to be read/parsed/interpreted once
rather than once per individual C source file we are compiling.
Signed-off-by: Stephen Gutekanst <stephen@hexops.com>