There ought to be a expand=dtype option in the API so we know what quantizations are available (like for onnx).