Add sparsity to benchmarking#1917
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1917
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit ef4cf36 with merge base 09c2760 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
jcaip
left a comment
There was a problem hiding this comment.
thanks for working on this! just a couple questions but otherwise looks good
| m_copy = deepcopy(base_model).eval().to(config.device) | ||
| quantization_config = string_to_config( | ||
| config.quantization, high_precision_dtype=config.high_precision_dtype | ||
| aoBaseConfig = string_to_config( |
There was a problem hiding this comment.
probably camel_case is better here?
| benchmark_mode: "inference" | ||
| quantization_config_recipe_names: | ||
| - "baseline" | ||
| # - "baseline" Will always run a baseline instatance |
There was a problem hiding this comment.
should this be commented out?
There was a problem hiding this comment.
We're running baseline case as default for any benchmarking param. the reason I listed it here as a comment is because I wanted to let users know that this will always run. Maybe I can simply add it to readme, and write the comment like
# Will run a baseline inference for model by default, without quantization for comparison
There was a problem hiding this comment.
nit: yeah I think that's better, I would just make it clear that it's not some commented out code.
| - "int4wo-128" | ||
| - "marlin" | ||
| sparsity_config_recipe_names: | ||
| # - "none" Will always run a without sparsity instance |
| # Mock string_to_config to return valid configs | ||
| from torchao.quantization import Int4WeightOnlyConfig | ||
| from torchao.sparsity.sparse_api import ( | ||
| BlockSparseWeightConfig, |
There was a problem hiding this comment.
I don't think we need BlockSparseWeightConfig here - should be semi-structured sparsity no?
| self.assertIsInstance(result, BenchmarkResult) | ||
| self.assertTrue(hasattr(result, "model_inference_time_in_ms")) | ||
|
|
||
| # Test with block sparsity |
There was a problem hiding this comment.
Oh, I see - can we split this into two tests then, one for int4+2:4 marlin, and one for block sparsity?
6e00835 to
d71baa3
Compare
This reverts commit d71baa3.
| BenchmarkResult( | ||
| BenchmarkConfig( | ||
| quantization="int8wo", | ||
| sparsity="None", |
There was a problem hiding this comment.
super nit: why string None and not just None here?
Add sparsity support for benchmarking. The following support has been added