Quantization Pipeline Exercise
The current pipeline implementation contains several hardcoded values that could be made configurable to increase its flexibility and usability. This exercise encourages you to enhance the pipeline by parametrizing these values.
Enhancement Opportunities
1. Quantization Format Support
Currently supports only INT8 and INT4 quantization
Potential enhancement: Add support for FP8 format
-
Consider implementing a configurable quantization format selector
2. Data Calibration Parameters
The following calibration-related values are currently hardcoded and could be made configurable:
-
NUM_CALIBRATION_SAMPLES
: Number of samples used for calibration -
DATASET_ID
: The dataset identifier used for calibration -
DATASET_SPLIT
: The specific split of the dataset to use
3. Quantization Configuration
Several quantization parameters could be exposed for customization:
-
DAMPENING_FRAC
: Dampening fraction for the quantization process -
OBSERVER
: The type of observer used for quantization -
GROUP_SIZE
: Size of the quantization groups -
Quantization mappings for different model components
Exercise Goals
-
Choose one or more of these areas for improvement
-
Implement the parametrization of your chosen components
-
Test the enhanced pipeline with different configurations
-
Bonus: Send a PR to add it to the official material at pull request
This exercise will help you understand the pipeline’s architecture while making it more versatile for different use cases and environments.