parent
520daef18f
commit
e79678ddcd
|
@ -1131,19 +1131,19 @@ Skywork-13B-Chat:
|
|||
|
||||
# Quantization
|
||||
|
||||
## 8bit Quntization
|
||||
## 8bit Quantization
|
||||
|
||||
Skywork utilizes the widely-used 8-bit quantization method called [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes). This method allows for quantizing performance with minimal loss and has been seamlessly integrated into the transformers library. Building upon BitsAndBytes, we offer two approaches for utilizing online quantization and offline 8-bit models.
|
||||
|
||||
To illustrate the usage of the int8 quantization model, we provide an example. Before you begin, please ensure that you have installed the BitsAndBytes library and the necessary dependencies. For detailed installation instructions, please refer to the [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) repository.
|
||||
|
||||
### Online Quntization
|
||||
### Online Quantization
|
||||
|
||||
```python
|
||||
model = AutoModelForCausalLM.from_pretrained("skywork-13B-Base", torch_dtype=torch.bfloat16,load_in_8bit=True, trust_remote_code=True).eval()
|
||||
```
|
||||
|
||||
### Offline Quntization
|
||||
### Offline Quantization
|
||||
|
||||
```python
|
||||
model = AutoModelForCausalLM.from_pretrained("skywork-13B-Base-8bits", device_map="auto", torch_dtype=torch.bfloat16,trust_remote_code=True).eval()
|
||||
|
|
Loading…
Reference in New Issue