72 lines
2.0 KiB
Markdown
72 lines
2.0 KiB
Markdown
# Token classification
|
|
|
|
## PyTorch version, no Trainer
|
|
|
|
Fine-tuning (m)LUKE for token classification task such as Named Entity Recognition (NER), Parts-of-speech
|
|
tagging (POS) or phrase extraction (CHUNKS). You can easily
|
|
customize it to your needs if you need extra processing on your datasets.
|
|
|
|
It will either run on a datasets hosted on our [hub](https://huggingface.co/datasets) or with your own text files for
|
|
training and validation, you might just need to add some tweaks in the data preprocessing.
|
|
|
|
The script can be run in a distributed setup, on TPU and supports mixed precision by
|
|
the mean of the [🤗 `Accelerate`](https://github.com/huggingface/accelerate) library. You can use the script normally
|
|
after installing it:
|
|
|
|
```bash
|
|
pip install git+https://github.com/huggingface/accelerate
|
|
```
|
|
|
|
then to train English LUKE on CoNLL2003:
|
|
|
|
```bash
|
|
export TASK_NAME=ner
|
|
|
|
python run_luke_ner_no_trainer.py \
|
|
--model_name_or_path studio-ousia/luke-base \
|
|
--dataset_name conll2003 \
|
|
--task_name $TASK_NAME \
|
|
--max_length 128 \
|
|
--per_device_train_batch_size 32 \
|
|
--learning_rate 2e-5 \
|
|
--num_train_epochs 3 \
|
|
--output_dir /tmp/$TASK_NAME/
|
|
```
|
|
|
|
You can then use your usual launchers to run in it in a distributed environment, but the easiest way is to run
|
|
|
|
```bash
|
|
accelerate config
|
|
```
|
|
|
|
and reply to the questions asked. Then
|
|
|
|
```bash
|
|
accelerate test
|
|
```
|
|
|
|
that will check everything is ready for training. Finally, you can launch training with
|
|
|
|
```bash
|
|
export TASK_NAME=ner
|
|
|
|
accelerate launch run_ner_no_trainer.py \
|
|
--model_name_or_path studio-ousia/luke-base \
|
|
--dataset_name conll2003 \
|
|
--task_name $TASK_NAME \
|
|
--max_length 128 \
|
|
--per_device_train_batch_size 32 \
|
|
--learning_rate 2e-5 \
|
|
--num_train_epochs 3 \
|
|
--output_dir /tmp/$TASK_NAME/
|
|
```
|
|
|
|
This command is the same and will work for:
|
|
|
|
- a CPU-only setup
|
|
- a setup with one GPU
|
|
- a distributed training with several GPUs (single or multi node)
|
|
- a training on TPUs
|
|
|
|
Note that this library is in alpha release so your feedback is more than welcome if you encounter any problem using it.
|