update tatoeba workflow (#9051)

This commit is contained in:
Suraj Patil 2020-12-11 20:29:15 +05:30 committed by GitHub
parent 7c8f5f6487
commit 86896de064
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 28 additions and 4 deletions

View File

@ -19,7 +19,7 @@ Setup transformers following instructions in README.md, (I would fork first).
git clone git@github.com:huggingface/transformers.git
cd transformers
pip install -e .
pip install pandas
pip install pandas GitPython wget
```
Get required metadata
@ -35,7 +35,7 @@ git clone git@github.com:Helsinki-NLP/Tatoeba-Challenge.git
To convert a few models, call the conversion script from command line:
```bash
python src/transformers/convert_marian_tatoeba_to_pytorch.py --models heb-eng eng-heb --save_dir converted
python src/transformers/models/marian/convert_marian_tatoeba_to_pytorch.py --models heb-eng eng-heb --save_dir converted
```
To convert lots of models you can pass your list of Tatoeba model names to `resolver.convert_models` in a python client or script.
@ -48,10 +48,22 @@ resolver.convert_models(['heb-eng', 'eng-heb'])
### Upload converted models
Since version v3.5.0, the model sharing workflow is switched to git-based system . Refer to [model sharing doc](https://huggingface.co/transformers/master/model_sharing.html#model-sharing-and-uploading) for more details.
To upload all converted models,
1. Install [git-lfs](https://git-lfs.github.com/).
2. Login to `transformers-cli`
```bash
cd converted
transformers-cli login
for FILE in *; do transformers-cli upload $FILE; done
```
3. Run the `upload_models` script
```bash
./scripts/tatoeba/upload_models.sh
```

View File

@ -0,0 +1,12 @@
#!/bin/bash
for FILE in converted/*; do
model_name=`basename $FILE`
transformers-cli repo create $model_name -y
git clone https://huggingface.co/Helsinki-NLP/$model_name
mv $FILE/* $model_name/
cd $model_name
git add . && git commit -m "initial commit"
git push
cd ..
done