Commit Graph

15000 Commits

Author SHA1 Message Date
isaac-vidas dafd59512c
[`Llava`] Update convert_llava_weights_to_hf.py script (#28617)
* Update convert_llava_weights_to_hf.py script

* Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception.
* Remove keys that ends with `inv_freq` from the state dict.
* Add examples and instructions for creating `model_state_dict.bin` that can be used by the script.

* Update convert_llava_weights_to_hf.py

* Update convert_vipllava_weights_to_hf.py
2024-01-22 15:28:18 +01:00
bofeng huang deb2b59073
Fix lr_scheduler in no_trainer training scripts (#27872)
* Fix lr_scheduler

* Fix lr scheduler
2024-01-22 14:22:18 +00:00
Matt 692c3c6b73
Add config tip to custom model docs (#28601)
Add tip to custom model docs
2024-01-22 13:46:04 +00:00
Yih-Dar d336c56d94
Avoid root logger's level being changed (#28638)
* avoid root logger's level being changed

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-22 14:45:30 +01:00
Matt bf674153d3
Add missing key to TFLayoutLM signature (#28640)
Fix missing bbox in LayoutLM signature
2024-01-22 13:16:29 +00:00
jheitmann f0acf7b6d8
Fix id2label assignment in run_classification.py (#28590) 2024-01-22 11:31:31 +00:00
Arthur 83f9196cc4
[`GPTNeoX`] Fix BC issue with 4.36 (#28602)
* fix dtype issue

* add a test

* update copied from mentions

* nits

* fixup

* fix copies

* Apply suggestions from code review
2024-01-21 17:01:19 +00:00
Sangbum Daniel Choi 3f69f415ad
Fix auxiliary loss related code in transformers (#28406)
* [DETA] fix freeze/unfreeze function

* Update src/transformers/models/deta/modeling_deta.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deta/modeling_deta.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add freeze/unfreeze test case in DETA

* fix type

* fix typo 2

* fix : enable aux and enc loss in training pipeline

* Add unsynced variables from original DETA for training

* modification for passing CI test

* make style

* make fix

* manual make fix

* change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking

* remove print

* divide configuration in DetaModel and DetaForObjectDetection

* image smaller size than 224 will give topk error

* pred_boxes and logits should be equivalent to two_stage_num_proposals

* add missing part in DetaConfig

* Update src/transformers/models/deta/modeling_deta.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add docstring in configure and prettify TO DO part

* change distribute related code to accelerate

* Update src/transformers/models/deta/configuration_deta.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/deta/test_modeling_deta.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* protect importing accelerate

* change variable name to specific value

* wrong import

* fix aux_loss in conditional_detr

* add test aux_loss

* add aux_loss test in deta and table_transformer

* fix yolos since it doesn't have auxiliary function

* fix maskformer auxiliary_loss related code

* make style

* change param 'auxiliary_loss' to 'use_auxiliary_loss'

* change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests

* make style & fix-copies, also revert yolos related parameter

* revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig

* revert variable name in yolos

* revert maskformer

* add aux_loss test in maskformer

* make style

* Update src/transformers/models/yolos/configuration_yolos.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-19 14:12:01 +00:00
Joao Gante 948ffff407
RWKV: raise informative exception when attempting to manipulate `past_key_values` (#28600) 2024-01-19 14:09:36 +00:00
Ofir Zafrir 9efec11400
Fix `_speculative_sampling` implementation (#28508) 2024-01-19 14:07:31 +00:00
Matt d15781597a
Allow add_tokens for ESM (#28535)
* Allow non-special tokens to be added

* Add test, fix token adding code

* Revert changes to id_to_token and token_to_id

* Update the ESM tokenizer to be a bit more standardized

* Update src/transformers/models/esm/tokenization_esm.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-19 12:32:05 +00:00
isaac-vidas 5b7f4bc6c1
[`Llava`] Fix convert_llava_weights_to_hf.py script (#28570)
* Update convert_llava_weights_to_hf.py

Fix call to `tokenizer.add_tokens`

* Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py
2024-01-19 13:31:25 +01:00
NielsRogge faf03541e2
[SigLIP] Don't pad by default (#28578)
First draft
2024-01-19 13:30:00 +01:00
Fanli Lin 8db64367b2
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386)
* remove elif xpu

* remove redudant code
2024-01-19 13:28:53 +01:00
Patrick von Platen 690fe73f20
[Whisper] Finalize batched SOTA long-form generation (#27658)
* finalize

* make fix copies whisper

* [Tests] Make sure that we don't run tests mulitple times

* Update src/transformers/models/whisper/modeling_whisper.py

* [Tests] Make sure that we don't run tests mulitple times

* fix more

* improve

* improve

* improve further

* improve more

* improve

* fix more

* git commit and git push

* fix more

* fix more

* fix more

* New try

* Fix more whisper stuff

* Improve

* correct more

* correct more

* correct more

* Fix some tests

* Add more tests

* correct more

* correct more

* correct more

* push

* correct more

* Fix more

* Better

* without dec mask

* correct more

* clean

* save intermediate

* Fix more

* Fix VAD for large-v2

* Save new

* Correct more

* make cleaner

* correct tests

* correct src

* Finish

* Fix more

* Fix more

* finish

* Fix edge cases

* fix return_dict_in_generate

* fix all tests

* make style

* add docstrings

* add docstrings

* Fix logit processor

* make style

* fix pipeline test

* fix more style

* Apply suggestions from code review

* apply feedback Sanchit

* correct more

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* correct more

* correct more

* correct more

* Fix staticmethod

* correct more

* fix

* fix slow tests

* make style

* fix tokenizer test

* fix tokenizer test

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* finish

* finish

* revert kwargs change

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-19 14:04:17 +02:00
Saibo-creator d4fc1eb498
feat: Sequential beam search (#26304) 2024-01-19 11:36:54 +00:00
Yoach Lacombe 268fc1fdfa
Add w2v2bert to pipeline (#28585)
* generalize asr pipeline to fbank models

* change w2v2 pipeline output

* Update test_pipelines_automatic_speech_recognition.py
2024-01-19 11:25:01 +00:00
Amy Roberts b2748a6efd v4.38.dev.0 2024-01-19 10:43:28 +00:00
Yih-Dar db9a7e9d3d
Don't save `processor_config.json` if a processor has no extra attribute (#28584)
* not save if empty

* fix

* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-19 09:59:14 +00:00
Yoach Lacombe 772307be76
Making CTC training example more general (#28582)
* add w2v2bert compatibility

* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-18 17:01:49 +00:00
Sanchit Gandhi 186aa6befe
[Whisper] Fix audio classification with weighted layer sum (#28563)
* fix

* tests

* fix test
2024-01-18 16:41:44 +00:00
Sanchit Gandhi 619ecfe26f
[Whisper Tok] Move token ids to CPU when computing offsets (#28485)
* move token ids to cpu

* check for torch attr
2024-01-18 16:12:14 +00:00
Sanchit Gandhi 0eaa5ea38e
[ASR Pipe] Update init to set model type and subsequently call parent init method (#28486)
* add image processor arg

* super

* rm args
2024-01-18 16:11:49 +00:00
Jeremy Fowers c662c78c71
Fix the documentation checkpoint for xlm-roberta-xl (#28567)
* Fix the documentation checkpoint for xlm-roberta-xl

* Improve docstring consistency
2024-01-18 13:47:49 +00:00
Yih-Dar 0754217c82
Use `LoggingLevel` context manager in 3 tests (#28575)
* inside with LoggingLevel

* remove is_flaky

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-18 13:41:25 +00:00
Yoach Lacombe d2cdefb9ec
Add new meta w2v2-conformer BERT-like model (#28165)
* first commit

* correct default value non causal

* update config and modeling code

* update converting checkpoint

* clean modeling and fix tests

* make style

* add new config parameters to docstring

* fix copied from statements

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* make position_embeddings_type docstrings clearer

* clean converting script

* remove function not used

* clean modeling file

* apply suggestion for test file + add convert script to not_doctested

* modify tests according to review - cleaner logic and more tests

* Apply nit suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add checker of valid position embeddings type

* instantiate new layer norm layer with the right eps

* fix freeze_feature_encoder since it can be None in some cases

* add test same output in convert script

* restore wav2vec2conformer and add new model

* create processor and FE + clean

* add new model code

* fix convert script and set default config parameters

* correct model id paths

* make style

* make fix-copies and cleaning files

* fix copied from statements

* complete .md and fixe copies

* clean convert script argument defaults

* fix config parameters docstrings

* fix config docstring

* add copied from and enrich FE tests

* fix copied from and repo-consistency

* add autotokenizer

* make test input length shorter and change docstring code

* fix docstrings and copied from

* add add_adapter to ASR training example

* make testing of adapters more robust

* adapt to multi adapter layers

* refactor input_values->input_features and remove w2v2-bert feature extractor

* remove pretraining model

* remove depreciated features and useless lines

* add copied from and ignore statements to modeling tests

* remove pretraining model #2

* change import in convert script

* change default in convert script

* update readme and remove useless line

* Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* refactor BERT to Bert for consistency

* remove useless ignore copy statement

* add persistent to buffer in rotary

* add eps in LayerNorm init and remove copied from

* add adapter activation parameters and add copied from statements

* Fix copied statements and add unitest.skip reasons

* add copied statement in test_processor

* refactor processor

* make style

* replace numpy random by torch rand

* remove expected output CTC

* improve converting script with processor class

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* remove gumbel class

* remove tests related to previously deleted class

* Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* correct typos

* remove uused parameters

* update processor to takes both text and audio

* update checkpoints

* update expected output and add ctc expected output

* add label_attention_mask

* replace pt with np in processor tests

* fix typo

* revert to behaviour with labels_attention_mask

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-18 13:37:34 +00:00
hugo-syn 5d8eb93eee
chore: Fix multiple typos (#28574) 2024-01-18 13:35:09 +00:00
Arthur 8189977885
[`Core Tokenization`] Support a fix for spm fast models (#26678)
* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* the fix

* where do we stand

* nits

* nits

* revert unrelated changes

* nits nits nits

* styling

* don't break llama just yet

* revert llama changes

* safe arg check

* fixup

* Add a test for T5

* Necessary changes

* Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Update to main

* nits

* fmt

* more and more test

* comments

* revert change as tests are failing

* make the test more readble

* nits

* refactor the test

* nit

* updates

* simplify

* style

* style

* style convert slow

* Update src/transformers/convert_slow_tokenizer.py
2024-01-18 12:31:54 +01:00
Yih-Dar a1668cc72e
Use `weights_only` only if torch >= 1.13 (#28506)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-18 10:55:29 +00:00
Yih-Dar 3005f96552
Save `Processor` (#27761)
* save processor

* Update tests/models/auto/test_processor_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/test_processing_common.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-18 10:21:45 +00:00
Ahmed Elnaggar 98dda8ed03
Fix Switch Transformers When sparse_step = 1 (#28564)
Fix sparse_step = 1

I case sparse_step = 1, the current code will not work.
2024-01-17 21:26:21 +00:00
Lucas Thompson fa6d12f74f
Allow to train dinov2 with different dtypes like bf16 (#28504)
I want to train dinov2 with bf16 but I get the following error in bc72b4e2cd/src/transformers/models/dinov2/modeling_dinov2.py (L635):

```
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
```

Since the input dtype is torch.float32, the parameter dtype has to be torch.float32...

@LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (bc72b4e2cd/src/transformers/models/clip/modeling_clip.py (L181-L182)).

So I add similar automatic dtype transformation to modeling_dinov2.py.
2024-01-17 19:03:08 +00:00
fxmarty 2c1eebc121
Fix SDPA tests (#28552)
* skip bf16 test if not supported by device

* fix

* fix bis

* use is_torch_bf16_available_on_device

* use is_torch_fp16_available_on_device

* fix & use public llama

* use 1b model

* fix flacky test

---------

Co-authored-by: Your Name <you@example.com>
2024-01-17 17:29:18 +01:00
Junyang Lin d6ffe74dfa
Add qwen2 (#28436)
* add config, modeling, and tokenization

* add auto and init

* update readme

* update readme

* update team name

* fixup

* fixup

* update config

* update code style

* update for fixup

* update for fixup

* update for fixup

* update for testing

* update for testing

* fix bug for config and tokenization

* fix bug for bos token

* not doctest

* debug tokenizer

* not doctest

* debug tokenization

* debug init for tokenizer

* fix style

* update init

* delete if in token auto

* add tokenizer doc

* add tokenizer in init

* Update dummy_tokenizers_objects.py

* update

* update

* debug

* Update tokenization_qwen2.py

* debug

* Update convert_slow_tokenizer.py

* add copies

* add copied from and make style

* update files map

* update test

* fix style

* fix merge reading and update tests

* fix tests

* fix tests

* fix style

* debug a variable in readme

* Update src/transformers/models/qwen2/configuration_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update test and copied from

* fix style

* update qwen2 tokenization  and tests

* Update tokenization_qwen2.py

* delete the copied from after property

* fix style

* update tests

* update tests

* add copied from

* fix bugs

* update doc

* add warning for sliding window attention

* update qwen2 tokenization

* fix style

* Update src/transformers/models/qwen2/modeling_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer fast

---------

Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 16:02:22 +01:00
Gustavo de Rosa d93ef7d751
Fixes default value of `softmax_scale` in `PhiFlashAttention2`. (#28537)
* fix(phi): Phi does not use softmax_scale in Flash-Attention.

* chore(docs): Update Phi docs.
2024-01-17 14:22:44 +01:00
fxmarty a6adc05e6b
symbolic_trace: add past_key_values, llama, sdpa support (#28447)
* torch.fx: add pkv, llama, sdpa support

* Update src/transformers/models/opt/modeling_opt.py

* remove spaces

* trigger ci

* use explicit variable names
2024-01-17 11:50:53 +01:00
Patrick von Platen 09eb11a1bd
[Makefile] Exclude research projects from format (#28551) 2024-01-17 11:59:40 +02:00
Joao Gante f4f57f9dfa
Config: warning when saving generation kwargs in the model config (#28514) 2024-01-16 18:31:01 +00:00
inisis 7142bdfa90
Add is_model_supported for fx (#28521)
* modify check_if_model_is_supported to return bool

* add is_model_supported and have check_if_model_is_supported use that

* Update src/transformers/utils/fx.py

Fantastic

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-16 17:52:44 +00:00
fxmarty 02f8738ef8
Clearer error for SDPA when explicitely requested (#28006)
* clearer error for sdpa

* better message
2024-01-16 16:10:44 +00:00
Arthur fe23256b73
[`SpeechT5Tokenization`] Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme (#28522)
* Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme

* fixup

* add a small test

* style test file

* nites
2024-01-16 16:50:02 +01:00
Arthur 96d0883103
[`TokenizationRoformerFast`] Fix the save and loading (#28527)
* cleanup

* add a test

* update the test

* style

* revert part that allows to pickle the tokenizer
2024-01-16 16:37:15 +01:00
Arthur 716df5fb7e
[ `TokenizationUtils`] Fix `add_special_tokens` when the token is already there (#28520)
* fix adding special tokens when the token is already there.

* add a test

* add a test

* nit

* fix the test: make sure the order is preserved

* Update tests/test_tokenization_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-16 16:36:29 +01:00
Nima Yaqmuri 07ae53e6e7
Fix/speecht5 bug (#28481)
* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Enhance handling of speaker embeddings in SpeechT5

- Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch).
- The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions.
- Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations.

* Improve Test Robustness with Randomized Speaker Embeddings
2024-01-16 14:14:28 +00:00
fxmarty 66db33ddc8
Fix mismatching loading in from_pretrained with/without accelerate (#28414)
* fix mismatching behavior in from_pretrained with/without accelerate

* meaningful refactor

* remove added space

* add test

* fix model on the hub

* comment

* use tiny model

* style
2024-01-16 14:29:51 +01:00
Hamza FILALI 002566f398
Improving Training Performance and Scalability Documentation (#28497)
* Improving Training Performance and Scaling documentation by adding PEFT techniques to suggestions to reduce memory requirements for training

* Update docs/source/en/perf_train_gpu_one.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-01-16 11:30:26 +01:00
regisss 0cdcd7a2b3
Remove `task` arg in `load_dataset` in image-classification example (#28408)
* Remove `task` arg in `load_dataset` in image-classification example

* Manage case where "train" is not in dataset

* Add new args to manage image and label column names

* Similar to audio-classification example

* Fix README

* Update tests
2024-01-16 08:04:08 +01:00
amyeroberts edb170238f
SiLU activation wrapper for safe importing (#28509)
Add back in wrapper for safe importing
2024-01-15 19:36:59 +00:00
Timothy Cronin ff86bc364d
improve dev setup comments and hints (#28495)
* improve dev setup comments and hints

* fix tests for new dev setup hints
2024-01-15 18:36:40 +00:00
Boris Dayma 735968b61c
fix: sampling in flax keeps EOS (#28378) 2024-01-15 18:12:09 +00:00