Susnato Dhar
404ff8fc17
Fix typo ( #25966 )
...
* Update feature_extraction_clap.py
* changed all lenght to length
2023-09-05 10:12:25 +02:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions ( #21694 )
2023-02-22 09:14:54 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
code-review-doctor
a2392415e9
Some tests misusing assertTrue for comparisons fix ( #16771 )
...
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor
* fix tests
* fix tf
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:44:08 +02:00
Anton Lozhkov
1417978cd4
[SequenceFeatureExtractor] Rewrite padding logic from pure python to numpy ( #13650 )
...
* Test np padding
* Pass feature extraction tests
* Update type hints
* Fix flaky integration tests
* Try a more stable waveform
* Add to_numpy jax support
* int32 attention masks
* Refactor normalization tests
2021-09-21 17:10:13 +03:00
Patrick von Platen
f6e254474c
[Sequence Feature Extraction] Add truncation ( #12804 )
...
* fix_torch_device_generate_test
* remove @
* add truncate
* finish
* correct test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* clean tests
* correct normalization for truncation
* remove casting
* up
* save intermed
* finish
* finish
* correct
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-23 17:53:30 +02:00
Suraj Patil
d26b37e744
Speech2TextTransformer ( #10175 )
...
* s2t
* fix config
* conversion script
* fix import
* add tokenizer
* fix tok init
* fix tokenizer
* first version working
* fix embeds
* fix lm head
* remove extra heads
* fix convert script
* handle encoder attn mask
* style
* better enc attn mask
* override _prepare_attention_mask_for_generation
* handle attn_maks in encoder and decoder
* input_ids => input_features
* enable use_cache
* remove old code
* expand embeddings if needed
* remove logits bias
* masked_lm_loss => loss
* hack tokenizer to support feature processing
* fix model_input_names
* style
* fix error message
* doc
* remove inputs_embeds
* remove input_embeds
* remove unnecessary docstring
* quality
* SpeechToText => Speech2Text
* style
* remove shared_embeds
* subsample => conv
* remove Speech2TextTransformerDecoderWrapper
* update output_lengths formula
* fix table
* remove max_position_embeddings
* update conversion scripts
* add possibility to do upper case for now
* add FeatureExtractor and Processor
* add tests for extractor
* require_torch_audio => require_torchaudio
* add processor test
* update import
* remove classification head
* attention mask is now 1D
* update docstrings
* attention mask should be of type long
* handle attention mask from generate
* alwyas return attention_mask
* fix test
* style
* doc
* Speech2TextTransformer => Speech2Text
* Speech2TextTransformerConfig => Speech2TextConfig
* remove dummy_inputs
* nit
* style
* multilinguial tok
* fix tokenizer
* add tgt_lang setter
* save lang_codes
* fix tokenizer
* add forced_bos_token_id to tokenizer
* apply review suggestions
* add torchaudio to extra deps
* add speech deps to CI
* fix dep
* add libsndfile to ci
* libsndfile1
* add speech to extras all
* libsndfile1 -> libsndfile1
* libsndfile
* libsndfile1-dev
* apt update
* add sudo to install
* update deps table
* install libsndfile1-dev on CI
* tuple to list
* init conv layer
* add model tests
* quality
* add integration tests
* skip_special_tokens
* add speech_to_text_transformer in toctree
* fix tokenizer
* fix fp16 tests
* add tokenizer tests
* fix copyright
* input_values => input_features
* doc
* add model in readme
* doc
* change checkpoint names
* fix copyright
* fix code example
* add max_model_input_sizes in tokenizer
* fix integration tests
* add do_lower_case to tokenizer
* remove clamp trick
* fix "Add modeling imports here"
* fix copyrights
* fix tests
* SpeechToTextTransformer => SpeechToText
* fix naming
* fix table formatting
* fix typo
* style
* fix typos
* remove speech dep from extras[testing]
* fix copies
* rename doc file,
* put imports under is_torch_available
* run feat extract tests when torch is available
* dummy objects for processor and extractor
* fix imports in tests
* fix import in modeling test
* fxi imports
* fix torch import
* fix imports again
* fix positional embeddings
* fix typo in import
* adapt new extractor refactor
* style
* fix torchscript test
* doc
* doc
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* fix docs, copied from, style
* fix docstring
* handle imports
* remove speech from all extra deps
* remove s2t from seq2seq lm mapping
* better names
* skip training tests
* add install instructions
* List => Tuple
* doc
* fix conversion script
* fix urls
* add instruction for libsndfile
* fix fp16 test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Patrick von Platen
9a06b6b11b
[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor ( #10594 )
...
* save first version
* finish refactor
* finish refactor
* correct naming
* correct naming
* shorter names
* Update src/transformers/feature_extraction_common_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* change name
* finish
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 12:16:59 +03:00