transformers/utils/not_doctested.txt

997 lines
50 KiB
Plaintext
Raw Permalink Normal View History

docs/source/en/_config.py
docs/source/en/accelerate.md
docs/source/en/add_new_model.md
docs/source/en/add_new_pipeline.md
Reboot Agents (#30387) * Create CodeAgent and ReactAgent * Fix formatting errors * Update documentation for agents * Add custom errors, improve logging * Support variable usage in ReactAgent * add messages * Add message passing format * Create React Code Agent * Update * Refactoring * Fix errors * Improve python interpreter * Only non-tensor inputs should be sent to device * Calculator tool slight refactor * Improve docstrings * Refactor * Fix tests * Fix more tests * Fix even more tests * Fix tests by replacing output and input types * Fix operand type issue * two small fixes * EM TTS * Fix agent running type errors * Change text to speech tests to allow changed outputs * Update doc with new agent types * Improve code interpreter * If max iterations reached, provide a real answer instead of an error * Add edge case in interpreter * Add safe imports to the interpreter * Interpreter tweaks: tuples and listcomp * Make style * Make quality * Add dictcomp to interpreter * Rename ReactJSONAgent to ReactJsonAgent * Misc changes * ToolCollection * Rename agent's logger to self.logger * Add while loops to interpreter * Update doc with new tools. still need to mention collections * Add collections to the doc * Small fixes on logs and interpretor * Fix toolbox return type * Docs + fixup * Skip doctests * Correct prompts with improved examples and formatting * Update prompt * Remove outdated docs * Change agent to accept Toolbox object for tools * Remove calculator tool * Propagate removal of calculator in doc * Fix 2 failing workflows * Simplify additional argument passing * AgentType audio * Minor changes: function name, types * Remove calculator tests * Fix test * Fix torch requirement * Fix final answer tests * Style fixes * Fix tests * Update docstrings with calculator removal * Small type hint fixes * Update tests/agents/test_translation.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_python_interpreter.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/default_tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_agents.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/bert/configuration_bert.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/speech_to_text.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_speech_to_text.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_tools_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pygments * Answer comments * Cleaning up * Simplifying init for all agents * Improving prompts and making code nicer * Style fixes * Add multiple comparator test in interpreter * Style fixes * Improve BERT example in documentation * Add examples to doc * Fix python interpreter quality * Logging improvements * Change test flag to agents * Quality fix * Add example for HfEngine * Improve conversation example for HfEngine * typo fix * Verify doc * Update docs/source/en/agents.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/agents.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/prompts.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/python_interpreter.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/agents.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix style issues * local s2t tool --------- Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com> Co-authored-by: Lysandre <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-07 18:59:49 +08:00
docs/source/en/agents.md
docs/source/en/agents.md
docs/source/en/attention.md
docs/source/en/benchmarks.md
docs/source/en/bertology.md
docs/source/en/big_models.md
docs/source/en/community.md
docs/source/en/contributing.md
docs/source/en/create_a_model.md
docs/source/en/custom_models.md
docs/source/en/debugging.md
docs/source/en/fast_tokenizers.md
docs/source/en/glossary.md
docs/source/en/hpo_train.md
docs/source/en/index.md
docs/source/en/installation.md
docs/source/en/internal/audio_utils.md
docs/source/en/internal/file_utils.md
docs/source/en/internal/image_processing_utils.md
docs/source/en/internal/modeling_utils.md
docs/source/en/internal/pipelines_utils.md
docs/source/en/internal/time_series_utils.md
docs/source/en/internal/tokenization_utils.md
docs/source/en/internal/trainer_utils.md
docs/source/en/llm_tutorial.md
docs/source/en/main_classes/agent.md
docs/source/en/main_classes/callback.md
docs/source/en/main_classes/configuration.md
docs/source/en/main_classes/data_collator.md
docs/source/en/main_classes/deepspeed.md
docs/source/en/main_classes/feature_extractor.md
docs/source/en/main_classes/image_processor.md
docs/source/en/main_classes/keras_callbacks.md
docs/source/en/main_classes/logging.md
docs/source/en/main_classes/model.md
docs/source/en/main_classes/onnx.md
docs/source/en/main_classes/optimizer_schedules.md
docs/source/en/main_classes/output.md
docs/source/en/main_classes/pipelines.md
docs/source/en/main_classes/processors.md
docs/source/en/main_classes/quantization.md
docs/source/en/main_classes/tokenizer.md
docs/source/en/main_classes/trainer.md
docs/source/en/model_doc/albert.md
docs/source/en/model_doc/align.md
docs/source/en/model_doc/altclip.md
docs/source/en/model_doc/audio-spectrogram-transformer.md
docs/source/en/model_doc/auto.md
docs/source/en/model_doc/autoformer.md
docs/source/en/model_doc/bark.md
docs/source/en/model_doc/bart.md
docs/source/en/model_doc/barthez.md
docs/source/en/model_doc/bartpho.md
docs/source/en/model_doc/beit.md
docs/source/en/model_doc/bert-generation.md
docs/source/en/model_doc/bert-japanese.md
docs/source/en/model_doc/bert.md
docs/source/en/model_doc/bertweet.md
docs/source/en/model_doc/big_bird.md
docs/source/en/model_doc/bigbird_pegasus.md
docs/source/en/model_doc/biogpt.md
docs/source/en/model_doc/bit.md
docs/source/en/model_doc/blenderbot-small.md
docs/source/en/model_doc/blenderbot.md
docs/source/en/model_doc/blip-2.md
docs/source/en/model_doc/blip.md
docs/source/en/model_doc/bloom.md
docs/source/en/model_doc/bort.md
docs/source/en/model_doc/bridgetower.md
docs/source/en/model_doc/camembert.md
docs/source/en/model_doc/canine.md
docs/source/en/model_doc/chinese_clip.md
docs/source/en/model_doc/clap.md
docs/source/en/model_doc/clip.md
docs/source/en/model_doc/clipseg.md
docs/source/en/model_doc/codegen.md
docs/source/en/model_doc/conditional_detr.md
docs/source/en/model_doc/convbert.md
docs/source/en/model_doc/convnext.md
docs/source/en/model_doc/convnextv2.md
docs/source/en/model_doc/cpm.md
docs/source/en/model_doc/cpmant.md
docs/source/en/model_doc/ctrl.md
docs/source/en/model_doc/cvt.md
docs/source/en/model_doc/data2vec.md
docs/source/en/model_doc/deberta-v2.md
docs/source/en/model_doc/deberta.md
docs/source/en/model_doc/decision_transformer.md
docs/source/en/model_doc/deformable_detr.md
docs/source/en/model_doc/deit.md
docs/source/en/model_doc/deplot.md
docs/source/en/model_doc/deta.md
docs/source/en/model_doc/detr.md
docs/source/en/model_doc/dialogpt.md
docs/source/en/model_doc/dinat.md
docs/source/en/model_doc/dinov2.md
docs/source/en/model_doc/distilbert.md
docs/source/en/model_doc/dit.md
docs/source/en/model_doc/dpr.md
docs/source/en/model_doc/dpt.md
docs/source/en/model_doc/efficientformer.md
docs/source/en/model_doc/efficientnet.md
docs/source/en/model_doc/electra.md
docs/source/en/model_doc/encodec.md
docs/source/en/model_doc/ernie.md
docs/source/en/model_doc/ernie_m.md
docs/source/en/model_doc/esm.md
docs/source/en/model_doc/flan-t5.md
docs/source/en/model_doc/flan-ul2.md
docs/source/en/model_doc/flaubert.md
docs/source/en/model_doc/flava.md
docs/source/en/model_doc/fnet.md
docs/source/en/model_doc/focalnet.md
docs/source/en/model_doc/fsmt.md
docs/source/en/model_doc/funnel.md
docs/source/en/model_doc/git.md
docs/source/en/model_doc/glpn.md
docs/source/en/model_doc/gpt-sw3.md
docs/source/en/model_doc/gpt2.md
docs/source/en/model_doc/gpt_bigcode.md
docs/source/en/model_doc/gpt_neo.md
docs/source/en/model_doc/gpt_neox.md
docs/source/en/model_doc/gpt_neox_japanese.md
docs/source/en/model_doc/gptj.md
docs/source/en/model_doc/gptsan-japanese.md
docs/source/en/model_doc/graphormer.md
docs/source/en/model_doc/groupvit.md
docs/source/en/model_doc/herbert.md
docs/source/en/model_doc/hubert.md
docs/source/en/model_doc/ibert.md
docs/source/en/model_doc/idefics.md
docs/source/en/model_doc/imagegpt.md
docs/source/en/model_doc/informer.md
docs/source/en/model_doc/instructblip.md
docs/source/en/model_doc/jukebox.md
docs/source/en/model_doc/layoutlm.md
docs/source/en/model_doc/layoutlmv2.md
docs/source/en/model_doc/layoutlmv3.md
docs/source/en/model_doc/layoutxlm.md
docs/source/en/model_doc/led.md
docs/source/en/model_doc/levit.md
docs/source/en/model_doc/lilt.md
docs/source/en/model_doc/llama.md
docs/source/en/model_doc/llama2.md
[`Llava`] Add Llava to transformers (#27662) * add model like * logits match * minor fixes * fixes * up * up * add todo * llava processor * keep the processor simple * add conversion script * fixup * fix copies * up * add to index * fix config + logits * fix * refactor * more refactor * more refactor * fix copies * add authors * v1 tests * add `LlavaProcessor` in init * remove unneeded import * up * up * docs * up * fix CI * fix CI * add attention mask in test * make fixup * remove the vision model * that' s the dirty way to do it * nits * nits * updates * add more tests * add input tests * fixup * more styling * nits * updates amd cleanup * fixup the generation expected results * fix the testing script * some cleanup and simplification which does not work yet but almost there! * make correct dispatch operations * vectorize works for batch of images and text * last todos * nits * update test and modeling code * remove useless function for now * fix few issues * fix generation * some nits * add bakllava * nits * remove duplicated code * finis merge * cleanup * missed this line * fill the todos * add left padding offset * add left and rignt padding logic * bool to properly index * make sure * more cleanups * batch is fixed :wink: * add correct device for tensor creation * fix some dtype missmatch * ruff * update conversion script * Update src/transformers/__init__.py * fa 2 support + fix conversion script * more * correct reshaping * fix test dict * fix copies by ignoring * fix nit * skip clip vision model * fixup * fixup * LlavaForVisionText2Text -> LlavaForCausalLM * update * fix * raise correct errors * fix * docs * nuke for now * nits here and there * fixup * fix remaining tests * update LlavaForConditionalGeneration instead of CausalLM * fixups * pipeline support * slow and piepline tests * supports batch * nits * cleanup * fix first integration tests * add pad token where needed * correct etsts * fixups * update pipeline testr * fix quality * nits * revert unneeded change * nit * use BatchFeature * from ...feature_extraction_utils import BatchFeature * nits * nits * properly update * more f*** nits * fix copies * comment * keep slow test slow * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add piepline example * add pixel values in docstrign * update pr doctest * fix * fix slow tests * remove hack * fixup * small note * forward contrib credits from PR25789 * forward contrib credits from original implementation and work * add arthur * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Lysandre Debut <hi@lysand.re> * update docstring * nit * move to not doctested because of timeout issues * fixup * add description * more * fix-copies * fix docs * add beam search * add more comments * add typehints on processor * add speedup plot * update slow tests and docs * push test * push batched test * fix batched generation with different number of images * remove benchmark due to a bug * fix test * fix copies * add gcolab demo --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: shauray8 <shauray8@users.noreply.github.com> Co-authored-by: haotian-liu <haotian-liu@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-07 16:30:47 +08:00
docs/source/en/model_doc/llava.md
docs/source/en/model_doc/llava_next.md
docs/source/en/model_doc/longformer.md
docs/source/en/model_doc/longt5.md
docs/source/en/model_doc/luke.md
docs/source/en/model_doc/lxmert.md
docs/source/en/model_doc/m2m_100.md
docs/source/en/model_doc/madlad-400.md
docs/source/en/model_doc/marian.md
docs/source/en/model_doc/mask2former.md
docs/source/en/model_doc/maskformer.md
docs/source/en/model_doc/matcha.md
docs/source/en/model_doc/mbart.md
docs/source/en/model_doc/mctct.md
docs/source/en/model_doc/mega.md
docs/source/en/model_doc/megatron-bert.md
docs/source/en/model_doc/megatron_gpt2.md
docs/source/en/model_doc/mgp-str.md
docs/source/en/model_doc/mistral.md
docs/source/en/model_doc/mixtral.md
docs/source/en/model_doc/mluke.md
docs/source/en/model_doc/mms.md
docs/source/en/model_doc/mobilebert.md
docs/source/en/model_doc/mobilenet_v1.md
docs/source/en/model_doc/mobilenet_v2.md
docs/source/en/model_doc/mobilevit.md
docs/source/en/model_doc/mobilevitv2.md
docs/source/en/model_doc/mpnet.md
docs/source/en/model_doc/mpt.md
docs/source/en/model_doc/mra.md
docs/source/en/model_doc/mt5.md
docs/source/en/model_doc/musicgen.md
Add MusicGen Melody (#28819) * first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-03-18 21:06:12 +08:00
docs/source/en/model_doc/musicgen_melody.md
docs/source/en/model_doc/mvp.md
docs/source/en/model_doc/nat.md
docs/source/en/model_doc/nezha.md
docs/source/en/model_doc/nllb-moe.md
docs/source/en/model_doc/nllb.md
docs/source/en/model_doc/nystromformer.md
docs/source/en/model_doc/oneformer.md
docs/source/en/model_doc/open-llama.md
docs/source/en/model_doc/openai-gpt.md
docs/source/en/model_doc/opt.md
docs/source/en/model_doc/owlvit.md
docs/source/en/model_doc/pegasus.md
docs/source/en/model_doc/pegasus_x.md
docs/source/en/model_doc/perceiver.md
docs/source/en/model_doc/phobert.md
docs/source/en/model_doc/pix2struct.md
docs/source/en/model_doc/plbart.md
docs/source/en/model_doc/poolformer.md
docs/source/en/model_doc/pop2piano.md
docs/source/en/model_doc/prophetnet.md
docs/source/en/model_doc/pvt.md
docs/source/en/model_doc/qdqbert.md
Add qwen2 (#28436) * add config, modeling, and tokenization * add auto and init * update readme * update readme * update team name * fixup * fixup * update config * update code style * update for fixup * update for fixup * update for fixup * update for testing * update for testing * fix bug for config and tokenization * fix bug for bos token * not doctest * debug tokenizer * not doctest * debug tokenization * debug init for tokenizer * fix style * update init * delete if in token auto * add tokenizer doc * add tokenizer in init * Update dummy_tokenizers_objects.py * update * update * debug * Update tokenization_qwen2.py * debug * Update convert_slow_tokenizer.py * add copies * add copied from and make style * update files map * update test * fix style * fix merge reading and update tests * fix tests * fix tests * fix style * debug a variable in readme * Update src/transformers/models/qwen2/configuration_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update test and copied from * fix style * update qwen2 tokenization and tests * Update tokenization_qwen2.py * delete the copied from after property * fix style * update tests * update tests * add copied from * fix bugs * update doc * add warning for sliding window attention * update qwen2 tokenization * fix style * Update src/transformers/models/qwen2/modeling_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer fast --------- Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com> Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 23:02:22 +08:00
docs/source/en/model_doc/qwen2.md
docs/source/en/model_doc/qwen2_moe.md
docs/source/en/model_doc/rag.md
docs/source/en/model_doc/realm.md
docs/source/en/model_doc/reformer.md
docs/source/en/model_doc/regnet.md
docs/source/en/model_doc/rembert.md
docs/source/en/model_doc/resnet.md
docs/source/en/model_doc/retribert.md
docs/source/en/model_doc/roberta-prelayernorm.md
docs/source/en/model_doc/roberta.md
docs/source/en/model_doc/roc_bert.md
docs/source/en/model_doc/roformer.md
docs/source/en/model_doc/rwkv.md
docs/source/en/model_doc/sam.md
docs/source/en/model_doc/segformer.md
docs/source/en/model_doc/sew-d.md
docs/source/en/model_doc/sew.md
docs/source/en/model_doc/speech-encoder-decoder.md
docs/source/en/model_doc/speech_to_text_2.md
docs/source/en/model_doc/speecht5.md
docs/source/en/model_doc/splinter.md
docs/source/en/model_doc/squeezebert.md
docs/source/en/model_doc/swiftformer.md
docs/source/en/model_doc/swin.md
docs/source/en/model_doc/swin2sr.md
docs/source/en/model_doc/swinv2.md
docs/source/en/model_doc/table-transformer.md
docs/source/en/model_doc/tapas.md
docs/source/en/model_doc/time_series_transformer.md
docs/source/en/model_doc/timesformer.md
docs/source/en/model_doc/trajectory_transformer.md
docs/source/en/model_doc/transfo-xl.md
docs/source/en/model_doc/trocr.md
docs/source/en/model_doc/tvlt.md
docs/source/en/model_doc/ul2.md
docs/source/en/model_doc/umt5.md
docs/source/en/model_doc/unispeech-sat.md
docs/source/en/model_doc/unispeech.md
docs/source/en/model_doc/upernet.md
docs/source/en/model_doc/van.md
docs/source/en/model_doc/videomae.md
docs/source/en/model_doc/vilt.md
docs/source/en/model_doc/vipllava.md
docs/source/en/model_doc/vision-encoder-decoder.md
docs/source/en/model_doc/vision-text-dual-encoder.md
docs/source/en/model_doc/visual_bert.md
docs/source/en/model_doc/vit.md
docs/source/en/model_doc/vit_hybrid.md
docs/source/en/model_doc/vit_mae.md
docs/source/en/model_doc/vit_msn.md
docs/source/en/model_doc/vivit.md
docs/source/en/model_doc/wav2vec2-conformer.md
docs/source/en/model_doc/wav2vec2.md
docs/source/en/model_doc/wav2vec2_phoneme.md
docs/source/en/model_doc/wavlm.md
docs/source/en/model_doc/whisper.md
docs/source/en/model_doc/xclip.md
docs/source/en/model_doc/xglm.md
docs/source/en/model_doc/xlm-prophetnet.md
docs/source/en/model_doc/xlm-roberta-xl.md
docs/source/en/model_doc/xlm-roberta.md
docs/source/en/model_doc/xlm-v.md
docs/source/en/model_doc/xlm.md
docs/source/en/model_doc/xlnet.md
docs/source/en/model_doc/xls_r.md
docs/source/en/model_doc/xlsr_wav2vec2.md
docs/source/en/model_doc/xmod.md
docs/source/en/model_doc/yolos.md
docs/source/en/model_doc/yoso.md
docs/source/en/model_memory_anatomy.md
docs/source/en/model_sharing.md
docs/source/en/model_summary.md
docs/source/en/multilingual.md
docs/source/en/notebooks.md
docs/source/en/pad_truncation.md
docs/source/en/peft.md
docs/source/en/perf_hardware.md
docs/source/en/perf_infer_cpu.md
docs/source/en/perf_infer_gpu_one.md
docs/source/en/perf_torch_compile.md
docs/source/en/perf_train_cpu.md
docs/source/en/perf_train_cpu_many.md
docs/source/en/perf_train_gpu_many.md
docs/source/en/perf_train_gpu_one.md
docs/source/en/perf_train_special.md
docs/source/en/perf_train_tpu_tf.md
docs/source/en/performance.md
docs/source/en/perplexity.md
docs/source/en/philosophy.md
docs/source/en/pipeline_webserver.md
docs/source/en/pr_checks.md
docs/source/en/preprocessing.md
docs/source/en/run_scripts.md
docs/source/en/sagemaker.md
docs/source/en/serialization.md
docs/source/en/tasks/asr.md
docs/source/en/tasks/audio_classification.md
docs/source/en/tasks/document_question_answering.md
[`Llava`] Add Llava to transformers (#27662) * add model like * logits match * minor fixes * fixes * up * up * add todo * llava processor * keep the processor simple * add conversion script * fixup * fix copies * up * add to index * fix config + logits * fix * refactor * more refactor * more refactor * fix copies * add authors * v1 tests * add `LlavaProcessor` in init * remove unneeded import * up * up * docs * up * fix CI * fix CI * add attention mask in test * make fixup * remove the vision model * that' s the dirty way to do it * nits * nits * updates * add more tests * add input tests * fixup * more styling * nits * updates amd cleanup * fixup the generation expected results * fix the testing script * some cleanup and simplification which does not work yet but almost there! * make correct dispatch operations * vectorize works for batch of images and text * last todos * nits * update test and modeling code * remove useless function for now * fix few issues * fix generation * some nits * add bakllava * nits * remove duplicated code * finis merge * cleanup * missed this line * fill the todos * add left padding offset * add left and rignt padding logic * bool to properly index * make sure * more cleanups * batch is fixed :wink: * add correct device for tensor creation * fix some dtype missmatch * ruff * update conversion script * Update src/transformers/__init__.py * fa 2 support + fix conversion script * more * correct reshaping * fix test dict * fix copies by ignoring * fix nit * skip clip vision model * fixup * fixup * LlavaForVisionText2Text -> LlavaForCausalLM * update * fix * raise correct errors * fix * docs * nuke for now * nits here and there * fixup * fix remaining tests * update LlavaForConditionalGeneration instead of CausalLM * fixups * pipeline support * slow and piepline tests * supports batch * nits * cleanup * fix first integration tests * add pad token where needed * correct etsts * fixups * update pipeline testr * fix quality * nits * revert unneeded change * nit * use BatchFeature * from ...feature_extraction_utils import BatchFeature * nits * nits * properly update * more f*** nits * fix copies * comment * keep slow test slow * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add piepline example * add pixel values in docstrign * update pr doctest * fix * fix slow tests * remove hack * fixup * small note * forward contrib credits from PR25789 * forward contrib credits from original implementation and work * add arthur * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Lysandre Debut <hi@lysand.re> * update docstring * nit * move to not doctested because of timeout issues * fixup * add description * more * fix-copies * fix docs * add beam search * add more comments * add typehints on processor * add speedup plot * update slow tests and docs * push test * push batched test * fix batched generation with different number of images * remove benchmark due to a bug * fix test * fix copies * add gcolab demo --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: shauray8 <shauray8@users.noreply.github.com> Co-authored-by: haotian-liu <haotian-liu@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-07 16:30:47 +08:00
docs/source/en/tasks/idefics.md
docs/source/en/tasks/image_captioning.md
docs/source/en/tasks/image_classification.md
docs/source/en/tasks/language_modeling.md
docs/source/en/tasks/masked_language_modeling.md
docs/source/en/tasks/monocular_depth_estimation.md
docs/source/en/tasks/multiple_choice.md
docs/source/en/tasks/object_detection.md
docs/source/en/tasks/question_answering.md
docs/source/en/tasks/semantic_segmentation.md
docs/source/en/tasks/sequence_classification.md
docs/source/en/tasks/summarization.md
docs/source/en/tasks/text-to-speech.md
docs/source/en/tasks/token_classification.md
docs/source/en/tasks/translation.md
docs/source/en/tasks/video_classification.md
docs/source/en/tasks/visual_question_answering.md
docs/source/en/tasks/zero_shot_image_classification.md
docs/source/en/tasks/zero_shot_object_detection.md
docs/source/en/tasks_explained.md
docs/source/en/tf_xla.md
docs/source/en/tflite.md
docs/source/en/tokenizer_summary.md
docs/source/en/torchscript.md
docs/source/en/training.md
docs/source/en/troubleshooting.md
src/transformers/activations.py
src/transformers/activations_tf.py
Reboot Agents (#30387) * Create CodeAgent and ReactAgent * Fix formatting errors * Update documentation for agents * Add custom errors, improve logging * Support variable usage in ReactAgent * add messages * Add message passing format * Create React Code Agent * Update * Refactoring * Fix errors * Improve python interpreter * Only non-tensor inputs should be sent to device * Calculator tool slight refactor * Improve docstrings * Refactor * Fix tests * Fix more tests * Fix even more tests * Fix tests by replacing output and input types * Fix operand type issue * two small fixes * EM TTS * Fix agent running type errors * Change text to speech tests to allow changed outputs * Update doc with new agent types * Improve code interpreter * If max iterations reached, provide a real answer instead of an error * Add edge case in interpreter * Add safe imports to the interpreter * Interpreter tweaks: tuples and listcomp * Make style * Make quality * Add dictcomp to interpreter * Rename ReactJSONAgent to ReactJsonAgent * Misc changes * ToolCollection * Rename agent's logger to self.logger * Add while loops to interpreter * Update doc with new tools. still need to mention collections * Add collections to the doc * Small fixes on logs and interpretor * Fix toolbox return type * Docs + fixup * Skip doctests * Correct prompts with improved examples and formatting * Update prompt * Remove outdated docs * Change agent to accept Toolbox object for tools * Remove calculator tool * Propagate removal of calculator in doc * Fix 2 failing workflows * Simplify additional argument passing * AgentType audio * Minor changes: function name, types * Remove calculator tests * Fix test * Fix torch requirement * Fix final answer tests * Style fixes * Fix tests * Update docstrings with calculator removal * Small type hint fixes * Update tests/agents/test_translation.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_python_interpreter.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/default_tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_agents.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/bert/configuration_bert.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/tools.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/speech_to_text.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_speech_to_text.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/agents/test_tools_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pygments * Answer comments * Cleaning up * Simplifying init for all agents * Improving prompts and making code nicer * Style fixes * Add multiple comparator test in interpreter * Style fixes * Improve BERT example in documentation * Add examples to doc * Fix python interpreter quality * Logging improvements * Change test flag to agents * Quality fix * Add example for HfEngine * Improve conversation example for HfEngine * typo fix * Verify doc * Update docs/source/en/agents.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/agents.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/prompts.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/agents/python_interpreter.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/agents.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix style issues * local s2t tool --------- Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com> Co-authored-by: Lysandre <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-07 18:59:49 +08:00
src/transformers/agents/agent_types.py
src/transformers/agents/agents.py
src/transformers/agents/document_question_answering.py
src/transformers/agents/evaluate_agent.py
src/transformers/agents/image_question_answering.py
src/transformers/agents/prompts.py
src/transformers/agents/python_interpreter.py
src/transformers/agents/speech_to_text.py
src/transformers/agents/text_to_speech.py
src/transformers/agents/tools.py
src/transformers/agents/translation.py
src/transformers/audio_utils.py
src/transformers/benchmark/benchmark.py
src/transformers/benchmark/benchmark_args.py
src/transformers/benchmark/benchmark_args_tf.py
src/transformers/benchmark/benchmark_args_utils.py
src/transformers/benchmark/benchmark_tf.py
src/transformers/benchmark/benchmark_utils.py
src/transformers/commands/add_new_model_like.py
src/transformers/commands/convert.py
src/transformers/commands/download.py
src/transformers/commands/env.py
src/transformers/commands/lfs.py
src/transformers/commands/pt_to_tf.py
src/transformers/commands/run.py
src/transformers/commands/serving.py
src/transformers/commands/train.py
src/transformers/commands/transformers_cli.py
src/transformers/commands/user.py
src/transformers/configuration_utils.py
src/transformers/convert_graph_to_onnx.py
src/transformers/convert_pytorch_checkpoint_to_tf2.py
src/transformers/convert_slow_tokenizer.py
src/transformers/convert_slow_tokenizers_checkpoints_to_fast.py
src/transformers/convert_tf_hub_seq_to_seq_bert_to_pytorch.py
src/transformers/data/data_collator.py
src/transformers/data/datasets/glue.py
src/transformers/data/datasets/language_modeling.py
src/transformers/data/datasets/squad.py
src/transformers/data/metrics/squad_metrics.py
src/transformers/data/processors/glue.py
src/transformers/data/processors/squad.py
src/transformers/data/processors/utils.py
src/transformers/data/processors/xnli.py
src/transformers/debug_utils.py
src/transformers/deepspeed.py
src/transformers/dependency_versions_check.py
src/transformers/dependency_versions_table.py
src/transformers/dynamic_module_utils.py
src/transformers/feature_extraction_sequence_utils.py
src/transformers/feature_extraction_utils.py
src/transformers/file_utils.py
src/transformers/hf_argparser.py
src/transformers/hyperparameter_search.py
src/transformers/image_processing_utils.py
src/transformers/image_transforms.py
src/transformers/image_utils.py
src/transformers/integrations/bitsandbytes.py
src/transformers/integrations/deepspeed.py
src/transformers/integrations/integration_utils.py
src/transformers/integrations/peft.py
src/transformers/keras_callbacks.py
src/transformers/modelcard.py
src/transformers/modeling_flax_outputs.py
src/transformers/modeling_flax_pytorch_utils.py
src/transformers/modeling_flax_utils.py
src/transformers/modeling_outputs.py
src/transformers/modeling_tf_outputs.py
src/transformers/modeling_tf_pytorch_utils.py
src/transformers/modeling_tf_utils.py
src/transformers/modeling_utils.py
src/transformers/models/albert/convert_albert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/albert/modeling_flax_albert.py
src/transformers/models/align/configuration_align.py
src/transformers/models/align/convert_align_tf_to_hf.py
src/transformers/models/align/modeling_align.py
src/transformers/models/altclip/configuration_altclip.py
src/transformers/models/altclip/modeling_altclip.py
src/transformers/models/audio_spectrogram_transformer/configuration_audio_spectrogram_transformer.py
src/transformers/models/audio_spectrogram_transformer/convert_audio_spectrogram_transformer_original_to_pytorch.py
src/transformers/models/auto/auto_factory.py
src/transformers/models/auto/configuration_auto.py
src/transformers/models/auto/modeling_auto.py
src/transformers/models/auto/modeling_flax_auto.py
src/transformers/models/auto/modeling_tf_auto.py
src/transformers/models/autoformer/configuration_autoformer.py
src/transformers/models/autoformer/modeling_autoformer.py
src/transformers/models/bark/convert_suno_to_hf.py
src/transformers/models/bart/convert_bart_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/bart/modeling_flax_bart.py
src/transformers/models/bart/modeling_tf_bart.py
src/transformers/models/beit/convert_beit_unilm_to_pytorch.py
src/transformers/models/beit/modeling_flax_beit.py
src/transformers/models/bert/convert_bert_original_tf2_checkpoint_to_pytorch.py
src/transformers/models/bert/convert_bert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/bert/convert_bert_pytorch_checkpoint_to_original_tf.py
src/transformers/models/bert/convert_bert_token_dropping_original_tf2_checkpoint_to_pytorch.py
src/transformers/models/bert/modeling_flax_bert.py
src/transformers/models/bert_generation/modeling_bert_generation.py
src/transformers/models/big_bird/convert_bigbird_original_tf_checkpoint_to_pytorch.py
src/transformers/models/big_bird/modeling_flax_big_bird.py
src/transformers/models/bigbird_pegasus/convert_bigbird_pegasus_tf_to_pytorch.py
src/transformers/models/biogpt/configuration_biogpt.py
src/transformers/models/biogpt/convert_biogpt_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/biogpt/modeling_biogpt.py
src/transformers/models/bit/configuration_bit.py
src/transformers/models/bit/convert_bit_to_pytorch.py
src/transformers/models/bit/modeling_bit.py
src/transformers/models/blenderbot/convert_blenderbot_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/blenderbot/modeling_flax_blenderbot.py
src/transformers/models/blenderbot/modeling_tf_blenderbot.py
src/transformers/models/blenderbot_small/modeling_flax_blenderbot_small.py
src/transformers/models/blenderbot_small/modeling_tf_blenderbot_small.py
src/transformers/models/blip/configuration_blip.py
src/transformers/models/blip/convert_blip_original_pytorch_to_hf.py
src/transformers/models/blip/modeling_blip_text.py
src/transformers/models/blip/modeling_tf_blip_text.py
src/transformers/models/blip_2/configuration_blip_2.py
src/transformers/models/blip_2/convert_blip_2_original_to_pytorch.py
[`Llava`] Add Llava to transformers (#27662) * add model like * logits match * minor fixes * fixes * up * up * add todo * llava processor * keep the processor simple * add conversion script * fixup * fix copies * up * add to index * fix config + logits * fix * refactor * more refactor * more refactor * fix copies * add authors * v1 tests * add `LlavaProcessor` in init * remove unneeded import * up * up * docs * up * fix CI * fix CI * add attention mask in test * make fixup * remove the vision model * that' s the dirty way to do it * nits * nits * updates * add more tests * add input tests * fixup * more styling * nits * updates amd cleanup * fixup the generation expected results * fix the testing script * some cleanup and simplification which does not work yet but almost there! * make correct dispatch operations * vectorize works for batch of images and text * last todos * nits * update test and modeling code * remove useless function for now * fix few issues * fix generation * some nits * add bakllava * nits * remove duplicated code * finis merge * cleanup * missed this line * fill the todos * add left padding offset * add left and rignt padding logic * bool to properly index * make sure * more cleanups * batch is fixed :wink: * add correct device for tensor creation * fix some dtype missmatch * ruff * update conversion script * Update src/transformers/__init__.py * fa 2 support + fix conversion script * more * correct reshaping * fix test dict * fix copies by ignoring * fix nit * skip clip vision model * fixup * fixup * LlavaForVisionText2Text -> LlavaForCausalLM * update * fix * raise correct errors * fix * docs * nuke for now * nits here and there * fixup * fix remaining tests * update LlavaForConditionalGeneration instead of CausalLM * fixups * pipeline support * slow and piepline tests * supports batch * nits * cleanup * fix first integration tests * add pad token where needed * correct etsts * fixups * update pipeline testr * fix quality * nits * revert unneeded change * nit * use BatchFeature * from ...feature_extraction_utils import BatchFeature * nits * nits * properly update * more f*** nits * fix copies * comment * keep slow test slow * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add piepline example * add pixel values in docstrign * update pr doctest * fix * fix slow tests * remove hack * fixup * small note * forward contrib credits from PR25789 * forward contrib credits from original implementation and work * add arthur * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Lysandre Debut <hi@lysand.re> * update docstring * nit * move to not doctested because of timeout issues * fixup * add description * more * fix-copies * fix docs * add beam search * add more comments * add typehints on processor * add speedup plot * update slow tests and docs * push test * push batched test * fix batched generation with different number of images * remove benchmark due to a bug * fix test * fix copies * add gcolab demo --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: shauray8 <shauray8@users.noreply.github.com> Co-authored-by: haotian-liu <haotian-liu@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-07 16:30:47 +08:00
src/transformers/models/blip_2/modeling_blip_2.py
src/transformers/models/bloom/convert_bloom_original_checkpoint_to_pytorch.py
src/transformers/models/bloom/modeling_bloom.py
src/transformers/models/bloom/modeling_flax_bloom.py
src/transformers/models/bridgetower/configuration_bridgetower.py
src/transformers/models/bridgetower/modeling_bridgetower.py
src/transformers/models/bros/convert_bros_to_pytorch.py
src/transformers/models/byt5/convert_byt5_original_tf_checkpoint_to_pytorch.py
src/transformers/models/camembert/modeling_camembert.py
src/transformers/models/camembert/modeling_tf_camembert.py
src/transformers/models/canine/convert_canine_original_tf_checkpoint_to_pytorch.py
src/transformers/models/chinese_clip/configuration_chinese_clip.py
src/transformers/models/chinese_clip/convert_chinese_clip_original_pytorch_to_hf.py
src/transformers/models/chinese_clip/modeling_chinese_clip.py
src/transformers/models/clap/convert_clap_original_pytorch_to_hf.py
src/transformers/models/clip/convert_clip_original_pytorch_to_hf.py
src/transformers/models/clip/modeling_clip.py
src/transformers/models/clip/modeling_flax_clip.py
src/transformers/models/clip/modeling_tf_clip.py
src/transformers/models/clipseg/configuration_clipseg.py
src/transformers/models/clipseg/convert_clipseg_original_pytorch_to_hf.py
src/transformers/models/codegen/modeling_codegen.py
src/transformers/models/conditional_detr/convert_conditional_detr_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/convbert/convert_convbert_original_tf1_checkpoint_to_pytorch_and_tf2.py
src/transformers/models/convbert/modeling_convbert.py
src/transformers/models/convbert/modeling_tf_convbert.py
src/transformers/models/convnext/convert_convnext_to_pytorch.py
src/transformers/models/convnext/modeling_tf_convnext.py
src/transformers/models/convnextv2/configuration_convnextv2.py
src/transformers/models/convnextv2/convert_convnextv2_to_pytorch.py
src/transformers/models/convnextv2/modeling_convnextv2.py
src/transformers/models/cpmant/configuration_cpmant.py
src/transformers/models/cpmant/modeling_cpmant.py
src/transformers/models/cpmant/tokenization_cpmant.py
src/transformers/models/ctrl/modeling_tf_ctrl.py
src/transformers/models/cvt/convert_cvt_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/cvt/modeling_tf_cvt.py
src/transformers/models/data2vec/convert_data2vec_audio_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/data2vec/convert_data2vec_text_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/data2vec/convert_data2vec_vision_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/data2vec/modeling_data2vec_text.py
src/transformers/models/data2vec/modeling_tf_data2vec_vision.py
src/transformers/models/deberta/modeling_tf_deberta.py
src/transformers/models/deberta_v2/modeling_tf_deberta_v2.py
src/transformers/models/decision_transformer/modeling_decision_transformer.py
src/transformers/models/deformable_detr/convert_deformable_detr_to_pytorch.py
src/transformers/models/deformable_detr/load_custom.py
src/transformers/models/deit/convert_deit_timm_to_pytorch.py
src/transformers/models/deprecated/bort/convert_bort_original_gluonnlp_checkpoint_to_pytorch.py
src/transformers/models/deprecated/mctct/configuration_mctct.py
src/transformers/models/deprecated/mctct/feature_extraction_mctct.py
src/transformers/models/deprecated/mctct/modeling_mctct.py
src/transformers/models/deprecated/mctct/processing_mctct.py
src/transformers/models/deprecated/mmbt/configuration_mmbt.py
src/transformers/models/deprecated/mmbt/modeling_mmbt.py
src/transformers/models/deprecated/open_llama/configuration_open_llama.py
src/transformers/models/deprecated/open_llama/modeling_open_llama.py
src/transformers/models/deprecated/retribert/configuration_retribert.py
src/transformers/models/deprecated/retribert/modeling_retribert.py
src/transformers/models/deprecated/retribert/tokenization_retribert.py
src/transformers/models/deprecated/retribert/tokenization_retribert_fast.py
src/transformers/models/deprecated/tapex/tokenization_tapex.py
src/transformers/models/deprecated/trajectory_transformer/configuration_trajectory_transformer.py
src/transformers/models/deprecated/trajectory_transformer/convert_trajectory_transformer_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/deprecated/trajectory_transformer/modeling_trajectory_transformer.py
src/transformers/models/deprecated/transfo_xl/convert_transfo_xl_original_tf_checkpoint_to_pytorch.py
src/transformers/models/deprecated/transfo_xl/modeling_tf_transfo_xl.py
src/transformers/models/deprecated/transfo_xl/modeling_tf_transfo_xl_utilities.py
src/transformers/models/deprecated/transfo_xl/modeling_transfo_xl.py
src/transformers/models/deprecated/transfo_xl/modeling_transfo_xl_utilities.py
src/transformers/models/deprecated/van/configuration_van.py
src/transformers/models/deprecated/van/convert_van_to_pytorch.py
src/transformers/models/deprecated/van/modeling_van.py
src/transformers/models/detr/convert_detr_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/detr/convert_detr_to_pytorch.py
src/transformers/models/dialogpt/convert_dialogpt_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/dinov2/configuration_dinov2.py
src/transformers/models/dinov2/convert_dinov2_to_hf.py
src/transformers/models/dinov2/modeling_dinov2.py
src/transformers/models/distilbert/modeling_distilbert.py
src/transformers/models/distilbert/modeling_flax_distilbert.py
src/transformers/models/distilbert/modeling_tf_distilbert.py
src/transformers/models/dit/convert_dit_unilm_to_pytorch.py
src/transformers/models/donut/configuration_donut_swin.py
src/transformers/models/donut/convert_donut_to_pytorch.py
src/transformers/models/donut/modeling_donut_swin.py
src/transformers/models/dpr/convert_dpr_original_checkpoint_to_pytorch.py
src/transformers/models/dpr/modeling_dpr.py
src/transformers/models/dpr/modeling_tf_dpr.py
src/transformers/models/dpt/configuration_dpt.py
src/transformers/models/dpt/convert_dpt_hybrid_to_pytorch.py
src/transformers/models/dpt/convert_dpt_to_pytorch.py
src/transformers/models/efficientnet/configuration_efficientnet.py
src/transformers/models/efficientnet/convert_efficientnet_to_pytorch.py
src/transformers/models/efficientnet/modeling_efficientnet.py
src/transformers/models/electra/convert_electra_original_tf_checkpoint_to_pytorch.py
src/transformers/models/electra/modeling_flax_electra.py
src/transformers/models/encodec/configuration_encodec.py
src/transformers/models/encodec/convert_encodec_checkpoint_to_pytorch.py
src/transformers/models/encoder_decoder/modeling_encoder_decoder.py
src/transformers/models/encoder_decoder/modeling_flax_encoder_decoder.py
src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py
src/transformers/models/ernie/modeling_ernie.py
src/transformers/models/esm/configuration_esm.py
src/transformers/models/esm/convert_esm.py
src/transformers/models/esm/modeling_esm.py
src/transformers/models/esm/modeling_esmfold.py
src/transformers/models/esm/modeling_tf_esm.py
src/transformers/models/esm/openfold_utils/chunk_utils.py
src/transformers/models/esm/openfold_utils/data_transforms.py
src/transformers/models/esm/openfold_utils/feats.py
src/transformers/models/esm/openfold_utils/loss.py
src/transformers/models/esm/openfold_utils/protein.py
src/transformers/models/esm/openfold_utils/residue_constants.py
src/transformers/models/esm/openfold_utils/rigid_utils.py
src/transformers/models/esm/openfold_utils/tensor_utils.py
src/transformers/models/falcon/configuration_falcon.py
src/transformers/models/falcon/modeling_falcon.py
src/transformers/models/flaubert/configuration_flaubert.py
src/transformers/models/flaubert/modeling_flaubert.py
src/transformers/models/flaubert/modeling_tf_flaubert.py
src/transformers/models/flava/convert_dalle_to_flava_codebook.py
src/transformers/models/flava/convert_flava_original_pytorch_to_hf.py
src/transformers/models/flava/modeling_flava.py
src/transformers/models/fnet/convert_fnet_original_flax_checkpoint_to_pytorch.py
src/transformers/models/fnet/modeling_fnet.py
src/transformers/models/focalnet/configuration_focalnet.py
src/transformers/models/focalnet/convert_focalnet_to_hf_format.py
src/transformers/models/focalnet/modeling_focalnet.py
src/transformers/models/fsmt/convert_fsmt_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/fsmt/modeling_fsmt.py
src/transformers/models/funnel/configuration_funnel.py
src/transformers/models/funnel/convert_funnel_original_tf_checkpoint_to_pytorch.py
src/transformers/models/funnel/modeling_funnel.py
src/transformers/models/funnel/modeling_tf_funnel.py
src/transformers/models/fuyu/convert_fuyu_model_weights_to_hf.py
[ `gemma`] Adds support for Gemma 💎 (#29167) * inital commit * update * update conversion checkpoint * update conversion script * nits * some fixes * nits * merge * fix permute * nits * fix * nits * nits * nits * fix rope * fix both rope * nites * style * make sure flax works * fix flax init code * fix foward * nits * print flax generation out * current code * nits * SIIIIIIIIIIIIIIIIIII * update * add new tokenizer * correct fast tokenizer * fix conversion * more comments * fix modeling and conversion * nits and nits * nits testing * add some tokenization tests * add some edge cases * add slow tests and fix them * fixup * fix copies for modeling * fix copies * add 7B slow tests * fix * fix * fix tests * make tokenizer cis go green * styling * last tokenizer nits * update jax tests * fix flax for 7b * add jit testing 🤗 * cleanups * isolated nit, inv_freq for rotary_emb.inv_freq * propagate to jax * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adjust test * fix conversion script * change name * correct file names * update conversion script * Fix bos and eos token ids in the model configuration (#3) * update modelling * update conversion script * add static cache for gemma * fix sdpa generate * fix batched * multiple fixes * fix FA2 * final fix * Rename a few missing strings and filenames (#4) * merge with upstream main * fix copies * fix copies * fix fixup * fix fixup * fix * fix * final tests * fix fx gemma tests * fix fx bf16/fp16 tests * update slow fx tests * fx slow tests: one logits, one generation * move jit test standalone * Apply suggestions from code review * nits * tokenizer updates * more tokenization updates: custom GemmaSentencepieceExtrator * style * Update src/transformers/cache_utils.py * Update src/transformers/models/gemma/__init__.py * Update tests/models/gemma/test_modeling_flax_gemma.py * small nits * style * update tokenization test * fix the rotary embedding * with style * fix slow tests * WARNING this commit might be very important for precisions * Update tests/models/gemma/test_modeling_flax_gemma.py * Update src/transformers/models/gemma/configuration_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/models/gemma/modeling_flax_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * small nits here and there! * forgotten nit * remove on the fly computation of inv_freq * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_flax_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * nit conversion script link * fix some tests * add not doctest and pr doctest * repo consistency * fix last CIs 🚀 * update all readmes --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-02-21 21:21:28 +08:00
src/transformers/models/gemma/configuration_gemma.py
src/transformers/models/gemma/convert_gemma_weights_to_hf.py
src/transformers/models/gemma/modeling_flax_gemma.py
src/transformers/models/gemma/modeling_gemma.py
src/transformers/models/git/configuration_git.py
src/transformers/models/git/convert_git_to_pytorch.py
src/transformers/models/glpn/configuration_glpn.py
src/transformers/models/glpn/convert_glpn_to_pytorch.py
src/transformers/models/gpt2/CONVERSION.md
src/transformers/models/gpt2/convert_gpt2_original_tf_checkpoint_to_pytorch.py
src/transformers/models/gpt2/modeling_flax_gpt2.py
src/transformers/models/gpt2/modeling_tf_gpt2.py
src/transformers/models/gpt_bigcode/configuration_gpt_bigcode.py
src/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py
src/transformers/models/gpt_neo/convert_gpt_neo_mesh_tf_to_pytorch.py
src/transformers/models/gpt_neo/modeling_flax_gpt_neo.py
src/transformers/models/gpt_neo/modeling_gpt_neo.py
src/transformers/models/gpt_neox/modeling_gpt_neox.py
src/transformers/models/gpt_neox_japanese/modeling_gpt_neox_japanese.py
src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
src/transformers/models/gptj/configuration_gptj.py
src/transformers/models/gptj/modeling_flax_gptj.py
src/transformers/models/gptj/modeling_tf_gptj.py
src/transformers/models/groupvit/configuration_groupvit.py
src/transformers/models/groupvit/convert_groupvit_nvlab_to_hf.py
src/transformers/models/hubert/configuration_hubert.py
src/transformers/models/hubert/convert_distilhubert_original_s3prl_checkpoint_to_pytorch.py
src/transformers/models/hubert/convert_hubert_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/hubert/convert_hubert_original_s3prl_checkpoint_to_pytorch.py
src/transformers/models/hubert/modeling_tf_hubert.py
src/transformers/models/ibert/configuration_ibert.py
src/transformers/models/ibert/modeling_ibert.py
src/transformers/models/ibert/quant_modules.py
src/transformers/models/idefics/configuration_idefics.py
src/transformers/models/idefics/image_processing_idefics.py
src/transformers/models/idefics/modeling_idefics.py
src/transformers/models/idefics/perceiver.py
src/transformers/models/idefics/processing_idefics.py
src/transformers/models/idefics/vision.py
src/transformers/models/imagegpt/convert_imagegpt_original_tf2_to_pytorch.py
src/transformers/models/informer/configuration_informer.py
src/transformers/models/informer/modeling_informer.py
src/transformers/models/instructblip/configuration_instructblip.py
src/transformers/models/instructblip/convert_instructblip_original_to_pytorch.py
src/transformers/models/instructblip/modeling_instructblip.py
src/transformers/models/instructblip/processing_instructblip.py
Add jamba (#29943) * Add jamba arch * apply "make fix-copies" changes * fix link to model in JambaConfig docstring * Add n_ctx in modeling file because repo-consistency wants that * Add jamba to flash attention and sdpa documentation * mamba dt_proj quant fix now works for LoRA as well * override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers * add jamba to tokenization auto * fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24) * simple PR fixes * remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer * remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530) * Add copied comment on JambaMLP (it's the same as MixtralMLP) * remove padding_mask warnings. It's not supported anymore * fix docstring. Float instead of int * A few more minor PR fixes * (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass * Return None attention weights from mamba layers. Append to all attentions only if not None. * remove some leftover jamba archive lists * Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel * no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers * Add Jamba paper on READMEs * (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes) * Add copied from comment * remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms * clearer docstring for _convert_to_standard_cache * style fixes * Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs * rename test so it still overrides what its meant to override * draft * oups * nit * remove more complexe logic * fix names used in config * fix fix fix * style * fix some more failing tests * generate did not init the cache 🙃 * more small nits * typo * config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes * fix init of pkv with torch.tensor() * empty tensor * fix some init issues * stupid changes required by generate because it does not even support it's own DynamicCache class * more fixes * fix general assisted gen cache_position bug * tests passing * Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py * fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache * no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore * fix docstrings and typehints for past_key_values * style fixes * fix docs * change typehint due to copy from Mixtral * forgot import * import order * Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward) * Add integration test with tiny tandom Jamba model on hub * fix flash attention cache shapes * bring back forgotten hidden states * rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model * align integration test after modeling fixes * bugfix - mamba can use precomputed states only of forward pass is on a single token * bugfix - mamba can use precomputed states only if they match the batch size * typo * remove making _prepare_4d_causal_attention_mask a leaf function * stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>
2024-04-18 17:04:02 +08:00
src/transformers/models/jamba/configuration_jamba.py
src/transformers/models/jamba/modeling_jamba.py
src/transformers/models/kosmos2/convert_kosmos2_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/led/configuration_led.py
src/transformers/models/led/modeling_led.py
src/transformers/models/led/modeling_tf_led.py
src/transformers/models/levit/convert_levit_timm_to_pytorch.py
src/transformers/models/levit/modeling_levit.py
src/transformers/models/lilt/configuration_lilt.py
src/transformers/models/llama/configuration_llama.py
src/transformers/models/llama/convert_llama_weights_to_hf.py
src/transformers/models/llama/modeling_llama.py
[`Llava`] Add Llava to transformers (#27662) * add model like * logits match * minor fixes * fixes * up * up * add todo * llava processor * keep the processor simple * add conversion script * fixup * fix copies * up * add to index * fix config + logits * fix * refactor * more refactor * more refactor * fix copies * add authors * v1 tests * add `LlavaProcessor` in init * remove unneeded import * up * up * docs * up * fix CI * fix CI * add attention mask in test * make fixup * remove the vision model * that' s the dirty way to do it * nits * nits * updates * add more tests * add input tests * fixup * more styling * nits * updates amd cleanup * fixup the generation expected results * fix the testing script * some cleanup and simplification which does not work yet but almost there! * make correct dispatch operations * vectorize works for batch of images and text * last todos * nits * update test and modeling code * remove useless function for now * fix few issues * fix generation * some nits * add bakllava * nits * remove duplicated code * finis merge * cleanup * missed this line * fill the todos * add left padding offset * add left and rignt padding logic * bool to properly index * make sure * more cleanups * batch is fixed :wink: * add correct device for tensor creation * fix some dtype missmatch * ruff * update conversion script * Update src/transformers/__init__.py * fa 2 support + fix conversion script * more * correct reshaping * fix test dict * fix copies by ignoring * fix nit * skip clip vision model * fixup * fixup * LlavaForVisionText2Text -> LlavaForCausalLM * update * fix * raise correct errors * fix * docs * nuke for now * nits here and there * fixup * fix remaining tests * update LlavaForConditionalGeneration instead of CausalLM * fixups * pipeline support * slow and piepline tests * supports batch * nits * cleanup * fix first integration tests * add pad token where needed * correct etsts * fixups * update pipeline testr * fix quality * nits * revert unneeded change * nit * use BatchFeature * from ...feature_extraction_utils import BatchFeature * nits * nits * properly update * more f*** nits * fix copies * comment * keep slow test slow * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add piepline example * add pixel values in docstrign * update pr doctest * fix * fix slow tests * remove hack * fixup * small note * forward contrib credits from PR25789 * forward contrib credits from original implementation and work * add arthur * Update src/transformers/models/llava/processing_llava.py Co-authored-by: Lysandre Debut <hi@lysand.re> * update docstring * nit * move to not doctested because of timeout issues * fixup * add description * more * fix-copies * fix docs * add beam search * add more comments * add typehints on processor * add speedup plot * update slow tests and docs * push test * push batched test * fix batched generation with different number of images * remove benchmark due to a bug * fix test * fix copies * add gcolab demo --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: shauray8 <shauray8@users.noreply.github.com> Co-authored-by: haotian-liu <haotian-liu@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-07 16:30:47 +08:00
src/transformers/models/llava/configuration_llava.py
src/transformers/models/llava/modeling_llava.py
src/transformers/models/llava_next/configuration_llava_next.py
src/transformers/models/llava_next/modeling_llava_next.py
src/transformers/models/longformer/configuration_longformer.py
src/transformers/models/longformer/convert_longformer_original_pytorch_lightning_to_pytorch.py
src/transformers/models/longt5/configuration_longt5.py
src/transformers/models/longt5/convert_longt5x_checkpoint_to_flax.py
src/transformers/models/longt5/modeling_flax_longt5.py
src/transformers/models/luke/configuration_luke.py
src/transformers/models/luke/convert_luke_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/luke/modeling_luke.py
src/transformers/models/lxmert/configuration_lxmert.py
src/transformers/models/lxmert/convert_lxmert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/lxmert/modeling_lxmert.py
src/transformers/models/lxmert/modeling_tf_lxmert.py
src/transformers/models/m2m_100/convert_m2m100_original_checkpoint_to_pytorch.py
src/transformers/models/m2m_100/modeling_m2m_100.py
src/transformers/models/marian/configuration_marian.py
src/transformers/models/marian/convert_marian_tatoeba_to_pytorch.py
src/transformers/models/marian/convert_marian_to_pytorch.py
src/transformers/models/marian/modeling_flax_marian.py
src/transformers/models/marian/modeling_tf_marian.py
src/transformers/models/markuplm/configuration_markuplm.py
src/transformers/models/markuplm/feature_extraction_markuplm.py
src/transformers/models/mask2former/convert_mask2former_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/maskformer/configuration_maskformer_swin.py
src/transformers/models/maskformer/convert_maskformer_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/maskformer/convert_maskformer_resnet_to_pytorch.py
src/transformers/models/maskformer/convert_maskformer_swin_to_pytorch.py
src/transformers/models/maskformer/modeling_maskformer_swin.py
src/transformers/models/mbart/convert_mbart_original_checkpoint_to_pytorch.py
src/transformers/models/mbart/modeling_flax_mbart.py
src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
src/transformers/models/megatron_bert/modeling_megatron_bert.py
src/transformers/models/megatron_gpt2/checkpoint_reshaping_and_interoperability.py
src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
src/transformers/models/mgp_str/configuration_mgp_str.py
src/transformers/models/mgp_str/modeling_mgp_str.py
src/transformers/models/mistral/configuration_mistral.py
src/transformers/models/mistral/modeling_mistral.py
[`Add Mixtral`] Adds support for the Mixtral MoE (#27942) * up * up * test * logits ok * up * up * few fixes * conversion script * up * nits * nits * update * nuke * more updates * nites * fix many issues * nit * scatter * nit * nuke megablocks * nits * fix conversion script * nit * remove * nits * nit * update * oupsssss * change * nits device * nits * fixup * update * merge * add copied from * fix the copy mentions * update tests * more fixes * nits * conversion script * add parts of the readme * Update tests/models/mixtral/test_modeling_mixtral.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * new test + conversion script * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review * fix * fix copies * fix copies * ooops * fix config * Apply suggestions from code review * fix nits * nit * add copies * add batched tests * docs * fix flash attention * let's add more verbose * add correct outputs * support router ouptus * ignore copies where needed * fix * cat list if list is given for now * nits * Update docs/source/en/model_doc/mixtral.md * finish router refactoring * fix forward * fix expected values * nits * fixup * fix * fix bug * fix * fix dtype mismatch * fix * grrr grrr I support item assignment * fix CI * docs * fixup * remove some copied form * fix weird diff * skip doctest fast on the config and modeling * mark that is supports flash attention in the doc * update * Update src/transformers/models/mixtral/modeling_mixtral.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update docs/source/en/model_doc/mixtral.md Co-authored-by: Lysandre Debut <hi@lysand.re> * revert router logits config issue * update doc accordingly * Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py * nits * use torch testing asssert close * fixup * doc nits --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-11 19:50:27 +08:00
src/transformers/models/mixtral/configuration_mixtral.py
src/transformers/models/mixtral/modeling_mixtral.py
src/transformers/models/mluke/convert_mluke_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/mobilebert/convert_mobilebert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/mobilenet_v1/configuration_mobilenet_v1.py
src/transformers/models/mobilenet_v1/convert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/mobilenet_v2/configuration_mobilenet_v2.py
src/transformers/models/mobilenet_v2/convert_original_tf_checkpoint_to_pytorch.py
src/transformers/models/mobilevit/configuration_mobilevit.py
src/transformers/models/mobilevit/convert_mlcvnets_to_pytorch.py
src/transformers/models/mobilevitv2/convert_mlcvnets_to_pytorch.py
src/transformers/models/mpnet/configuration_mpnet.py
src/transformers/models/mpnet/modeling_mpnet.py
src/transformers/models/mpnet/modeling_tf_mpnet.py
src/transformers/models/mpt/configuration_mpt.py
src/transformers/models/mpt/modeling_mpt.py
src/transformers/models/mra/configuration_mra.py
src/transformers/models/mra/convert_mra_pytorch_to_pytorch.py
src/transformers/models/mra/modeling_mra.py
src/transformers/models/mt5/configuration_mt5.py
src/transformers/models/mt5/modeling_flax_mt5.py
src/transformers/models/mt5/modeling_mt5.py
src/transformers/models/mt5/modeling_tf_mt5.py
src/transformers/models/musicgen/convert_musicgen_transformers.py
Add MusicGen Melody (#28819) * first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-03-18 21:06:12 +08:00
src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
src/transformers/models/mvp/modeling_mvp.py
src/transformers/models/nllb_moe/configuration_nllb_moe.py
src/transformers/models/nllb_moe/convert_nllb_moe_sharded_original_checkpoint_to_pytorch.py
src/transformers/models/nllb_moe/modeling_nllb_moe.py
src/transformers/models/nougat/convert_nougat_to_hf.py
src/transformers/models/nystromformer/configuration_nystromformer.py
src/transformers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/nystromformer/modeling_nystromformer.py
src/transformers/models/oneformer/convert_to_hf_oneformer.py
src/transformers/models/openai/convert_openai_original_tf_checkpoint_to_pytorch.py
src/transformers/models/openai/modeling_openai.py
src/transformers/models/openai/modeling_tf_openai.py
src/transformers/models/opt/convert_opt_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/opt/modeling_flax_opt.py
src/transformers/models/owlvit/configuration_owlvit.py
src/transformers/models/owlvit/convert_owlvit_original_flax_to_hf.py
src/transformers/models/pegasus/convert_pegasus_tf_to_pytorch.py
src/transformers/models/pegasus/modeling_flax_pegasus.py
src/transformers/models/pegasus/modeling_tf_pegasus.py
src/transformers/models/pegasus_x/modeling_pegasus_x.py
src/transformers/models/perceiver/configuration_perceiver.py
src/transformers/models/perceiver/convert_perceiver_haiku_to_pytorch.py
src/transformers/models/persimmon/convert_persimmon_weights_to_hf.py
src/transformers/models/persimmon/modeling_persimmon.py
src/transformers/models/pix2struct/configuration_pix2struct.py
src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py
src/transformers/models/pix2struct/image_processing_pix2struct.py
src/transformers/models/pix2struct/processing_pix2struct.py
src/transformers/models/plbart/convert_plbart_original_checkpoint_to_torch.py
src/transformers/models/poolformer/convert_poolformer_original_to_pytorch.py
src/transformers/models/pop2piano/convert_pop2piano_weights_to_hf.py
src/transformers/models/pop2piano/feature_extraction_pop2piano.py
src/transformers/models/pop2piano/processing_pop2piano.py
src/transformers/models/pop2piano/tokenization_pop2piano.py
src/transformers/models/prophetnet/configuration_prophetnet.py
src/transformers/models/prophetnet/convert_prophetnet_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/prophetnet/modeling_prophetnet.py
src/transformers/models/pvt/configuration_pvt.py
src/transformers/models/pvt/convert_pvt_to_pytorch.py
src/transformers/models/pvt/image_processing_pvt.py
src/transformers/models/pvt/modeling_pvt.py
Add qwen2 (#28436) * add config, modeling, and tokenization * add auto and init * update readme * update readme * update team name * fixup * fixup * update config * update code style * update for fixup * update for fixup * update for fixup * update for testing * update for testing * fix bug for config and tokenization * fix bug for bos token * not doctest * debug tokenizer * not doctest * debug tokenization * debug init for tokenizer * fix style * update init * delete if in token auto * add tokenizer doc * add tokenizer in init * Update dummy_tokenizers_objects.py * update * update * debug * Update tokenization_qwen2.py * debug * Update convert_slow_tokenizer.py * add copies * add copied from and make style * update files map * update test * fix style * fix merge reading and update tests * fix tests * fix tests * fix style * debug a variable in readme * Update src/transformers/models/qwen2/configuration_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update test and copied from * fix style * update qwen2 tokenization and tests * Update tokenization_qwen2.py * delete the copied from after property * fix style * update tests * update tests * add copied from * fix bugs * update doc * add warning for sliding window attention * update qwen2 tokenization * fix style * Update src/transformers/models/qwen2/modeling_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer fast --------- Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com> Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 23:02:22 +08:00
src/transformers/models/qwen2/configuration_qwen2.py
src/transformers/models/qwen2/modeling_qwen2.py
src/transformers/models/qwen2/tokenization_qwen2.py
src/transformers/models/qwen2/tokenization_qwen2_fast.py
src/transformers/models/qwen2_moe/configuration_qwen2_moe.py
src/transformers/models/qwen2_moe/modeling_qwen2_moe.py
src/transformers/models/rag/configuration_rag.py
src/transformers/models/rag/modeling_rag.py
src/transformers/models/rag/modeling_tf_rag.py
src/transformers/models/rag/retrieval_rag.py
Add recurrent gemma (#30143) * Fork. * RecurrentGemma initial commit. * Updating __init__.py. * Minor modification to how we initialize the cache. Changing how the config specifies the architecture. * Reformat code to 4 spaces. Fixed a few typos. * Fixed the forward pass. Still unclear on the cache? * Fixed the RecurrentGemmaForCausalLM * Minor comment that we might not need attention_mask and output_attention arguments. * Now cache should work as well. * Adding a temporary example to check whether the model generation works. * Adding the tests and updating imports. * Adding the example file missing in the previous commit. * First working example. * Removing .gitignore and reverting parts of __init__. * Re-add .gitignore. * Addressing comments for configuration. * Move mask creation to `_prepare_inputs_for_generation`. * First try at integration tests: 1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'. 2. `cache_position` not passed * Transfoering between machines. * Running normal tests. * Minor fix. * More fixes. * Addressing more comments. * Minor fixes. * first stab at cleanup * more refactoring * fix copies and else * renaming and get init to work * fix causal mask creation * update * nit * fix a hell lot of things * updates * update conversion script * make all keys importable * nits * add auto mappings * properly convert ffw_up and down * add scaling * fix generations * for recurrent dtype * update * fix going beyong window * fixup * add missing files * current updates to remove last einops * finish modeling refactor * TADA * fix compile * fix most failing testt ? ? * update tests * refactor and update * update * nits, fixup and update tests * more fixup * nits * fix imports * test format * fixups * nits * tuple typing * fix code quality * add model card * fix doc * skip most generation tests * nits * style * doc fixes * fix pr and check_copies? * last nit * oupsy * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * update * Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update based on review * doc nit * fix quality * quality * fix slow test model path * update default dype * ignore attributes that can be safely ignored in check config attributes * 0lallalala come on * save nit * style * remove to dict update * make sure we can also run in float16 * style --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Aleksandar Botev <botev@google.com> Co-authored-by: Leonard Berrada <lberrada@users.noreply.github.com> Co-authored-by: anushanf <anushanf@google.com> Co-authored-by: botev <botevmg@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-10 22:59:13 +08:00
src/transformers/models/recurrent_gemma/modeling_recurrent_gemma.py
src/transformers/models/reformer/convert_reformer_trax_checkpoint_to_pytorch.py
src/transformers/models/regnet/configuration_regnet.py
src/transformers/models/regnet/convert_regnet_seer_10b_to_pytorch.py
src/transformers/models/regnet/convert_regnet_to_pytorch.py
src/transformers/models/regnet/modeling_flax_regnet.py
src/transformers/models/rembert/configuration_rembert.py
src/transformers/models/rembert/convert_rembert_tf_checkpoint_to_pytorch.py
src/transformers/models/rembert/modeling_rembert.py
src/transformers/models/rembert/modeling_tf_rembert.py
src/transformers/models/resnet/convert_resnet_to_pytorch.py
src/transformers/models/resnet/modeling_flax_resnet.py
src/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/roberta/modeling_flax_roberta.py
src/transformers/models/roberta_prelayernorm/convert_roberta_prelayernorm_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/roberta_prelayernorm/modeling_flax_roberta_prelayernorm.py
src/transformers/models/roc_bert/configuration_roc_bert.py
src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py
src/transformers/models/roformer/modeling_flax_roformer.py
src/transformers/models/roformer/modeling_roformer.py
src/transformers/models/roformer/modeling_tf_roformer.py
src/transformers/models/rwkv/configuration_rwkv.py
src/transformers/models/rwkv/convert_rwkv_checkpoint_to_hf.py
src/transformers/models/rwkv/modeling_rwkv.py
src/transformers/models/sam/configuration_sam.py
src/transformers/models/sam/convert_sam_to_hf.py
src/transformers/models/sam/image_processing_sam.py
src/transformers/models/sam/modeling_sam.py
src/transformers/models/sam/modeling_tf_sam.py
src/transformers/models/sam/processing_sam.py
src/transformers/models/seamless_m4t/convert_fairseq2_to_hf.py
Add SeamlessM4T v2 (#27779) * add working convertion script * first non-working version of modeling code * update modeling code (working) * make style * make fix-copies * add config docstrings * add config to ignore docstrings formatage due to unconventional markdown * fix copies * fix generation num_return_sequences * enrich docs * add and fix tests beside integration tests * update integration tests * update repo id * add tie weights and make style * correct naming in .md * fix imports and so on * correct docstrings * fix fp16 speech forward * fix speechencoder attention * make style * fix copied from * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2 * Apply suggestions on configuration Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove useless public models * fix private models + better naming for T2U models * clean speech encoder relative position embeddings * refactor chunk attention * add docstrings to chunk attention method * improve naming and docstrings * rename some attention variables + add temperature sampling in T2U model * rename DOCSTRINGS variable names * make style + remove 2 useless config parameters * enrich model card * remove any attention_head reference + fix temperature in T2U * new fmt and make style * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * rename spkr_id->speaker_id and change docstrings of get_char_input_ids * simplify v2attention * make style * Update seamless_m4t_v2.md * update code and tests with last update * update repo ids * fill article name, abstract andauthors * update not_doctested and slow_doc tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-12-01 03:24:43 +08:00
src/transformers/models/seamless_m4t_v2/convert_fairseq2_to_hf.py
src/transformers/models/segformer/configuration_segformer.py
src/transformers/models/segformer/convert_segformer_original_to_pytorch.py
src/transformers/models/sew/convert_sew_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/sew_d/convert_sew_d_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/speech_encoder_decoder/configuration_speech_encoder_decoder.py
src/transformers/models/speech_encoder_decoder/convert_mbart_wav2vec2_seq2seq_original_to_pytorch.py
src/transformers/models/speech_encoder_decoder/convert_speech_to_text_wav2vec2_seq2seq_original_to_pytorch.py
src/transformers/models/speech_encoder_decoder/modeling_flax_speech_encoder_decoder.py
src/transformers/models/speech_to_text/convert_s2t_fairseq_to_tfms.py
src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py
src/transformers/models/speecht5/configuration_speecht5.py
src/transformers/models/speecht5/convert_hifigan.py
src/transformers/models/speecht5/convert_speecht5_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/speecht5/number_normalizer.py
src/transformers/models/splinter/configuration_splinter.py
src/transformers/models/splinter/modeling_splinter.py
src/transformers/models/squeezebert/modeling_squeezebert.py
src/transformers/models/stablelm/modeling_stablelm.py
src/transformers/models/starcoder2/modeling_starcoder2.py
src/transformers/models/swiftformer/configuration_swiftformer.py
src/transformers/models/swiftformer/convert_swiftformer_original_to_hf.py
src/transformers/models/swiftformer/modeling_swiftformer.py
src/transformers/models/swin/convert_swin_simmim_to_pytorch.py
src/transformers/models/swin/convert_swin_timm_to_pytorch.py
src/transformers/models/swin/modeling_tf_swin.py
src/transformers/models/swin2sr/configuration_swin2sr.py
src/transformers/models/swin2sr/convert_swin2sr_original_to_pytorch.py
src/transformers/models/swinv2/convert_swinv2_timm_to_pytorch.py
src/transformers/models/swinv2/modeling_swinv2.py
src/transformers/models/switch_transformers/configuration_switch_transformers.py
src/transformers/models/switch_transformers/convert_big_switch.py
src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py
src/transformers/models/switch_transformers/modeling_switch_transformers.py
src/transformers/models/t5/configuration_t5.py
src/transformers/models/t5/convert_t5_original_tf_checkpoint_to_pytorch.py
src/transformers/models/t5/convert_t5x_checkpoint_to_flax.py
src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py
src/transformers/models/t5/modeling_flax_t5.py
src/transformers/models/t5/modeling_t5.py
src/transformers/models/t5/modeling_tf_t5.py
src/transformers/models/table_transformer/configuration_table_transformer.py
src/transformers/models/table_transformer/convert_table_transformer_to_hf.py
src/transformers/models/table_transformer/convert_table_transformer_to_hf_no_timm.py
src/transformers/models/tapas/configuration_tapas.py
src/transformers/models/tapas/convert_tapas_original_tf_checkpoint_to_pytorch.py
src/transformers/models/tapas/modeling_tapas.py
src/transformers/models/tapas/modeling_tf_tapas.py
src/transformers/models/timesformer/convert_timesformer_to_pytorch.py
src/transformers/models/timm_backbone/configuration_timm_backbone.py
src/transformers/models/timm_backbone/modeling_timm_backbone.py
src/transformers/models/trocr/convert_trocr_unilm_to_pytorch.py
src/transformers/models/umt5/configuration_umt5.py
src/transformers/models/umt5/convert_umt5_checkpoint_to_pytorch.py
src/transformers/models/umt5/modeling_umt5.py
src/transformers/models/unispeech/convert_unispeech_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/unispeech_sat/configuration_unispeech_sat.py
src/transformers/models/unispeech_sat/convert_unispeech_original_s3prl_checkpoint_to_pytorch.py
src/transformers/models/unispeech_sat/convert_unispeech_sat_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/upernet/configuration_upernet.py
src/transformers/models/upernet/convert_convnext_upernet_to_pytorch.py
src/transformers/models/upernet/convert_swin_upernet_to_pytorch.py
src/transformers/models/videomae/configuration_videomae.py
src/transformers/models/videomae/convert_videomae_to_pytorch.py
src/transformers/models/vilt/configuration_vilt.py
src/transformers/models/vilt/convert_vilt_original_to_pytorch.py
src/transformers/models/vipllava/configuration_vipllava.py
src/transformers/models/vipllava/modeling_vipllava.py
src/transformers/models/vision_encoder_decoder/modeling_flax_vision_encoder_decoder.py
src/transformers/models/vision_encoder_decoder/modeling_tf_vision_encoder_decoder.py
src/transformers/models/vision_text_dual_encoder/modeling_flax_vision_text_dual_encoder.py
src/transformers/models/vision_text_dual_encoder/modeling_vision_text_dual_encoder.py
src/transformers/models/visual_bert/convert_visual_bert_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/visual_bert/modeling_visual_bert.py
src/transformers/models/vit/convert_dino_to_pytorch.py
src/transformers/models/vit/convert_vit_timm_to_pytorch.py
src/transformers/models/vit/modeling_flax_vit.py
src/transformers/models/vit_mae/convert_vit_mae_to_pytorch.py
src/transformers/models/vit_mae/modeling_tf_vit_mae.py
src/transformers/models/vit_msn/configuration_vit_msn.py
src/transformers/models/vit_msn/convert_msn_to_pytorch.py
src/transformers/models/vivit/configuration_vivit.py
src/transformers/models/vivit/convert_vivit_flax_to_pytorch.py
src/transformers/models/vivit/image_processing_vivit.py
src/transformers/models/vivit/modeling_vivit.py
src/transformers/models/wav2vec2/convert_wav2vec2_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/wav2vec2/convert_wav2vec2_original_s3prl_checkpoint_to_pytorch.py
src/transformers/models/wav2vec2/modeling_flax_wav2vec2.py
src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
Add new meta w2v2-conformer BERT-like model (#28165) * first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-18 21:37:34 +08:00
src/transformers/models/wav2vec2_bert/convert_wav2vec2_seamless_checkpoint.py
src/transformers/models/wav2vec2_conformer/convert_wav2vec2_conformer_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/wavlm/convert_wavlm_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/wavlm/convert_wavlm_original_s3prl_checkpoint_to_pytorch.py
src/transformers/models/whisper/convert_openai_to_hf.py
src/transformers/models/whisper/english_normalizer.py
src/transformers/models/whisper/modeling_flax_whisper.py
src/transformers/models/x_clip/configuration_x_clip.py
src/transformers/models/x_clip/convert_x_clip_original_pytorch_to_hf.py
src/transformers/models/xglm/configuration_xglm.py
src/transformers/models/xglm/convert_xglm_original_ckpt_to_trfms.py
src/transformers/models/xglm/modeling_flax_xglm.py
src/transformers/models/xglm/modeling_tf_xglm.py
src/transformers/models/xglm/modeling_xglm.py
src/transformers/models/xlm/convert_xlm_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/xlm/modeling_tf_xlm.py
src/transformers/models/xlm/modeling_xlm.py
src/transformers/models/xlm_roberta/modeling_flax_xlm_roberta.py
src/transformers/models/xlm_roberta/modeling_tf_xlm_roberta.py
src/transformers/models/xlm_roberta/modeling_xlm_roberta.py
src/transformers/models/xlm_roberta_xl/convert_xlm_roberta_xl_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/xlm_roberta_xl/modeling_xlm_roberta_xl.py
src/transformers/models/xlnet/convert_xlnet_original_tf_checkpoint_to_pytorch.py
src/transformers/models/xlnet/modeling_tf_xlnet.py
src/transformers/models/xlnet/modeling_xlnet.py
src/transformers/models/xmod/convert_xmod_original_pytorch_checkpoint_to_pytorch.py
src/transformers/models/yolos/convert_yolos_to_pytorch.py
src/transformers/models/yoso/convert_yoso_pytorch_to_pytorch.py
src/transformers/models/yoso/modeling_yoso.py
src/transformers/onnx/__main__.py
src/transformers/onnx/config.py
src/transformers/onnx/convert.py
src/transformers/onnx/features.py
src/transformers/onnx/utils.py
src/transformers/optimization.py
src/transformers/optimization_tf.py
src/transformers/pipelines/audio_classification.py
src/transformers/pipelines/audio_utils.py
src/transformers/pipelines/automatic_speech_recognition.py
src/transformers/pipelines/base.py
src/transformers/pipelines/conversational.py
src/transformers/pipelines/depth_estimation.py
src/transformers/pipelines/document_question_answering.py
src/transformers/pipelines/feature_extraction.py
src/transformers/pipelines/fill_mask.py
src/transformers/pipelines/image_classification.py
src/transformers/pipelines/image_segmentation.py
src/transformers/pipelines/image_to_text.py
src/transformers/pipelines/mask_generation.py
src/transformers/pipelines/object_detection.py
src/transformers/pipelines/pt_utils.py
src/transformers/pipelines/question_answering.py
src/transformers/pipelines/table_question_answering.py
src/transformers/pipelines/text_classification.py
src/transformers/pipelines/token_classification.py
src/transformers/pipelines/video_classification.py
src/transformers/pipelines/visual_question_answering.py
src/transformers/pipelines/zero_shot_audio_classification.py
src/transformers/pipelines/zero_shot_classification.py
src/transformers/pipelines/zero_shot_image_classification.py
src/transformers/pipelines/zero_shot_object_detection.py
src/transformers/processing_utils.py
src/transformers/pytorch_utils.py
`HfQuantizer` class for quantization-related stuff in `modeling_utils.py` (#26610) * squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-30 09:48:25 +08:00
src/transformers/quantizers/auto.py
src/transformers/quantizers/base.py
src/transformers/quantizers/quantizer_awq.py
src/transformers/quantizers/quantizer_bnb_4bit.py
src/transformers/quantizers/quantizer_bnb_8bit.py
src/transformers/quantizers/quantizer_gptq.py
src/transformers/quantizers/quantizers_utils.py
src/transformers/sagemaker/trainer_sm.py
src/transformers/sagemaker/training_args_sm.py
src/transformers/testing_utils.py
src/transformers/tf_utils.py
src/transformers/time_series_utils.py
src/transformers/tokenization_utils.py
src/transformers/tokenization_utils_base.py
src/transformers/tokenization_utils_fast.py
src/transformers/trainer.py
src/transformers/trainer_callback.py
src/transformers/trainer_pt_utils.py
src/transformers/trainer_seq2seq.py
src/transformers/trainer_utils.py
src/transformers/training_args.py
src/transformers/training_args_seq2seq.py
src/transformers/training_args_tf.py
src/transformers/utils/backbone_utils.py
src/transformers/utils/bitsandbytes.py
src/transformers/utils/constants.py
src/transformers/utils/doc.py
src/transformers/utils/dummy_detectron2_objects.py
src/transformers/utils/dummy_essentia_and_librosa_and_pretty_midi_and_scipy_and_torch_objects.py
src/transformers/utils/dummy_flax_objects.py
src/transformers/utils/dummy_keras_nlp_objects.py
src/transformers/utils/dummy_music_objects.py
src/transformers/utils/dummy_pt_objects.py
src/transformers/utils/dummy_sentencepiece_and_tokenizers_objects.py
src/transformers/utils/dummy_sentencepiece_objects.py
src/transformers/utils/dummy_speech_objects.py
src/transformers/utils/dummy_tensorflow_text_objects.py
src/transformers/utils/dummy_tf_objects.py
src/transformers/utils/dummy_tokenizers_objects.py
src/transformers/utils/dummy_vision_objects.py
src/transformers/utils/fx.py
src/transformers/utils/generic.py
src/transformers/utils/hp_naming.py
src/transformers/utils/hub.py
src/transformers/utils/import_utils.py
src/transformers/utils/logging.py
src/transformers/utils/model_parallel_utils.py
src/transformers/utils/notebook.py
src/transformers/utils/peft_utils.py
src/transformers/utils/quantization_config.py
src/transformers/utils/sentencepiece_model_pb2.py
src/transformers/utils/sentencepiece_model_pb2_new.py
src/transformers/utils/versions.py