transformers

Commit Graph

Author	SHA1	Message	Date
Younes Belkada	3170af71e1	[`Detr`] Fix detr BatchNorm replacement issue (#25230 ) * fix detr weird issue * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * fix copies --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-08-01 12:21:48 +02:00
Younes Belkada	05ebb0264e	[`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201 ) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well	2023-08-01 12:20:34 +02:00
Younes Belkada	972fdcc778	[`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216 ) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-01 10:56:52 +02:00
Younes Belkada	77c3973e8f	[`Pix2Struct`] Fix pix2struct cross attention (#25200 ) * fix pix2struct cross attention * fix torchscript slow test	2023-08-01 10:56:37 +02:00
Wang, Yi	4033ea7167	make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193 ) make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-08-01 01:35:49 -04:00
Yih-Dar	0fd8d2aa2c	Fix docker image build failure (#25214 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 20:13:15 +02:00
Yih-Dar	1b4f6199c6	Update tiny model info. and pipeline testing (#25213 ) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 19:35:33 +02:00
Younes Belkada	e0c50b274a	[`pipeline`] revisit device check for pipeline (#25207 ) * revisit device check for pipeline * let's raise an error.	2023-07-31 18:43:21 +02:00
Stas Bekman	5220606607	[quantization.md] fix (#25190 ) Update quantization.md	2023-07-31 09:37:29 -07:00
Yih-Dar	9ca3aa0156	Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 17:32:05 +02:00
Younes Belkada	59dcea3fe4	[`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206 ) wrap `cuda` and `to` method correctly	2023-07-31 17:25:09 +02:00
Yih-Dar	67b85f24de	Better error message in `_prepare_output_docstrings` (#25202 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 16:15:02 +02:00
Joao Gante	4a564490e1	Musicgen: CFG is manually added (#25173 )	2023-07-31 11:21:11 +01:00
amyeroberts	05cda5df34	🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174 ) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list	2023-07-28 19:52:51 +01:00
Sanchit Gandhi	03f98f9683	[MusicGen] Fix integration tests (#25169 ) * move to device * update with cuda values * fix fp16 * more rigorous	2023-07-28 18:50:15 +01:00
Yoni Gottesman	c90e14fb0f	Fix beam search to sample at least 1 non eos token (#25103 ) (#25115 )	2023-07-28 13:20:24 -04:00
Sohyun Sim	31f137c04f	🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881 ) * docs: ko: transformers_agents.md * docs: ko: transformers_agents.md * feat: deepl draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com> --------- Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com> Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>	2023-07-28 13:06:37 -04:00
Younes Belkada	dd9d45b6ec	[`InstructBlip`] Fix instructblip slow test (#25171 ) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py	2023-07-28 17:00:10 +02:00
Younes Belkada	add0895dd9	[`Mpt`] Fix mpt slow test (#25170 ) fix mpt slow test	2023-07-28 16:45:09 +02:00
Yih-Dar	d53b8ad780	Update `use_auth_token` -> `token` in example scripts (#25167 ) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-28 15:33:45 +02:00
Alexander Markov	3cbc560d03	added compiled model support for inference (#25124 ) * added compiled model support for inference * linter * Fix tests * linter * linter * remove inference mode from pipelines * Linter --------- Co-authored-by: amarkov <alexander@inworld.ai>	2023-07-28 08:28:04 -04:00
Alan Ji	afa96fffdf	make run_generation more generic for other devices (#25133 ) * make run_generation more generic for other devices * use Accelerate to support any device type it supports. * make style * fix error usage of accelerator.prepare_model * use `PartialState` to make sure everything is running on the right device --------- Co-authored-by: statelesshz <jihuazhong1@huawei.com>	2023-07-28 08:20:10 -04:00
jiqing-feng	d23d2c27c2	Represent query_length in a different way to solve jit issue (#25164 ) Fix jit trace	2023-07-28 08:19:10 -04:00
YQ	2a78720104	override .cuda() to check if model is already quantized (#25166 )	2023-07-28 08:17:24 -04:00
Lucain	c1dba1111b	Add test when downloading from gated repo (#25039 )	2023-07-28 08:14:27 -04:00
Lucain	6232c380f2	Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120 ) * Fix .push_to_hub and cleanup get_full_repo_name usage * Do not rely on Python bool conversion magic * request changes	2023-07-28 11:40:08 +02:00
Sylvain Gugger	400e76ef11	Add new model in doc table of content (#25148 )	2023-07-27 13:41:50 -04:00
Sanchit Gandhi	e93103632b	Add bloom flax (#25094 ) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-27 18:24:56 +01:00
Yih-Dar	0c790ddbd1	More `token` things (#25146 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-27 17:42:07 +02:00
Yoach Lacombe	0b92ae3489	Add offload support to Bark (#25037 ) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-07-27 15:35:17 +01:00
Arthur	9cea3e7b80	[`MptConfig`] support from pretrained args (#25116 ) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-07-27 16:24:52 +02:00
Zach Mueller	a1c4954d25	🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109 ) * Change defaults * Sylvain's comments	2023-07-27 09:11:28 -04:00
Bram Vanroy	9a220ce30c	Clarify 4/8 bit loading log message (#25134 ) * clarify 4/8 bit loading log message * make style	2023-07-27 09:09:27 -04:00
Arthur	9429642e2d	[`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131 ) default legacy to None	2023-07-27 14:43:18 +02:00
Pbihao	de9e3b5945	fix delete all checkpoints when save_total_limit is set to 1 (#25136 )	2023-07-27 08:34:02 -04:00
Sourab Mangrulkar	a004237926	fix deepspeed load best model at end when the model gets sharded (#25057 )	2023-07-27 07:11:43 +05:30
amyeroberts	1689aea733	Move center_crop to BaseImageProcessor (#25122 )	2023-07-26 18:30:38 +01:00
amyeroberts	659829b6ae	MaskFormer - enable return_dict in order to compile (#25052 ) * Enable return_dict in order to compile * Update tests	2023-07-26 16:23:30 +01:00
Eric Bezzam	b914ec9847	Fix ViT docstring regarding default dropout values. (#25118 ) Fix docstring for dropout.	2023-07-26 11:08:57 -04:00
amyeroberts	1486d2aec2	Move common image processing methods to BaseImageProcessor (#25089 ) Move out common methods	2023-07-26 15:09:17 +01:00
Yih-Dar	d30cf3d02f	Fix past CI after #24334 (#25113 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 15:34:42 +02:00
Yih-Dar	224da5df69	update `use_auth_token` -> `token` (#25083 ) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 15:09:59 +02:00
Leo	c53c8e490c	fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772 ) fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>	2023-07-26 09:07:21 -04:00
David Reguera	04a5c859b0	Add descriptive docstring to TemperatureLogitsWarper (#24892 ) * Add descriptive docstring to TemperatureLogitsWarper It addresses https://github.com/huggingface/transformers/issues/24783 * Remove niche features Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Commit suggestion Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Refactor the examples to simpler ones * Add a missing comma Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Make args description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Remove extra text after making description more compact Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix linter --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-07-26 08:58:26 -04:00
Yih-Dar	31acba5697	Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 14:57:44 +02:00
Kihoon Son	ee63520a7b	🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828 ) * translated pipeline_webserver.md Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Update pipeline_webserver.md * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com> --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com> Co-authored-by: Kim haewon <ehdvkf02@naver.com>	2023-07-26 08:40:37 -04:00
Shauray Singh	277d3aed0a	documentation for llama2 models (#25102 ) * fix documentation * changes	2023-07-26 08:30:33 -04:00
Marc Sun	a5cc30d72a	fix tied_params for meta tensor (#25101 ) * fix tied_params for meta tensor * remove duplicate	2023-07-25 18:08:45 -04:00
dependabot[bot]	f1deb21fce	Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097 ) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-25 17:25:14 -04:00
dependabot[bot]	45bde362d2	Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098 ) Bump certifi in /examples/research_projects/decision_transformer Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-25 17:25:05 -04:00

... 3 4 5 6 7 ...

13776 Commits All Branches Search

13776 Commits

All Branches