Excluding AdamWeightDecayOptimizer internal variables from restoring #16

donatasrep · 2018-11-13T15:13:18Z

I tried to use convert_tf_checkpoint_to_pytorch.py script to convert my pretrained model, but in order to do so, I had to make some minor tweaks. I thought I would share in case you find it useful.

thomwolf · 2018-11-13T15:15:30Z

Is your pre-trained model a TensorFlow model?

donatasrep · 2018-11-13T15:16:47Z

Yes

thomwolf · 2018-11-13T15:19:35Z

Nice, thanks for that!

Excluding AdamWeightDecayOptimizer internal variables from restoring

…module hack to make roberta can run it ortmodule

…update_hf_training Removed hardcoded warmup steps.

wed

…rmat_processing make style & add postprocssing for instance segmentation compatible for triton

* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file * refactor * NOTHING. add space to rerun github actions tests * remove it... * `UniversalSpeculativeDecodingGenerator` * Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * add `TestGenerateWithDifferentModels` * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * `UniversalSpeculativeDecodingGenerator` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * fix device issue * fix get_assistant_input_ids * add `TestAssistedCandidateGeneratorDifferentTokenizers` * formatting * `AssistantVocabTranslatorCache` refactor & tests * revert changes in `src/transformers/generation/logits_process.py` * refactor `AssistedCandidateGenerator` * refactor `AssistedCandidateGeneratorDifferentTokenizers` * formatting * refactor `UniversalSpeculativeDecodingGenerator` * fix negative value for max_new_tokens * fix generation length target + attention_mask vs. assistant + attent * fix device * fix negative max_new_tokens bug * fix UAG * minor * formatting * `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init * resolve conflict & formatting * rerun CI tests * remove space... * remove old code * fix candidate_input_ids device * minor * formatting * Fix prepare + apply (#7) * fix prepare + apply * move to cpu * simplity suppress_tokens * fix bugs and refacatoring * device move * handle self.config.vocab_size > len(target_tokenizer.get_vocab()) * no need to normalize in candidate_generator * address Nadav's comments + minor * optimize device move + SuppressTokensLogitsProcessor * AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements * padding size * padding improvement * fix and simplify get_target_logits * renaming in get_target_logits * minor * add filter_value and suppress_tokens_id * style + rename * remove TODO * restore original SelectTokensLogitsProcessor with modification * fix style * fix _update_past_and_masks and optimize code * remove assistant_vocab_size arg * fix attention_mask * call _prepare_attention_mask also if not has_past_key_values * handling attention mask for first generation * comment * restore test * remove SelectTokensLogitsProcessor * _update_past_and_masks implementation for USD * Add unittests for Universal Assisted generation * fix style * update tests * Remove unused import and fix `test_speculation_depth` test * exclude special and reserved tokens from tokenizer for UAG * mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py` * Remove unused imports and fix style using `make style` (#9) * formatting * Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10) * Fix space sign disagreement (#12) * default values for AssistantToTargetTranslator fileds * fix space sign * minor * fix test + style * Default values for some fields of assistant to target translator (#11) * default values for AssistantToTargetTranslator fileds * fix * add support to empty logit_processors * Update candidate_generator.py (#15) fix typo * BUG fix in _prepare_assistant_input_ids (#14) * fix _prepare_assistant_input_ids * target_to_assistant_input_ids * Update src/transformers/generation/candidate_generator.py Co-authored-by: Nadav Timor <[email protected]> --------- Co-authored-by: Nadav Timor <[email protected]> * typo (`target_to_assistant_input_ids`) * formatting * merge upstream/main * Fix minor review comments (#16) * Fix: `token_ids.to(torch.int64)` (#18) * tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers) * `LongTensor` * fix dtype * `assistant_input_ids.to(dtype=torch.long)` * Remove unused import from test_candidate_generator.py * Remove unused import from test_candidate_generator.py * Remove `numpy` import * resolve pr comments (#19) * `AssistantToTargetTranslator` docstring * (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants * update `AssistantToTargetTranslator` docstring * (gante's comment) replace `match-case` * formatting * Fix Joao's comments (#21) * remove threading * fix logits_processor * fix test device * fix style (#23) * Move atm (#24) * move AssistantToTargetTranslator * fixup * fix logit_processor * add atm_translator test * refactor test * remove threading from test * add require_torch in tests * move AssistantVocabTranslatorCache + add tests * ruff fix --------- Co-authored-by: jmamou <[email protected]> Co-authored-by: Gaurav <[email protected]> Co-authored-by: Gaurav Jain <[email protected]> Co-authored-by: gauravjain14 <[email protected]>

[llama4/mm] Add back <|image|> tag in tokenization corresponding to global tile

Excluding AdamWeightDecayOptimizer internal variables from restoring

20d07b3

thomwolf merged commit 5cd8d7a into huggingface:master Nov 13, 2018

qwang70 pushed a commit to DRL36/pytorch-pretrained-BERT that referenced this pull request Mar 2, 2019

Merge pull request huggingface#16 from donatasrep/master

39e0bab

Excluding AdamWeightDecayOptimizer internal variables from restoring

maeotaku mentioned this pull request May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

HongyanJiao mentioned this pull request Sep 19, 2019

traced_model #1291

Closed

manchandasahil mentioned this pull request Mar 22, 2021

Longformer training : CUDA error: device-side assert triggered #10852

Closed

2 tasks

amathews-amd referenced this pull request in ROCm/transformers Aug 6, 2021

Merge pull request #16 from microsoft/pr_for_running_roberta_with_ort…

d25a36f

…module hack to make roberta can run it ortmodule

rraminen pushed a commit to rraminen/transformers that referenced this pull request Oct 27, 2022

Merge pull request huggingface#16 from ROCmSoftwarePlatform/adabeyta_…

24b288f

…update_hf_training Removed hardcoded warmup steps.

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this pull request Jun 1, 2023

Merge pull request huggingface#16 from huggingface/main

7a17fbe

wed

lwmlyy mentioned this pull request Aug 15, 2023

add util for ram efficient loading of model when using fsdp #25107

Merged

1 task

SangbumChoi added a commit to SangbumChoi/transformers that referenced this pull request Jan 13, 2025

Merge pull request huggingface#16 from Superb-AI-Suite/feat/triton_fo…

e4984d9

…rmat_processing make style & add postprocssing for instance segmentation compatible for triton

jmamou pushed a commit to jmamou/transformers that referenced this pull request Feb 27, 2025

Fix minor review comments (huggingface#16)

4e3660a

ArthurZucker pushed a commit that referenced this pull request Apr 5, 2025

Merge pull request #16 from huggingface/global-tile

6f63da6

[llama4/mm] Add back <|image|> tag in tokenization corresponding to global tile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Uh oh!

donatasrep commented Nov 13, 2018

Uh oh!

thomwolf commented Nov 13, 2018

Uh oh!

donatasrep commented Nov 13, 2018

Uh oh!

thomwolf commented Nov 13, 2018

Uh oh!

Uh oh!

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Excluding AdamWeightDecayOptimizer internal variables from restoring #16

Uh oh!

Conversation

donatasrep commented Nov 13, 2018

Uh oh!

thomwolf commented Nov 13, 2018

Uh oh!

donatasrep commented Nov 13, 2018

Uh oh!

thomwolf commented Nov 13, 2018

Uh oh!

Uh oh!