You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Text Generation] Turn off the (currently) inefficient external KV cache logic when internal KV cache management enabled (#1175)
* fix kv cache
* refactor
* add validation pathway
* avx2 support
* initial commit
* initial commit
* initial implementation
* problems with multitoken prefill
* its working
* Create test_nl_decoder_engine.py
* almost there...
* finally all tests pass
* just need to change to stub
* fix bad merge
* added some tests
* ready for review
* [Text Generation][Tests] DecoderKVCache (#1154)
* [Text Generation][Tests] NLDecoderEngine (#1155)
* initial commit
* initial commit
* [Text Generation][Tests] Text Generation Pipeline (#1162)
* initial implementation
* problems with multitoken prefill
* almost there...
* finally all tests pass
* just need to change to stub
* fix bad merge
* Make tests work with stub (as much as possible), cleanup test names, disable heavy tests, include patch for running without causal mask
* initial commit
* use patch from unittest library - remove additional dependency
* improved logic
* additional improvements
* Update src/deepsparse/transformers/pipelines/text_generation.py
* Update src/deepsparse/utils/onnx.py
Co-authored-by: Benjamin Fineran <[email protected]>
* Update src/deepsparse/utils/onnx.py
Co-authored-by: Benjamin Fineran <[email protected]>
* response to Ben's comments
* finish rebasing
* full support
* Update tests/deepsparse/transformers/pipelines/test_text_generation.py
* initial commit
* clarify todo comment
* update user messages + add assertion for safety
* [Text Generation] KV Cache internal Deepsparse support (#1135)
* fix kv cache
* refactor
* add validation pathway
* avx2 support
* initial commit
* initial commit
* initial implementation
* problems with multitoken prefill
* its working
* almost there...
* finally all tests pass
* just need to change to stub
* fix bad merge
* added some tests
* ready for review
* full support
---------
Co-authored-by: dbogunowicz <[email protected]>
Co-authored-by: Damian <[email protected]>
* minor improvements before landing
* Fix the helper function that has been broken after a merge
* incomplete string in parametrize
* few nits before the merge
* pass dummy cache if internal cache management supported
* Apply suggestions from code review
* add missing property
* cleaner func
* PR ready
* add timing for KV cache update
* initial commit
* code review comments
* Nit: docstring typo
* nit: docstring style
* fix style
* fix broken test
* fixing bad rebase
---------
Co-authored-by: Sage Moore <[email protected]>
Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: Benjamin <[email protected]>
0 commit comments