Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Text Generation] Support for causal masks, internal KV cache, and initial testing framework #1172

Merged
merged 16 commits into from
Aug 10, 2023

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Aug 8, 2023

  1. Making sure that models that have not been reexported do not raise errors.
  2. Verbose messages for the user regarding the enablement/disablement of multitoken engine for prompt processing.

INCLUDED PRs:

Includes PR #1163 which adds testing and internal KV cache support, made up of:

@dbogunowicz dbogunowicz changed the title initial commit [Text Generation] Enable running models without causal mask support in the pipeline. Aug 8, 2023
@dbogunowicz dbogunowicz requested a review from bfineran August 9, 2023 08:33
bfineran
bfineran previously approved these changes Aug 9, 2023
Satrat
Satrat previously approved these changes Aug 10, 2023
@dbogunowicz dbogunowicz dismissed stale reviews from Satrat and bfineran via 14a5197 August 10, 2023 03:31
@dbogunowicz dbogunowicz requested review from Satrat and bfineran August 10, 2023 03:40
Satrat
Satrat previously approved these changes Aug 10, 2023
bfineran
bfineran previously approved these changes Aug 10, 2023
…rk (#1163)

* Create test_nl_decoder_engine.py

* [Text Generation][Tests] DecoderKVCache (#1154)

* [Text Generation][Tests] NLDecoderEngine (#1155)

* initial commit

* initial commit

* [Text Generation][Tests] Text Generation Pipeline (#1162)

* initial implementation

* problems with multitoken prefill

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge

* Make tests work with stub (as much as possible), cleanup test names,  disable heavy tests, include patch for running without causal mask

* use patch from unittest library - remove additional dependency

* Update tests/deepsparse/transformers/pipelines/test_text_generation.py

* clarify todo comment

* [Text Generation]  KV Cache internal Deepsparse support (#1135)

* fix kv cache

* refactor

* add validation pathway

* avx2 support

* initial commit

* initial commit

* initial implementation

* problems with multitoken prefill

* its working

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge

* added some tests

* ready for review

* full support

---------

Co-authored-by: dbogunowicz <[email protected]>
Co-authored-by: Damian <[email protected]>

* incomplete string in parametrize

* few nits before the merge

---------

Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: Sage Moore <[email protected]>
@bfineran bfineran dismissed stale reviews from Satrat and themself via 43e70a5 August 10, 2023 16:53
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving - includes changes from #1163

@bfineran bfineran changed the title [Text Generation] Enable running models without causal mask support in the pipeline. [Text Generation] Support for causal masks, internal KV cache, and initial testing framework Aug 10, 2023
@bfineran bfineran merged commit 0fc35f0 into main Aug 10, 2023
@bfineran bfineran deleted the feature/damian/backward_comp_no_causal_mask_models branch August 10, 2023 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants