-
Notifications
You must be signed in to change notification settings - Fork 182
Pipeline
refactor to use BasePipeline
#1081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
24947a1
to
fe60a16
Compare
fe60a16
to
efb9789
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean! This looks good to me over all - for the server tests, would like to see if we can try setting more task specific kwargs (ie sequence_length
) to make sure that args are properly propagated and then for the engine specific kwargs (ie batch_size
, num_cores
) I wonder if a base pipeline alone (non engine) will choke since these are top level server args.
Similarly, might be a bit tricky overall but want to make sure we safely migrate any generic usage of pipeline attributes (ie if Pipeline.X
is called in our repo for an arbitrary pipeline, want to make sure we either add a check or make X
generic) - I believe most of this should be confined to the server though
1f3efe9
to
03de3a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, note the bucketing tests failing in GHA.
968c0c4
to
0faa374
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good pending failing tests
For 1: Updated to test the server with more pipeline-specific arguments, and verified the results were correct/pipelines were initialized as expected
For 2: |
@dsikka for the Pipeline class directly this seems about right. Surprised the server doesn't access properties of instantiated pipelines directly such a input schema. I do think we should be good overall though |
approved - test failures look unrelated. waiting to land after final major 1.6 features land |
e34249e
02a1f2c
to
e34249e
Compare
Rebased off of main since last approvals/PR review. This rebase includes the engine initialization changing, as well as joining/splitting engine inputs. Retested pipeline tests and server examples (see PR description) and everything works as expected. |
Summary
pipeline.py
by introducing a newBasePipeline
. A high-level summary of the refactor can be seen in the diagram below. The testing summary is described below. Please let me know if anything additional or more aggressive should be testedTests
test_base_pipeline
added totest_pipeline.py
pipelines
: (test_pipeline.py
,test_bucketing.py
,test_custom_pipeline.py
,test_computer_vision_pipelines.py
,test_transformers.py
)Local Testing using the following code:
New BasePipeline
Existing Pipelines
Server Testing:
Sample Config Used: