Skip to content

Remove benchmark generate corpus command #1553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

aspacca
Copy link
Contributor

@aspacca aspacca commented Nov 10, 2023

The benchmark generate-corpus command fetches artifacts from the corpus generator's repo.
We should instead rely on the artifacts inside the elastic-package itself.
This it's already taken care of by #1522, adding an option to persist the rally track and execute a dry run (no rally execution, just track creation)

@aspacca aspacca requested review from jsoriano and mrodm November 10, 2023 00:25
@aspacca aspacca self-assigned this Nov 10, 2023
@aspacca aspacca requested a review from ruflin November 10, 2023 00:26
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @aspacca

@ruflin
Copy link
Contributor

ruflin commented Nov 10, 2023

Let's hold off with merging this for a bit until all the corpus PR's are merged and @gizas also confirms nothing is missing for their benchmark efforts.

Few additional thoughts here on where I see us heading long term. When we discussed moving over the packages, we also had the conversation that we put some limitations in on the flexility of creating a corpus. As you know, I'm ok with this as bringing packages and corpus together ensures we test what the users get. If we want to test some "custom data" in the future, we should create a package for it.

The current approach is very tied to having the package as source available. I expect there is a future where you can select the package version that you want to run the benchmark on, it pulls it from the registry (or git?) and runs it. There are a few things in the way to get there like the benchmark assets are removed during packaging AFAIK. To get to the point: Any future such command, where will it be in the hiearchy? Is it under the new command hiearchy or does it better fit here.

@aspacca By removing this command, do we loose any functionality we don't have yet in the new command (except referencing external corpus definitions?)

@aspacca
Copy link
Contributor Author

aspacca commented Nov 13, 2023

By removing this command, do we loose any functionality we don't have yet in the new command (except referencing external corpus definitions?)

just streaming the corpus to stdout, and since we have schema-a assets in the generator repo, we loose them as well. nothing that cannot be added back as system test (and there's only 1 schema-a integrations: aws.vpcflow)

The current approach is very tied to having the package as source available. I expect there is a future where you can select the package version that you want to run the benchmark on, it pulls it from the registry (or git?) and runs it. There are a few things in the way to get there like the benchmark assets are removed during packaging AFAIK. To get to the point: Any future such command, where will it be in the hiearchy? Is it under the new command hiearchy or does it better fit here.

I will keep the same command as benchmark rally with some flags to fetch a specific version of the package from the registry instead installing the local package folder

@aspacca aspacca merged commit b2251c5 into elastic:main Nov 13, 2023
This was referenced Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants