Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Qdrant storage support #205

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

Anush008
Copy link
Contributor

@Anush008 Anush008 commented Mar 24, 2025

Descrption

This PR adds support to export data to a Qdrant - https://qdrant.tech/ collection.

To run Qdrant,

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

6333 - REST API, 6334 - gRPC API.
The Rust client calls the gRPC API.

You can find a dashboard at https://localhost:6333/dashboard.

Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding Qdrant storage support!

@Anush008 Anush008 marked this pull request as ready for review March 27, 2025 07:16
@Anush008
Copy link
Contributor Author

Anush008 commented Mar 27, 2025

Payload conversion is simpler now.
Setup status check is WIP.

@Anush008
Copy link
Contributor Author

Hey @badmonster0.

The fields specified under primary_key_fields are only sent as key fields and the rest as value fields. Correct?
So in case of Qdrant, we try to parse the key fields as UUIDs or Integers to be used as point IDs?

@badmonster0
Copy link
Member

Hey @badmonster0.

The fields specified under primary_key_fields are only sent as key fields and the rest as value fields. Correct? So in case of Qdrant, we try to parse the key fields as UUIDs or Integers to be used as point IDs?

Correct. "key fields" and "value fields" are exclusive.

Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a warm checkin to see if there's any questions during implementing. Feel free to ask if there's any :)

It looks like the executor logic is almost there. The setup status checker is usually harder. We're also considering making this part optional (e.g. users can manage the backend resources themselves outside cocoindex, as long as the schema they created matches certain expectations). If there're indeed difficulty in this part and you prefer, I can make this change.

Thanks!

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 4, 2025

Hey. I'm on a vacation.

I can make this change

Sure. Thank you 🙏

@badmonster0
Copy link
Member

Hey. I'm on a vacation.

I can make this change

Sure. Thank you 🙏

Added a setup_by_user option for this purpose: https://cocoindex.io/docs/core/flow_def#export

Then when implementing check_setup_status you can directly return an error (also just implemented Infallible for ResourceSetupStatusCheck which provides a concrete type that you can return)

Hope this makes the integration of Qdrant easier by separating into different steps :)

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 8, 2025

ResourceSetupStatusCheck is handled now.
With 4a0bdf5 and a4c6cae

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 8, 2025

Hey @badmonster0! Am I right in my understanding that primary_key_fields are passed as key_fields? If so, would it be sufficient to treat the first key field as the ID when using Qdrant?

67f1e71

@badmonster0
Copy link
Member

Hey @badmonster0! Am I right in my understanding that primary_key_fields are passed as key_fields?

Yes, that's correct.

If so, would it be sufficient to treat the first key field as the ID when using Qdrant?

Yes, this is the right approach in my mind.
primary_key_fields is allowed to have multiple elements as storages like Postgres support primary key on multiple fields.
Since qdrant only support a single key part, the size should always be 1. And we can raise an error (like api_bail!) when StorageFactoryBase::build is called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants