Skip to content

Support for Distinct Query #4195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

Conversation

dplewis
Copy link
Member

@dplewis dplewis commented Sep 21, 2017

Ref: #2238, #2877

I added support for distinct query. Also works with nested queries.

Let me know if more tests should be added.

@dplewis dplewis changed the title Distinct Support for Distinct Query Sep 21, 2017
@codecov
Copy link

codecov bot commented Sep 21, 2017

Codecov Report

Merging #4195 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4195      +/-   ##
==========================================
+ Coverage   92.22%   92.24%   +0.01%     
==========================================
  Files         116      116              
  Lines        8089     8120      +31     
==========================================
+ Hits         7460     7490      +30     
- Misses        629      630       +1
Impacted Files Coverage Δ
src/RestQuery.js 95.21% <ø> (ø) ⬆️
src/Adapters/Storage/Mongo/MongoCollection.js 97.56% <100%> (+0.06%) ⬆️
src/Controllers/DatabaseController.js 94.72% <100%> (+0.04%) ⬆️
src/Routers/ClassesRouter.js 93.5% <100%> (+0.17%) ⬆️
...dapters/Storage/Postgres/PostgresStorageAdapter.js 96.58% <100%> (+0.08%) ⬆️
src/Adapters/Storage/Mongo/MongoStorageAdapter.js 93.05% <100%> (+0.09%) ⬆️
src/RestWrite.js 93.34% <0%> (-0.19%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70ca7bd...748477a. Read the comment docs.

@dplewis
Copy link
Member Author

dplewis commented Sep 21, 2017

@flovilmart

I realized that arrays works differently between mongo and postgres.

const size1 = new TestObject({size: ['S', 'M']});
const size2 = new TestObject({size: ['M', 'L']});
const size3 = new TestObject({size: ['S']});
const size4 = new TestObject({size: ['S']});

Mongo returns

[ 'L', 'M', 'S' ]

PG returns

[ [ 'S' ], [ 'M', 'L' ], [ 'S', 'M' ] ]

I can make PG match the results from mongo by checking if size is type Array.

I see a problem with nested arrays on PG since I can't check the type for array.

Thoughs?

@flovilmart
Copy link
Contributor

We should have the same results on both DB’s can you point to the example test?

@dplewis
Copy link
Member Author

dplewis commented Sep 21, 2017

If they should be the same I'll commit a solution shortly

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

@flovilmart I think this is OK for now

@dplewis dplewis closed this Sep 22, 2017
@dplewis dplewis reopened this Sep 22, 2017
@flovilmart
Copy link
Contributor

@dplewis, if we’re going with an aggregate api, do we want the distinct early on?

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

Some users want it and it's easier to use than aggregate. What do you think?

@flovilmart
Copy link
Contributor

I think we need to be careful adding new API's more over, the classes/Object is supposedly returning a list of Parse Objects, not just an array of values (which would completely break all clients)

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

I haven't tried using distinct/aggregate on clients yet so I don't know the end results. I don't want to break clients but are their any alternatives for these features?

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

We also can make these REST exclusive

@flovilmart
Copy link
Contributor

Yeah but we need consistency.

All /classes/ClassName return an array of results which are Objects of type ClassName, no matter what you pass, (adding count add an additional key; count: N)

Addind distinct as a query param, on the same endpoint would break that stability.

However, on a new endpoint, like /aggregate/ClassName, we can guarantee that

  • it will never break client SDK's
  • it ill not break the consistency of the endpoints
  • it's backwards/forward compatible
  • we can define a new return type for those

what do you think? The code itself look great, I'm just pondering as it's a major shift from the current API stability, having a query param that triggers a completely different return type.

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

You bring up a lot of good points. Let go that route (no pun intended)

@flovilmart
Copy link
Contributor

@dplewis even if the REST interface is a bit more verbose like

GET /aggregate/ClassName?distinct=key1,key2&where=...

it's not that bad

we can interpolate types, (key1, key2), structures etc...

@dplewis
Copy link
Member Author

dplewis commented Sep 22, 2017

I understand now thanks for clarifying endpoints.

@flovilmart
Copy link
Contributor

@dplewis what do you think if I close that one for now, and we'll get it merged as part of a 1st iteration for aggregations?

@flovilmart flovilmart closed this Sep 23, 2017
@dplewis
Copy link
Member Author

dplewis commented Sep 23, 2017

@flovilmart Sounds good to me. Quick question should we use for mongo db.Collection.distinct() or transform distinct into $group aggregate?

@flovilmart
Copy link
Contributor

Uhm, are they both the same? @TylerBrock?

@dplewis
Copy link
Member Author

dplewis commented Sep 23, 2017

GET /aggregate/className?distinct=key1,key2&where=..

db.collection.distinct(field, query, options)

.distinct only supports 1 field but allows for queries.

If we want to support key1, key2 we have to interpolate that into group with no query option or is there a work around?

Theres also the issue of speed map-reduce (distinct) vs aggregate (group). I haven't done much research on these options.

Thoughts?

@flovilmart
Copy link
Contributor

uhm, not much thoughts on that one! Hopefully @TylerBrock can probably point out something or someone from @mongodb like @christkv could shed some light!

@dplewis
Copy link
Member Author

dplewis commented Sep 23, 2017

Another option is to get rid of the distinct=key1,key2 and only use aggregate where.... The user can be responsible for building their own queries

@flovilmart
Copy link
Contributor

perhaps we can branch on the use case no? use .distinct() if certains params are passed otherwise use aggregation

@dplewis
Copy link
Member Author

dplewis commented Sep 23, 2017

So the use cases would be

Distinct one key
use .distinct() and allows where for queries like find()

Distinct multiple keys
Use aggregates and ignore where as it will be overridden

Aggregate
Checks for special keys $match, $group, $project in where etc...

How does that sound?

@flovilmart
Copy link
Contributor

Distinct multiple keys
Use aggregates and ignore where as it will be overridden

What do you mean it will be overriden?

Aggregate
Checks for special keys $match, $group, $project in where etc...

Actually they won't be in the where, but:

GET /aggregate/MyClass?match=...&group=...&where=...

And in that case match and where are equivalent no? As they filter the data set.

Then the order of aggregation matters a lot, I'm not sure we can guarantee it through query params.

@dplewis
Copy link
Member Author

dplewis commented Sep 23, 2017

GET /aggregate/MyClass?match=...&group=...&where=...

Makes sense.

Then the order of aggregation matters a lot

Does it?

@flovilmart
Copy link
Contributor

I believe itMa à pipeline, nothing prevents you to run match after group etc..:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants