Skip to content

Dbal Performance degrades when more than 100k rows #465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aarangara opened this issue Jun 22, 2018 · 5 comments · Fixed by #498
Closed

Dbal Performance degrades when more than 100k rows #465

aarangara opened this issue Jun 22, 2018 · 5 comments · Fixed by #498
Labels

Comments

@aarangara
Copy link

I am using Dbal transport and observed a performance related issue when my enqueue MySQL table had more than 100K rows.

I am using supervisor which spawns 3 same processes (consumers) which reads the messages from the queue and processes them further. Each consumer reads the message from the queue as $this->psrContext->createConsumer($queue)->receive(1000).

And this in turn, executes the below query on the MySQL table which takes 3-4 seconds to execute:

SELECT * FROM enqueue WHERE (queue = 'myqueue') AND (priority IS NOT NULL) AND ((delayed_until IS NULL OR delayed_until <= 1529654707)) ORDER BY priority desc, published_at asc LIMIT 1 FOR UPDATE

On executing show full processlist on MySQL server, I can see the state as "Creating sort index" for above query.

Also I tried running EXPLAIN for the above query and it returned the output as

{
  "id": 1,
  "select_type": "SIMPLE",
  "table": "enqueue",
  "type": "ref",
  "possible_keys": "IDX_F69EE1957FFD7F63,IDX_F69EE19562A6DC27,IDX_F69EE1951A065DF8",
  "key": "IDX_F69EE1957FFD7F63",
  "key_len": "767",
  "ref": "const",
  "rows": 1,
  "Extra": "Using index condition; Using where; Using filesort"
}

Now on observing the "possible_keys", I figured out that the query optimizer is not taking into consideration the index for the column published_at. And this column is used in the clause ORDER BY priority desc, published_at asc

On executing the query manually and removing the order by published_at asc clause, the query completed execution in milliseconds rather than seconds, which leads me to believe that this is might be the root cause of this issue.

Has anyone experienced something like this with Dbal ?

@makasim
Copy link
Member

makasim commented Jun 22, 2018

My guess we need a composite key for priority and published at fields.

@makasim makasim added the bug label Jun 22, 2018
@ramunasd
Copy link
Contributor

Or one field for both values, easily indexable. Bigint would work best in this case:

$sortIndex = (int)$priority . time();

@aarangara
Copy link
Author

aarangara commented Jun 24, 2018

I added new key as below, it did improve the time from 3-4 seconds to 1-2 seconds, but that still is too high I guess considering that it only fetches one record.

ALTER TABLE enqueue ADD INDEX new_index(priority, published_at);

And having 100K records will still needs more time to process/clear the queue even when I have 3 php scripts processing/reading the queue

@aarangara
Copy link
Author

I tested again and surprisingly, my queue is getting processed much quicker than my previous attempt. So I guess adding the new key did help with the performance.

@makasim
Copy link
Member

makasim commented Aug 13, 2018

There is a PR that address the issue. Closing

@makasim makasim closed this as completed Aug 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants