Skip to content

Search reindex task leaves empty index. #3746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
padix-key opened this issue Apr 18, 2018 · 19 comments · Fixed by #3989
Closed

Search reindex task leaves empty index. #3746

padix-key opened this issue Apr 18, 2018 · 19 comments · Fixed by #3989
Labels
bug 🐛 search Elasticsearch, search filters, and so on

Comments

@padix-key
Copy link

The 'Search projects' function does not work for me on https://pypi.org. Irrespective of the query, the search does not return any results. (Example: https://pypi.org/search/?q=numpy)

@UlionTse
Copy link

Me, too.

@danielhrisca
Copy link

+1
image

@klaaskooistra
Copy link

+1

@olf42
Copy link

olf42 commented Apr 18, 2018

Neither pip 10.0.0 nor pip 9.0.1 showing any results at the moment:

pip search pip -v   
Starting new HTTPS connection (1): pypi.python.org
https://pypi.python.org:443 "POST /pypi HTTP/1.1" 200 108

@RecNes
Copy link

RecNes commented Apr 18, 2018

Same here;
Tested with pip 10, 8, 7

$ pip search django -v
Starting new HTTPS connection (1): pypi.python.org
https://pypi.python.org:443 "POST /pypi HTTP/1.1" 200 108

screenshot from 2018-04-18 11-57-03

@GadgetSteve
Copy link

Neither the website search nor pip search returning any results but pip install & pip list -o (which has to query the versions) all seem to be working ok so it looks to be just the search interface.

@piotr-gomola
Copy link

Same here,
Critical functionality is broken!

@danielhrisca
Copy link

I'm sure people are already working to solve this

@olf42
Copy link

olf42 commented Apr 18, 2018

Appears to be solved in web and cli (which probaly use the same endpoint) 👍

@ewdurbin
Copy link
Member

We seem to have some kind of issue in the task that runs every 3 hours to update the index. It was aggravated by changes reverted in #3716, but the underlying issue seems to still be in play.

Something seems to clearly be going wrong in the "swap" in this code: https://github.com/pypa/warehouse/blob/b463af8aac4c778fe5fd1d7abe6e52c00bd06a13/warehouse/search/tasks.py#L131-L167

@ewdurbin ewdurbin changed the title Search function not showing any results Search reindex task leaves empty index. Apr 18, 2018
@ewdurbin
Copy link
Member

This seems to be related to running the indexing job as a Celery task. I'm unable to reproduce when running the reindex job from CLI, even kicking two of them off "in competition".

@ewdurbin
Copy link
Member

ewdurbin commented Apr 19, 2018

Our ElasticSearch cluster has been upgraded to the latest available release in the 5.x series (5.6.9) from a very early release (5.0).

This was optimistic, aside from being generally a good idea. Perhaps we were hitting some bug that has been resolved.

We also disabled automatic index creation, which may have been leading to the issues encountered leading to #3716.

Aside from this one observation: In our handling of the index swap, we do not wait for a "green" status on the new index before swapping the alias and deleting the old index. Perhaps we should?

@ewdurbin
Copy link
Member

Occurred again in prod on the last index task. New index being created, grabbed logs to investigate.

@ewdurbin
Copy link
Member

State found:

health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size                                           
green  open   production-39b6225ea1           yWEfL5Q6RGu8IlnHvCTvsA   1   2          0            0       486b           162b

So the index job attempted to create the new index, but the result was empty. It nevertheless continued on to delete the previous index and take the alias.

@ewdurbin
Copy link
Member

logs:

[2018-04-19 15:00:00,792: INFO/ForkPoolWorker-5] PUT https://<redacted>.us-east-1.aws.found.io:<redacted>/production-39b6225ea1?wait_for_active_shards=1 [status:200 request:0.712s]
[2018-04-19 15:15:22,702: INFO/ForkPoolWorker-5] POST https://<redacted>.us-east-1.aws.found.io:<redacted>/production-39b6225ea1/_forcemerge [status:200 request:0.020s]
[2018-04-19 15:15:22,786: INFO/ForkPoolWorker-5] PUT https://<redacted>.us-east-1.aws.found.io:<redacted>/production-39b6225ea1/_settings [status:200 request:0.083s]
[2018-04-19 15:15:22,837: INFO/ForkPoolWorker-5] HEAD https://<redacted>.us-east-1.aws.found.io:<redacted>/_alias/production [status:200 request:0.051s]
[2018-04-19 15:15:22,854: INFO/ForkPoolWorker-5] GET https://<redacted>.us-east-1.aws.found.io:<redacted>/_alias/production [status:200 request:0.017s]
[2018-04-19 15:15:23,067: INFO/ForkPoolWorker-5] POST https://<redacted>.us-east-1.aws.found.io:<redacted>/_aliases [status:200 request:0.213s]
[2018-04-19 15:15:23,682: INFO/ForkPoolWorker-5] DELETE https://<redacted>.us-east-1.aws.found.io:<redacted>/production-c7d6538b0d [status:200 request:0.614s]

excluded _bulk calls for clarity, but there were plenty of them!

grep <redacted> worker-recent | grep '2018-04-19 15:' | grep _bulk | wc
     270    2430   54810

comparing to two previous runs:

grep <redacted> worker-recent | grep '2018-04-19 12' | grep _bulk | wc
     274    2466   55641
grep <redacted> worker-recent | grep '2018-04-19 09' | grep _bulk | wc
     269    2421   54607

@ewdurbin
Copy link
Member

it seems #3774 may have helped... which leads me to believe some state was being cached by the celery worker...

@ewdurbin
Copy link
Member

haven't had any issues since #3774 deployed... continuing to keep an eye on things. added metric for search result counts in #3772 to alert us when index is empty.

@ewdurbin
Copy link
Member

we've been going steady for 3 days. closing.

@honzakral
Copy link
Contributor

The problem is that Project doesn't have the new index associated with it when running in celery - the Index.doc_type method only sets the index name on the DocType if no index is set. That would explain why it runs fine from CLI but not from celery - CLI starts with fresh objects that don't have an index associated with it while celery is a long running process where it might happen that the previous reindex run has already registered an index name with the Project doc type. Should be a simple fix, PR incoming

@honzakral honzakral mentioned this issue May 15, 2018
di pushed a commit to honzakral/warehouse that referenced this issue May 16, 2018
@di di closed this as completed in #3989 May 16, 2018
di pushed a commit that referenced this issue May 16, 2018
* Make sure data always goes to new index

Fixes #3746

* Make celery workers more short lived

* Fix tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 search Elasticsearch, search filters, and so on
Projects
None yet
Development

Successfully merging a pull request may close this issue.