Issue querying #1505

zeitler · 2021-05-13T15:54:12Z

Hi
I've googling and I'm having a lot of problem getting how to query properly.

Having:

*********** models.py *******************
class Account(models.Model):
name = models.CharField(max_length=128, db_index=True)
description = models.TextField(blank=True, null=True)
email = models.EmailField(max_length=254, db_index=True)
....

********* analyzers.py *********************
def ngram_filter(min=2, max=15):
return token_filter(
f"ngram_{min}_{max}_filter",
type="ngram",
min_gram=min,
max_gram=max
)

def edge_gram_filter(min=2, max=15):
return token_filter(
f"edgegram_{min}_{max}_filter",
type="edge_ngram",
min_gram=min,
max_gram=max
)

def stop_words_filter():
return [
english_stop_words_filter,
portuguese_stop_words_filter,
]

filters = stop_words_filter()
filters.append(ngram_filter(3, 3))
filters.append(edge_gram_filter(2, 15))
full_searchable_analyzer = analyzer(
"full_searchable_analyzer",
tokenizer="keyword",
filter=filters
)
full_searchable_analyzer = analyzer(
"full_searchable_analyzer",
tokenizer="keyword",
filter=filters
)
string_sort_analyzer = analyzer(
'string_sort',
type="keyword",
filter=[
"lowercase",
]
)

************** documents.py ********************
@registry.register_document
class AccountDocument(Document):
name = fields.TextField(
attr="name",
fields={
'raw': fields.TextField(
analyzer=full_searchable_analyzer,
search_analyzer=string_sort_analyzer
),
'suggest': fields.CompletionField(),
}
)
description = fields.TextField(
fields={
'raw': fields.TextField(
analyzer=string_sort_analyzer,
search_analyzer=string_sort_analyzer
),
'suggest': fields.CompletionField(),
}
)
class Index:
name = 'accounts'
settings = {'number_of_shards': 1,
'number_of_replicas': 0}
class Django:
model = Account
fields = [
'email',
]

objects data:
1-> {name: 'teste', description: 'testify a common taste', email: '[email protected]'}
2-> {name: 'testJonhy', description: ' asdasdkasçkdkldas', email: '[email protected]'}
3-> {name: 'Mariah', description: 'desctestsherealso', email: '[email protected]'}

s.query(MultiMatch(query='test', fields=fields, fuzziness=10)).execute()
returns record 1

s.query("match_phrase", query='test').execute()
returns nothing

q = s.filter("match_phrase", query='test').execute()
returns nothing also

How can I make query's properly?
The goal is to query and return all this documents.

Also I pretend to highlight and add Did You Mean feature.
And I've already acomplish sugestions with:
s.suggest('name', 'test', completion={'field': 'name.suggest'}).execute()

Can someone help me or point me some documentation where I can figure this out

Thanks

Sachin-Kahandal · 2021-05-14T17:29:47Z

For this
s.query("match_phrase", query='test').execute()

Do this
s.query("match_phrase", name='test').execute()

similarly for filter query change query with name of the field that you are looing into.
ElasticSearch expects you to give the field names and the query_text you want to search.

Match phrase query is similar to the match query but is used to query text phrases. Phrase matching is necessary when the ordering of the words is important. Only the documents that contain the words in the same order as the search input are matched.

As per my deduction from your question, you just want to match your query with "test" in name field.
So try using match like
s.query("match", name='test').execute()

For this
s.query(MultiMatch(query='test', fields=fields, fuzziness=10)).execute()

Try this,
s.query(MultiMatch(query='test', fields=['name', 'description'], fuzziness='AUTO')).execute()

Let elasticsearch take care of fuzziness

zeitler · 2021-05-17T16:53:19Z

Hi @Sachin-Kahandal.
Thank you very much for your help

Still not having the desired results.

The goal is to have 3 results:
Record 3 because description haves "test" in "desctestsherealso"
Record 4 because name haves "test" in "teste"
Record 5 because name haves "test" in "testJohny"

Tests:

Testing: s.query("match", name="test").execute().hits.total
FAILED: expected: 3, obtained: 0

Testing: s.query("match", query="test").execute().hits.total
FAILED: expected: 3, obtained: 0

Testing: s.filter("match", name="test").execute().hits.total
FAILED: expected: 3, obtained: 0

Testing: s.query(MultiMatch(query="test", fields=["name", "description"])).execute().hits.total
FAILED: expected: 3, obtained: 0

Testing: s.query(MultiMatch(query="test", fields=["name", "description"], fuzziness="AUTO")).execute().hits.total
FAILED: expected: 3, obtained: 1

If I understood well, MultiMatch will return documents where test is in "name" AND in "description"

But what I pretend is documents where is in name OR in the description

The application have a landing search page. And it's intended to show all the documents that have the search keys, and after having the results I need to highlight the matching words.

The definition of the fields is correct?
...
name = fields.TextField(attr="name", fields={
'raw':` fields.TextField(
analyzer=full_searchable_analyzer,
search_analyzer=string_sort_analyzer
),
'suggest': fields.CompletionField(),
})
...

kind regards,
Thank you very much

Sachin-Kahandal · 2021-05-22T16:56:36Z

Hi @zeitler,
Ok now that I understand your problem,

What you need to solve this sort of problem is nGram/edgeNGram tokenizer.
These tokenizers break up text into configurable-sized tuples of letters.
For instance, the word "news", run through a min_gram:1, max_gram:2 nGram tokenizer would be broken up into the tokens "n", "e", "w", "s", "ne", "ew", and "ws".
This sort of analysis does really well when it comes to imprecise matching.

Also, with multimatch you can pass operator of your choice like
query = MultiMatch(query=text, fields=['Name', 'Description'], operator="OR")

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html
Examples: https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch

Hope this helps

Brechard · 2021-09-14T10:49:31Z

@zeitler did you manage to find the solution? can you post it and close if so?

leberknecht mentioned this issue Jun 11, 2021

Documentation for Fuzzy, FuzzyLikeThis, FuzzyLikeThisField? #1510

Open

elastic locked and limited conversation to collaborators Apr 5, 2024

miguelgrinberg converted this issue into discussion #1740 Apr 5, 2024

miguelgrinberg added the Category: Question label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue querying #1505

Issue querying #1505

zeitler commented May 13, 2021

Sachin-Kahandal commented May 14, 2021

zeitler commented May 17, 2021

Sachin-Kahandal commented May 22, 2021 •

edited

Loading

Brechard commented Sep 14, 2021

Issue querying #1505

Issue querying #1505

Comments

zeitler commented May 13, 2021

Sachin-Kahandal commented May 14, 2021

zeitler commented May 17, 2021

Sachin-Kahandal commented May 22, 2021 • edited Loading

Brechard commented Sep 14, 2021

Sachin-Kahandal commented May 22, 2021 •

edited

Loading