Skip to content

The _moreLikeThis query fails #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
clintongormley opened this issue Mar 29, 2010 · 8 comments
Closed

The _moreLikeThis query fails #94

clintongormley opened this issue Mar 29, 2010 · 8 comments

Comments

@clintongormley
Copy link
Contributor

A _moreLikeThis query fails, apparently because the query it constructs uses like instead of likeText

curl -XGET 'http://localhost:9200/ia_object/notice/754/_moreLikeThis?pretty=true'
{
  "error" : "SearchPhaseExecutionException[Failed to execute [query] total failure; shardFailures {[getafix-44509][ia_object_1269862021][1]: QueryPhaseExecutionException[[ia_object_1269862021][1]: query[like:all_birthday like:-9223372036854775808 like:-9223372036854775808 like:-9223372036854775808 like:active like:-292275055-05-16 16:47:04 like:wears like:-9223372036854775808 like:-9223372036854775808 like:-9223372036854775808 like:-292275055-05-16 16:47:04 like:-9223372036854775808 like:true like:-292275055-05-16 16:47:04 like:false like:birthday like:-292275055-05-16 16:47:04 like:happy 21st birthday katie nicholas\nWe wish you good health and happiness for your future.\nLove, Nan and Gran (Dad) XX.\nWill be loving you always. like:-9223372036854775808 like:-9223372036854775808 -_uid:notice#754],from[0],size[10]: Query Failed []]; nested: }{null: Unknown}{null: Unknown}{null: Unknown}{null: Unknown}]"
}
@kimchy
Copy link
Member

kimchy commented Mar 29, 2010

It constructs a likeText correctly. Is there a chance that you can post a test case with some data?

@clintongormley
Copy link
Contributor Author

Hiya - this script reproduces the error:

curl -XPUT 'http://127.0.0.2:9200/ia_object/'  

curl -XPUT 'http://127.0.0.2:9200/ia_object/notice/_mapping?ignoreConflicts=false'  -d '
{
   "allField" : {
      "store" : "yes",
      "termVector" : "with_positions_offsets",
      "enabled" : 1
   },
   "properties" : {
      "notice_type" : {
         "index" : "not_analyzed",
         "type" : "string"
      },
      "ancestor_ids" : {
         "type" : "long"
      },
      "status" : {
         "index" : "not_analyzed",
         "type" : "string"
      },
      "location_ids" : {
         "type" : "long",
         "indexName" : "location_id"
      },
      "remote_last_modified" : {
         "format" : "yyyy-MM-dd HH:mm:ss",
         "type" : "date"
      },
      "parent_id" : {
         "type" : "long"
      },
      "featured" : {
         "type" : "boolean"
      },
      "publish_date" : {
         "format" : "yyyy-MM-dd HH:mm:ss",
         "type" : "date"
      },
      "text" : {
         "type" : "string"
      },
      "id" : {
         "type" : "long"
      },
      "creator_id" : {
         "nullValue" : 0,
         "type" : "long"
      },
      "last_modified" : {
         "format" : "yyyy-MM-dd HH:mm:ss",
         "type" : "date"
      },
      "name" : {
         "boost" : "1.2",
         "type" : "string"
      },
      "has_name" : {
         "type" : "boolean"
      },
      "created" : {
         "format" : "yyyy-MM-dd HH:mm:ss",
         "type" : "date"
      },
      "sub_type" : {
         "index" : "not_analyzed",
         "type" : "string"
      }
   }
}
'


curl -XPUT 'http://127.0.0.2:9200/ia_object/notice/754'  -d '
{
   "notice_type" : "all_birthday",
   "ancestor_ids" : [
      "268",
      "23",
      "22"
   ],
   "status" : "active",
   "last_modified" : "2010-03-25 18:33:14",
   "name" : "wears",
   "location_ids" : [
      "23",
      "24",
      "30"
   ],
   "remote_last_modified" : "2007-02-14 18:10:00",
   "parent_id" : "268",
   "has_name" : 1,
   "created" : "2010-01-11 18:47:46",
   "featured" : 0,
   "sub_type" : "birthday",
   "publish_date" : "2007-02-13 00:00:00",
   "text" : "happy 21st birthday katie nicholas\nWe wish you good health and happiness for your future.\nLove, Nan and Gran (Dad) XX.\nWill be loving you always.",
   "id" : "754",
   "creator_id" : 0
}
'

curl -XGET 'http://127.0.0.2:9200/ia_object/notice/754/_moreLikeThis' 

# {
#    "error" : "SearchPhaseExecutionException[Failed to execute [qu
# >    ery] total failure; shardFailures {[getafix-32775][ia_object
# >    ][4]: QueryPhaseExecutionException[[ia_object][4]: query[lik
# >    e:all_birthday like:-9223372036854775808 like:-9223372036854
# >    775808 like:-9223372036854775808 like:active like:-292275055
# >    -05-16 16:47:04 like:wears like:-9223372036854775808 like:-9
# >    223372036854775808 like:-9223372036854775808 like:-292275055
# >    -05-16 16:47:04 like:-9223372036854775808 like:true like:-29
# >    2275055-05-16 16:47:04 like:false like:birthday like:-292275
# >    055-05-16 16:47:04 like:happy 21st birthday katie nicholas\n
# >    We wish you good health and happiness for your future.\nLove
# >    , Nan and Gran (Dad) XX.\nWill be loving you always. like:-9
# >    223372036854775808 like:-9223372036854775808 -_uid:notice#75
# >    4],from[0],size[10]: Query Failed []]; nested: }{null: Unkno
# >    wn}{null: Unknown}{null: Unknown}{null: Unknown}]"
# }

@kimchy
Copy link
Member

kimchy commented Mar 30, 2010

I fixed the failure, but now, with the fix, when you just execute an mlt with no parameters, it will use the _source to generate a query, and note that it will ignore number based types.

One good option you may have is, when you store _all, is to list the _all in the mltFileds.

@clintongormley
Copy link
Contributor Author

The query without and mltFields now works, however if i specify the _all field (or any unknown field), I get:

curl -XGET 'http://127.0.0.2:9200/es_test_1/type_1/1/_moreLikeThis?mltFields=_all&minDocFreq=1&minTermFrequency=1'  -d '
{}
'
# {
#    "error" : "NoShardAvailableActionException[[es_test_1][3] No s
# >    hard available for [type_1#1]]; nested: ElasticSearchExcepti
# >    on[No mapping for field [_all] in type [type_1]]; "
# }

And if I specify the _source field, I get:

curl -XGET 'http://127.0.0.2:9200/es_test_1/type_1/1/_moreLikeThis?mltFields=_source&minDocFreq=1&minTermFrequency=1'  -d '
{}
'
# {
#    "error" : "NullPointerException[null]"
# }

In my local test suite, specifying the text field works, but in my live setup (with more complex docs), I get the same NullPointerException if I specify any field name.

@clintongormley
Copy link
Contributor Author

In fact, if you use my original test case in the first post in this issue, and just change the last line to:

curl -XGET 'http://127.0.0.2:9200/ia_object/notice/754/_moreLikeThis?mltFields=text'

.... you'll get a NullPointerException

@kimchy
Copy link
Member

kimchy commented Apr 2, 2010

ok, hopefully its fixed now.

@clintongormley
Copy link
Contributor Author

looks good. One thing is if I specify (eg) a date field, instead of reporting "you can't to mlt on a date field" it says "No fields found to fetch likeText from" - could be a better error message. I get the same message if specifying _source

@clintongormley
Copy link
Contributor Author

Fixed

rmuir pushed a commit to rmuir/elasticsearch that referenced this issue Nov 8, 2015
Closes elastic#94.
(cherry picked from commit 0ab38f3)
(cherry picked from commit 96c7bb1)
rmuir pushed a commit to rmuir/elasticsearch that referenced this issue Nov 8, 2015
I use ElasticSearch 1.4.3 with mapper-attachment plugin 2.4.2 (TIKA 1.7).

I get an error when indexing **specific** docx file:
> "[DEBUG][org.elasticsearch.index.mapper.attachment.AttachmentMapper] Failed to extract [-1] characters of text for [null]: [org.apache.poi.xwpf.usermodel.XWPFSDT.getContent()Lorg/apache/poi/xwpf/usermodel/ISDTContent;]"

But if i use mapper-attachment plugin 2.4.1 (TIKA 1.5) there is no error and content is parsed successfully.

Caused by this change elastic#94.

Closes elastic#104.
IanvsPoplicola pushed a commit to IanvsPoplicola/elasticsearch that referenced this issue Mar 21, 2017
Update Coordination.md claiming issue 21312
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants