Skip to content

Upgrade to Lucene 10 #114741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 580 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
580 commits
Select commit Hold shift + click to select a range
5ff75bb
[Automated] Update Lucene snapshot to 10.0.0-snapshot-dc47adbbe73
elasticsearchmachine Sep 9, 2024
e0e2e1e
[Automated] Update Lucene snapshot to 9.12.0-snapshot-371fa57d9c7
elasticsearchmachine Sep 9, 2024
3267fd0
Remove IndexVersions.V_7_6_0
cbuescher Sep 9, 2024
74040a5
Remove IndexVersions.V_7_7_0 - V_7_11_0
cbuescher Sep 9, 2024
70ee548
Remove remainign V_7x IndexVersions
cbuescher Sep 9, 2024
db7463f
Uncomment some tests for better mergability with main
cbuescher Sep 9, 2024
95bc9d3
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 9, 2024
d0a1110
Fix errors in imports
cbuescher Sep 9, 2024
afec00b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-64f5697f537
elasticsearchmachine Sep 10, 2024
304a1c6
[Automated] Update Lucene snapshot to 9.12.0-snapshot-ce23e15eb54
elasticsearchmachine Sep 10, 2024
d104bc3
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 10, 2024
8fe6c3a
Remove usages of random versions before V_8_0_0 in tests
cbuescher Sep 10, 2024
8a01884
More test fixed related to legacy versions
cbuescher Sep 10, 2024
09b68e2
Mute CompositeRolesStoreTests.testXPackUserCanAccessNonRestrictedIndices
cbuescher Sep 10, 2024
ca97d86
Fix failing knn yaml test
cbuescher Sep 10, 2024
84d4e86
Fix docs test related to knn queries
cbuescher Sep 10, 2024
77f24ae
Null out missing codec in BWCLucene70Codec
cbuescher Sep 10, 2024
3cea216
Partially revert previous commit
cbuescher Sep 10, 2024
8bf41b8
[Automated] Update Lucene snapshot to 9.12.0-snapshot-7964682ddf5
elasticsearchmachine Sep 11, 2024
fff3fbb
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7c529ce092d
elasticsearchmachine Sep 11, 2024
0ee5e4f
Fix compilation issues after last Lucene 10 snapshot merge
cbuescher Sep 11, 2024
2d31a70
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 11, 2024
ced0f7b
Fix UOE in ContextIndexSearcher after last Lucene merge
cbuescher Sep 11, 2024
2430fff
[Automated] Update Lucene snapshot to 10.0.0-snapshot-74e3c44063a
elasticsearchmachine Sep 12, 2024
d45dbc7
[Automated] Update Lucene snapshot to 9.12.0-snapshot-ab262f917d4
elasticsearchmachine Sep 12, 2024
d24200e
Fix compile issues after latest Lucene snapshot update
cbuescher Sep 12, 2024
42c4ed1
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 12, 2024
8800107
spotless
cbuescher Sep 12, 2024
84bb2b4
Fixing more checkstyle and spotless issues after merging main
cbuescher Sep 12, 2024
e916319
Remove awaitsFix, as the issue was fixed
mayya-sharipova Sep 12, 2024
c54686b
Override the correct search method
javanna Sep 12, 2024
28846be
Follow up changes to inter-segment concurrency changes
cbuescher Sep 12, 2024
07fdd8b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-5045d3c67b1
elasticsearchmachine Sep 13, 2024
7dca607
[Automated] Update Lucene snapshot to 9.12.0-snapshot-6cc4f13ab22
elasticsearchmachine Sep 13, 2024
8cb58a7
Use RegExp.DEPRECATED_COMPLEMENT where needed
cbuescher Sep 13, 2024
32f5907
Unmute two tests that now pass
cbuescher Sep 13, 2024
c79f782
Unmute a couple more tests that now pass
ChrisHegarty Sep 13, 2024
014d338
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7c056ab88c7
elasticsearchmachine Sep 14, 2024
35764df
[Automated] Update Lucene snapshot to 9.12.0-snapshot-1b38d5dec85
elasticsearchmachine Sep 14, 2024
e76b6f6
[Automated] Update Lucene snapshot to 10.0.0-snapshot-568d1f3fbe7
elasticsearchmachine Sep 15, 2024
00ae8a4
[Automated] Update Lucene snapshot to 9.12.0-snapshot-9cd6a24be43
elasticsearchmachine Sep 15, 2024
40aa2f9
Fix getDiscountOverlaps in LegacyBM25Similarity
ChrisHegarty Sep 15, 2024
b05b5f3
Fix AggregatorTestCase with LeafReaderContextPartition
ChrisHegarty Sep 15, 2024
7d27f53
More LeafReaderContextPartition refactoring fixes
ChrisHegarty Sep 15, 2024
12b98e2
Merge branch 'main' into lucene_snapshot_10
elasticsearchmachine Sep 15, 2024
70a5dba
[Automated] Update Lucene snapshot to 10.0.0-snapshot-3801d859783
elasticsearchmachine Sep 16, 2024
4b2f5f1
lucene_snapshot: Fix constructor chaining in LegacyBM25Similarity
elasticsearchmachine Sep 16, 2024
762ec3e
lucene_snapshot_10: fix license headers
elasticsearchmachine Sep 16, 2024
b3f371d
Move AggregatorTestCase to search(Query, CollectorManager)
javanna Sep 16, 2024
b215944
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 16, 2024
6a58074
Don't randomize LuceneTestCase concurrency when using "newSearcher"
cbuescher Sep 16, 2024
6fc5930
Fix docs according to changes in Lovins token filter, Pathhierarchy A…
cbuescher Sep 16, 2024
2d1a614
Fix geo_shape related docs tests
cbuescher Sep 16, 2024
52bf71d
[Automated] Update Lucene snapshot to 10.0.0-snapshot-f4ebed2404e
elasticsearchmachine Sep 17, 2024
b4546db
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 17, 2024
592642b
Fix expected output of romanian analyzer
cbuescher Sep 17, 2024
4f7df52
Fix QueryTranslatorSpecTests due to changes in regex syntax flags
cbuescher Sep 17, 2024
4ebfae1
Fix 370_profile yaml test for yamlRestCompatTest
cbuescher Sep 17, 2024
a6f6d21
Fix romanian analyzer restBwc test
cbuescher Sep 17, 2024
309abb0
Remove v7.17.13 bwc tasks in CI
cbuescher Sep 17, 2024
98e5600
Determinize automaton produced by IncludeExclude
cbuescher Sep 17, 2024
455980d
Fix persian language analyzer doc by adding stemmer
cbuescher Sep 17, 2024
01127d1
[Automated] Update Lucene snapshot to 10.0.0-snapshot-b59a357e586
elasticsearchmachine Sep 18, 2024
6958086
Fix compile errors after L10 snapshot merge
cbuescher Sep 18, 2024
9af0d8e
[Automated] Update Lucene snapshot to 9.12.0-snapshot-a774a998be1
elasticsearchmachine Sep 16, 2024
6294ad2
lucene_snapshot: Fix constructor chaining in LegacyBM25Similarity
elasticsearchmachine Sep 16, 2024
f5ce091
[Automated] Update Lucene snapshot to 9.12.0-snapshot-cd7a74cb4d4
elasticsearchmachine Sep 17, 2024
75fcbe0
[Automated] Update Lucene snapshot to 9.12.0-snapshot-71ca6b4bb16
elasticsearchmachine Sep 18, 2024
6e40125
lucene_snapshot: fix another instance of IOContext.READONCE
elasticsearchmachine Sep 18, 2024
8aa9cce
Merge branch 'main' into lucene_snapshot_new
elasticsearchmachine Sep 18, 2024
ff74c90
lucene_snapshot: fix license headers
elasticsearchmachine Sep 15, 2024
26b6513
[Automated] Update Lucene snapshot to 10.0.0-snapshot-6d987e1ce1c
elasticsearchmachine Sep 19, 2024
fb44c63
[Automated] Update Lucene snapshot to 9.12.0-snapshot-b467a2bb66d
elasticsearchmachine Sep 19, 2024
d1fbaab
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 19, 2024
1a8c3b1
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 19, 2024
9eec2c4
Add a capability and transport version for new regex and range interv…
ChrisHegarty Sep 19, 2024
7150729
Multi term intervals: increase max_expansions (#112826)
mayya-sharipova Sep 19, 2024
0085911
[Automated] Update Lucene snapshot to 10.0.0-snapshot-e4ac57746eb
elasticsearchmachine Sep 20, 2024
1e3d353
[Automated] Update Lucene snapshot to 9.12.0-snapshot-a7ce3466d7c
elasticsearchmachine Sep 20, 2024
3bfd004
Adapt QueryAnalyzer to use TermInSetQuery#getBytesRefIterator
javanna Sep 20, 2024
bbac749
restore ngram tokenizer removed due to a bad merge
javanna Sep 20, 2024
952aa9c
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 20, 2024
e5f4ef9
Address norwegian stemmer creation issues
javanna Sep 20, 2024
8e51178
Merge branch 'main' into lucene_snapshot_10
javanna Sep 20, 2024
c03d83d
extend ESTestCase#newSearcher methods and add javadocs
javanna Sep 20, 2024
27139fc
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 20, 2024
2318da1
Rephrase comment in OldCodecsAvailableTests
javanna Sep 20, 2024
46249a0
clarify comment in BWCLucene70Codec
javanna Sep 20, 2024
352b10a
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 20, 2024
1d2737f
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 20, 2024
266979f
Fix READONCE IOContext usage
ChrisHegarty Sep 20, 2024
c746906
fix TransportSimulateBulkActionIT compilation
javanna Sep 20, 2024
4512167
another attempt to fix READONCE IOContext usage
ChrisHegarty Sep 20, 2024
c0b6794
spotless
ChrisHegarty Sep 20, 2024
b7574b5
Update docs/changelog/113018.yaml
ChrisHegarty Sep 20, 2024
aaf1bbc
Use the RC build
ChrisHegarty Sep 20, 2024
89c4af7
Address test failures in old-lucene-versions
javanna Sep 20, 2024
989e48a
Address test failure in BlockPostingsFormat3Tests
javanna Sep 20, 2024
08a78e9
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 21, 2024
6a32106
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 21, 2024
b594455
Update docs/changelog/113333.yaml
ChrisHegarty Sep 21, 2024
543d0c3
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 21, 2024
b0ec6c0
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 21, 2024
2a87ecc
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 21, 2024
cf56c9b
remove erroneous changelog
ChrisHegarty Sep 21, 2024
ceaf86e
Address WildcardFieldMapperTests failure
javanna Sep 21, 2024
303f22d
Multi term intervals: increase max_expansions (#112826)
mayya-sharipova Sep 19, 2024
835c114
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 21, 2024
14451f2
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 22, 2024
e1dcf11
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 22, 2024
5d79230
Merge branch 'main' into lucene_snapshot_9_12
ChrisHegarty Sep 22, 2024
435f0c2
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 22, 2024
f3b96e8
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 22, 2024
e3c24b2
fix WildcardFieldMapperTests to include
ChrisHegarty Sep 22, 2024
c74d361
Fix docs build
ChrisHegarty Sep 22, 2024
3a0ff7d
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
0794124
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
8ef1fcd
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
2f56034
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
254d82f
Merge branch 'lucene_snapshot_9_12' into lucene_snapshot_10
ChrisHegarty Sep 22, 2024
30d23b2
revert
ChrisHegarty Sep 22, 2024
c9cb409
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 23, 2024
2d25cbc
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 23, 2024
8b32340
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 23, 2024
22d88d2
Merge branch 'main' into lucene_snapshot_10
javanna Sep 23, 2024
381132c
Restore index versions 7 in lucene_snapshot_10 (#113317)
javanna Sep 23, 2024
12dfe38
Add Lucene70DocValuesFormat to old-lucene-versions plugin (#113377)
ChrisHegarty Sep 23, 2024
8ecb407
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 24, 2024
147eb47
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 24, 2024
fcd9a0e
Merge branch 'main' into lucene_snapshot_10
ChrisHegarty Sep 24, 2024
cad6c6c
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 24, 2024
1e62951
Remove leftover TODO
javanna Sep 24, 2024
c7c24b0
Remove TODO in RegexpFlag
ChrisHegarty Sep 24, 2024
eb06cec
Add UpdateForV10 annotation
javanna Sep 24, 2024
871c430
Merge remote-tracking branch 'upstream/lucene_snapshot_10' into lucen…
brianseeders Sep 24, 2024
f7be20d
Merge branch 'main' into lucene_snapshot
javanna Sep 24, 2024
7a8cab6
Revert needless jdk change in legacy file
javanna Sep 24, 2024
5490a47
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 25, 2024
3ea0406
Merge branch 'main' into lucene_snapshot
javanna Sep 25, 2024
b295803
Fix compile issues after last merge with main
cbuescher Sep 25, 2024
518fb08
[Automated] Update Lucene snapshot to 10.0.0-snapshot-ff57fa7b423
elasticsearchmachine Sep 26, 2024
05b4b6e
Merge branch 'main' into lucene_snapshot
cbuescher Sep 26, 2024
394a063
Merge branch 'main' into lucene_snapshot
javanna Sep 26, 2024
52de43f
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7b4b0238d70
elasticsearchmachine Sep 27, 2024
b723551
Merge branch 'main' into lucene_snapshot
javanna Sep 27, 2024
bbc9cf3
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 27, 2024
731219a
update profile tests
javanna Sep 27, 2024
c064636
add lucene 10 upgrade node feature and fix profile yaml test
javanna Sep 27, 2024
c75e0c5
Merge branch 'main' into lucene_snapshot
javanna Sep 27, 2024
2b15163
restore replaceValueInMatch for profile description tests from 8.x
javanna Sep 27, 2024
d912d7b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7b4b0238d70
elasticsearchmachine Sep 28, 2024
f93db49
[Automated] Update Lucene snapshot to 10.0.0-snapshot-0a8604d908c
elasticsearchmachine Sep 29, 2024
4a79e51
Revert "[Automated] Update Lucene snapshot to 10.0.0-snapshot-0a8604d…
javanna Sep 29, 2024
db54b81
Merge branch 'main' into lucene_snapshot
javanna Sep 29, 2024
1dc8b4c
[Automated] Update Lucene snapshot to 10.0.0-snapshot-22ac47c07ad
elasticsearchmachine Sep 30, 2024
b6532e6
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 30, 2024
63e524d
Address compile errors after vector api changes upstream (#113766)
javanna Sep 30, 2024
2471dc9
Update lucene snapshot buildkite config to build from branch_10_0 (#1…
javanna Sep 30, 2024
af93513
Merge branch 'main' into lucene_snapshot
javanna Sep 30, 2024
a969b1d
Merge branch 'main' into lucene_snapshot
javanna Sep 30, 2024
79d3d6a
Make dutch_kp and lovins no op token filters
javanna Sep 30, 2024
a20bf84
Fix needless use of concurrent collector managers (#113739)
original-brownbear Oct 1, 2024
85e8730
[Automated] Update Lucene snapshot to 10.0.0-snapshot-524ea208c87
elasticsearchmachine Oct 1, 2024
2b66d0e
Merge branch 'main' into lucene_snapshot
javanna Oct 1, 2024
ef2d0b8
Make "german2" an alias for "german" snowball stemmer (#113614)
cbuescher Oct 1, 2024
575c55d
Merge branch 'main' into lucene_snapshot
ChrisHegarty Oct 1, 2024
e700fe1
Fix build IOConext READ -> DEFAULT
ChrisHegarty Oct 1, 2024
6a48626
Fix bad merge IOContext READ -> DEFAULT
ChrisHegarty Oct 1, 2024
ef5d73f
fix compilation after merge
javanna Oct 1, 2024
552e935
Merge branch 'main' into lucene_snapshot
javanna Oct 1, 2024
d5a4604
[Automated] Update Lucene snapshot to 10.0.0-snapshot-4461bc1eff4
elasticsearchmachine Oct 2, 2024
e7644ca
Replace Lucene912Codec with Lucene100Codec
javanna Oct 2, 2024
d623d99
Introduce Elasticsearch900Codec
javanna Oct 2, 2024
66c2c3c
Add the missing bits for Elasticsearch900Codec
javanna Oct 2, 2024
d7bf310
Merge branch 'main' into lucene_snapshot
javanna Oct 2, 2024
352b903
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 2, 2024
7089ff3
Conditional stemming for 'persian' analyzer (#113482)
cbuescher Oct 2, 2024
23c0bed
Add bwc layer for 'romanian' analyzer (#113906)
cbuescher Oct 2, 2024
6afa629
Merge branch 'main' into lucene_snapshot
javanna Oct 2, 2024
e12e962
Fix bad merge
javanna Oct 2, 2024
e18492c
Merge branch 'main' into lucene_snapshot
javanna Oct 2, 2024
1a4387d
minor changes to changelog for persian analyzer
javanna Oct 2, 2024
ae92b29
[Automated] Update Lucene snapshot to 10.0.0-snapshot-f76fdb293e1
elasticsearchmachine Oct 3, 2024
de506f7
Merge remote-tracking branch 'origin/main' into lucene_snapshot
pmpailis Oct 3, 2024
d92bfad
Merge branch 'main' into lucene_snapshot
javanna Oct 3, 2024
aeabb36
[Automated] Update Lucene snapshot to 10.0.0-snapshot-f76fdb293e1
elasticsearchmachine Oct 4, 2024
e56d356
muting RankDocsQueryBuilderTests testRankDocsQueryEarlyTerminate
pmpailis Oct 4, 2024
f5602af
Merge branch 'main' into lucene_snapshot
cbuescher Oct 4, 2024
ca1938c
Fix compilation issue
cbuescher Oct 4, 2024
e4de0a1
Add changelog about the nori dictionary update (#114124)
cbuescher Oct 4, 2024
6861f7f
fix typos
javanna Oct 4, 2024
add9a9e
Updating RankDocRetrieverBuilderIT for lucene_snapshot branch (#114098)
pmpailis Oct 4, 2024
dba248e
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 4, 2024
adfd377
Add Snowball upgrade changelog (#114146)
cbuescher Oct 4, 2024
1df381a
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 5, 2024
e3e795e
Merge branch 'main' into lucene_snapshot
ChrisHegarty Oct 5, 2024
4d4e435
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 6, 2024
174df87
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 6, 2024
d0951f6
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 7, 2024
64e66e8
muting RankDocsQueryBuilderTests testRankDocsQueryEarlyTerminate (#11…
pmpailis Oct 7, 2024
8edbe11
Merge branch 'main' into lucene_snapshot
cbuescher Oct 7, 2024
a2a7120
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 7, 2024
e7a8877
Delete full cluster restart test removed in main
javanna Oct 7, 2024
6799147
Remove smoke test section in MetadataCreateIndexServiceTests
cbuescher Oct 7, 2024
cd46af1
spotless
javanna Oct 7, 2024
0ee0940
Merge branch 'main' into lucene_snapshot
javanna Oct 7, 2024
f8fac7a
Address EngineTestCase compile errors
javanna Oct 7, 2024
7987e8c
Address more compile errors around TotalHitCountCollectorManager cons…
javanna Oct 7, 2024
43b9797
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 8, 2024
cb0a0f0
Merge branch 'main' into lucene_snapshot
javanna Oct 8, 2024
37e47c5
Merge branch 'main' into lucene_snapshot
cbuescher Oct 8, 2024
a413196
Merge branch 'main' into lucene_snapshot
javanna Oct 8, 2024
84fdf5a
[Automated] Update Lucene snapshot to 10.0.0-snapshot-a4c0f741ccc
elasticsearchmachine Oct 9, 2024
70ce7a1
Merge branch 'main' into lucene_snapshot
cbuescher Oct 9, 2024
fc31076
Fix bbq for Lucene 10
benwtrent Oct 8, 2024
3933d6a
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 9, 2024
7dafac9
[Automated] Update Lucene snapshot to 10.0.0-snapshot-bc478d85a12
elasticsearchmachine Oct 9, 2024
8835728
apply MADV_NORMAL advice to enable more aggressive readahead (#114410)
ChrisHegarty Oct 9, 2024
773de7d
Merge branch 'main' into lucene_snapshot
cbuescher Oct 9, 2024
ec1b110
[Automated] Update Lucene snapshot to 10.0.0-snapshot-eadc07cc6a1
elasticsearchmachine Oct 10, 2024
08a24ca
Merge branch 'main' into lucene_snapshot
cbuescher Oct 10, 2024
3d77894
Remove empty queue conditional from slicing logic (#114513)
javanna Oct 10, 2024
7731f08
Merge branch 'main' into lucene_snapshot
cbuescher Oct 10, 2024
34ed721
[Automated] Update Lucene snapshot to 10.0.0-snapshot-eadc07cc6a1
elasticsearchmachine Oct 11, 2024
edb88a0
Merge branch 'main' into lucene_snapshot
cbuescher Oct 11, 2024
e5f7124
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 11, 2024
9018a73
Merge branch 'main' into lucene_snapshot
cbuescher Oct 11, 2024
4721a67
[Automated] Update Lucene snapshot to 10.0.0-snapshot-eadc07cc6a1
elasticsearchmachine Oct 12, 2024
c2a480b
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 12, 2024
3c5086a
[Automated] Update Lucene snapshot to 10.0.0-snapshot-eadc07cc6a1
elasticsearchmachine Oct 13, 2024
c70ceb7
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Oct 13, 2024
8574e51
[Automated] Update Lucene snapshot to 10.0.0-snapshot-eadc07cc6a1
elasticsearchmachine Oct 14, 2024
073a116
Merge branch 'main' into lucene_snapshot
cbuescher Oct 14, 2024
31b41d3
Merge branch 'main' into lucene_snapshot
javanna Oct 14, 2024
0cd2bcf
Upgrade Lucene to released version 10.0.0
javanna Oct 14, 2024
fc4a43a
Update docs/changelog/114741.yaml
javanna Oct 14, 2024
f91c190
Update docs/changelog/114741.yaml
javanna Oct 14, 2024
2084ab8
remove unnecessary change
javanna Oct 14, 2024
7990bda
update changelog
javanna Oct 14, 2024
cdfeb5c
fix bad merge of rest-api-spec/build.gradle
javanna Oct 14, 2024
ac24265
Merge branch 'main' into lucene_snapshot_10
benwtrent Oct 15, 2024
b6f646c
Restore SpecialPermission#check call in ExpressionScriptEngine
javanna Oct 15, 2024
1d5719d
Merge branch 'main' into lucene_snapshot_10
javanna Oct 16, 2024
d66c7e3
Merge branch 'main' into lucene_snapshot_10
javanna Oct 17, 2024
c2f698b
Merge branch 'main' into lucene_snapshot_10
javanna Oct 17, 2024
cc85647
Merge branch 'main' into lucene_snapshot_10
javanna Oct 17, 2024
d60179c
Merge branch 'main' into lucene_snapshot_10
javanna Oct 17, 2024
a02ad20
Merge branch 'main' into lucene_snapshot_10
javanna Oct 20, 2024
83142c3
Merge branch 'main' into lucene_snapshot_10
javanna Oct 21, 2024
ac85cdb
Merge branch 'main' into lucene_snapshot_10
javanna Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 0 additions & 1 deletion .buildkite/pipelines/lucene-snapshot/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ steps:
matrix:
setup:
BWC_VERSION:
- 7.17.13
- 8.9.1
- 8.10.0
agents:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
import org.apache.lucene.store.MMapDirectory;
import org.apache.lucene.util.hnsw.RandomVectorScorer;
import org.apache.lucene.util.hnsw.RandomVectorScorerSupplier;
import org.apache.lucene.util.quantization.RandomAccessQuantizedByteVectorValues;
import org.apache.lucene.util.quantization.QuantizedByteVectorValues;
import org.apache.lucene.util.quantization.ScalarQuantizer;
import org.elasticsearch.common.logging.LogConfigurator;
import org.elasticsearch.core.IOUtils;
Expand Down Expand Up @@ -217,19 +217,17 @@ public float squareDistanceScalar() {
return 1 / (1f + adjustedDistance);
}

RandomAccessQuantizedByteVectorValues vectorValues(int dims, int size, IndexInput in, VectorSimilarityFunction sim) throws IOException {
QuantizedByteVectorValues vectorValues(int dims, int size, IndexInput in, VectorSimilarityFunction sim) throws IOException {
var sq = new ScalarQuantizer(0.1f, 0.9f, (byte) 7);
var slice = in.slice("values", 0, in.length());
return new OffHeapQuantizedByteVectorValues.DenseOffHeapVectorValues(dims, size, sq, false, sim, null, slice);
}

RandomVectorScorerSupplier luceneScoreSupplier(RandomAccessQuantizedByteVectorValues values, VectorSimilarityFunction sim)
throws IOException {
RandomVectorScorerSupplier luceneScoreSupplier(QuantizedByteVectorValues values, VectorSimilarityFunction sim) throws IOException {
return new Lucene99ScalarQuantizedVectorScorer(null).getRandomVectorScorerSupplier(sim, values);
}

RandomVectorScorer luceneScorer(RandomAccessQuantizedByteVectorValues values, VectorSimilarityFunction sim, float[] queryVec)
throws IOException {
RandomVectorScorer luceneScorer(QuantizedByteVectorValues values, VectorSimilarityFunction sim, float[] queryVec) throws IOException {
return new Lucene99ScalarQuantizedVectorScorer(null).getRandomVectorScorer(sim, values, queryVec);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,6 @@ org.apache.lucene.util.Version#parseLeniently(java.lang.String)

org.apache.lucene.index.NoMergePolicy#INSTANCE @ explicit use of NoMergePolicy risks forgetting to configure NoMergeScheduler; use org.elasticsearch.common.lucene.Lucene#indexWriterConfigWithNoMerging() instead.

@defaultMessage Spawns a new thread which is solely under lucenes control use ThreadPool#relativeTimeInMillis instead
org.apache.lucene.search.TimeLimitingCollector#getGlobalTimerThread()
org.apache.lucene.search.TimeLimitingCollector#getGlobalCounter()

@defaultMessage Don't interrupt threads use FutureUtils#cancel(Future<T>) instead
java.util.concurrent.Future#cancel(boolean)

Expand Down
2 changes: 1 addition & 1 deletion build-tools-internal/version.properties
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
elasticsearch = 9.0.0
lucene = 9.12.0
lucene = 10.0.0

bundled_jdk_vendor = openjdk
bundled_jdk = 22.0.1+8@c7ec1332f7bb44aeba2eb341ae18aca4
Expand Down
3 changes: 3 additions & 0 deletions distribution/src/config/jvm.options
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,9 @@
23:-XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.setAsTypeCache
23:-XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.asTypeUncached

# Lucene 10: apply MADV_NORMAL advice to enable more aggressive readahead
-Dorg.apache.lucene.store.defaultReadAdvice=normal

## heap dumps

# generate a heap dump when an allocation from the Java heap fails; heap dumps
Expand Down
4 changes: 2 additions & 2 deletions docs/Versions.asciidoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@

include::{docs-root}/shared/versions/stack/{source_branch}.asciidoc[]

:lucene_version: 9.12.0
:lucene_version_path: 9_12_0
:lucene_version: 10.0.0
:lucene_version_path: 10_0_0
:jdk: 11.0.2
:jdk_major: 11
:build_type: tar
Expand Down
27 changes: 27 additions & 0 deletions docs/changelog/113482.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
pr: 113482
summary: The 'persian' analyzer has stemmer by default
area: Analysis
type: breaking
issues:
- 113050
breaking:
title: The 'persian' analyzer has stemmer by default
area: Analysis
details: >-
Lucene 10 has added a final stemming step to its PersianAnalyzer that Elasticsearch
exposes as 'persian' analyzer. Existing indices will keep the old
non-stemming behaviour while new indices will see the updated behaviour with
added stemming.
Users that wish to maintain the non-stemming behaviour need to define their
own analyzer as outlined in
https://www.elastic.co/guide/en/elasticsearch/reference/8.15/analysis-lang-analyzer.html#persian-analyzer.
Users that wish to use the new stemming behaviour for existing indices will
have to reindex their data.
impact: >-
Indexing with the 'persian' analyzer will produce slightly different tokens.
Users should check if this impacts their search results. If they wish to
maintain the legacy non-stemming behaviour they can define their own
analyzer equivalent as explained in
https://www.elastic.co/guide/en/elasticsearch/reference/8.15/analysis-lang-analyzer.html#persian-analyzer.
notable: false

18 changes: 18 additions & 0 deletions docs/changelog/113614.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
pr: 113614
summary: The 'german2' stemmer is now an alias for the 'german' snowball stemmer
area: Analysis
type: breaking
issues: []
breaking:
title: The "german2" snowball stemmer is now an alias for the "german" stemmer
area: Analysis
details: >-
Lucene 10 has merged the improved "german2" snowball language stemmer with the
"german" stemmer. For Elasticsearch, "german2" is now a deprecated alias for
"german". This may results in slightly different tokens being generated for
terms with umlaut substitution (like "ue" for "ü" etc...)
impact: >-
Replace usages of "german2" with "german" in analysis configuration. Old
indices that use the "german" stemmer should be reindexed if possible.
notable: false

18 changes: 18 additions & 0 deletions docs/changelog/114124.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
pr: 114124
summary: The Korean dictionary for Nori has been updated
area: Analysis
type: breaking
issues: []
breaking:
title: The Korean dictionary for Nori has been updated
area: Analysis
details: >-
Lucene 10 ships with an updated Korean dictionary (mecab-ko-dic-2.1.1).
For details see https://github.com/apache/lucene/issues/11452. Users
experiencing changes in search behaviour on existing data are advised to
reindex.
impact: >-
The change is small and should generally provide better analysis results.
Existing indices for full-text use cases should be reindexed though.
notable: false

20 changes: 20 additions & 0 deletions docs/changelog/114146.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
pr: 114146
summary: Snowball stemmers have been upgraded
area: Analysis
type: breaking
issues: []
breaking:
title: Snowball stemmers have been upgraded
area: Analysis
details: >-
Lucene 10 ships with an upgrade of its Snowball stemmers.
For details see https://github.com/apache/lucene/issues/13209. Users using
Snowball stemmers that are experiencing changes in search behaviour on
existing data are advised to reindex.
impact: >-
The upgrade should generally provide improved stemming results. Small changes
in token analysis can lead to mismatches with previously index data, so
existing indices using Snowball stemmers as part of their analysis chain
should be reindexed.
notable: false

5 changes: 5 additions & 0 deletions docs/changelog/114741.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 114741
summary: Upgrade to Lucene 10
area: Search
type: upgrade
issues: []
12 changes: 6 additions & 6 deletions docs/plugins/analysis-nori.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -244,11 +244,11 @@ Which responds with:
"end_offset": 3,
"type": "word",
"position": 1,
"leftPOS": "J(Ending Particle)",
"leftPOS": "JKS(Subject case marker)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "J(Ending Particle)"
"rightPOS": "JKS(Subject case marker)"
},
{
"token": "깊",
Expand All @@ -268,11 +268,11 @@ Which responds with:
"end_offset": 6,
"type": "word",
"position": 3,
"leftPOS": "E(Verbal endings)",
"leftPOS": "ETM(Adnominal form transformative ending)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "E(Verbal endings)"
"rightPOS": "ETM(Adnominal form transformative ending)"
},
{
"token": "나무",
Expand All @@ -292,11 +292,11 @@ Which responds with:
"end_offset": 10,
"type": "word",
"position": 5,
"leftPOS": "J(Ending Particle)",
"leftPOS": "JX(Auxiliary postpositional particle)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "J(Ending Particle)"
"rightPOS": "JX(Auxiliary postpositional particle)"
}
]
},
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/analysis/analyzers/lang-analyzer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1430,7 +1430,8 @@ PUT /persian_example
"decimal_digit",
"arabic_normalization",
"persian_normalization",
"persian_stop"
"persian_stop",
"persian_stem"
]
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,6 @@ http://bvg.udc.es/recursos_lingua/stemming.jsp[`minimal_galician`] (Plural step
German::
https://dl.acm.org/citation.cfm?id=1141523[*`light_german`*],
https://snowballstem.org/algorithms/german/stemmer.html[`german`],
https://snowballstem.org/algorithms/german2/stemmer.html[`german2`],
http://members.unine.ch/jacques.savoy/clef/morpho.pdf[`minimal_german`]

Greek::
Expand Down
24 changes: 12 additions & 12 deletions docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,14 @@ POST _analyze
"start_offset": 0,
"end_offset": 8,
"type": "word",
"position": 0
"position": 1
},
{
"token": "/one/two/three",
"start_offset": 0,
"end_offset": 14,
"type": "word",
"position": 0
"position": 2
}
]
}
Expand Down Expand Up @@ -144,14 +144,14 @@ POST my-index-000001/_analyze
"start_offset": 7,
"end_offset": 18,
"type": "word",
"position": 0
"position": 1
},
{
"token": "/three/four/five",
"start_offset": 7,
"end_offset": 23,
"type": "word",
"position": 0
"position": 2
}
]
}
Expand All @@ -178,14 +178,14 @@ If we were to set `reverse` to `true`, it would produce the following:
[[analysis-pathhierarchy-tokenizer-detailed-examples]]
=== Detailed examples

A common use-case for the `path_hierarchy` tokenizer is filtering results by
file paths. If indexing a file path along with the data, the use of the
`path_hierarchy` tokenizer to analyze the path allows filtering the results
A common use-case for the `path_hierarchy` tokenizer is filtering results by
file paths. If indexing a file path along with the data, the use of the
`path_hierarchy` tokenizer to analyze the path allows filtering the results
by different parts of the file path string.


This example configures an index to have two custom analyzers and applies
those analyzers to multifields of the `file_path` text field that will
those analyzers to multifields of the `file_path` text field that will
store filenames. One of the two analyzers uses reverse tokenization.
Some sample documents are then indexed to represent some file paths
for photos inside photo folders of two different users.
Expand Down Expand Up @@ -264,8 +264,8 @@ POST file-path-test/_doc/5
--------------------------------------------------


A search for a particular file path string against the text field matches all
the example documents, with Bob's documents ranking highest due to `bob` also
A search for a particular file path string against the text field matches all
the example documents, with Bob's documents ranking highest due to `bob` also
being one of the terms created by the standard analyzer boosting relevance for
Bob's documents.

Expand Down Expand Up @@ -301,7 +301,7 @@ GET file-path-test/_search
With the reverse parameter for this tokenizer, it's also possible to match
from the other end of the file path, such as individual file names or a deep
level subdirectory. The following example shows a search for all files named
`my_photo1.jpg` within any directory via the `file_path.tree_reversed` field
`my_photo1.jpg` within any directory via the `file_path.tree_reversed` field
configured to use the reverse parameter in the mapping.


Expand Down Expand Up @@ -342,7 +342,7 @@ POST file-path-test/_analyze


It's also useful to be able to filter with file paths when combined with other
types of searches, such as this example looking for any files paths with `16`
types of searches, such as this example looking for any files paths with `16`
that also must be in Alice's photo directory.

[source,console]
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/search/profile.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1298,7 +1298,7 @@ One of the `dfs.knn` sections for a shard looks like the following:
"query" : [
{
"type" : "DocAndScoreQuery",
"description" : "DocAndScore[100]",
"description" : "DocAndScoreQuery[0,...][0.008961825,...],0.008961825",
"time_in_nanos" : 444414,
"breakdown" : {
"set_min_competitive_score_count" : 0,
Expand Down
Loading