You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26Lines changed: 26 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -72,6 +72,32 @@ Folding of unicode characters based on `UTR#30`. It registers itself under `icu_
72
72
}
73
73
}
74
74
75
+
ICU Filtering
76
+
-------------
77
+
78
+
The folding can be filtered by a set of unicode characters with the parameter `unicodeSetFilter`. This is useful for a non-internationalized search engine where retaining a set of national characters which are primary letters in a specific language is wanted. See syntax for the UnicodeSet "here":http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html.
79
+
80
+
The Following example exempts Swedish characters from the folding. Note that the filtered characters are NOT lowercased which is why we add that filter below.
0 commit comments