Skip to content

Commit 4c113ac

Browse files
committed
Improve performance of extracting warning value
When building headers for a REST response, we de-duplicate the warning headers based on the actual warning value. The current implementation of this uses a capturing regular expression that is prone to excessive backtracking. In cases a request involves a large number of warnings, this extraction can be a severe performance penalty. An example where this can arise is a bulk indexing request that utilizes a deprecated feature (e.g., using deprecated forms of boolean values). This commit is an attempt to address this performance regression. We already know the format of the warning header, so we do not need to use a regular expression to parse it but rather can parse it by hand to extract the warning value. This gains back the vast majority of the performance lost due to the usage of a deprecated feature. There is still a performance loss due to logging the deprecation message but we do not address that concern in this commit. Relates #24114
1 parent 485f931 commit 4c113ac

File tree

1 file changed

+32
-1
lines changed

1 file changed

+32
-1
lines changed

core/src/main/java/org/elasticsearch/common/logging/DeprecationLogger.java

+32-1
Original file line numberDiff line numberDiff line change
@@ -226,10 +226,41 @@ public void deprecated(String msg, Object... params) {
226226
* @return the extracted warning value
227227
*/
228228
public static String extractWarningValueFromWarningHeader(final String s) {
229+
/*
230+
* We know the exact format of the warning header, so to extract the warning value we can skip forward from the front to the first
231+
* quote, and skip backwards from the end to the penultimate quote:
232+
*
233+
* 299 Elasticsearch-6.0.0 "warning value" "Sat, 25, Feb 2017 10:27:43 GMT"
234+
* ^ ^ ^
235+
* firstQuote penultimateQuote lastQuote
236+
*
237+
* We do it this way rather than seeking forward after the first quote because there could be escaped quotes in the warning value
238+
* but since there are none in the warning date, we can skip backwards to find the quote that closes the quoted warning value.
239+
*
240+
* We parse this manually rather than using the capturing regular expression because the regular expression involves a lot of
241+
* backtracking and carries a performance penalty. However, when assertions are enabled, we still use the regular expression to
242+
* verify that we are maintaining the warning header format.
243+
*/
244+
final int firstQuote = s.indexOf('\"');
245+
final int lastQuote = s.lastIndexOf('\"');
246+
final int penultimateQuote = s.lastIndexOf('\"', lastQuote - 1);
247+
final String warningValue = s.substring(firstQuote + 1, penultimateQuote - 2);
248+
assert assertWarningValue(s, warningValue);
249+
return warningValue;
250+
}
251+
252+
/**
253+
* Assert that the specified string has the warning value equal to the provided warning value.
254+
*
255+
* @param s the string representing a full warning header
256+
* @param warningValue the expected warning header
257+
* @return {@code true} if the specified string has the expected warning value
258+
*/
259+
private static boolean assertWarningValue(final String s, final String warningValue) {
229260
final Matcher matcher = WARNING_HEADER_PATTERN.matcher(s);
230261
final boolean matches = matcher.matches();
231262
assert matches;
232-
return matcher.group(1);
263+
return matcher.group(1).equals(warningValue);
233264
}
234265

235266
/**

0 commit comments

Comments
 (0)