-
Notifications
You must be signed in to change notification settings - Fork 25.2k
sum_of_squares calculation and docs don't align #50416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-docs (>docs) |
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
Dropping the link to Wikipedia seems appropriate. I'm actually not sure you can add the method the link describes now without two passes which is something we don't support right now. @polyfractal, do I speak the truth here? |
Hmm, I think there are multiple things going wrong here :)
|
@polyfractal |
I have removed the link to Wikipedia as the linked page does not reflect what is actually being calculated. A per the following issue: #50416
Removed the link to Wikipedia as the function is not calculating the sum of squares in this way. More can be found here at this issue: #50416
Closed by #52398 |
Uh oh!
There was an error while loading. Please reload this page.
Potentially a link in the docs to a Wikipedia article is incorrect and should be removed, or the function to calculate the intended usage of the sum of squares is incorrect and needs to be updated.
Describe the feature:
Expecting a sum_of_squares output in the Extended Stats Aggregation to align with the formula provided by Wikipedia and many other statistical sources.
The current calculation for the Extended Stats Aggregation for the
sum_of_squares
is calculated using the following equation:sumOfSquares += value * value;
The docs references a Wikipedia article which provides a different function.Elasticsearch version: Tested on
6.5.4, 7.5.0
Description of the problem including expected versus actual behavior:
Current calculation of the sum of squares does not align to the statistical technique used to calculate the sum of squares.
Sum of squares is a statistical technique used in regression analysis to determine the dispersion of data points. In a regression analysis, the goal is to determine how well a data series can be fitted to a function that might help to explain how the data series was generated. Sum of squares is used as a mathematical way to find the function that best fits (varies least) from the data.
Many sum of squares calculators do not align to the way the sum of
Steps to reproduce:
List of Numbers:
74.01,74.77,73.94,73.61,73.40
Expected outcome:
Actual Outcome:
Elastic looks to be using the following formula to calculate the sum_of_squares:
Recreate:
Created an index:
Add some Docs
Search the index:
Response:
Search the index using SQL
Response:
Can the statistical method also be added if the current method is as expected. The link in the docs will need to be removed if the current method is correct. I am happy to put in the PR once I have the clarification.
The text was updated successfully, but these errors were encountered: