Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue #8462

Closed
asal-hesehus opened this issue Feb 27, 2025 · 11 comments · Fixed by #8472
Closed

Performance Issue #8462

asal-hesehus opened this issue Feb 27, 2025 · 11 comments · Fixed by #8472
Labels
8.x Relates to 8.x client version Category: Bug

Comments

@asal-hesehus
Copy link

Elastic.Clients.Elasticsearch version: 8.17.1

Elasticsearch version:8.16

.NET runtime version:.NET core 9.0

Description of the problem including expected versus actual behavior:
We du performance test for searches in elastic for each of our versions. When upgraded Elastic.Clients.Elasticsearch from 8.15.10 to 8.17.1 We noticed that our queries was 4-6 times slower and that CPU usage was also about 3 times slower.

Steps to reproduce:
We do not have any steps to reproduce yet. But will provide them if needed.

Expected behavior
No performance dip.

@asal-hesehus asal-hesehus added 8.x Relates to 8.x client version Category: Bug labels Feb 27, 2025
@flobernd
Copy link
Member

flobernd commented Feb 27, 2025

Hi @asal-hesehus,

We noticed that our queries was 4-6 times slower and that CPU usage was also about 3 times slower.

This statement is very confusing to me.

Is that the server or client CPU usage? How do you measure it? You mention the CPU usage is "slower". Does that mean it's higher or less than it used to be?

Is that the time a search query request takes when executing it while using the client? How do you measure that? Did you try to run the same query in Kibana or curl?

There are no fundamental changes in the client from 8.15.10 to 8.17.1 which could explain such performance drop.

I'm sorry, but this issue requires way more context to find the bottleneck. It could be anything from the client (unlikely), changes to the code that are not related to the client update, network, server, etc.

@asal-hesehus
Copy link
Author

Hi sorry for the a bit complete support issue.

It is the client side that we noticed the higher CPU usage and the increased response times. The performance test was our servers complete time to process the responses.

Downgrading the client from 8.17.1 to 8.15.10 returned our response times to the "Normal". So we have isolated that the client is the issue but not what in client, if it serialization or if the queries that the client outputs have changes and that leads to worse performance.

I will try to dig a bit deeper and see if we can find the issue.

@mvkhesehus
Copy link

Hi - I work in the same team as @asal-hesehus

So i'll try to provide more info from what we saw in APM and from our load tests. The only change in the in the last deployment is the downgrade to version 8.15.10 nothing else. APM metrics are aggregated across several production deployments, so it indicates its a general issue.

Image

From our load test where the only change is downgrading from 8.17.1 to 8.15.10. Also shows a significant change in performance. Both response times and number of request per second is affected.

Image

@flobernd
Copy link
Member

Hi @asal-hesehus @mvkhesehus

I will try to dig a bit deeper and see if we can find the issue.

Happy to help you finding the problem! For this I would need a standalone project that includes only your load test code and a bootstrapping routine that creates the required indices with proper mappings, some data seeding, etc.

Copy link
Contributor

github-actions bot commented Mar 5, 2025

This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 2 days.

@asal-hesehus
Copy link
Author

We are working on getting some more information

@asal-hesehus
Copy link
Author

Hi We can se that there are no changes to the generated query.
We can see also se that there are alot of changes to the Elastic.Transport: elastic/elastic-transport-net@0.4.22...0.5.7
Are there any changes to the configuration that might affect performance?

@asal-hesehus
Copy link
Author

After doing some more debugging I have managed to find the issue : EnableTcpStats and EnableThreadPoolStats are default on in the newer version of the driver and that leads to a major performance hit. I can see they are on in the debug information on the response.

In the 8.15.10 the EnableTcpStats and EnableThreadPoolStats was false. In the new they are null, and I would guess that somewhere in your code that causes them to be enabled.

I ran the code below on the old and on the new version. The old version was around 8 seconds while the new was around 14.

Then I disabled EnableTcpStats and EnableThreadPoolStats and then the numbers was the same.

`public static class Program
{
public static async Task Main()
{
var client = CreateClient("https://localhost:9200","elastic", "ELASTIC_PASSWORD");

    var sp = Stopwatch.StartNew();

    var tasks = new[]
    {
        PerformPingsAsync(client),
        PerformPingsAsync(client),
        PerformPingsAsync(client),
        PerformPingsAsync(client),
    };

    await Task.WhenAll(tasks);

    sp.Stop();
    Console.WriteLine(sp.ElapsedMilliseconds);
    Console.ReadKey();
}

static async Task  PerformPingsAsync(ElasticsearchClient elasticsearchClient)
{
    for (int i = 0; i < 10000; i++)
    {
        var pingResponse = await elasticsearchClient.PingAsync();
    }
}

public static ElasticsearchClient CreateClient(string url, string userName, string password)
{

    var authorizationHeader = (AuthorizationHeader)new BasicAuthentication(userName, password);
    var settings = new ElasticsearchClientSettings(new Uri(url))
        .Authentication(authorizationHeader);

    /* enabling these two lines of code, makes the code as fast as the old version
    settings.EnableThreadPoolStats(false);
    settings.EnableTcpStats(false);
    */
    settings.DisableDirectStreaming();
    settings.ThrowExceptions();
    return new ElasticsearchClient(settings);
}

}`

@gpetrou
Copy link
Contributor

gpetrou commented Mar 13, 2025

I suspect that the problem was introduced in elastic/elastic-transport-net@d5141a9#diff-ab361d66a9fe9e43daed281bc432588ddd3ca21abcfccda72cf1b3415b218e4dR127
It seems that true was not present before.
elastic/elastic-transport-net@d5141a9#diff-ab361d66a9fe9e43daed281bc432588ddd3ca21abcfccda72cf1b3415b218e4dL104
@Mpdreamz was this change intentional?
If not, maybe revert?

@Mpdreamz
Copy link
Member

Damn sorry for missing that in my refactor @gpetrou and thank you for catching it!

Revert PR is up: elastic/elastic-transport-net#157

@flobernd
Copy link
Member

Thanks a lot @asal-hesehus @gpetrou for chasing this down!

Thanks @Mpdreamz for the transport fix. I'll try to release a new version of the client early next week 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.x Relates to 8.x client version Category: Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants