Azure Service Fabric InvokeWithRetryAsync的巨大开销

本文介绍了Azure Service Fabric InvokeWithRetryAsync的巨大开销的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在研究需要具有高吞吐量的Service Fabric微服务.

我想知道为什么我无法使用环回在我的工作站上每秒实现500条以上的1KB消息.

我删除了所有业务逻辑，并附加了一个性能分析器，以测量端到端的性能.

看来，约有96％的时间用于解决客户端，而只有约2％的时间用于执行实际的Http请求.

我正在严格测试中调用发送":

private HttpCommunicationClientFactory factory = new HttpCommunicationClientFactory();

public async Task Send()
{
    var client = new ServicePartitionClient<HttpCommunicationClient>(
         factory,
         new Uri("fabric:/MyApp/MyService"));

    await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + "/test"));
}

对此有何想法?根据文档，我称其为服务"的方式似乎是Service Fabric的最佳做法.

更新:缓存ServicePartioningClient确实可以提高性能，但是使用分区服务，由于我不知道给定PartitionKey的分区，因此无法缓存客户端.

更新2 :很抱歉，我没有在最初的问题中包含完整的详细信息.最初实现基于套接字的通信时，我们注意到InvokeWithRetry的巨大开销.

如果您使用的是http请求，您不会注意到太多.一个http请求已经花费了大约1毫秒，因此为InvokeWithRetry添加0.5毫秒并不是很明显.

但是如果您使用的是原始套接字，那么在我们的情况下大约需要0.005ms的时间，因此为InvokeWithRetry增加0.5ms的开销是非常大的！

这是一个http示例，使用InvokeAndRetry的时间是原来的3倍:

public async Task RunTest()
{
    var factory = new HttpCommunicationClientFactory();
    var uri = new Uri("fabric:/MyApp/MyService");
    var count = 10000;

    // Example 1: ~6000ms
    for (var i = 0; i < count; i++)
    {
        var pClient1 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1));
        await pClient1.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url));
    }

    // Example 2: ~1800ms
    var pClient2 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1));
    HttpCommunicationClient resolvedClient = null;
    await pClient2.InvokeWithRetryAsync(
        c =>
        {
            resolvedClient = c;
            return Task.FromResult(true);
        });

    for (var i = 0; i < count; i++)
    {
        await resolvedClient.HttpClient.GetAsync(resolvedClient.Url);
    }
}

我知道InvokeWithRetry添加了一些我不想从客户端错过的好东西.但是是否需要在每次调用时解析分区?

解决方案

我认为最好对它进行基准测试，看看有什么实际区别.我使用有状态服务创建一个基本设置，该服务会打开HttpListener和一个客户端，并通过三种不同的方式调用该服务:

为每个呼叫创建一个新客户端，并按顺序执行所有呼叫

for (var i = 0; i < count; i++)
{
    var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
    var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
}

仅创建一次客户端，然后按顺序将其重新用于每个调用

var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
for (var i = 0; i < count; i++)
{
    var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
}

为每个呼叫创建一个新客户端，并并行运行所有呼叫

var tasks = new List<Task>();
for (var i = 0; i < count; i++)
{
    tasks.Add(Task.Run(async () =>
    {
        var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
        var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
    }));
}
Task.WaitAll(tasks.ToArray());

然后我进行了多次计数测试以获得平均值形式:

现在，应按原样进行测试，而不是在受控环境中进行完整而全面的测试，有许多因素会影响此性能，例如群集大小，被调用服务的实际作用(在这种情况下，实际上并没有什么)以及有效负载的大小和复杂度(在这种情况下，字符串非常短).

在此测试中，我还想了解结构传输的行为方式，其性能类似于HTTP传输(老实说，我原本以为会好一些，但是在这种琐碎的情况下可能看不到).

值得注意的是，对于10,000个呼叫的并行执行，性能明显下降.这可能是由于服务耗尽了工作内存这一事实.这样做的结果可能是某些客户端调用在延迟后出现故障并重试(待验证).我测量持续时间的方法是完成所有调用之前的总时间.同时应注意的是，由于所有调用都路由到同一分区，因此该测试实际上不允许该服务使用多个节点.

总而言之，重用客户端的性能影响是正常的，对于琐碎的调用，HTTP的执行与结构传输类似.

I'm currently working on a Service Fabric microservice which needs to have a high throughput.

I wondered why I'm not able to achieve more than 500 1KB messages per second on my workstation using loopback.

I removed all the business logic and attached a performance profiler, just to measure end to end performance.

It seems that ~96% of the time is spent resolving the Client and only ~2% doing the actual Http requests.

I'm invoking "Send" in a tight loop for the test:

private HttpCommunicationClientFactory factory = new HttpCommunicationClientFactory();

public async Task Send()
{
    var client = new ServicePartitionClient<HttpCommunicationClient>(
         factory,
         new Uri("fabric:/MyApp/MyService"));

    await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + "/test"));
}

Any ideas on this? According to the documentation the way I call the Services seems to be Service Fabric best practice.

UPDATE: Caching the ServicePartioningClient does improve the Performance, but using partioned services, I'm unable to cache the client, since I don't know the partition for a give PartitionKey.

UPDATE 2: I'm sorry that I didn't include full details in my initial question.We noticed the huge overhead of InvokeWithRetry when initially implementing a socket based communication.

You won't notice it that much if you are using http requests. A http request already takes ~1ms, so adding 0.5ms for the InvokeWithRetry isn't that noticable.

But if you use raw sockets which takes in our case ~ 0.005ms adding 0.5ms overhead for the InvokeWithRetry is immense!

Here is an http example, with InvokeAndRetry it takes 3x as long:

public async Task RunTest()
{
    var factory = new HttpCommunicationClientFactory();
    var uri = new Uri("fabric:/MyApp/MyService");
    var count = 10000;

    // Example 1: ~6000ms
    for (var i = 0; i < count; i++)
    {
        var pClient1 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1));
        await pClient1.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url));
    }

    // Example 2: ~1800ms
    var pClient2 = new ServicePartitionClient<HttpCommunicationClient>(factory, uri, new ServicePartitionKey(1));
    HttpCommunicationClient resolvedClient = null;
    await pClient2.InvokeWithRetryAsync(
        c =>
        {
            resolvedClient = c;
            return Task.FromResult(true);
        });

    for (var i = 0; i < count; i++)
    {
        await resolvedClient.HttpClient.GetAsync(resolvedClient.Url);
    }
}

I'm aware that InvokeWithRetry adds some nice stuff I don't want to miss from the clients. But does it need to resolve the partitions on every call?

解决方案

I thought it would be nice to actually benchmark this and see what the difference actually was. I create a basic setup with a Stateful service that opens a HttpListener and a client that calls that service three different ways:

Creating a new client for each call and do all the calls in sequence

for (var i = 0; i < count; i++)
{
    var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
    var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
}

Create the client only once and reuse it for each call, in a sequence

var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
for (var i = 0; i < count; i++)
{
    var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
}

Create a new client for each call and run all the calls in parallell

var tasks = new List<Task>();
for (var i = 0; i < count; i++)
{
    tasks.Add(Task.Run(async () =>
    {
        var client = new ServicePartitionClient<HttpCommunicationClient>(_factory, _httpServiceUri, new ServicePartitionKey(1));
        var httpResponseMessage = await client.InvokeWithRetryAsync(c => c.HttpClient.GetAsync(c.Url + $"?index={id}"));
    }));
}
Task.WaitAll(tasks.ToArray());

I then ran the test for a number of counts to get a form of average:

Now, this should be taken for what it is, not a complete and comprehensive test in a controlled environment, there are a number of factors that will affect this performance, such as the cluster size, what the called service actually does (in this case nothing really) and the size and complexity of the payload (in this case a very short string).

In this test I also wanted to see how Fabric Transport behaved and the performance was similar to HTTP transport (honestly I had expected slightly better but that might not be visible in this trivial scenario).

It's worth noting that for the parallell execution of 10,000 calls the performance was significantly degraded. This is likely due to the fact that the service runs out of working memory. The effect of this might be that some of the client calls are faulted and retried (to be verified) after a delay. The way I measure the duration is the total time until all calls have completed. At the same time it should be noted that the test does not really allow the service to use more than one node since all the calls are routed to the same Partition.

To conclude, the performance effect of reusing the client is nominal, and for trivial calls HTTP performs similar to Fabric Transport.

这篇关于Azure Service Fabric InvokeWithRetryAsync的巨大开销的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！