在Visual Studio中，当与std :: async一起使用时，不会调用"thread_local"变量析构函数，这是一个错误吗?

本文介绍了在Visual Studio中，当与std :: async一起使用时，不会调用"thread_local"变量析构函数，这是一个错误吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下代码

#include <iostream>
#include <future>
#include <thread>
#include <mutex>

std::mutex m;

struct Foo {
    Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"\n";
    }

    ~Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"\n";
    }

    void proveMyExistance() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo this = " << this <<"\n";
    }
};

int threadFunc() {
    static thread_local Foo some_thread_var;

    // Prove the variable initialized
    some_thread_var.proveMyExistance();

    // The thread runs for some time
    std::this_thread::sleep_for(std::chrono::milliseconds{100}); 

    return 1;
}

int main() {
    auto a1 = std::async(std::launch::async, threadFunc);
    auto a2 = std::async(std::launch::async, threadFunc);
    auto a3 = std::async(std::launch::async, threadFunc);

    a1.wait();
    a2.wait();
    a3.wait();

    std::this_thread::sleep_for(std::chrono::milliseconds{1000});        

    return 0;
}

在macOS中编译并运行宽度的叮当声

clang++ test.cpp -std=c++14 -pthread
./a.out

获得结果

已编译并在Visual Studio 2015 Update 3中运行:

不调用析构函数.

这是bug还是未定义的灰色区域?

P.S.

如果最后的睡眠std::this_thread::sleep_for(std::chrono::milliseconds{1000});时间不够长，有时您可能看不到全部3条删除"消息.

使用std::thread而不是std::async时，在两个平台上都将调用析构函数，并且将始终打印所有3条删除"消息.

解决方案

介绍性提示:我现在已经学到了很多有关此内容，因此重新编写了答案.感谢@ super，@ M.M和@DavidHaim和@NoSenseEtAl(使我)走上了正确的轨道.

tl; dr 微软对std::async的实现是不符合标准的，但是它们有其原因，并且一旦您正确理解它们的所作所为实际上就会有用.

对于那些不想要的人来说，编写std::async的直接替换替代品并不是很困难，该替代品在所有平台上的工作方式都相同.我在此处发布了一个.

哇，最近这些天打开的情况如何，我喜欢，请参见: https://github.com/MicrosoftDocs/cpp-docs/issues/308

让我们开始. cppreference 的意思是(强调和删除线):

但是， C ++标准表示:

那么哪个是正确的?正如OP所发现的，这两个语句具有非常不同的语义.当然，正如clang和gcc所示，该标准是正确的，那么Windows的实现为何有所不同?就像很多东西一样，它可以追溯到历史.

(古老的)链接该MM疏的意思是:

PPL基于Windows对线程池，所以@super是正确的.

那么Windows线程池有什么作用，又有什么用呢?好吧，它旨在以高效的方式管理频繁丢失的短期任务，因此要点1是请勿滥用，但是我的简单测试表明，如果这是您的用例，那么它可以提供显着的效率.本质上，它有两件事

它回收线程，而不必总是为启动的每个异步任务启动一个新线程.
它限制了它使用的后台线程总数，在此之后，对std::async的调用将阻塞，直到一个线程可用为止.在我的机器上，这个数字是768.

因此，了解了所有这些，我们现在可以解释OP的观察结果:

为main()启动的三个任务中的每一个创建一个新线程(因为它们都不立即终止).
这三个线程中的每个线程都会创建一个新的线程局部变量Foo some_thread_var.
这三个任务都可以运行完成，但是它们正在运行的线程仍然存在(休眠).
然后程序睡眠片刻，然后退出，使3个线程局部变量保持不变.

我进行了许多测试，除此之外，我还发现了一些关键的东西:

回收线程时，将重新使用线程局部变量.具体来说，它们不会被未销毁，然后被重新创建(已被警告！).
如果所有异步任务均已完成并且您等待了足够长的时间，则线程池将终止所有关联的线程，然后将线程本地变量销毁. (毫无疑问，实际规则比这更复杂，但这就是我观察到的.)
在提交新的异步任务时，线程池会限制创建新线程的 rate ，以希望一个线程在执行所有工作(创建新线程)之前是免费的价格昂贵).因此，调用std::async可能需要一段时间才能返回(在我的测试中最长为300ms).同时，它只是在四处闲逛，希望它的飞船能够进来.这种现象已记录在案，但我在这里称呼它，以防万一它使您感到惊讶.

结论:

Microsoft的std::async实现不符合标准，但显然是出于特定目的而设计的，该目的是充分利用Win32 ThreadPool API.您可以大肆嘲弄标准，以击败他们，但是这种方法已经存在很长时间了，他们可能有(重要！)依赖该标准的客户.我将要求他们在其文档中对此进行说明.不做那是犯罪的.
在Windows的std::async任务中使用thread_local变量不是不安全的.只是不要这样做，它会流下眼泪.

The following code

#include <iostream>
#include <future>
#include <thread>
#include <mutex>

std::mutex m;

struct Foo {
    Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"\n";
    }

    ~Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"\n";
    }

    void proveMyExistance() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo this = " << this <<"\n";
    }
};

int threadFunc() {
    static thread_local Foo some_thread_var;

    // Prove the variable initialized
    some_thread_var.proveMyExistance();

    // The thread runs for some time
    std::this_thread::sleep_for(std::chrono::milliseconds{100}); 

    return 1;
}

int main() {
    auto a1 = std::async(std::launch::async, threadFunc);
    auto a2 = std::async(std::launch::async, threadFunc);
    auto a3 = std::async(std::launch::async, threadFunc);

    a1.wait();
    a2.wait();
    a3.wait();

    std::this_thread::sleep_for(std::chrono::milliseconds{1000});        

    return 0;
}

Compiled and run width clang in macOS:

clang++ test.cpp -std=c++14 -pthread
./a.out

Got result

Compiled and run in Visual Studio 2015 Update 3:

Destructor are not called.

Is this a bug or some undefined grey zone?

P.S.

If the sleep std::this_thread::sleep_for(std::chrono::milliseconds{1000}); at the end is not long enough, you may not see all 3 "Delete" messages sometimes.

When using std::thread instead of std::async, the destructors get called on both platform, and all 3 "Delete" messages will always be printed.

解决方案

Introductory Note: I have now learned a lot more about this and have therefore re-written my answer. Thanks to @super, @M.M and (latterly) @DavidHaim and @NoSenseEtAl for putting me on the right track.

tl;dr Microsoft's implementation of std::async is non-conformant, but they have their reasons and what they have done can actually be useful, once you understand it properly.

For those who don't want that, it is not too difficult to code up a drop-in replacement replacement for std::async which works the same way on all platforms. I have posted one here.

Edit: Wow, how open MS are being these days, I like it, see: https://github.com/MicrosoftDocs/cpp-docs/issues/308

Let's being at the beginning. cppreference has this to say (emphasis and strikethrough mine):

However, the C++ standard says this:

So which is correct? The two statements have very different semantics as the OP has discovered. Well of course the standard is correct, as both clang and gcc show, so why does the Windows implementation differ? And like so many things, it comes down to history.

The (oldish) link that M.M dredged up has this to say, amongst other things:

And PPL is based on Windows' built-in support for ThreadPools, so @super was right.

So what does the Windows thread pool do and what is it good for? Well, it's intended to manage frequently-sheduled, short-running tasks in an efficient way so point 1 is don't abuse it, but my simple tests show that if this is your use-case then it can offer significant efficiencies. It does, essentially, two things

It recycles threads, rather than having to always start a new one for each asynchronous task you launch.
It limits the total number of background threads it uses, after which a call to std::async will block until a thread becomes free. On my machine, this number is 768.

So knowing all that, we can now explain the OP's observations:

A new thread is created for each of the three tasks started by main() (because none of them terminates immediately).
Each of these three threads creates a new thread-local variable Foo some_thread_var.
These three tasks all run to completion but the threads they are running on remain in existence (sleeping).
The program then sleeps for a short while and then exits, leaving the 3 thread-local variables un-destructed.

I ran a number of tests and in addition to this I found a few key things:

When a thread is recycled, the thread-local variables are re-used. Specifically, they are not destroyed and then re-created (you have been warned!).
If all the asynchonous tasks complete and you wait long enough, the thread pool terminates all the associated threads and the thread-local variables are then destroyed. (No doubt the actual rules are more complex than that but that's what I observed).
As new asynchonous tasks are submitted, the thread pool limits the rate at which new threads are created, in the hope that one will become free before it needs to perform all that work (creating new threads is expensive). A call to std::async might therefore take a while to return (up to 300ms in my tests). In the meantime, it's just hanging around, hoping that its ship will come in. This behaviour is documented but I call it out here in case it takes you by surprise.

Conclusions:

Microsoft's implementation of std::async is non-conformant but it is clearly designed with a specific purpose, and that purpose is to make good use of the Win32 ThreadPool API. You can beat them up for blantantly flouting the standard but it's been this way for a long time and they probably have (important!) customers who rely on it. I will ask them to call this out in their documentation. Not doing that is criminal.
It is not safe to use thread_local variables in std::async tasks on Windows. Just don't do it, it will end in tears.

这篇关于在Visual Studio中，当与std :: async一起使用时，不会调用"thread_local"变量析构函数，这是一个错误吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！