Switch WIN32 to this implementation to be able to use condition variables, which is impossible with CriticalSection.
Use a custom pthread_mutex_t wrapper because std::mutex adds overhead.