Comment by joshumax - Hacker Neue

joshumax Aug 6, 2019 parent

What are you referring to regarding pthread_create()? Last time I checked I thought that it would return an undefined thread* only when giving a nonzero return code, which while certainly isn't handled in a lot of multi-threaded applications could be checked before anything else is done with the newly created thread.

bArray Aug 6, 2019

There appears to be cases under high create/destroy scenarios where it returns zero, but failed to allocate memory for a thread during create. This is with tonnes of available memory (hence not a mapping error) and no exceptions thrown.

That said, it's still entirely possible I've made a mistake. Please see here: https://github.com/electric-sheep-uc/black-sheep/blob/master...

The idea is that you do preliminary processing of a camera frame before sending a neural network over the top.

datenwolf Aug 6, 2019

> This is with tonnes of available memory (hence not a mapping error) and no exceptions thrown.

Could be a memory fragmentation issue

EDIT: Also since you're using C++ threads: You should really use move semantics, because right now you have two points of failure acting on the same thing: `new` operator may fail on creating the threads instance, and the underlying pthread_create may fail as well.

datenwolf Aug 6, 2019

Comment on my mentioning on move semantics, because I feel that this really ought to be pointed out:

In all the C++ standard library implementations of std::thread the only member variable of that class is the native handle to the system thread itself; there are no additional member variables! This means that the size of a std::thread object is equal to the size of a native handle, usually the size of a pointer, but sometimes smaller.

If you create std::thread by `new` you're essentially creating a pointer to a "possibly a pointer", which comes with all the inefficiencies associated with it: Double indirection, small size allocations tend to fragment memory. And at the end of the day to actually use it, you have to at least lob around that outside pointer around on the stack anyway.

So there is zero benefit at all of using dynamic allocation for std::thread. Don't do it! Just create the std::thread instances on the stack, they're just handled/pointers with "smarts" around them, and you can copy them around just efficiently as you can copy a pointer or an integer. Better yet, if you're not trying to "outsmart" the compiler you'll often get copy elision where applicable.

gmueckl Aug 6, 2019

sts::thread has its copy constructor deleted. This means that creating it on the heap is often your only option if you have to mix it with other types that don't handle noving properly because move semantics are strictly opt in.

datenwolf Aug 6, 2019

> std::thread has its copy constructor deleted

and for good reason

> your only option if you have to mix it with other types that don't handle moving properly

It's not the only option. The other option is to implement move semantics on the containing type. Properly implemented move semantics gives you assurance about ownership and that you don't use the tread interface inappropriately.

7 More Comments →

asveikau Aug 6, 2019

> Please see here: https://github.com/electric-sheep-uc/black-sheep/blob/0735de...

Creating and destroying threads in a tight loop (and blocking on their destruction, thereby reducing the point of having many) seems like a bad idea. Conceptually you have only maximum of two threads in any given part of this snippet, the one running the loop and the current instance of visThread. My guess is also that the loop thread spends most of its time waiting for the recently created threads to die. Why not have visThread only get created once and process a queue of events?

Anyway without additional evidence you potentially flagged a bug in the c++ standard library rather than pthread_create.

bArray Aug 6, 2019

Firstly, thanks for taking the time to look.

> Creating and destroying threads in a tight loop (and blocking on their destruction, thereby reducing the point of having many) seems like a bad idea.

The purpose is that it's pretty much 100% always running a visThread (it's a neural network that takes about 100ms per image). The pre-process on the other hand runs in about 10ms, but there's no reason why it can't be run in advance (+). The neural network can't really be run on multiple cores safely, but it does have OpenMP (parallel loops).

(+) It does create some latency in the output compared to the real world, but it's not a massive deal when it comes down to it.

> Why not have visThread only get created once and process a queue of events?

This is probably the best way to do this, but this was the lazy way with not too much overhead (I think) :)

> Anyway without additional evidence you potentially flagged a bug in the c++ standard library rather than pthread_create.

Possibly, I did run it through GDB and Valgrind, it reliably seemed to die in pthread_create, but that of course could have only been the trigger. It could also be the aggressive optimization [1].

[1] https://github.com/electric-sheep-uc/black-sheep/blob/0735de...

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous