<jaredmm>
Why yes, I did spend an hour staring at code wondering why it was complaining about expecting a C pointer and ignoring that I was using the index buffer struct instead of the vertex.
leon-p has quit [Quit: leaving]
<jaredmm>
Is there a way to get the error output to say the name of the type that's used for the parameter? That is, if there's a declaration for `pub const SDL_AudioSpec = struct_SDL_AudioSpec;` have the error say SDL_AudioSpec rather than struct_SDL_AudioSpec?
ur5us has joined #zig
a92 has joined #zig
meatcar has joined #zig
<g-w1>
that stream was incredible. I was amazed on how it worked basically first try! Can't wait till it gets merged!
<andrewrk>
marler8997, yo
<marler8997>
yo
<andrewrk>
g-w1, protty is no joke :D
<marler8997>
I'm getting a segfault on my 24 core machine with the thread_pool.zig code
<marler8997>
I'm trying to root cause
marijnfs_ has joined #zig
marijnfs has quit [Ping timeout: 256 seconds]
<marler8997>
looks like kprotty has fixed it on my machine
<marler8997>
looks like there's another issue, a deadlock
<marler8997>
I produce it by running it in an infinite loop with: while true; do zig run thread_pool.zig; done
<marler8997>
my last deadlock took 492 invocations
<marler8997>
so I think this is an issue with the design
<marler8997>
the implementation depends on each worker calling wg.done()...
<marler8997>
so if a worker asserts or exits abnormally, it won't call wg.done() and then we are deadlocked
<marler8997>
the implementation needs to be able to detect when a worker has exited whether or not it got a chance to call wg.done()
knebulae has quit [Read error: Connection reset by peer]
<andrewrk>
marler8997, did you manage to get in contact with protty?
<andrewrk>
he's in the Zig Projects Chat (voice chat) in loris's discord server: https://discord.gg/9TQcBqz
<marler8997>
yeah
<marler8997>
he figured out the first bug
<marler8997>
the second bug is just a problem with wait group, basically, with a wait group, if any worker exits without calling WaitGroup.done, then we are in deadlock
<marler8997>
the only way I know of to avoid that is to use thread joining
<andrewrk>
I think that's an OK limitation
<marler8997>
ugh that's a horrible experience
<andrewrk>
well I suppose we would want abort() to happen if a thread panics
<marler8997>
it means whenever you get a crash, instead of an error the compiler will just hang
<andrewrk>
hmm that is already the case actually
<marler8997>
in his example, all we do is call std.debug.warn...which apparently will cause this deadlock
<andrewrk>
I mean that if a thread panics it will call abort()
<marler8997>
sometimes std.debug.warn is causing the program to exit without calling wg.exit
<marler8997>
and the main thread stays alive
<marler8997>
so you just hang
<andrewrk>
I don't understand, how would std.debug.warn cause the program to exit
<marler8997>
I'm not sure
<marler8997>
I mean, it seems to cause the thread to exit (not the whole program)
a92 has quit [Quit: My presence will now cease]
<marler8997>
but having an intermittent failure builtin to the code is horrible, it results in intermittent deadlock failures that are impossible to debug
<marler8997>
I don't see this as acceptable at all
<marler8997>
it looks like the fix here is just to remove the wait group
<marler8997>
the main thread already waits for all threads to exit from the loop that calls "wait" on all the threads
<marler8997>
ok I think all we need to do is remove the reset event from the WaitGroup
<marler8997>
after we're done running, instead of calling wg.wait()...we check that the counter is 0
ur5us has quit [Ping timeout: 260 seconds]
<marler8997>
if it's not, then it means a thread has exited abnormally, so instead of deadlocking we can exit
ur5us has joined #zig
<andrewrk>
how do you know the thread is not still running?
<andrewrk>
how would you get a deadlock other than if a thread called exit() intentionally?
swills has quit [Quit: swills]
<marler8997>
ok I've narrowed down another bug
<marler8997>
andrewrk, (I actually modified the code to call exit to reproduce that bug yes)
<marler8997>
the next bug manifests by a thread neve waking up
<marler8997>
well, it never gets a task at least
<marler8997>
I can see via strace that it gets created and is waiting for an even by waiting on the futex
<marler8997>
but it never runs a task
<marler8997>
and the main thread exits, and this last thread keeps the main process from exiting
<marler8997>
I can reproduce more easily by decreasing the number of tasks as well
<marler8997>
interesting, I added a log at the beginning of runWorker, the thread that is keeping the process alive doesn't even print that
<marler8997>
FOUND IT
<marler8997>
self.is_shutdown
<marler8997>
it's not being synchronized, it either needs to be checked inside the mutex, or accessed through an atomic
<marler8997>
in runWorker I moved the check inside the mutex and now I seem to be able to run indefninitely
<marler8997>
don't you just love how 24-core cpus expose all the multi-threaded bugs :)
<andrewrk>
thanks :)
<marler8997>
potential optimization is to access is_shutdown through an atomic
<andrewrk>
the hot path is is_shutdown=false so we may as well aquire the lock IMO
<marler8997>
yeah it only optimizes the cold shutdown path as far as I can tell
<marler8997>
the only other issue is the deadlock issue, but fix is simple, remove the AutoResetEvent from the wait group, and instead of calling wait, std.debug.assert(wg.count == 0)
knebulae has joined #zig
<marler8997>
I haven't manually reviewed the code yet though, that was just the issue discovered by running the program
<marler8997>
my deadlock suggestion about removing AutoResetEvent could be wrong, I'll have to review the code to be sure
<marler8997>
based on my understanding, that's after we call thread.wait() on all the worker threads, but I'm not 100% sure
<andrewrk>
in update() we're not going to call thread.wait()
<andrewrk>
or are you suggesting to create and destroy the thread pool with every update() ?
<marler8997>
if that's the case then what I'm saying is nonsensical
<marler8997>
if we're keeping threads alive, then we need waitgroup
<andrewrk>
yeah we are because the thread pool will be passed to Compilation, it won't be in charge of creating it
<marler8997>
an enhancement would be to select on the thread handles and the reset event, so we could handle when a thread exits without notifying the wait group
ur5us has quit [Ping timeout: 264 seconds]
radgeRayden has joined #zig
ur5us has joined #zig
spiderstew_ has joined #zig
spiderstew has quit [Ping timeout: 258 seconds]
marnix has joined #zig
marnix has quit [Read error: Connection reset by peer]
marnix has joined #zig
marnix has quit [Read error: Connection reset by peer]
marnix has joined #zig
ur5us has quit [Ping timeout: 264 seconds]
hf69 has quit [Ping timeout: 256 seconds]
waleee-cl has quit [Quit: Connection closed for inactivity]
earnestly has quit [Ping timeout: 260 seconds]
<andrewrk>
marler8997, interesting, it looks like the new thread pool stuff deadlocked on aarch64-linux
marnix has quit [Ping timeout: 264 seconds]
marnix has joined #zig
cole-h has quit [Ping timeout: 260 seconds]
marnix has quit [Ping timeout: 272 seconds]
marnix has joined #zig
sord937 has joined #zig
marnix has quit [Read error: Connection reset by peer]
marnix has joined #zig
lucid_0x80 has joined #zig
<Tharro>
getting a compiler error when trying to sort a i64 slice (adapters) using zig 0.7.1, any clue what I´m doing wrong? error: unable to evaluate constant expression
hnOsmium0001 has quit [Quit: Connection closed for inactivity]
gpanders has quit [Ping timeout: 268 seconds]
rzezeski has quit [Quit: Connection closed for inactivity]
<dch>
in the zig ci jobs, theres a big llvm tarball we fetch from ziglang.org/deps/llvm+clang+lld-11.0.0-x86_64-freebsd-release.tar.xz
<dch>
do we *explicitly* need this to test zig's own build?
<dch>
its much easier to grab llvm11 and ninja from prebuilt ports in this case
<dch>
at the risk that periodically llvm11 will become 11vm 11.1 or something and zig might implode
<dch>
and if we do need this tarball, where/how is it generated? I'd like to build an aarch64 one
gpanders has joined #zig
earnestly has joined #zig
scientes has joined #zig
fwg has quit [Quit: .zZ( sleep is healthy )]
<FireFox317>
dch, i think we have this tarball since this provides a static llvm+clang+lld whereas the package managers almost always provide dynamic ones. Im pretty sure these are generated using zig-bootstrap: https://github.com/ziglang/zig-bootstrap
gpanders has quit [Ping timeout: 272 seconds]
<dch>
FireFox317: cool - thanks for the url. I think then for at least just testing builds, I can safely use the llvm11 port then.
<dch>
`pkg list llvm11 |egrep .a\$ | wc -l` = 262 so freebsd base llvm11 appears to have what's needed anyway
<dch>
maybe this was from a time when the latest llvm wasn't available, or zig needed a patched lvvm
<ifreund>
yeah if your package manager distributes the .a files you should be good to go with those
fwg has joined #zig
gpanders has joined #zig
fwg has quit [Quit: bye bye]
gpanders has quit [Ping timeout: 268 seconds]
nycex- has joined #zig
nycex has quit [Ping timeout: 240 seconds]
leon-p has joined #zig
gazler has quit [Read error: Connection reset by peer]
fwg has joined #zig
pfg_ has joined #zig
wootehfoot has joined #zig
pfg_ has quit [Quit: Leaving]
wootehfoot has quit [Quit: Leaving]
fwg has quit [Quit: .zZ( sleep is healthy )]
rzezeski has joined #zig
fwg has joined #zig
<dch>
when I'm running zig's own tests, I get an `error: No 'build.zig' file found, in the current directory or any parent directories.`
<dch>
pretty sure this is related to not finding appropriate llvm or some cmake-induced mistake on my part
<pixelherodev>
There's a new LLVM backend for stage2?
<pixelherodev>
I really haven't been paying as much attention to Zig lately :(
notzmv has joined #zig
kenran has quit [Quit: leaving]
Akuli has quit [Quit: Leaving]
<justin_smith>
is there a trick for a function that can error, (via alloc...) and also self calls? - the compiler is complaining that it cannot resolve the inferred error set
remby has joined #zig
<justin_smith>
(I worked around it for now by using if (success) |_| {} else |err| { base_case }
donniewest has quit [Quit: WeeChat 3.0]
<andrewrk>
justin_smith, the trick is to use an explicit error set. but it is planned to make this Just Work
<leeward>
Ooh, Zig makes the age old ambiguity of I2C slave addresses go away! Just take a u7 and it's clear what the address representation is.
<andrewrk>
neat, although it will still let you pass a u7 where a u8 is expected
<leeward>
Right, but I can just make my function take a u7 and it'll be fine.
wootehfoot has quit [Quit: Leaving]
<andrewrk>
yep
xackus__ has quit [Ping timeout: 260 seconds]
factormystic has quit [Read error: Connection reset by peer]
factormystic has joined #zig
ur5us has quit [Ping timeout: 260 seconds]
casaca has quit [Remote host closed the connection]
ur5us has joined #zig
<marler8997>
andrewrk, once we enable multithreading a new class of errors will be enabled. What do you think about threads having access to stack memory from other threads? Should we try to avoid this?
<andrewrk>
I'm interested in exploring language features to help avoid these classes of errors
<marler8997>
woooo, that's a deep topic
<andrewrk>
yeah
<marler8997>
in the interim though...with the language we have today, do you we should avoid threads accessing memory that lives on the stack of another thread
<marler8997>
for example, say the main thread has allocated the ThreadPool metadat on it's stack...should allocate it on the heap instead to avoid this?
casaca has joined #zig
marijnfs has quit [Quit: WeeChat 2.8]
<andrewrk>
what's the idea behind this?
<marler8997>
if a thread dies, it's stack goes away, so any thread that has a reference to it's stack will be using freed memory