<mikdusan>
xackus: with gcc it can blow up 1.5GB+ per job. gcc10.2 also on linking is killer, you need the awesome option -DLLVM_PARALLEL_LINK_JOBS=1 to solve that issue
<andrewrk>
xackus, limit the link jobs to 1
<xackus>
thanks!
<andrewrk>
-GNinja -DLLVM_PARALLEL_LINK_JOBS=1
<xackus>
yeah, it died during linking
<xackus>
I need to start over because it keeps looping trying to configure
<mikdusan>
yeah if oom stops your build, you must start over
<xackus>
so with -DLLVM_PARALLEL_LINK_JOBS=1 16 GB and 6 cores should be fine?
<xackus>
i don't want to start over again
<mikdusan>
in my experience yes
<xackus>
fingers crossed
<andrewrk>
if you limit to 1 link job, you can just let ninja do the parallelization
brzg has joined #zig
voldial has quit [Ping timeout: 240 seconds]
fireglow has quit [Quit: Gnothi seauton; Veritas vos liberabit]
jicksaw has quit [Ping timeout: 252 seconds]
tslil is now known as op_4
jicksaw has joined #zig
<mikdusan>
xackus: reproduces for me. pretty much same backtrace. added as comment to your gist
ur5us has joined #zig
brzg has quit [Quit: leaving]
<xackus>
I'm still compiling, [3105/4314]. I pulled just in case it got fixed.
<mikdusan>
hey amazon s3 cloudfront experts, if a file is uploaded WITHOUT header "cache-control" is it effectively uncached?
<mikdusan>
ie: no caching behavior
<mikdusan>
xackus: I ran `update_cpu_features` on your branch and it fails. did you have success with that part of upgrading-llvm procedure for zig?
<xackus>
oof, I forgot about the procedure
<mikdusan>
so there is something different, whether or not it is causing the segfault I don't know yet
<mikdusan>
what I did was build tools/update_cpu_features on zig master. use the produced .exe as so, paths local to my system:
<g-w1>
oh wait, nvm andrewrk probably already did in his branch
<noam>
it's been accepted lol
<noam>
and I didn't even notice :P
<g-w1>
oh yeah, i didn't either
<g-w1>
i really like this push towards less lazy analysis
<noam>
It's less unlazy and more *intelligently* lazy?
<noam>
ohhh, there is one thing I need to figure out in the current design - detecting that something is used on e.g. another platform...
<g-w1>
yes, the benefit of doing it on the astgen without types is that this concept does not exist there
<noam>
Ah, actually it should be trivial, come to think of it
<noam>
Just need the comptime engine to talk to the queue during usage analysis :)
<gracefu>
does full decl dependency graph mean parallel Sema is possible?
<noam>
`const a = if(isWindows()) b else c;` the comptime engine would say "c is seen but doesn't need to be analyzed"
<noam>
gracefu: parallel sema is possible even without a full dependency graph
<noam>
To put it in grammatical terms: to analyze a function, you don't need to know the bodies of its dependencies, only the FnProtos
<noam>
if you know that a is of type `fn()void`, you can analyze a function which calls a without needing a to already be analyzed
<gracefu>
hmm right
<gracefu>
i haven't thought enough about it, i thought there would be issues with comptime shit
<noam>
if you were asking after stage2, then I'm 100% sure they intend to parallelize
<gracefu>
i guess not?
<noam>
if you're asking after zyg, I'm 100% sure I intend to parallelize ;)
<gracefu>
andrew wrote this in stage2-meeting on discord: "Some day Sema might be parallelized per Decl. But that will require being careful. The other stuff is embarrassingly parallel"
<g-w1>
it is harder to parralize sema than astgen, but its still possible. iirc thats why a Sema is a total struct so that it can be thread safe
<gracefu>
so i believed him :P
<gracefu>
i see
<noam>
gracefu: there's two kinds of comptime functions, broadly speaking. 1) functions which don't modify external mutable state, and 2) functions which do. The former category are self contained: every call to `fn isFoo(comptime a: u32) bool` will always give the same result for any a, for instance, and can thus be calculated independent of call site to reduce work (memoization). For functions that *do*
<noam>
modify mutable state, you need to effectively inline it at every call site *anyways*, so it has no impact on parallelization
<noam>
g-w1: how is it harder??
<noam>
Ah, as a consequence of stage2's design?
pretty_dumm_guy has quit [Quit: WeeChat 3.2-dev]
<gracefu>
noam: yeah, re (2) in fact it's important calls to something like `fn makeType(comptime a: u32) type` give the same result, which is what i had in mind by "require being careful"
<g-w1>
i think its harder because all astgen requires is ident lookup (havent looked deep into the recent usingnamestuff but i assume this is the same), while sema requires type info. so you might have to call/analyze a different function to analyze the current function. i guess a call graph makes this easier
<noam>
gracefu: that falls under 1
<gracefu>
oh yeah i misread, that's (1)
<noam>
the only state makeType can access there is an explicit input, which is comptime known
xackus_ has joined #zig
<noam>
For every value of a, there is a single valid value of makeType(a)
<gracefu>
yes
<gracefu>
but my point stands, i just referenced the wrong point
<noam>
The cases I was talking about are stuff like e.g. std.mem.copy, as ifreund pointed out to me
<noam>
andrewrk: As I mentioned to g-w1, that doesn't affect zyg - I use a different mechanism to track unused decls, for instance
<g-w1>
im not sure how it is a language proposal? it just seems like a way to solve 335
<noam>
^
<noam>
andrewrk: that seems like it's purely an implementation detail
<g-w1>
maybe its a language proposal because now errors *will* show for stuff inside unused functions since they are analyzed?
<gracefu>
noam: if process/thread A is analyzing some decl F(x) and it encounters two types G(x) and H(x), presumably it adds both to some job queue so they can execute analysis of both G(x) and H(x) in parallel, right?
<noam>
yep
<noam>
that's exactly what I'm typing up rn
<gracefu>
assuming that, if now thread B/C are analyzing G(x)/H(x), and they both encounter I(x), wouldn't you now need to block a process?
<noam>
gracefu: the only place where dependencies are unknown is usage analysis
<noam>
When type analysis is performed, a full comptime dependency graph is available
<noam>
So the job queue will never tell a thread to analyze G or H until after I is analyzed
<gracefu>
oh! i see, so you can just add I(x) to the queue during *usage* analysis, and so we don't need to know the result of I(x) yet
<gracefu>
this builds up the usage graph, then tyype analysis will proceed in the reverse direction after all usages are tracked in the graph
<noam>
Not quite - all the jobs are added to the queue immediately
<noam>
Ah
<noam>
Yeah, exactly
<gracefu>
makes sense now
<noam>
The brute force option is for the queue to simply do (for node in nodes: if node is missing dependencies, continue, else spawn job) else (error: recursive dependencies found)
<gracefu>
actually there's one more detail, how would this work for I(x+1)? sometimes you need to do comptime stuff before adding the usages right?
<noam>
Another option is to have a list of leaves, too
<noam>
I(x) and I(x + 1) are considered different functions by everything after usage analysis
<gracefu>
F(0) -> G(0), H(0) -> I(1) for example
<noam>
> Compile-time calls with different arguments are treated as separate dependencies.
<noam>
This works fine with both cases of comptime calls I mentioned earlier :)
<gracefu>
hm, i mean if say G(x) did I(x) and H(x) did I(x+1), it's not clear syntactically whether two uses of G and H would use the same I right?
<mikdusan>
andrewrk: feel free to nuke these 2 old tarballs for macos-aarch64 from ziglang.org/deps/:
<noam>
gracefu: ahh. At the moment, I was planning on assuming they were distinct (and just wasting a few cycles if they weren't)
<noam>
by treating them as distinct, the worst thing that can possibly happen is work gets wasted
<gracefu>
it's important you actually use the same result though, because if I returns a type, both need to use the exact same struct
<noam>
They still will
<noam>
Type deduplication is a nice thing ;)
<gracefu>
i think there's another spec going around re: dedup but i can't remember it off the top of my head :P
<noam>
But, now that I'm thinking about it, it's probably better to just invoke the engine on x + 1 first
<noam>
Hmmm... usage analysis can queue up dedicated comptime execution tasks for comptime blocks within a decl, maybe?
<noam>
Or, after scoping analysis rather
<gracefu>
in my implementation of (a subset of) racket, we don't distinguish usage with just compile time stuff, we just interpret until we see something and then do it lazily, but we do it single threaded so it's not an issue
<gracefu>
noam: i think that idea of interleaving usage with comptime execution means you need to be careful :P
<noam>
Not quite
<noam>
Just add yet another graph ;p
<noam>
Not even a graph, per sé; a list of comptime blocks / exprs
<gracefu>
maybe you could just put yourself back on the queue if you encounter something that's currently being analyzed by someone else
<noam>
Then, after scoping analysis runs (so that e.g. `comptime { a.b(c.d); }` is COMPTIME_BLOCK{ CALL{DECL, DECL}}, just run each one and patch the tree with the results
<noam>
gracefu: not even a concern ;)
<gracefu>
so with the running example, G(0) sees I(1) and starts doing I(1), then at the same time H(0) sees I(1) but sees someone started work, so it puts itself H(0):continuation back on the queue with a wait on I(1)?
<gracefu>
idk :P
<noam>
That's fundamentally not how the analysis works :)
<gracefu>
probably :P
<noam>
if worker 1 is analyzing A and sees that it needs B, it never starts doing B
<noam>
It tells the analysis manager to add B to the queue if it isn't already present or done
<noam>
then it keeps going with A
<mikdusan>
how can it keep going? it has to wait or be re-entrant?
<gracefu>
you might need B because you can do J(I(1))
<noam>
mikdusan: no it doesn't :)
<gracefu>
so you don't know what usage of J you're invoking until you have the result of I(1)
<noam>
I don't track transitive dependencies, only direct
<noam>
Ah, hm, B(C(1)) is a good point. *thinks*
<noam>
(this is why I'm writing it up and discussing before working more on the implementation :)
<gracefu>
haha yeah it'd be a big problem if you were halfway in :^)
<noam>
I've written a lot of all the analyzers
<gracefu>
that happened to us 3 times for racket :(
<gracefu>
so many rewrites /o\
<noam>
but they're deliberately crappy versions meant to be just good enough to sustain proper versions of each being written one at a time
<noam>
Basically, I have crappy codegen, crappy expr analysis, etc - now, I can write all of the usage analysis at once and all the existing tests will keep working, with some light patching to the crappy analyzers
<gracefu>
crappy codegen is pretty much on par with stage2
<noam>
Then, once that's done, I can do a fresh write of scoping analysis and get it all done - then type, then expr, etc
<noam>
Ha, no, stage2 has me beat slightly, I think
<gracefu>
:P
<noam>
Well, maybe not - I cheat and only use registers XD
<gracefu>
just make this like 2 functions more complicated so we have a case for parallelization not knowing which C(x) we're calling
<noam>
In this case, C doesn't depend on D - at all
<noam>
We're calling C(D(0)) here - and...
<noam>
> But, now that I'm thinking about it, it's probably better to just invoke the engine on x + 1 first
<gracefu>
would you not say we're using C(0)?
<noam>
We are - but there's two main options, which are both equally valid, I think
<gracefu>
right
<noam>
a) treat it as C(D(0))
<noam>
This means that it will not be memoized with C(0) - though the result can be deduplicated anyways, if it's e.g. returning a struct
<noam>
b) execute D(0) first, and treat the function as C(RESULT_OF{CALL{D, 0}})
<noam>
I'm leaning towards b), but there's still a couple options to think about
<gracefu>
you'll want to dedup transitively e.g. in B(C(D(0))) vs B(C(0)) where B also returns another struct
<gracefu>
so i'm not a fan of (a)
<noam>
b1) usage analysis marks down C(D(0)) as a dependency of A, and marks D(0) as needing comptime patching
<gracefu>
yeah
<noam>
b2) usage analysis invokes the comptime engine on D(0) immediately and marks down C(0) as a dependency
xackus has joined #zig
<noam>
I think b2 is the way to go, here
<noam>
Wait, oops
<noam>
I think b1 is the way to go
xackus_ has quit [Ping timeout: 265 seconds]
<noam>
After scoping analysis, the analysis manager queues up tasks for all comptime exprs and blocks using the info from usage analysis
<gracefu>
so in (b2), usage analysis and semantic analysis (a requirement before comptime interpreting i presume) are now coupled right
<noam>
That's the main issue with b2, yeah
<noam>
Though usage analysis is part of sema - I call it "semantic usage analysis" technically
<noam>
but yeah, b1 is definitely the best option here
<gracefu>
for (b1) im again thinking of B(C(D(0))) vs B(C(0)), where you might end up marking (temporarily) two different usages as dependencies in two locations, and needing to make them the same usage again after comptime patching
<noam>
The sema manager runs the comptime engine on D(0), and patches the index of the call in C(D(0)) with the result (the deduplicated integer literal zero)
<noam>
No
<noam>
gracefu: that's still not a concern
<noam>
B(C(D(0))) is identical to B(C(0)) in b1
<noam>
B(C(D(0))) is marked as a dependency of a, and C(D(0)) is marked as needing execution
<gracefu>
i mean during usage analysis, they're syntactically not the same right
<noam>
So?
<gracefu>
so you'd need to wait until some time later when the results of C(...) come back and it turns out they were the same
<noam>
C(D(0)) would be executed here
<gracefu>
wait, was that not (b2)?
<noam>
b2 was immediately executing
<noam>
b1 means marking *one* dependency and *one* expression as needing execution
<noam>
no matter how complex the expression is
<noam>
so B(C(D(0))) becomes "A depends on 'B(C(D(0)))', and execute C(D(0)) before going on to type analysis"
<gracefu>
ah, and after that expression gets evaluated, would you only resolve the dependency then?
<noam>
Exactly
<gracefu>
makes sense
<noam>
Usage analysis also doesn't resolve e.g. identifiers
<noam>
Scoping analysis then does hotpatching to replace the index of the identifier in the tree with the index of the resolved value
<gracefu>
shares some parallels with the "suspend yourself and wait on C(D(0))" idea except you actually keep going with *usage* analysis, and suspend other stuff instead
<noam>
Doesn't even require suspending anything
<gracefu>
yeah
<gracefu>
cause the other stuff isn't running yet :P
<noam>
It's one of the benefits of separating the stages
<noam>
It's definitely slower than it could be
<noam>
I'm willing to bet I could speed up the compiler by upwards of 30% if I really wanted to
<gracefu>
suspending might be slower
<noam>
Sure, but I mean keeping the stages separate instead of doing a single large pass
<gracefu>
so separating it out into stages where each stage doesn't need to suspend might scale better
<noam>
by splitting sema into small pieces (usage, scope, comptime, types, exprs), it makes the code a lot more readable
<noam>
Yeah, that's the other really nice thikng
<g-w1>
asking again from discord, because i feel like people on irc might have something to add: is there a way to make magic init init the pins that are used later in the program (compitme abuse allowed!)
<noam>
Each stage is, as andrewrk likes to say, embarassingly parallel
<noam>
If input_pin is partly comptime, and flips a bit to indicate usage, then that bitarray becomes immutable runtime data, magic_init could use it. Maybe.
<noam>
But!
<noam>
You'd have to be *explicit*
<noam>
e.g. input_pin(&used_pins, porta, 0)
<noam>
which kinda defeats the *point*
<gracefu>
hmm, tried to summarize and ran into a roadblock again. from what it sounds to me now, usage stage makes a usage graph that only indicates syntactic dependencies. from there, you do a comptime run from the top down (cause the lower levels need to know the args passed in) which turns the syntactic usage graph into a "deduplicated/actual" usage graph that's supposed to come before type analysis. the
<gracefu>
problem is in order to execute something like C(D(0)), wouldn't we have to first execute type analysis of C and D? so it might need a bigger graph where the comptime-eval dependencies and the type analysis dependencies are combined (they're coupled again, but now with comptime-eval instead of usage)
<g-w1>
yeah makes sense, thanks noam
<gracefu>
oof, that was longer than it needed to be
<noam>
gracefu: close, but a tad off - but, so was I :)
<noam>
Usage does track actual semantic dependencies
<noam>
e.g. it knows a depends on B - it's only when B is comptime that it gets tricky, because the semantic meaning of B(C(D(0))) cannot be known without fully analyzing C(D(0))
<gracefu>
yes exactly
<noam>
so `a = fn()void{b();};` it knows a depends on something called "b"
<noam>
It just doesn't understand what "b" *is*
<noam>
Scoping analysis then resolves b into a function (or errors, or whatever else)
<noam>
gracefu: one important note I think you overlooked: after scoping analysis, before the C(D(0)) gets evaluated, ALL comptime functions get passed through type and expr analysis
<gracefu>
that's not possible because type analysis also depends on comptime evaluation
<gracefu>
see B(C(D(0))) case
<noam>
The type analysis is able to invoke the comptime engine
<gracefu>
where C returns a type
<noam>
and detects recursive dependencies
<noam>
I noted that concern and a rebuttal in the doc :)
<gracefu>
but the comptime engine needs type analysis to run, no?
<noam>
That's why comptime functions go through type analysis before the comptime engine is run - but, again, this is after usage analysis, so we can go bottom up!
<noam>
const a = fn() type... const b: a()
<noam>
b depends on a *at comptime*
<gracefu>
they can be recursive too i'm pretty sure
<noam>
I reject recursive functions
<gracefu>
oh
<gracefu>
ok then :^)
<noam>
I think that's technically a violation of the spec, but I'm not sure
<gracefu>
nice
<andrewrk>
yeah comptime function calls can be recursive, but they are subject to branch quota
<andrewrk>
and a comptime function call counts as a branch
<noam>
andrewrk: can the types be recursive?
<noam>
a: b(), b:a() ?
<andrewrk>
they can reference each other, as in `const Node = struct{ node: *Node };`
<noam>
The only case where that makes sense is if both return the literal value `type`, which seems like the kinda thing which has no real world use
<noam>
Yeah, that I know
<noam>
Yeah, to clarify, what I reject is the case where the *types* depend on recursive function invocation
<noam>
if the type of a depends on calling b, and the type of b depends on calling a, I currently reject that
<andrewrk>
that should be handled as any comptime function call. the fact that it is for a type is irrelevant
* noam
double checks logic
<xackus>
rip. still got oomkilled
<noam>
andrewrk: I mean, sure, but... the only possible case where it's semantically valid is if both a and b return `type`
<gracefu>
that's when they cause the most trouble yeah
<noam>
So it's less "I'm going to make sure to reject it" and more "I have no intention of making sure this works because there's no valid use case"
<noam>
My goal with zyg is 100% conformance *for practical cases*. If something is technically in violation of the spec, but no use case exists where it matters, I'm probably not going to bother fixing it
<gracefu>
oh you can count on some comptime DSL library to break your assumption
<gracefu>
imo it's a matter of when
<andrewrk>
xackus, argh, during what?
<noam>
To have two functions both returning the literal value type?
<xackus>
Linking CXX executable bin/clang-13
<gracefu>
to have recursive comptime functions
<noam>
That's already something I intend to support
<g-w1>
wait llvm13 is already in dev?
<noam>
gracefu: this is specifically with recursive *type-functions*
<noam>
Not recursive types. Not recursive functions. Recursive *type-functions*
<gracefu>
to have types* depend on recursive function invocation, too, like i said DSL libraries can get pretty crazy
<noam>
There is only one function I can think of which fits that constraint: `fn()a(){return type;}`
<noam>
definitionally, if you have `a = fn() b()`, the return type of b MUST be type
<noam>
thus, if `b = fn() a()`, a() MUST be type
<gracefu>
you can't do that i think, if you're returning type then it should be fn()type{...}
<xackus>
why would llvm13 not be in dev?
<noam>
gracefu: you can use a fn to define the type
<noam>
that's what I'm talking about
<gracefu>
anyway, i gotta stop here, gotta study for an exam
<andrewrk>
g-w1, similar to how the master branch zig version is 0.8.0+something, llvm master branch is at 13
<noam>
gracefu: good luck :)
<gracefu>
thanks
<g-w1>
ah i see, all these naming schemes are so confusing :/
<andrewrk>
agreed
osa1 has quit [Ping timeout: 252 seconds]
<xackus>
I got so used to it, I didn't see why you were confused
<noam>
I do see how it can be tweaked to make something more meaningful that would be hard to catch, though
<andrewrk>
noam, I would expect "error: dependency loop detected" for this source
<noam>
That's... exactly what I was saying
<noam>
andrewrk: > that should be handled as any comptime function call. the fact that it is for a type is irrelevant
<andrewrk>
recursion implies termination
aerona has quit [Remote host closed the connection]
<noam>
Ahhhhhh, okay
* noam
facepalms
<noam>
Yeah, really should have been more careful to clarify
<andrewrk>
I'm not trying to be pedantic
<noam>
I know, this was 100% on me
<andrewrk>
just didn't understand the qusetion
<noam>
I was unclear
<noam>
This was the use case I was talking about as being rejected
<andrewrk>
makes sense
<noam>
I *do* need to tweak this to better handle terminating comptime recursivity, though
<gracefu>
noam: see this paste, i couldn't get it to work so i asked on discord, but they couldn't get it to work either (might be a stage1 bug) https://zigbin.io/64dd2c
<gracefu>
this is what i meant by recursive function returning types
<gracefu>
see this very similar snippet if you want the invocation to happen at the type signature https://zigbin.io/77e03f
<gracefu>
anyway now that the wait is done i'm suspending zig again xP
<noam>
gah, who runs zigbin? Is it OSS? it's *atrociously* themed if you don't have JS
<noam>
gracefu: I don't get it - is that function even *called*?
<gracefu>
noam: yeah, main references and calls Wrap
<gracefu>
Wrap calls Wrap
<noam>
... link me that in a raw paste?
<noam>
I suspect zigbin is showing the wrong thing without JS
<gracefu>
hence my question, we need at least some kind of type analysis so we can do 2-1 = 1
<gracefu>
but at the same time this is done during type analysis
<noam>
hmm, in this case technically not
<gracefu>
right
<noam>
i and 1 are both comptime_int, but for argument's sake let's have 1 there be a function call, which returns a comptime_int but uses a function to define the type or some such
<gracefu>
re test suite: not sure if you want to wait until we open some issue and ask what the intended behaviour is first, though
<noam>
gracefu: I meant zyg's test suite, not stage1's :P
<gracefu>
i know
<noam>
Nah, it's an important test case for the analysis anyways
<gracefu>
but i presume you want zyg to implement zig's spec as closely as possible right
<gracefu>
which is currently up in the air
<noam>
It's easier to tweak it to do the right thing later if it does something approximating the right thing *now*
<gracefu>
true that :P
<noam>
Anywho, to get back to your question, yes it depends on type analysis
<noam>
but again, type analysis and the comptime engine are a bit entangled
<noam>
(but only a bit!)
<gracefu>
yeah
<noam>
after scoping analysis, comptime functions get run through type analysis
<gracefu>
at the very least, separating usage from type/comptime is possible, probably
<noam>
again, from the bottom of the dependency graph upwards
<gracefu>
(according to our discussion earlier)
<noam>
that's definitely possible, and I'm doing it! :)
<noam>
comptime is also separate
<gracefu>
but disentangling type/comptime seems to be hard/impossible
<gracefu>
haha
<noam>
it's less that they're entangled internally, and more that they recursively call each other?
<gracefu>
yeah
<noam>
and even that's not fully accurate
<noam>
but anyways, let's focus on the one case first
<gracefu>
yeap
<noam>
again, from the bottom of the dependency graph upwards, we start typing comptime functions
<noam>
so in the Wrap example, everything is straightforward until the branch...
<gracefu>
if you get Wrap to work, try and make a version of the test case where you want to make sure even with multithreading everything is dedup'd properly
<gracefu>
i'm not quite sure how to do that though
<gracefu>
heh
<noam>
No need
<noam>
Typing should occur on both sides of the branch, of course, so the only real sticking point is the Wrap(T, i - 1) call
<gracefu>
yeah in this case it's simple, there's no more than 1 job running at a time
<noam>
And even that's really straightforward: you only need the proto to type it, so the type of wrapped is some type
<gracefu>
i meant create some kind of diamond dependency thing which is not immediately obvious before comptime
<noam>
gracefu: not what I meant - at present the queue can only ever use one worker per tree
<gracefu>
hmm
<gracefu>
i see
<noam>
(for anything which appends to the tree, at least)
<noam>
One thing I'm working on now is splitting up the tree even further
<noam>
Each scope is to be its own tree, probably
<gracefu>
so you're betting on the idea that in practice, there'll be lots of smaller trees that you can parallelise?
<gracefu>
i think that's good
<noam>
and that each tree will go so fast it won't matter
<gracefu>
i was mainly worried about intra-tree parallelism being hard to implement, but if you're not doing it then it's all good
<noam>
I fully intend the single-worker use case to be blazing fast
<gracefu>
currently, usage analysis is what splits up the trees, i guess?
<noam>
gracefu: in theory, there's a relatively simple way to do that - just preallocate nodes for workers so that they don't interfere, and reuse them if they aren't needed (and retry with more nodes if needed later)
<gracefu>
so the more accurate you can make usage analysis, the faster it'd run
<noam>
tree splitting would be a phase before usage, probably
<gracefu>
oh
<gracefu>
interesting :P
<noam>
that allows further parallelizing usage analysis, after all
<noam>
anywho! the case!
<noam>
typing Wrap here is actually super easy
<gracefu>
in theory yeah
<gracefu>
you just make a couple calls and you're done
<gracefu>
:>
<gracefu>
oh unless you meant typing *Wrap* itself
<noam>
the type of wrapped in the struct is CALL{Wrap, PARAMS{T, i - 1}}
<gracefu>
yeah then that's definitely easy i think
<andrewrk>
mikdusan, I could have sworn I updated the CI to use a zig-bootstrap tarball on windows. but I'm looking at it now and I see an msvc based script
<noam>
Actually, hm. We need to invoke that call immediately
<noam>
Ahhh right, I'm being dumb!
<noam>
Wrap *doesn't get typed!*
<noam>
Wrap isn't in the comptime graph
<andrewrk>
I suppose I'll race changing the CI against building llvm12 with MSVC
<noam>
We start with Wrap(i32, 0), which is at the bottom of the graph
<noam>
As such, we can actually completely discard the i>0 branch, but... hm, andrewrk: is it semantically intended for comptime-known not-taken branches to be ignored?
<noam>
So in e.g. https://pastebin.com/raw/CGbAmvFR Wrap(i32, 0) is definitionally equal to a function which just contains `return struct{value: i32};`?
<noam>
I think this is a case where lazy analysis makes perfect sense, since we can *always* know if a comptime branch is taken or not
<mikdusan>
andrewrk: how about freebsd. was it using zig-bootstrap artifact?
<noam>
gracefu: I think I figured out the cause of the bug, by accident ;P
<noam>
*If* you were to type both sides of the branch even when you know the i>0 case isn't taken, then you need to evalue Wrap(T, -1)
<noam>
evaluate*
<noam>
and then you need to keep going until you hit the branch quota
<noam>
even though it will *never be taken*
<noam>
In terms of Zyg, Wrap(i32, 0) gets typed easily, then gets fed into expr analysis, and so thanks to memoization and tree patching it becomes known as struct{value: i32}. Then, climbing up the tree, we get to Wrap(i32, 1) - now, wrapped is of type Wrap(i32, 0) aka struct{value: i32}. Climbing up the tree, this remains trivial to perform type and expr analysis on each instance, until we hit Wrap(i32, 2)
<noam>
gracefu: if you want to come up with more complicated examples, I won't complain :)
<gracefu>
noam: yeah, Wrap(i32, -1), ... is what i think is happening too
<gracefu>
but it may or may not be intentional, so we should clarify what the spec says re: this
<noam>
90% sure it's a bug
<gracefu>
same
<noam>
anywho... pretty sure I can avoid tree splitting while still maintaining intratree parallelization, with a tiny bit of smartness
<noam>
but... it's not worth it, I think
<gracefu>
if you can make full Sema parallelization at the decl level work in zyg, it'd legit be a great proof of concept for stage2 to refer to :P
<noam>
this is one feature that really doesn't need to be implemented, and could very likely backfire anyways
<gracefu>
i'm being selfish here, though
<noam>
gracefu: sure, I think it makes sense for stage2, but one of the key reasons I'm making zyg is because my criteria differs greatly
<gracefu>
yep
<noam>
Heck, even just the task queue as it's currently designed is likely a mistake
<noam>
At least, tracking decls is
<noam>
Per-tree lists are better, but still a bit much...
<andrewrk>
mikdusan, hmm no looks like we're building with system compiler on freebsd too
<andrewrk>
not sure why I thought we switched over for more stuff
<noam>
hmm, someone mind running `grep -r const | wc -l` in stdlib? A tad curious...
<mikdusan>
ok so linux because you target -musl for both llvm,zig builds... it's pretty agnostic to linux distro/version
<andrewrk>
mikdusan, the linux one is a bootstrapped build - we use `zig cc` to build stage1
<noam>
Ah right, i can just do a quick clone :P
<mikdusan>
but freebsd artifact is going to be pinned to version of pipeline vm, and same with mac, not sure about windows
<andrewrk>
same for x86_64-macos
<mikdusan>
oh sorry yeah x86_64-macos used zig-bootstrap, an older one but still you're right `zig cc` for everything
<andrewrk>
that insulates us from chaotic system stuff changing
<andrewrk>
makes the CI more reliable
<mikdusan>
did you manage to get a zig-bootstrap llvm-12 on macos-x86_64 artifact?
<andrewrk>
my macos laptop is the slowest oldest hardware I have
<andrewrk>
it's at 39%
<mikdusan>
crosses fingers. this is exactly what I couldn't get to work.
<andrewrk>
I bet I can get it to work for aarch64-macos as well
* mikdusan
ha this guy thinks I'm going to bet against him :P
<andrewrk>
haha
<noam>
Oh, huh. According to `find -name '*.zig' -exec bash -c 'grep const {} | wc -l' \; |sort -h`, the most decls in any file is roughly 1200, with the median being roughly 15, and the mean being roughly 50
<noam>
ah wait, should check for var too
<andrewrk>
each function is a decl as well
<noam>
....ohhhh right, you don't have fnexprs yet! :P
<gracefu>
noam flexing on stage2 😒
<noam>
Or realizing the downsides of implementing them early
<noam>
I'm not going to be able to use zig until either zyg is ready or stage2 supports them XD
<noam>
it just feels so *weird* now...
<andrewrk>
yeah there is a reason stage2 is matching stage1 and not skipping straight to accepted proposals
<noam>
Okay!
<noam>
Median is about 19, mean is about 26, max is about 1200
<noam>
Assuming stdlib is an order of magnitude smaller than a typical codebase (a deliberately faulty assumption to better stress test these ideas), check a per-file queue should still average something like 120 nodes tested
<noam>
checking*
<noam>
That number still feels weird, but *shrugs*
<mikdusan>
xackus: I got my linux llvm-debug build to OOM during link of bin/clang so will up vm size until it can link
<mikdusan>
Killed process 39140 (ld) total-vm:9764612kB
<noam>
There definitely *are* a lot of big ones - total decls is ~36000 - but *most* files don't have a lot
<gracefu>
noam: you should exclude comptime-only functions from your count -- your trees can't be rooted on those functions because those depend on their arguments
<noam>
This is an approximation
<noam>
It's not meant to be precise - I'm aiming for "within an order of magnitude, with error leaning towards making it worse"
<xackus>
my build got killed with 16 GB
<andrewrk>
xackus, you should be able to link with 8 GiB
<noam>
... stage1 needs *8G* now?
<andrewrk>
did you do -DCMAKE_BUILD_TYPE=Release ?
<xackus>
no, that was a debug build
<andrewrk>
noam, we're talking about linking master branch clang
<mikdusan>
no we're doing debug. it's huge.
<noam>
oh god
<andrewrk>
when I make a debug llvm, I make llvm debug but still make clang and lld release
<xackus>
I will try following the procedure before compiling debug again
* mikdusan
rolls dice. linking bin/clang-12 with 16 GB VM
<noam>
lol
<noam>
gracefu: Assuming a 250 decl average - which is near the high end in stdlib - and the queue split into three per-tree lists (unanalyzed, todo, done), and we only have to check unanalyzed (if in unanalyzed, move into todo and move the last entry in unanalyzed into this spot), that's an average of ~80 nodes that'd need checking (again, rough estimate) - given that it's just `for(i = 0; i < len; i += 1)
<noam>
if(list[i] == foo) bar();`, it's really just a bunch of `CMP ; BE ; ADD ; B`
<noam>
Roughly 3000 insns (again inflating by an order of magnitude), so really not worth thinking about
<xackus>
I have to finally sell my old 16 GB ram kit before DDR5 comes out
<noam>
Assuming 1 billion IPS (IPC * Hz) (this number is worse than my pi lol), it'd need to be called 333 times before it reached 1ms, and since we're assuming 250 decls, that'd literally never happen
<noam>
Even artificially inflating by two orders of magnitude, I'm still assuming something like 800 microseconds as the cost :)
factormystic has quit [Read error: Connection reset by peer]
<mikdusan>
13077200 KB (13 GB) MAX RSS to link clang and this is with forcably avoiding the lto linker. gonna see what happens if lto linker is used (without lto objects).
<andrewrk>
noam, can zyg pass any std lib tests yet?
<xackus>
what is this zyg I keep hearing about
<mikdusan>
xackus: it takes 16.3 GB of MAX RSS to link bin/clang-12 during a debug build (gcc-10.2),
<mikdusan>
this is using that newer linker (is it gold linker?). the one capable of doing LTO even without LTO objects takes more memory
<mikdusan>
compared to 13 GB for older linker
<xackus>
hmm, I had 4 GB of swap
<mikdusan>
if you want to build with < 16 GB here's my cmake config line:
<mikdusan>
most important is -DLLVM_PARALLEL_LINK_JOBS=1 ; imagine having N link jobs taking 10 GB each randomly choosing concurrency based on larger build
<mikdusan>
and the 2 '-fno-lto' forces use of the non-LTO linker
<mikdusan>
300 MB executable btw.
<xackus>
thanks for all the help
<mikdusan>
oh one more zero. 3000 MB executable. sheesh
<xackus>
sounded a little too reasonable
<xackus>
template go brrr
<mikdusan>
:)
<mikdusan>
afk for a bit; chips and Deadpool
aerona has joined #zig
ur5us has quit [Ping timeout: 258 seconds]
waleee-cl has quit [Quit: Connection closed for inactivity]
RadekCh has joined #zig
waffle_ethics has quit [Read error: Connection reset by peer]
waffle_ethics has joined #zig
leon-p has joined #zig
sm2n_ has joined #zig
sm2n has quit [Read error: Connection reset by peer]
<andrewrk>
in retrospect I should have done all this ci fiddling in a branch. apologies for the noise
<andrewrk>
this has been a very rough llvm release all around
aerona has quit [Quit: Leaving]
<mikdusan>
any luck with macos-x86_64 zig-bootstrap ?
sord937 has joined #zig
<andrewrk>
57%
<andrewrk>
I'm telling you my x86 macos laptop is *slow*
<andrewrk>
hit a snag on windows too. apparently nobody tested the 12 release candidate on windows
<andrewrk>
can't even run cmake on lld, it interprets the file separators incorrectly in some cmake code
<RadekCh>
Hi guys, on the topic of zig-bootstrap I'm having troubles compiling it on my Arch Linux x86_64
<gracefu>
ran out of diskspace, not RAM
<gracefu>
😒
<gracefu>
(when compiling zig-bootstrap)
<gracefu>
how coincidental, you're having trouble with zig-bootstrap too kek
<RadekCh>
After typing it: ./build -j1 native-linux-gnu baseline
<mikdusan>
heh well debug llvm on linux is: 66 GB build + 48 GB install
<RadekCh>
I'm getting the following errors:
<RadekCh>
In file included from /home/radzio/experiments/zig/zig-bootstrap/llvm/lib/Target/XCore/MCTargetDesc/XCoreMCTargetDesc.cpp:32:
<RadekCh>
I don't know how LLVM works and I can't trace where those missing functions/members are coming from, but they are not declared anywhere
<andrewrk>
hmm why didn't I hit this
<andrewrk>
RadekCh, which zig-bootstrap commit is this?
<RadekCh>
The one from today morning (1f4ebdbb7dcf838b8cf36b89232b42fa679292c4), but I've built it yesterday with previous source, should I do a new build?
<RadekCh>
Because today it only did incremental build
<andrewrk>
yeah try that
<andrewrk>
not sure why it would matter tho
<RadekCh>
ok, it will take some time as -j8 is eating up all my RAM and the laptop stalls at around 30%, so -j1 is the only option for me
<v0idify>
hey, i'm making a server that takes connections. when it gets a connection, i want it to run an async function and continue receiving and when it finishes handling the connection it has to deallocate itself. however I also want to keep an ArrayList (or something else?) with each connection handler to be able to await for them before shutting down the server if needed. how can I do this without racing?
Raito_Bezarius has quit [Ping timeout: 260 seconds]
osa1 has quit [Ping timeout: 240 seconds]
remby has joined #zig
jokoon has joined #zig
Raito_Bezarius has joined #zig
<ifreund>
v0idify: you probably want to use the shutdown() syscall to shutdown your sockets when you want to stop the server
<ifreund>
then let your code for handling the peer closing the connection clean up all the connections
<v0idify>
ifreund, so I save a list of sockets instead of frames? then just use runDetached right?
Akuli has joined #zig
<ifreund>
v0idify: I would probably store a list of frames and associated data in some kind of struct
lemur has quit [Remote host closed the connection]
<v0idify>
ifreund, if I do that I also need to deallocate the struct or frame which involves modifying that list which can race when reading for calling shutdown() no?
<RadekCh>
Ok, I think I know why my compilation of compiler is failing, it seems that the values referenced by the errors are coming from the zig installed on host. For example the error says:
<RadekCh>
./src/type.zig:2698:13: error: enum 'std.target.Tag' has no field named 'zos'
<RadekCh>
.zos,
<RadekCh>
^
<RadekCh>
And in the source code I can see that .zos is in fact part of this target.zig
<RadekCh>
However when I've looked on /usr/lib/std/target.zig it did not contain 'zos'
<RadekCh>
so for some reason libraries are taken from host and not from source code of zig
<RadekCh>
Do You know how to fix it? Should I deinstall zig and then relay maybe on some portable version of zig so that it doesn't look from those definitions on system-wide paths?
cole-h has joined #zig
hspak has quit [Quit: Ping timeout (120 seconds)]
hspak has joined #zig
<RadekCh>
Ok, I guess this is a mystery for another day. Bye guys!
RadekCh has quit [Quit: Connection closed]
nkoz has joined #zig
Bernstein has joined #zig
Mete- has joined #zig
remby has quit [Ping timeout: 246 seconds]
Sahnvour has joined #zig
bitmapper has quit [Quit: Connection closed for inactivity]
Mete- has quit [Ping timeout: 265 seconds]
Bernstein has quit [Remote host closed the connection]
<andrewrk>
my commands were: ./build -j2 x86_64-native-gnu x86_64_v2
<mikdusan>
which host macos version
<andrewrk>
catalina
<andrewrk>
on the m1: ./build -j8 aarch64-native-gnu native
<andrewrk>
native cpu because this is the only known cpu for aarch64-macos :)
<andrewrk>
idk if you want to try to make the aarch64-macos CI take the same strategy as we do for x86, but I can upload a tarball if you want
<andrewrk>
I'm fine with how it is now
<mikdusan>
yes please do upload the aarch64-macos bootstrap and I can tinker with a PR to replace macos_arm64_script
<andrewrk>
ok will do
<andrewrk>
mikdusan, oh I also had to use system linker hack on the m1, and on both I used the `zig libc` feature and ZIG_LIBC environment variable
<andrewrk>
this is going to be so much easier when jakub gets c++ object linking support
<mikdusan>
curious, on the M1 is it a system where someone has forced /usr/include to exist, or perhaps /Library/Developer/ contains /usr/include therein? both of those are hacky
<andrewrk>
ironically zig-bootstrap creating windows tarballs works flawlessly on linux; but running into a lot of trouble trying to use zig-bootstrap on windows to build windows tarballs
<andrewrk>
also just building llvm on windows with msvc is broken.
drakonis has left #zig ["WeeChat 3.1"]
<v0idify>
is there a reason why, for example, std.event.Channel doesn't use std.atomic.Bool?
<ifreund>
bind guess: std.atomic.Bool didn't exist when std.event.Channel was written
<nefix>
hello! I've started today learning Zig! Right now I have some questions: 1- what is the state of networking in Zig? 2- Where I can find code examples? 3- Where I can find libraries? 4- How can I define "interfaces"? Thanks! :)
<v0idify>
ifreund, would contributions replacing it be welcome?
<ifreund>
v0idify: not really my area of expertise so I won't review/merge myself, but I'm sure any std cleanup is welcome
v0idifyy has joined #zig
v0idify has quit [Ping timeout: 240 seconds]
nefix has quit [Quit: WeeChat 3.1]
leon-p has quit [Quit: leaving]
<v0idifyy>
is there *any* way to listen to multiple channels at once?
<v0idifyy>
currently*
remby has quit [Quit: Leaving]
Akuli has quit [Quit: Leaving]
LewisGaul has joined #zig
<noam>
andrewrk: do you have any examples of terminating comptime recursion in which the recursed calls use the same arguments?
<noam>
Rather, in which at some point in the recursion, the same exact call appears multiple times?
<noam>
I think, definitionally, any such case cannot be terminating - if a(x) depends on the value of a(x), the loop cannot terminate, because there's no possible branch
<noam>
(regardless of whether it is a direct dependency)
ur5us has joined #zig
fireglow has joined #zig
marler8997 has joined #zig
<marler8997>
zig build-exe -Dtarget=native-native-gnu -lc ... prints an error that says WindowsSdkNotFound
<marler8997>
visual studio is not installed on this machine, I thought the native-native-gnu would get around that though...?
sh4rm4^bnc has joined #zig
LewisGaul has quit [Quit: Connection closed]
selby has joined #zig
<andrewrk>
marler8997, yes that's how it's supposed to work, it shouldn't be looking for windows sdk with that triple
<marler8997>
andrewrk, I was using build-exe, but I think it's working now that I've created a build.zig and am now using that
<marler8997>
should it be working with build-exe as well?
<andrewrk>
yes
<marler8997>
I'll create an issue :)
paulgrmn_ has quit [Ping timeout: 260 seconds]
<andrewrk>
this one should be pretty contributor friendly to debug
<andrewrk>
breakpoint on the function that looks for the sdk and then look at the stack trace to figure out why it's incorrectly getting called
selby has quit [Quit: My MacBook has gone to sleep.]
<marler8997>
shoot, I may have found another bug. When I compile with -Dtarget=native-native-gnu, the ws2_32.lib symbols seem to be missing
earnestly has quit [Ping timeout: 252 seconds]
<marler8997>
oh no, it's not the symbols that are missing. For some reason when I use -Dtarget=native-native-gnu, the #pragma(lib, "ws2_32.lib") is being ignored?