sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
klickverbot has quit [Ping timeout: 260 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 268 seconds]
<sb0> for running all the tests, elapsedTime=83.350688 for master, elapsedTime=64.582824 for release-1. is the compiler getting slower, or is this something else or just noise?
sandeepkr has quit [Ping timeout: 268 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 248 seconds]
cr1901_modern has quit [Ping timeout: 268 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 276 seconds]
evilspirit has joined #m-labs
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 240 seconds]
sandeepkr has joined #m-labs
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 268 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 260 seconds]
cr1901_modern has joined #m-labs
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 276 seconds]
klickverbot has joined #m-labs
evilspirit has quit [Ping timeout: 264 seconds]
klickverbot has quit [Ping timeout: 246 seconds]
<mithro> sb0: What do you think about misoc generating a device tree fragment needed to for booting linux on a misoc platform?
<whitequark> sb0: dunno
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 246 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 246 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 246 seconds]
sandeepkr has quit [Ping timeout: 244 seconds]
_rht has joined #m-labs
sandeepkr has joined #m-labs
<whitequark> rjo: pulse_rate_dds down to 100us
<whitequark> rjo: I'm not sure what else can be done about it
<whitequark> there's a load of phase_mode and a load/store of now in the inner loop
<whitequark> everything else has been eliminated
<whitequark> the load of phase_mode *might* be eliminable with aggressive TBAA
key2 has joined #m-labs
<whitequark> rjo: oh wait lol
<whitequark> the *IR* doesn't have any soft-FP in the loop
<whitequark> however, the *assembly* does
<whitequark> it looks like instruction selector decided to fuse the FP comparison with the branch. except it's dumb and doesn't understand that it shouldn't do that with soft-FP
<whitequark> I can address this, but post-1.0.
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]
<GitHub67> [conda-recipes] whitequark pushed 1 new commit to master: https://github.com/m-labs/conda-recipes/commit/b9c7c952a3211cdd8630fddca0ffaf01c68189ee
<GitHub67> conda-recipes/master b9c7c95 whitequark: llvm-or1k: move to artiq branch.
<GitHub63> [conda-recipes] whitequark pushed 1 new commit to master: https://github.com/m-labs/conda-recipes/commit/7823364ab8a3b7c96e397211b1977c49c7919bde
<GitHub63> conda-recipes/master 7823364 whitequark: llvmlite-artiq: bump.
<whitequark> bb-m-labs: force build --props=package=llvm-or1k conda-all
<bb-m-labs> build forced [ETA 5m59s]
<bb-m-labs> I'll give a shout when the build finishes
key2 has quit [Ping timeout: 276 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 268 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 276 seconds]
<rjo> whitequark: i wonder wether we could generally bias the artiq-python code much more away from FP towards integers.
<rjo> something that isn't there in the first place does not need to be optimized away.
_rht has quit [Quit: Connection closed for inactivity]
klickverbot has joined #m-labs
<bb-m-labs> build #105 of conda-win64 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-win64/builds/105
<bb-m-labs> build #131 of conda-lin64 is complete: Failure [failed anaconda_upload] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/131
<whitequark> bb-m-labs: force build --props=package=llvm-or1k conda-lin64
<bb-m-labs> build forced [ETA 1m00s]
<bb-m-labs> I'll give a shout when the build finishes
<bb-m-labs> build #74 of conda-win32 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-win32/builds/74
<bb-m-labs> build #40 of conda-all is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/conda-all/builds/40
<bb-m-labs> build #132 of conda-lin64 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/132
klickverbot has quit [Ping timeout: 248 seconds]
<whitequark> oh wait
<whitequark> that wasn't 100us
<whitequark> that's just the starting value
<whitequark> okay so before D18744, pulse_rate_dds is 36us
<whitequark> after D18744, pulse_rate_dds is 26us
<rjo> whitequark: yay.
<rjo> whitequark: no fp anymore?
<rjo> see my 20 us estimate was right on the money.
<whitequark> yeah. no fp.
<whitequark> the rest is... spills, reloads, doubtful register allocator decisions, manipulation of `now`
<whitequark> some bizarre jumps on the hot path
<rjo> does llvm-or1k 3.8.1-10 have all those?
<whitequark> no.
<GitHub87> [conda-recipes] whitequark pushed 1 new commit to master: https://github.com/m-labs/conda-recipes/commit/90738936fa7cb51217797a3f8844b2d0b9a6870b
<GitHub87> conda-recipes/master 9073893 whitequark: llvm-or1k: bump.
<whitequark> bb-m-labs: force build --props=package=llvm-or1k conda-all
<bb-m-labs> build forced [ETA 5m59s]
<bb-m-labs> I'll give a shout when the build finishes
sandeepkr_ has joined #m-labs
sandeepkr has quit [Ping timeout: 240 seconds]
kuldeep has quit [Ping timeout: 244 seconds]
klickverbot has joined #m-labs
<whitequark> sb0: you said the desired number is 10us/ch. we are at 13us/ch.
sandeepkr__ has joined #m-labs
kuldeep has joined #m-labs
sandeepkr_ has quit [Ping timeout: 246 seconds]
<sb0> whitequark, nice. should be fine for now.
<sb0> there is the option of putting a delay between runs so that the fifo has time to refill
<whitequark> sb0: what's the ns/instruction ratio for mor1kx?
<whitequark> well, clock/instruction and ns/clock
<sb0> CPI is 1 for non-branch instructions
<sb0> clock is 8ns
<whitequark> what about loads?
<whitequark> loads from cache specifically, I think everything should be in cache
<sb0> I think 1 if they hit the cache
<rjo> that would be something interesting to verify
<whitequark> that's odd
<sb0> stekern, ^
<whitequark> why do we take ~812 instructions to set one DDS channel?
<whitequark> well, 812 cycles
<sb0> there are a lot of things in there. look at the C source
<whitequark> okay so
<whitequark> the inner loop has 69 instructions. I can shave off, optimistically, ten
<whitequark> so, 80ns. not worth bothering with.
sandeepkr_ has joined #m-labs
<rjo> 64bit manipulation is not that cheap
kuldeep has quit [Ping timeout: 248 seconds]
sandeepkr__ has quit [Ping timeout: 276 seconds]
<whitequark> hm?
<whitequark> oh
<whitequark> actually, it is not that expensive, when addc is actually used
<bb-m-labs> build #75 of conda-win32 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-win32/builds/75
<whitequark> but someone stubbed that out in or1k...
<sb0> really?
<whitequark> yeah. in the LLVM backend.
<sb0> I remember seeing very short code produced by llvm for a 64-bit add
<whitequark> yes. but it is longer than two instructions.
* whitequark sighs
<sb0> can that be fixed?
<whitequark> okay, I can fix that
<whitequark> sb0: can I express a "subc" with l.addc?
<sb0> how?
<whitequark> I dunno
<whitequark> but they seem related
<whitequark> http://hastebin.com/oluditiqox.avrasm this is the current assembly
<rjo> EARCH
<rjo> sorry: -EARCH
<whitequark> hastebin doesn't have or1k.
<whitequark> ... and anyway the highlighting is close enough
<sb0> whitequark, can you take the opposite of a 64-bit number in a fast manner?
<sb0> I think it's invert all the bits (xor each word with 0xffffffff) and then add 1, isn't it?
<rjo> yes
<whitequark> LLVM generates a pretty obnoxious sequence for negation...
<sb0> so yeah, you can do subc with two xors and then two 64-bit additions
<sb0> s/subc/64-bit sub
<whitequark> that seems way too slow.
kuldeep has joined #m-labs
<bb-m-labs> build #106 of conda-win64 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-win64/builds/106
<sb0> there doesn't seem to be any support for 64-bit subtraction in or1k
<sb0> I mean, multiprecision with carry flags
<sb0> artiq code should rarely use 64-bit subtraction anyway
<sb0> there are just a few
<rjo> yes. time is mostly increasing.
<rjo> but is that asm for the 64 bit add() good? i remember seeing that style as well months ago.
<whitequark> no, it's not
<rjo> it is also longer than the sub()
<bb-m-labs> build #133 of conda-lin64 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/133
<bb-m-labs> build #41 of conda-all is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-all/builds/41
<whitequark> sb0: rjo: we can subtract quickly.
<whitequark> invert all the bits, *set carry*, and then add.
<whitequark> wow
<whitequark> there's no easy way to set carry.
<whitequark> oh, I can do an l.addi rX, r0, -1
<whitequark> oh, I got a suggestion for an even more optimal way
<rjo> shouldn't llvm be able to come up with those?
<whitequark> nope
<whitequark> LLVM's codegen is a pattern matcher. it reduces a graph.
<whitequark> it doesn't understand almost anything past basic semantics, e.g it knows commutative
<whitequark> LLVM does have hardcoded canonical expansions, but they can be quite inefficient
<stekern> sb0: yes, loads from cache are one cycle
klickverbot has quit [Ping timeout: 246 seconds]
klickverbot has joined #m-labs
sandeepkr__ has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]
kuldeep has quit [Ping timeout: 248 seconds]
sandeepkr_ has quit [Ping timeout: 252 seconds]
klickverbot has joined #m-labs
kuldeep has joined #m-labs
sandeepkr_ has joined #m-labs
kuldeep has quit [Max SendQ exceeded]
sandeepkr__ has quit [Read error: No route to host]
kuldeep has joined #m-labs
sandeepkr has joined #m-labs
sandeepkr_ has quit [Ping timeout: 244 seconds]
sandeepkr has quit [Quit: Leaving]
mumptai has joined #m-labs
kuldeep has quit [Ping timeout: 246 seconds]
kuldeep has joined #m-labs
kuldeep has quit [Changing host]
kuldeep has joined #m-labs
mumptai has quit [Quit: Verlassend]