sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
_rht has joined #m-labs
klickverbot has quit [Ping timeout: 260 seconds]
<mithro> sb0: How should I go about adding support for using gcc with or1k in misoc?
<sb0> why do you need that?
<sb0> otherwise it's just replacing clang with gcc in the makefile
<mithro> sb0: Because I want to use the same toolchain on lm32 and or1k for the moment
<sb0> well you cannot, gcc needs different compiler builds for different architecture
<sb0> so I'm not sure what this adds
<sb0> you'll need to compile another toolchain anyway
<mithro> sb0: yes, I've already done that bit - I have conda recipes for lm32 and or1k gcc which seem to work okay. I needed the gcc compiler for or1k to compile linux / rtems as that is what the openrisc guys are developing with anyway
<mithro> I was thinking that adding a command line flag to https://github.com/m-labs/misoc/blob/master/misoc/integration/builder.py which allowed you to specify which compiler you wanted (with the default being the same as now) would be the correct approach?
<mithro> sb0: my other thought was using environment variables to override the settings in cpu_interface.py
<mithro> but that felt more "hacky" ?
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 264 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 248 seconds]
evilspirit has joined #m-labs
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]
_rht has quit [Quit: Connection closed for inactivity]
evilspirit has quit [Ping timeout: 244 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 252 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 240 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 260 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 246 seconds]
evilspirit has joined #m-labs
klickverbot has joined #m-labs
ssk1328 has joined #m-labs
kuldeep has quit [Ping timeout: 260 seconds]
ssk1328 has quit [Quit: Page closed]
kuldeep has joined #m-labs
<GitHub147> [artiq] sbourdeauducq pushed 4 new commits to master: https://git.io/vVRXm
<GitHub147> artiq/master 6951613 Sebastien Bourdeauducq: protocols/pc_rpc: add get_local_host to clients
<GitHub147> artiq/master 059836c Sebastien Bourdeauducq: protocols/remote_exec: give access to controller_initial_namespace
<GitHub147> artiq/master 4ce00e3 Sebastien Bourdeauducq: protocols/remote_exec: add connect_global_rpc
<bb-m-labs> build #286 of artiq-kc705-nist_clock is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_clock/builds/286
<bb-m-labs> build #545 of artiq is complete: Failure [failed python_unittest_1] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/545 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
klickverbot has quit [Ping timeout: 276 seconds]
evilspirit has quit [Ping timeout: 268 seconds]
evilspirit has joined #m-labs
kuldeep has quit [Ping timeout: 276 seconds]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 246 seconds]
kuldeep has joined #m-labs
kuldeep has quit [Ping timeout: 268 seconds]
kuldeep has joined #m-labs
FelixVi has joined #m-labs
<GitHub162> [artiq] sbourdeauducq pushed 4 new commits to master: https://git.io/vV0Gy
<GitHub162> artiq/master f860548 Sebastien Bourdeauducq: protocols/pyon: minor cleanup
<GitHub162> artiq/master aa61c29 Sebastien Bourdeauducq: transfer Python builtin exceptions over pc_rpc and master/worker
<GitHub162> artiq/master 7453d85 Sebastien Bourdeauducq: GUI -> dashboard
<GitHub82> [artiq] sbourdeauducq pushed 1 new commit to release-1: https://git.io/vV0GH
<GitHub82> artiq/release-1 eba90c8 Sebastien Bourdeauducq: client: add --async option to scan-repository, recommend usage in git post-receive
FelixVi has quit [Remote host closed the connection]
<bb-m-labs> build #287 of artiq-kc705-nist_clock is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_clock/builds/287
<whitequark> this took me entirely too long but I implemented optimal 64-bit subtraction
<bb-m-labs> build #546 of artiq is complete: Failure [failed python_unittest_1] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/546 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<whitequark> rjo ^
<whitequark> so, a 64-bit sub is a sub+xor+add-with-carry
<whitequark> you know, this is really stupid, because the l.addc opcode has a reserved bit and the ALU already has all the necessary combinatory logic for subtraction
<whitequark> they could have added l.subc but did not :/
<whitequark> could've also used the 0x38,0x1 ALU subrange
<whitequark> sb0: do you see any use for the MAC module?
<sb0> no
<whitequark> 64-bit multiplier?
<sb0> how does this work? is l.sub touching the carry flag?
<sb0> doc say it does not
<whitequark> huh? the doc says that it does
<whitequark> rD[31:0] ← rA[31:0] - rB[31:0]
<whitequark> SR[CY] ← carry (unsigned overflow)
<whitequark> SR[OV] ← signed overflow
<sb0> This isntruction does not change carry SR[CY] flag.
<whitequark> that's the old version of the architecture
<whitequark> opencores.org/websvn,filedetails?repname=openrisc&path=%2Fopenrisc%2Ftrunk%2Fdocs%2Fopenrisc-arch-1.1-rev0.pdf
<whitequark> this is the recent revision
<whitequark> this was revised in 2012
<sb0> ah yes
<whitequark> why... why does or1k have separate instructions for extending byte and half-word to register size?!
<whitequark> well, zero-extending at least, that's just a waste of opcode space, since they're all representible via l.andi
<whitequark> this is a bizarre architecture
<sb0> yes
<sb0> lm32 doesn't have such problems afaik...
<whitequark> so, about that
<rjo> whitequark: nice. but how do you teach this to llvm if you say it can't learn to do this?
<whitequark> with what I leanred while fixing OR1K in the last few days, I'm confident I can quickly implement a decent LM32 backend as well as upstream OR1K
<whitequark> I understand pretty much all the moving parts necessary for implementing a backend of this complexity now
<whitequark> rjo: with C++ code.
<whitequark> it has a SUBE instruction (sub-using-carry) and it has built-in legalization code that translates the 64-bit SUB into SUBE+SUBC
<whitequark> I lower SUBC to l.sub which does the right thing, and then manually lower SUBE to l.xor+l.addc
<rjo> by the way. soon there will be many 64 bit subtractions because of latency compensation.
<whitequark> you will be pleased with their speed, then.
<whitequark> (and I will be pleased that I didn't waste this time)
<sb0> whitequark, but then there will be libunwind and all
<whitequark> (well, not like it would have gone to waste anyway, with all the things I learned...)
<whitequark> sb0: what about libunwind?
<sb0> I don't trust it will be bug-free for lm32, if available at all
<whitequark> you do remember that libunwind wasn't available at all for OR1K?
<whitequark> OR1K had no exceptions, no DWARF, no debug information whatsoever
<whitequark> libunwind basically needs setcontext+getcontext and a little bit of boilerplate. and it was bug-free from the start, because that code is just too dumb to have bugs
<whitequark> there *were* a few bugs in the OR1K frame lowering code, but they would have manifested even without exceptions or DWARF, that just made them manifest earlier, and in easier to debug ways, for that matter
<sb0> didn't you use something from BSD?
<whitequark> nope
<sb0> I remember seeing some OR1K DWARF/unwind support from there
<whitequark> I have never even heard about that
evilspirit has quit [Ping timeout: 260 seconds]
<cr1901_modern> They probably reimplemented something due to licensing concerns and/or the GNU equivalent being crap
<whitequark> well they sure as hell used binutils, there's no alternative for or1k
<cr1901_modern> Fair. (Though tbh, I'm a little surprised a binutils alt never came to fruition.)
<whitequark> sure it did
<whitequark> LLVM has its own assembler since ages (because it's stupid to fork, serialize and deserialize just to emit machine code)
<whitequark> now LLVM has its own linker too, and it slowly gains all the loose parts ie ar dwarfdump objdump et cetera
* sb0 notices that QFileDialog with QFileDialog::DontUseNativeDialog also has table column layout issues
<cr1901_modern> I guess it's just slow to adopt then. I actually didn't know LLVM had an assembler. Presumably you can write one for any backend you want if motivated?
<sb0> whitequark, so you're motivated to port everything to lm32?
<whitequark> sb0: sure, why not? you're saying it provides concrete advantages, and I see that it's not a lot of work
<sb0> well the architecture is cleaner. but there are no user-visible advantages ...
<whitequark> we also need to decide something about upstreaming the backends. or1k, lm32, both
<whitequark> I'm tempted to try it with or1k because it's already there and in a good state, and see how painful it is
<sb0> on the other hand, a file selector that would not suck clearly would be a user-visible advantage
<sb0> the kde one is okay, but probably hell to integrate
<sb0> on windows and all
<whitequark> might not be that bad actually, but what's wrong with the system one?
<sb0> I want to customize it in two ways: 1) it should not be a dialog but a permanent part of the application window 2) large icons used as previews (rendered by my application)
<sb0> the system one supports neither
<whitequark> I don't think you should base it off the file selector at all, then
<GitHub95> [artiq] jordens pushed 1 new commit to master: https://git.io/vV0P5
<GitHub95> artiq/master d095d48 Robert Jordens: gui.models: style
sb0 has quit [Quit: Leaving]
<bb-m-labs> build #288 of artiq-kc705-nist_clock is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_clock/builds/288
<bb-m-labs> build #547 of artiq is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/547 blamelist: Robert Jordens <rj@m-labs.hk>
<whitequark> mor1kx doesn't even bother to implement the extension instructions
evilspirit has joined #m-labs
<whitequark> well, four out of six
sb0 has joined #m-labs
evilspirit has quit [Ping timeout: 244 seconds]
<whitequark> rjo: sb0: wow.
<whitequark> the 64-bit addc changes have had a *massive* effect, far more than I have anticipated
<whitequark> specifically, PulseRateDDS is down to 20us
<whitequark> so... 10us per channel? that's actually better than what the Oxford group wants, isn't it?
<whitequark> uh
<whitequark> what
<whitequark> *enabling* addc while building the runtime makes the test faster, but *disabling* addc while building the kernel *also* makes the test faster?
<whitequark> a little bit, but it does
<whitequark> yeah, there's a pretty large amount of l.addic's in dds.o, and a few in rtio.o, i think most of them are dead though
<whitequark> i wonder what's up with addc slowing down the kernel though
key2 has joined #m-labs
<rjo> whitequark: it is what i suggested they can get with drtio. 10us is a useful number for a pulse (dds set and ttl pulse combined). but there will be overhead when actually doing them.
<whitequark> yes, if you change phase you will immediately have FP in the loop
<whitequark> (do you change phase?)
<rjo> all the time
<whitequark> or if you set phase mode to not continuous, there will be a bunch of 64-bit multiplications in dds_set
<rjo> not only that but als the overhead of retrieving the pulse data etc. this is not just repeating the same pulse over and over again.
<rjo> but we really need to leave that for later imho.
<rjo> now we should prioritize and say that 10us for repeating the sme pulse without phase tracking is good.
<whitequark> there's the 64-bit multiplier in the ISA but not in mor1kx...
<rjo> out of curiosity. how did that help for pulse rate ttl?
<rjo> a 64 bit multiplier would need to be either multi-cycle or bring down the clock speed a lot.
<whitequark> it didn't. the ttl pulse rate is 1484ns
<whitequark> pretty much what it was before I started messing with FP, LICM, etc
<whitequark> this, on the other hand, I actually expected
<rjo> hmm. there should be heavy 64 bit stuff in there as well.
<rjo> but also something for later.
_rht has joined #m-labs
<bb-m-labs> build #199 of artiq-pipistrello-nist_qc1 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-pipistrello-nist_qc1/builds/199
<whitequark> hrm, RCA is inconclusive for that addc slowdown, but probably register pressure
<whitequark> in any case it's 40ns
kuldeep has quit [Ping timeout: 248 seconds]
key2 has quit [Ping timeout: 244 seconds]
kuldeep has joined #m-labs
kuldeep has quit [Client Quit]
kuldeep has joined #m-labs
_rht has quit [Quit: Connection closed for inactivity]
klickverbot has joined #m-labs
klickverbot has quit [Ping timeout: 250 seconds]
<whitequark> sb0: from discussion on #llvm: "like, moving from OR1K to RISC-V would be like moving from a trash can fire to a larger, dumpster-sized fire"
klickverbot has joined #m-labs
<cr1901_modern> Surprised. I was under the impression that RISC-V was the most popular out of LM32,OR1K,and RISC-V. But then again, most popular != best.
<cr1901_modern> (I've been told that "one reason LM32 is ignored is that it's 32-bit only")
<cr1901_modern> although I seem to recall that data width is adjustable? *checks*
<whitequark> datapath width is not really the same as register width
klickverbot has quit [Quit: No Ping reply in 180 seconds.]
klickverbot has joined #m-labs
<cr1901_modern> Yea, I'm not sure where I was going with that in retrospect.
<cr1901_modern> sb0: Ping.
<whitequark> rjo: sb0: we can't use overflows in OR1K.
<whitequark> none of the OR1K shifts set overflow (or carry, for that matter) bits
<whitequark> LLVM will transform *2 into <<1 in instcombine (and do other similar things)
<whitequark> which is, of course, not only legal but desirable.
<whitequark> not only this will make code *much* slower but also I don't think that optimization can even *be* turned off, it's considered target-independent
<GitHub64> [conda-recipes] whitequark pushed 2 new commits to master: https://github.com/m-labs/conda-recipes/compare/90738936fa7c...87172c701297
<GitHub64> conda-recipes/master 004e9e4 whitequark: llvm-or1k: bump.
<GitHub64> conda-recipes/master 87172c7 whitequark: llvmlite-artiq: bump.
<whitequark> bb-m-labs: force build --props=package=llvm-or1k conda-all
<bb-m-labs> build forced [ETA 43m59s]
<bb-m-labs> I'll give a shout when the build finishes
sandeepkr has joined #m-labs
klickverbot has quit [Ping timeout: 244 seconds]