sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
bentley` has quit [Ping timeout: 248 seconds]
rohitksingh_work has joined #m-labs
whitequark has quit [Ping timeout: 260 seconds]
cyrozap has quit [Ping timeout: 260 seconds]
whitequark has joined #m-labs
cyrozap has joined #m-labs
<sb0> whitequark, hadn't you said earlier that llvm was good at optimizing array bound checking? (e.g. hoisting it out of loops)
kuldeep has quit [Ping timeout: 244 seconds]
kuldeep has joined #m-labs
<rjo> whitequark: cython has directives that disable those checks. http://cython.readthedocs.io/en/latest/src/reference/compilation.html#compiler-directives
rohitksingh_wor1 has joined #m-labs
rohitksingh_work has quit [Ping timeout: 260 seconds]
bentley` has joined #m-labs
sandeepkr has joined #m-labs
<whitequark> sb0: sure. but it's not magic and it cannot generally optimize checks away, just merge existing ones when in absence of aliasing
<whitequark> when I add aliasing information for fields that might improve things at very low cost
<whitequark> rjo: perhaps. but i would be cautious. we do not have any tools that can detect out-of-bounds accesses and the way we perform allocation means that out-of-bounds accesses will just result in bogus data returned most of the time
<whitequark> (the stack frames generally contain large contiguous chunks of data with few interspersed pointers)
<whitequark> *maybe* we could enable stack smashing protection as a first line of defense against this. it's extremely cheap, much cheaper than the checks...
<whitequark> rjo: what I would much prefer is an extension to mor1kx that moves bounds checking to hardware.
<rjo> whitequark: a.k.a. MMU?
<whitequark> rjo: nope, an MMU would not help us at all.
<whitequark> unless we're adding a heap allocator and everything.
<whitequark> well, I guess we could allocate on stack in 4k granularity but that will have worst cases with small arrays
<whitequark> rjo: let me think of some unobtrustive way to implement it
<rjo> whitequark: but in our case out-of-bounds is not worse than in on a regular OS. just that the allocator is different.
<whitequark> rjo: not quite.
<whitequark> on a regular OS you have valgrind and ubsan
<whitequark> ubsan especially is taking advantage of "shadow pages" to drive cost of the checks quite low
<rjo> whitequark: in our case one would toggle the "fast-but-dangerous" flag and get exceptions.
<whitequark> so what I expect to happen is that people will get used to having the "fast-but-dangerous" flag on all the time.
<whitequark> then get bogus data.
<whitequark> on an OS you will get crashes pretty quickly because you have weird pointers
<whitequark> Python has pointers stuffed everywhere throughout its heap and overwrite it in a way that silently does a wrong thing is hard
<whitequark> we will also crash on invalid pointers about 3/4 of time because of alignment errors, even without an MMU
<whitequark> since we own the CPU why cannot we drive the cost of checks down instead?
<whitequark> e.g. a dedicated "bounds check" instruction.
<rjo> sure. but you still have to carry around the bounds data everywhere.
<whitequark> but we already do.
<whitequark> the slices (and strings, soon) are struct { len, ptr } that are passed by value.
<whitequark> this for one allows slicing that has essentially zero cost because it's just two arithmetic operations
ohama has quit [Read error: Connection reset by peer]
ohama has joined #m-labs
rohitksingh_wor1 has quit [Read error: Connection reset by peer]
rohitksingh has joined #m-labs
<GitHub118> [artiq] sbourdeauducq pushed 1 new commit to master: https://git.io/v1un2
<GitHub118> artiq/master 4c37179 Sebastien Bourdeauducq: drtio: link layer debugging CSRs
<sb0> rjo, we need to put some of the hardware initialization into the runtime, because we need the clock chips to work before we can use the DRTIO transceivers
<sb0> in this case, can't it just initialize the JESD links at the same time?
<sb0> also, the DRTIO protocol implementation won't extrapolate well to SPI
fengling has quit [Ping timeout: 268 seconds]
fengling has joined #m-labs
<bb-m-labs> build #253 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/253
<GitHub175> [artiq] sbourdeauducq commented on issue #636: Merging the address into the channel sounds OK. https://git.io/v1uBj
<bb-m-labs> build #1153 of artiq is complete: Failure [failed python_unittest_1] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1153 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<GitHub69> [artiq] sbourdeauducq commented on issue #636: The DMA playback engine takes LSB-first data of arbitrary length with byte granularity and zeros the missing MSBs, and DRTIO similarly removes zeros in front of data.... https://git.io/v1u0c
<rjo> sb0: why does DRTIO not work for SPI?
<sb0> well some parts can be recycled of course, but it's not completely straightforward
<rjo> sb0: ack the autonomous clock tree and JESD bootstrapping.
<rjo> sb0: but still: in the end the SPI interface to the DAC (also) needs to be exposed to the user.
<sb0> for example, DRTIO needs a framing signal. in SPI we can use CS, unless the chipmaker designed the SPI core as you did where you don't precisely control CS
<rjo> sb0: do you mean the non-RTIO SPI in phaser? or the generic RTIO SPI master?
<rjo> in the RTIO SPI PHY, CS is precisely controlled.
<sb0> but it's not if you connect that core to a CPU, and in your defense, you said that another chipmaker (motorola?) also did that
<rjo> it is not always controlled precisely if you do chained transfers. otherwise it is.
<sb0> but for framing a DRTIO frame you'd need chained transfers
<sb0> or the protocol needs to be changed in some way
<rjo> and i'd be happy to accept a patch that only releases CS if all bits of a chained transfer are transferred.
<sb0> yes, but that doesn't help if someone is using another SPI master with an imprecisely controlled CS
<rjo> sb0: i don't understand what you are saying. if you do a chained SPI transfer over DRTIO you just have to set the timestamp so that it will be chained.
<sb0> Jonathan wants to use Sayma with an undefined SPI master, not MiSoC/ARTIQ stuff
<rjo> yeah. but that's an SPI slave then.
<rjo> i just wrote one for PDQ3
<sb0> if that SPI master can't control CS precisely then we can't use it as DRTIO framing signal
<sb0> I'm just using the MiSoC SPI core as an example design that cannot always control CS precisely
<rjo> are you talking about jonathan's idea of RTIO-over-SPI-over-cpu-over-RTIO? or the flat abstraction of Sayma into a SPI "peripheral"
<sb0> Sayma into a SPI peripheral
<sb0> there is no problem putting the SPI PHY into a DRTIO channel
<rjo> ok. if CS comes a "late" after the last relevane bit, why is that a problem for DRTIO framing?
<rjo> ack. let's call that thing (which he doesn't want (yet)) "DRTIO-over-SPI".
<sb0> insert just one bit due to a clock glitch (e.g. at power up) and all SPI comms break down as they lose sync
<rjo> but that would be no CS control at all.
<sb0> again that can be solved, but it's not just "plug SPI into the other end of the DRTIO receiver"
<rjo> in my mind, the SPI slave would shift in a variable amount of data, and when CS is deasserted, push that framed paket into the same pipeline the DRTIO pakets would go into.
<rjo> when CS is asserted, start a new paket.
<rjo> yes. this breaks if there is no proper CS.
<sb0> also SPI will come with its own clock that has no relation to the RTIO clock and isn't even free-running, unlike DRTIO which is fully synchronous
<rjo> SPI controllers either have precise control over CS or then just don't do clock cycles before and after the actual data.
<rjo> yes. just like for PDQ.
<sb0> btw how are you sampling the SPI clock?
<rjo> precise CS is not needed as long as there are no extra clock cycles.
<rjo> with hysteresis.
<sb0> what if there are extra clock cycles due to power-up glitches?
<sb0> hysteresis?
<rjo> aka debouncing.
<sb0> you mean there is a schmitt trigger on the pcb?
<sb0> ah, ok good
<rjo> no. multiregs and then a hysteretic debouncer.
<rjo> if there are extra cycles due to glitches then (a) if CS is deasserted they don't matter and (b) if CS happens to be asserted as well due to a glitch then they constitute an incomplete packet and when CS is deasserted, the short packet is just canceled.
<sb0> so all transfers below, say, 32 bits are discarded?
<rjo> below minimum drtio paket length. i'd guess that's timestamp + channel number + a few data bits.
<sb0> i.e. the glitches would need to consist of >=32 clock pulses plus CS asserted at all times to cause trouble
<rjo> yes.
<sb0> it's still kinda fragile, a software bug on the other end can easily produce that, and you can't reset
<sb0> maybe add a gpio reset line?
<rjo> or we could even do magic interface-enable sequences.
<rjo> but it doesn't break much. it only inserts an event into a RTIO FIFO.
<rjo> plus a reset command.
<sb0> if CS is not used as framing signal, it can desych the whole SPI comms
<rjo> the framing is self-healing.
<rjo> yes.
<whitequark> rjo: ack re: RTIO over SPI
<rjo> but i'd expect CS.
<sb0> so that's not just a spurious rtio event, that's losing control of the device
<rjo> without CS you'd loose control. yes.
<rjo> but doing SPI without CS is masochistic.
<whitequark> rjo: I'm curious. you're mentioning a "spline knot". so is the phaser branch using ADCs to output waveforms defined by splines? which type?
<rjo> yes. b-splines.
<whitequark> that's remarkably flexible
<whitequark> I should figure out how it works
<whitequark> are the internals documented anywhere?
<rjo> whitequark: http://pdq2.readthedocs.io/en/latest/architecture.html#spline-interpolation and spline.py in gateware and coredevice
<whitequark> larsc: https://pbs.twimg.com/media/CzFdspIXAAA2lmp.jpg:large is this... a microwave breadboard?
<whitequark> rjo: thanks
<rjo> yep. that's a microwaye breadboard.
<rjo> wave
<rjo> but afaict in practice they tend to just buy optical breadboards and bolt stuff down on those, still using coax cables to connect boxes.
<GitHub194> [artiq] jordens commented on issue #636: The automatic zero-stripping/extending sounds good.... https://git.io/v1uV6
zoobab has quit [Ping timeout: 245 seconds]
stekern has quit [Ping timeout: 260 seconds]
stekern has joined #m-labs
rohitksingh has quit [Quit: Leaving.]
<whitequark> rjo: ooh, I haven't realized you can compute b-splines using just addition.
<whitequark> hm , there's also cordic, i never understood how that works. but i guess there is already documentation for it
<whitequark> the gateware is really not complex
mumptai has joined #m-labs
<GitHub168> [artiq] r-srinivas commented on issue #407: @whitequark @sbourdeauducq Would this be related to #637 or an independent problem? https://git.io/v1zln
sandeepkr has quit [Ping timeout: 265 seconds]
mumptai has quit [Remote host closed the connection]