#m-labs on 2015-03-31 — irc logs at freenode.irclog.whitequark.org

2015-03-04 14:45 sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs

00:16 <whitequark> rjo: python's parser is unsuitable for the kind of tooling we are building

00:16 <whitequark> it only allows to return crude error messages based on line number and sometimes column

00:19 <whitequark> the one I'll make will give you the column numbers for every token, expression, etc

00:19 <whitequark> similar to what clang can do

00:20 <whitequark> it's not a significant amount of work (i already wrote one such parser and i'm just rewriting chunks of it in python) and it improves usability greatly

01:16 sturmflut_ has joined #m-labs

01:20 SturmFlut has quit [Ping timeout: 264 seconds]

01:21 sb0 has joined #m-labs

01:21 <sb0> whitequark, what's the reason for not reusing actual ast nodes?

01:22 <sb0> with ast nodes, you could easily run modified asts - and that's something we may need later

01:25 <sb0> also, code becomes more directly reusable (there are often tests like isinstance(x, ast.yyy))

01:34 <whitequark> sb0: reusing as inheriting or reusing as in taking whatever python's parser gave out?

01:34 <whitequark> I don't have an opinion on the former except it's possible that python's AST nodes are some weird C thing you can't properly inherit from

01:35 <whitequark> (as in, I'll do it if it's possible)

01:35 <whitequark> the latter doesn't really make sense to me, it's just more work for no benefit

01:35 <whitequark> (you have to match /your/ nodes to /its/ nodes but you still have to parse everything anyway)

01:43 fengling has joined #m-labs

01:44 <whitequark> from a quick check it appears that I will be able to inherit from ast's nodes, yes

02:12 <GitHub148> [pyparser] whitequark pushed 1 new commit to master: http://git.io/j1pf

02:12 <GitHub148> pyparser/master 28671ca whitequark: Add diagnostic module.

02:25 <sb0> whitequark, inherit or just patch attributes dynamically

02:25 <sb0> the latter works (the transforms already do some of that)

02:50 fengling has quit [Ping timeout: 272 seconds]

03:13 fengling has joined #m-labs

03:45 kugg has quit [Ping timeout: 240 seconds]

03:46 kugg has joined #m-labs

04:14 <rjo> whitequark: in that case the right way would seem to be to fix the cpython parser and upstream those features would it not?

04:14 <whitequark> rjo: there are several issues with that

04:15 <rjo> whitequark: where do we need more information than the usual level of cpython ast (without column)?

04:15 <whitequark> rjo: py2llvm diagnostics?

04:15 <whitequark> I mean... compare pre-4.7 (or so) gcc and clang

04:16 <rjo> carrying over the ast level of metadata into py2llvm.

04:16 <whitequark> sorry?

04:17 <rjo> whitequark: yes. i see why good debugging symbols are nice and i enjoy it ;) but is this the most efficient way to spend development time right now?

04:17 <whitequark> sure why not

04:17 <whitequark> i'm nearly done with the lexer

04:17 <whitequark> a few more days and it'll be finished. python is substantially simpler than i thought

04:17 <rjo> whitequark: a parsed python source code snipped gives you an ast with some level of debugging information (line numbers, not columns).

04:18 <whitequark> yes, I know

04:18 <whitequark> I've looked at the builtin ast module and its implementation

04:18 <rjo> and that could just be carried over (to the extend reasonably possible) through py2llvm into IR.

04:18 <whitequark> thus i can say that extending upstream is /not/ a good way to spend development time, plus i'm not even sure if upstream /wants/ it, plus shipping a modified python binary is a pain

04:18 <whitequark> yes

04:18 <whitequark> well, it's not just IR

04:19 <whitequark> it's py2llvm itself's diagnostics

04:19 <whitequark> IR is mainly useful for backtraces; while convenient, proper location information doesn't have that much impact there

04:20 <rjo> if you had to guess, what is the slow-down of lexing/parsing wrt cpython ast?

04:20 <whitequark> 2x or less

04:20 <whitequark> parsing overhead is negligible, for lexing I'm heavily leaning back at python's re module

04:20 <whitequark> one token = one re.match invocation

04:21 <sb0> hmm, parsing isn't that fast

04:21 <rjo> yes. that is what i meant with carrying over. the py2llvm passes need to rais their own errors based with the metadata.

04:21 <whitequark> sb0: I'm not speculating. I'm basing this on data from my ruby parser

04:21 <whitequark> ruby parser lexes in pure ruby (using a ragel-built dfa), not even via regexps

04:22 <whitequark> the parsing overhead is in single %s

04:22 <whitequark> the lexing overhead is significant but python's simple enough you can trick re into giving you complete tokens

04:23 <whitequark> I'm assuming re has a decent implementation

04:23 <sb0> well, by "parsing" I meant the call to "ast.parse", which does lexing + parsing

04:23 <rjo> well. i also worry about having to maintain another piece in the puzzle. language might change, pyparsing and cpython might diverge in hidden ways...

04:23 <sb0> rjo, ad9726 is a 400MSPS DAC. should that be used like the PDQ?

04:24 <sb0> I thought you wanted a slow serial DAC

04:24 <whitequark> rjo: I will have integration tests verifying ast equality against current python, plus if that was ruby, it'd be a problem

04:24 <whitequark> rjo: python has a clear grammar specification you can simply diff

04:24 <rjo> sb0: it is the pdq2

04:24 <sb0> so what does "spi to rtio-bus for ad9726" mean?

04:24 <rjo> who calls for ad9726?

04:25 <rjo> ah. did i write that. i meant ad5370 ...

04:25 <sb0> ah, ok. makes more sense

04:26 <rjo> whitequark: my complaint is not about ability but about having to do it ;)

04:26 <sb0> so the API would be "dac_channel.set_voltage()" at now()?

04:27 <whitequark> rjo: to rephrase: I don't expect that to require any significant amount of time

04:27 <whitequark> I've reviewed the changes in grammar from 2.6 to 3.4

04:27 <whitequark> to arrive to this

04:27 <rjo> sb0: could there be a mediator that converts from the pdq2 style programs to set_voltage() and interleaves?

04:28 <rjo> whitequark: ok. lets see. but i will have a warm and fuzzy feeling when i find the first bug and can say "i told you so" ;)

04:29 * whitequark shrugs

04:29 <rjo> and carrying over all that heavy debugging information (with column numbers, ranges etc) through py2llvm is no biggie?

04:29 <whitequark> i want high quality tooling

04:29 <whitequark> no, not a problem

04:29 <sb0> just pass by reference

04:30 <whitequark> sb0: i create a bunch of heap objects, each 3 words big

04:30 <whitequark> if I remember cpython's optimization correctly, every source range is 6 heap words or so

04:30 <whitequark> if this becomes unacceptable, I know how to optimize it to a single fixnum, but that's unlikely to be necessary

04:30 <rjo> also acting on it and unwinding a funny stack with inlining and interleaving?

04:30 <sb0> rjo, the mediator could call wavesynth to convert the program to samples, put that in a big list, and then the kernel would loop over the list and call set_voltage and delay

04:31 <rjo> sb0: sounds about right.

04:31 <whitequark> rjo: that's not really any different to existing ast debugging info

04:32 <rjo> whitequark: there zero ast debugging info carried over into IR at the moment AFAIK.

04:32 <sb0> rjo, this won't work in parallel blocks because the delay will appear dynamic (as it's inside a loop), but we can add detection of simple cases like this (delay in a for loop of a fixed length without break)

04:32 <whitequark> rjo: yeah, I only meant it's not relevant to pyparser

04:33 <rjo> sb0: ack.

04:33 <sb0> hm. though the for loop can be broken (through an exception) in case set_voltage underflows rtio

04:33 <whitequark> if we're talking about just carrying over location info to IR, yes, it's not that hard, our transforms just have to be aware of location info

04:35 <rjo> whitequark: ok. i am not the expert on that. i'll leave it to you guys. but it does sound weird that we need to replace the cpython parser/lexer.

04:36 <rjo> sb0: yes. the context would be pretty restricted where the interleaving would work. that is fine.

04:37 <whitequark> rjo: i wrote a ruby parser a while ago specifically because existing ruby parser's lineno reporting was not enough for a compiler from ruby-like lang to LLVM IR

04:38 <rjo> conceptually, why do you need more debugging info if you do X-to-llvm than if you do X-interpreted?

04:40 <sb0> rjo, I think we'll soon end up with a multicore system and a crossbar between the cores and each rtio fifo...

04:40 <sb0> and each core running one branch of parallel blocks. also for performance reasons...

04:41 <whitequark> rjo: fundamentally it is not about to-llvm but about introducing a type system (and py2llvm effectively has one)

04:42 <whitequark> adding unusual control flow has the same effect. if some nontrivial invariant is violated, you better be able to explain to the user really well what is wrong with it

04:42 <whitequark> and interpreted dynlangs have comparatively few nontrivial invariants, it's almost exclusively "this value doesn't respond to that method"

04:44 <rjo> sb0: "end up" as in "accidential" or as in "according to plan"?

04:44 <rjo> ah

04:44 <rjo> not the comms-cpu/rtio-cpu split

04:44 <rjo> but the rtio1/rtio2 cpu split?

04:45 <rjo> hyperthreading...

04:46 <rjo> whitequark: hmm. do you have an example?

04:47 <whitequark> say if a type of a variable was inferred as int for some reason, and you're adding a float to it

04:47 <whitequark> you want to show where exactly it was inferred as int, where you add a float, possibly where the second variable was inferred as float

04:48 <sb0> rjo, well, the more channels you need to control, the more cpu power you need

04:49 <rjo> whitequark: ok. you should sell that parsing/debugging feature-set to numba. would you not expect them to constantly complaining about the problem?

04:49 <sb0> rjo, the Oxford people want a hundred or more dds channels...

04:50 <sb0> rjo, was there a decision made about duty cycle management?

04:50 <whitequark> rjo: that's why I'm writing this as a separate package from artiq

04:50 <rjo> sb0: that does not make sense to me. there is > 10k$ worth of equipment on each dds channel.

04:50 <whitequark> when I wrote the ruby parser, it was a hit

04:50 <whitequark> https://rubygems.org/api/v1/gems/parser/reverse_dependencies.json

04:53 <rjo> sb0: Joe and John who I asked to look into this feature did not seem to feel any pain having to do it the pedestrian way and did apparently not understand how the hardware logic analyzer would be used to do the heavy lifting automatically.

04:53 <whitequark> rjo: I've barely crossed paths with people using python but I would be surprised if it's not a problem for numba

04:53 <rjo> sb0: i would conclude that we delay it.

04:54 <sb0> rjo, so we don't put it in this extension?

04:54 <rjo> i would live to see it and it is a prime use case for the hard-LA but there is apparently no market.

04:54 <rjo> so yes.

04:55 <sb0> I wouldn't do it with the hard-LA, but with counters attached to each RTIO channels

04:55 <sb0> ok

04:56 <rjo> but they would sit right where the RLE encoder for the hard-LA would is.

04:56 <rjo> *would be

04:57 <rjo> to me the two are very related. hard-LA gives you history, duty-cycle-counters give you an average since reset.

04:58 <sb0> yes. but the difficulty in the hard-LA is RLE encoding, putting all the data together, and managing DRAM

04:58 <rjo> whitequark: ok.

04:58 <sb0> getting the data from the rtio-bus is straightforward

04:58 <rjo> sb0: ack. the hard-LA is bigger.

04:59 <sb0> rjo, and how do we keep duty cycle during kernel handovers? preprogram enough pulses and leave the RTIO core running?

04:59 <rjo> sb0: i guess all you would need to do is If(rtio, duty_cycle.storage.eq(duty_cycle.storage + 1))...

05:00 <rjo> sb0: yes. how fast is a handover with a back-buffered kernel?

05:00 <sb0> rjo, what if a large part of the experiment is in non-kernel mode, e.g. CPU intensive on the PC, or making a RPC call to a particularly slow device?

05:01 <sb0> if everything is already buffered, a handover is the same order of magnitude as a function call

05:01 <rjo> sb0: split the rpc or disallow it.

05:01 <rjo> that is what i would have guessed.

05:04 <rjo> the cpu intensive stuff would need to go into the pipelined prepare() or analyze()

05:16 <rjo> sb0: re duty cycle: that register transfer between the clock domains is similar to the one for the debug interface, right?

05:16 <sb0> yes, but the debug interfaces does not need to be precise

05:17 <rjo> the duty cycle counters dont need to be precise either. i guess knowing the duty cycle to 8 bits is sufficient for virtually all cases.

05:17 <rjo> that is however more than the single bit

05:17 <rjo> that is needed for the debug if.

05:17 <sb0> whereas with duty cycle counters, you want a clear definition of what the duty cycle is for a given value of now()

05:18 <rjo> but now() is in rtio cycles.

05:18 <sb0> yes. and the duty cycle counters will use rtio cycles, but the accounting of rtio cycles will happen in the CPU clock domain

05:18 <rjo> ah. i get it.

05:18 <rjo> you want to accumulate based on what is put into the fifo.

05:18 <sb0> yes

05:18 <rjo> the delta between pulses.

05:19 <rjo> ok.

05:19 <rjo> fine. that is perfect.

05:19 <rjo> would the hard-LA be in rtio-domain or in cpu?

05:20 <rjo> sounds like we have the duty-cycle thing sorted out. agree?

05:20 <rjo> then onto wb2rtio.

05:20 <sb0> we can do both, since we just want a full dump over a length of time and not a quick readout from the CPU at a given time

05:20 <sb0> doing it in CPU domain spares us one (easy) clock domain transfer

05:21 <sb0> yes, duty cycle is ok

05:21 <rjo> what do you mean by both?

05:21 <sb0> I mean: either will work

05:21 <rjo> ok.

05:21 <sb0> doing it in CPU domain is slightly easier imo

05:21 <rjo> whatever is easier as long as it does not tax the cpu at all in normal mode.

05:22 <rjo> once you want to do download the post-mortem all bets can be off.

05:22 <sb0> it will DMA all the time and use DRAM bandwidth. other than that the CPU doesn't have to worry about it.

05:22 <rjo> ack

05:24 <rjo> re wb2rtio. imho the naturral way to represent a wb-read in rtio would be to use that rtio-channel as output for address and then put the the data into an input fifo. then you can process the read data at your convenience and independent of the wb cyc length and wb device latency.

05:25 <rjo> that feature should be called rtio2wb (rtio-slave, wishbone-master)

05:25 <sb0> wait, there are 2 things:

05:25 <rjo> the wb2rtio would replace the test mode for the dds in the runtime.

05:25 <sb0> 1) an adapter that takes a device with a RTIO-bus interface on it and wraps it so it can be connected to a wishbone bus in the same clock domain

05:25 <sb0> 2) injection of RTIO commands from the CPU in debug mode

05:26 <sb0> #2 would just use the same type of register interface used right now I think

05:28 <rjo> hmm. 1) seems a bit boring. my use case would be instead of doing all these complicated writes to the rtio-bus fifos, just writing/reading to wishbone. thus not only exposing one rtio device on wb via synthesis changes but exposing the entire rtio-bus on wb at runtime.

05:29 <sb0> #1 is trivial, yes. and basically a development thing...

05:29 <rjo> is that then equal to 2)?

05:29 <sb0> there is no entire rtio-bus. each fifo has a point-to-point link to its device.

05:30 <sb0> the demultiplexing is done in the cpu domain, before the fifos

05:30 <rjo> yes. they all lie flat on wb or csr.

05:31 <rjo> let's step back.

05:31 <sb0> rtio-bus is a bit of a misnomer, since the links are point-to-point. I guess it came from dds *bus* on rtio...

05:31 <rjo> ack.

05:31 <rjo> my use case would be writing rtio-spi.

05:32 <rjo> more rpecisely rtio-ad5370

05:33 <rjo> one approach could be to start with wb-ad5370 with mapped registers and debug that thoroughly.

05:33 <rjo> then if we had rtio2wb, we could just hide the wishbone device behind rtio and be happy

05:34 <rjo> the other approach would be start with rtio-ad5370 right away but wrap the rtio in wb2rtio and again get memmapped registers

05:34 <sb0> so a rtio-bus device would have an optional (as ttls won't need it) abstract concept of addresses

05:35 <rjo> either is ok by me. which is easier?

05:35 <sb0> that rtio2wb would map to the wishbone address, if it exists?

05:35 <rjo> yes

05:35 <sb0> and then data, of an arbitrary number of bits (1 for ttls, 32 for dds, etc.)

05:36 <rjo> the fifo data layout would be (time, addr, dat_w, w_en) for the output fifo and (time, addr, dat_r) for the input fifo.

05:36 <sb0> there should also be the configuration signals (e.g. ttl oe) somewhere

05:37 <sb0> though that can be done with a special address, maybe

05:37 <sb0> and then ttls would also have addresses

05:37 <sb0> memory-mapped-IO within RTIO

05:37 <rjo> dat_{rw} being something like (o, oe, rising, falling) for a hypothetical ttl wishbone module that is to be wrapped in rtio2wb

05:38 <sb0> oe management isn't that simple, since the action of the output commands depend on the oe state (set level on the line for output, open/close the gate for input)

05:38 <sb0> but a special address that switches oe would work

05:39 <rjo> you had suggested wb2rtio last time. and the more i think about having to choose at synthesis time is ok.

05:39 <rjo> you could just ignore o if !oe the way tstriple already does it, no?

05:40 <sb0> you don't want to ignore o. you want to recycle the output commands to open/close the gate when in input mode

05:41 <sb0> well I guess that once we have addresses, opening/closing the gate can be done with another address than the one used for o

05:42 <sb0> then you can even do loopback tests without having to connect two RTIO pins on the board

05:42 <rjo> ack. iirc i had that in ventilator like that (but the ttls were a 32-bit bank) and there were 4 addresses.

05:42 <rjo> yes.

05:43 <rjo> but isn't wb2rtio (at synthesis time) the pragmatic solution for this debugging?

05:43 <rjo> (i guess we have drifted of into partly reorganizing the rtio-ttl registers)

05:44 <sb0> the loopback test does more than testing the hardware... it also tests some of the interleaving, drivers, etc.

05:45 <rjo> absolutely. i meant rtio2wb vs wb2rtio for debugging rtio-ad5370.

05:45 <sb0> yes

05:47 <rjo> ok. then this is also agreed, right? lets go for a slim wb2rtio wrapper that can be used at synthesis time.

05:47 <sb0> yes

05:47 <rjo> afaict wb2rtio would choose a suitable timestamp and push into the fifo for writes

05:48 <rjo> for reads it would have a flag and a register with the data (stripping the timestamp?)

05:48 <sb0> wb2rtio would give the same register-based interface as the current one

05:48 <rjo> a readable flag

05:48 <sb0> the cpu would be responsible for choosing the timestamp (e.g. by reading the current counter value and adding a margin)

05:49 <sb0> basically when a channel is in debug mode the rtio core would hand over the associated control registers to the comm-CPU

05:50 <sb0> (by "debug" I mean "override")

05:50 <rjo> ? you mean wb2rtio would consist of a timestamp register, a write register and a bunch of pseudo-mem-mapped registers?

05:50 <rjo> isn't the current rtio interface already wishbone?

05:50 <sb0> there will be no wb2rtio. just the possibility to access the FIFOs from the comm-CPU, and disable access to the same from the kernel-CPU.

05:51 <sb0> memory-mapped CSRs on wishbone, yes

05:51 <rjo> ok. yeah. i guess that is fine. doing the mapping in software is easier than in gateware.

05:53 <rjo> and debug_rtio_write_soon(rtio_channel, fifo_data) would be it. and fifo_data the aforementioned (addr, data) or the like.

05:54 <rjo> easier than the *memapped_wb2rtio_address = data with a complicated wb2rtio translator that decodes addresses, guesses timestamps etc in gateware.

05:54 <sb0> yes

05:54 <sb0> and we don't need performance for the debug interface

05:54 <rjo> ack. agreed.

05:55 <rjo> ok. that was an efficient discussion!

06:30 Alain has joined #m-labs

06:41 <sb0> rjo, your favorite program :) https://asciinema.org/a/18224

06:42 <whitequark> sb0: i've just discovered that pep8 can't automatically fix violations

06:43 <whitequark> see this is why tooling aware parsers are necessary :p

06:43 <whitequark> neither seems pylint

06:44 <whitequark> rubocop, ruby's pylint, specifically uses the rewrite module of my parser to fix these while not breaking the AST by accident

07:46 sturmflut-work has quit [Remote host closed the connection]

09:29 sturmflut-work has joined #m-labs

09:38 stekern has quit [Quit: Lost terminal]

09:39 stekern has joined #m-labs

10:32 sh[4]rm4 has joined #m-labs

10:34 sh4rm4 has quit [Ping timeout: 265 seconds]

11:09 fengling has quit [Quit: WeeChat 1.0]

11:50 kyak has quit [Ping timeout: 252 seconds]

11:56 kyak has joined #m-labs

12:38 kyak has quit [Ping timeout: 264 seconds]

12:48 kyak has joined #m-labs

12:48 kyak has quit [Changing host]

13:08 stekern has quit [Quit: Lost terminal]

13:09 stekern has joined #m-labs

14:25 <kristianpaul> sb0: remenbers the parameter for not not optimize a design in ise/xst?

14:56 <kristianpaul> hmm signal keep

15:15 <kristianpaul> (* KEEP = "TRUE" *)

15:53 sb0 has quit [Read error: Connection reset by peer]

16:31 sh[4]rm4 has quit [Remote host closed the connection]

16:43 sh4rm4 has joined #m-labs

17:20 sb0 has joined #m-labs

18:33 Zougloub has quit [Ping timeout: 256 seconds]

20:06 <rjo> sb0: yes. tmux was my "tool of the year 2014". 2013 was and 2015 will be vim ;)

20:09 digshadow-w has joined #m-labs

20:11 <digshadow-w> I heard some physics folks hang out here, so...anyone have a user manual for a LeCroy 6010 magic controller (GPIB <=> CAMAC)? A friend is trying to get some instruments running but doesn't have the manual. I found something similar, but it would be good to have the actual user manual. I've collected what I have so far here: http://siliconpr0n.org/wiki/doku.php?id=lecroy:6010_magic_controller

20:23 sh[4]rm4 has joined #m-labs

20:25 sh4rm4 has quit [Ping timeout: 265 seconds]

20:29 <rjo> digshadow-w: ha. got nothing here. nothing in the boat anchor manual archives?

20:30 <digshadow-w> nada

21:13 rofl__ has joined #m-labs

21:14 sh[4]rm4 has quit [Ping timeout: 265 seconds]

21:21 sh[4]rm4 has joined #m-labs

21:23 rofl__ has quit [Ping timeout: 265 seconds]

21:57 Alain has quit [Remote host closed the connection]

23:56 sh[4]rm4 has quit [Remote host closed the connection]

23:56 rofl__ has joined #m-labs