sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
<GitHub41>
[artiq] hartytp commented on issue #861: I drove it single-ended (soldered a 50R resistor across one SMP). At >=1GHz anything between +0dBm and +10dBm is fine (IIRC, those chips can take up to 2Vpp single ended). https://github.com/m-labs/artiq/issues/861#issuecomment-365787252
jkeller has joined #m-labs
<jkeller>
bb-m-labs: force build --props=package=artiq-kc705-nist_qc2 --branch=release-3 artiq-board
<bb-m-labs>
build forced [ETA 19m09s]
<bb-m-labs>
I'll give a shout when the build finishes
<GitHub103>
[artiq] hartytp commented on issue #908: The read/write delays should be the same for both gateware versions, right (w/wo SAWG)? As a temporary hack, would it make sense to just hard code the numbers to the values found without the SAWG? https://github.com/m-labs/artiq/issues/908#issuecomment-365863075
<hartytp>
sb0: that's certainly the configuration we're planning to use; Kasli puts out its 125MHz/150MHz RTIO clock on the MMCX connectors. We then loop that round to the MMCX on Urukul
<hartytp>
Urukul divides by 4 to get it into the PLL range (we're now at 31.25MHz/37.5)
<hartytp>
the AD9910 PLL takes that to either 31.25*32=1GHz or 37.5*26=975MHz
<hartytp>
sb0: redy to look at SC1 on Sayma.
<hartytp>
but, as usual, it's been over a month since I last thought about this, so I've forgotten the details
<sb0>
hartytp, okay, thanks for the confirmation re kasli clocking
<sb0>
hartytp, do you have any particular questions about sc1?
<hartytp>
yes: what exactly is the test I need to do.
<hartytp>
so, HMC7043 puts out various clocks, which can all be phase shifted
<hartytp>
DACs are clocked from this at 1.2GHz/2=600MHz
<hartytp>
600MHz goes to FPGA as well. Does that clock the SAWG logic?
<sb0>
hmc7043 also puts out a sysref clock that is used to designate DAC clock edges that correspond to the same sample (for all DACs)
<hartytp>
right.
<sb0>
since the latency of the jesd links is not deterministic
<hartytp>
right, SYSREF marks out edges of the 600MHz clock reaching the DAC (IIRC it basically resets the internal frame clock).
<hartytp>
but, the FPGA also needs to know about these edges to align the data it produces to them, right?
<sb0>
at some point yes, but we want to just test the sync between the DACs at the moment
<hartytp>
okay.
<sb0>
i.e. phase between rtio clock and dac outputs may not be deterministic, but phase between dac channels should be
<hartytp>
okay. does that rely on us meeting setup and hold timings somewhere? (i.e. if the data arrives at the wrong time, the DAC can't tell which clock cycle it belongs in)
<hartytp>
in any case, IIRC the current situation is that we're not seeing any resynchronisation events flagged up by the DAC as we scan the phase of SYS_REF w.r.t. the 600MHz DAC clock.
<sb0>
yes, sysref should meet s/h wrt the DAC clock at each DAC
<sb0>
when it doesn't meet s/h there should be synchronization jitter reported by the dac (since we set the tolerance window to zero)
<sb0>
well, zero is "1/2 DAC cycle" in the datasheet
<hartytp>
so, the first test is probably to look at J60 on a scope while scanning the phase of GTP_CLK_1_IN. Trigger the measurement off the 1.2GHz source
<sb0>
GTP?
<hartytp>
CLKOUT8 on the 7043
<hartytp>
looking at the schematics
<hartytp>
it's not the point I would have chosen to probe, but it's the only test connector after the 7043
<hartytp>
re sysref: right, sysref has to meet s/h at the DAC w.r.t. the DAC clock (of course), but doesn't the FPGA also have to meet some s/h w.r.t. the DAC clock/sysref? Or is that all taken care of by the JSED comma alignment/etc
<sb0>
the JESD links use clock recovery
<sb0>
that's why they have non-deterministic timing wrt the DAC clock in the first place (in addition to the Xilinx transceivers adding some when the TX buffer is not bypassed)
<larsc>
when you have multiple DACs you also want to make sure that the SYSREF edge is seen by all DAC in the same clock cycle
<sb0>
would that cause more problems than a corresponding (stable) phase shift of the outputs?
<hartytp>
sb0: okay...so IIRC, the DAC receives samples from the FPGA and sticks them in a buffer. They get clocked out of there by the DAC's clock (the DAC doesn't recover a clock from the JESD line, all clocks are just divided versions of the DAC clock). SYSREF marks one sample out as being t=0.
<hartytp>
but, if the FPGA doesn't know anything about SYSREF, I would have still thought there was potential for a 1 cycle phase ambiguity between the DACs that could change between power cycles
<hartytp>
anyway, does looking at J60 while scanning the GPT_CLK1_IN phase sound like the right plan? If so, do you have code to do that?
<larsc>
you only need sysref in the fpga if you want determinisitc latency
<larsc>
if all you want is mcs you only need it at the DAC
<larsc>
determinisitc latency is e.g. important for control loops
<hartytp>
larsc: the timing of sysref w.r.t. DAC clock should be matched be guaranteed by trace length matching on the PCB (which was nominally done)
<hartytp>
mcs?
<larsc>
multi-chip-synchronization
<larsc>
sysref tells the DAC when to release data from its buffer
<larsc>
but the buffer must be large enough for one lmfc worth of data
<larsc>
so thats quite a bit
<larsc>
think of it as a fifo. the write pointer is incremented whenever data arrives over the jesd link
<larsc>
the read pointer is incremented with each DAC clock cycle
<larsc>
but the read pointer is kept at zero until sysref is seen
<larsc>
its a bit more complicated than that, but it gives you an idea
<sb0>
hartytp, what do you want to test? that we can program a phase shift at all on the HMC7043?
<sb0>
the GTP clock phase should not have an impact on DAC synch
<sb0>
also I'm not sure which of the two GTP clock outputs we're using on the 7043
<sb0>
_florent_, when all RX slots are full, and a packet begins in liteeth, it goes to the DISCARD_REMAINING state. this state then goes to TERMINATE, that attempts to enter a slot into the FIFO, and also increments the counter number.
<sb0>
_florent_, this looks like buggy behavior to me.
<sb0>
1. if a slot was freed by the CPU during the DISCARD_REMAINING state, a slot will be incorrectly reused (i.e. without being written) and result in a "duplicate" (possibly truncated or extended with garbage) packet
<sb0>
2. if no slot was freed, the counter increment can lead to a slot being used even though it is also in the FIFO, potentially leading to the kind of corruption that whitequark is seeing if the CPU is in the middle of reading it
<sb0>
_florent_, or did I misunderstand something?
<sb0>
whitequark, do you have a repro for the bug? if that's the problem and you have one, it should be easily fixed
<sb0>
_florent_, what is the purpose of the DISCARD state? can't you reset the counter and go to IDLE directly in the WRITE state?
<hartytp>
sb0: yes, I though it would be good to check that you can actually program in a phase shift
<hartytp>
looking at J60 seems to be the best way to do that. Once we know that's working, we can assume that you're correctly varying the SYSREF phase
<sb0>
hartytp, okay. in that case yes, that sounds like a good plan. I don't have code to change this output specifically, but things that can be used are a) patching the runtime that has (since recently) phase adjustements or b) iirc there was some early code based on Florent's LiteX to access the clock chips from Python on a PC
<sb0>
*phase adjustements for sysref
<sb0>
are all the outputs programmed the same on the 7043?
<hartytp>
if you have a different test you'd like me to start with then let me know
<sb0>
sysref also has coupling capacitors, so the signals shouldn't too hard to access, but this may need a fancy scope probe
<hartytp>
sb0: there is only one HMC7042
<hartytp>
7043
<sb0>
yes, but it has many outputs, are they all the same?
<hartytp>
okay, I can do that.
<sb0>
I remember seeing some "sysref generator" that was only available on some
<sb0>
let me check
<hartytp>
my standard "high frequency" probe consists of a 500R 0603 resistor soldered to the PCB (in this case to the coupling cap). Then solder a short pig tail to the 0603 + some hot glue for strain relief
<hartytp>
then run that to a 50R fast scope
<hartytp>
via coax
<sb0>
yes, it seems odd-numbered clock outputs are somehow designed for sysref and even-numbered ones for DAC clocks
<hartytp>
larsc: thanks for the explanation. That much I do remember. What I wasn't clear on was how the data from the FPGA gets aligned into frames. Is that sorted via the JESD layer?
<hartytp>
If so, I'm happy
<hartytp>
SB0: sorry, misread your previous post. There are two SYSREF outputs from the HMC7043, one for each DAC
<sb0>
hartytp, I mean, inside the chip there are two types of clock outputs
<sb0>
it doesn't make sense to try and adjust the non-sysref type
<hartytp>
yes, both SYSREF signals are connected to SCLKOUT pins on the HMC7043
<larsc>
hartytp: the FPGA will start sending data as soon as the DAC is ready, the first sample of that stream is the first sample produced by the DAC
<larsc>
and then the remainder of the stream is aligned to that
<larsc>
as long as there are no overflows/underflows everything will stay aligned
<hartytp>
okay, that makes sense. Thanks!
<hartytp>
Now I'm happy :)
<hartytp>
(couldn't remember if the FPGA started sending data stright away and the DAC just ignored it until the JSED links were up. But, if the FPGA doesn't start sending data until the DAC is ready so that no samples are lost then that makes sense)
<hartytp>
sb0: to confirm, plan is to start by checking that I can phase shift GTP_CLK_1, which is easy to access. Sound good?
<larsc>
at least that is how it is supposed to be, I don't know what your implementation does
<sb0>
hartytp, that's not a SYSREF-designed pin
<hartytp>
true
<sb0>
if you trust that they are similar enough and there are no obscure silicon shenanigans between the two types, that is fine, but it's another loose screw
<hartytp>
no, you're right
<hartytp>
okay, I'll look at one of the DAC sysref pins directly.
<hartytp>
(solder something to C355)
<hartytp>
is there a plan to write an interface to the HMC7043/HMC830 at some point, so I don't need to look up the register map each time I want to make a change?
<sb0>
there are no plans for either, 830 sounds doable, 7043 is pretty messy (we're using the magic output from the proprietary ADI program)
<hartytp>
okay, that's fine then. Has one of you looked through the magic outputs to at least sanity check the register values it produces?
<hartytp>
e.g. check the on-chip termination is set correctly etc
<hartytp>
(while I'm unhappy that the HMC830 doesn't work atm, this reminds me why I didn't like the 7044: the register map for that thing was HUGE and not something I wanted to debug)
<sb0>
yep. good choice.
<sb0>
hartytp, unrelated, what is the main difference between the doublers with cavities that you have in the labs and the green laser pointer KTP crystals?
<rjo>
sb0: the tool is just for playing around with it. but yes. i am currently running it at each power up.
<rjo>
sb0: for opticlock i'll run the fpga from the si5324 at 125 MHz with 100 MHz ref input. the DDS will get their 100 MHz reference clock on the SMA.
<sb0>
okay I'm nearly done with some simple runtine support
<sb0>
your code for 125M output says "10 MHz CKIN2" in the comment
<sb0>
are you not using that code? is the comment incorrect?
<rjo>
i am using that to to play with it. the comment is correct. but that's not the condition i'll use for opticlock.
<sb0>
okay. well it's easy to edit the settings in the runtime later
<sb0>
do you mind if I commit the change that makes the opticlock target use that 10M ref and you change it to 100M afterwards?
Ultrasauce has joined #m-labs
<rjo>
sb0: sure.
<rjo>
BTW: I ran Kasli and Urukul on the PTB Yb clock yesterday. A lot of it worked just fine.
<sb0>
rjo, cool
<sb0>
rjo, what kind of experiment code was it?
<sb0>
rjo, do you still encounter the missing event rtio bug? did you find a repro?
<sb0>
rjo, what is the purpose of clocking the dds and kasli with separate signals, lower noise?
<rjo>
just a bit of cooling, pumping, attempted clock transition probing, detection, and processing logic.
<rjo>
i worked around the failed rtio replacement bug. but we need to exclude some bigger problem first.
<rjo>
the kasli v1.0 clock dist and si5324 power rails are too noisy.
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
<rjo>
i was constantly looking at "RTIO busy involving <the SPI channel>" and couldn't figure out why.
<rjo>
sb0: i first saw the replacement events getting lost on a TTLOut channel. then i saw the SPI channel being busy.
<sb0>
okay, the TTLOut replacement bug should be less difficult to debug
<sb0>
though, it may make sense to fix timing first
<rjo>
but have a look at that timing failure. that's a bummer.
<sb0>
that one appeared after you added the SERDES TTLs I suppose?
<rjo>
sb0: yes i added four TTLInOut.
<sb0>
well that's the path from the serdes output to the serdes input through the pad, and seemingly involving the tristate pin
<sb0>
the sdram also uses the serdes in this way. let me check if there is a difference in the way the tristate pin is handled. this may need another OSERDES driving it.
<rjo>
sb0: i also noticed vivado complaining bitterly about the way we clock the oserdes on the sdram.
<sb0>
on sayma?
<rjo>
sayma and kasli iirc.
<rjo>
i have the weird feeling that we are not giving it the right constraints to figure it out.
<rjo>
and someone knowledgable about the xilinx logic should tell me whether it is expected that vivado tries to insert BUFGs on the reset nets and fails due to placement/routing conflicts. i.e. would debugging that to the point that vivado can insert bufgs on our reset nets (which are driven by AsyncResetSynchronizers) be worthwhile?
rohitksingh_work has quit [Read error: Connection reset by peer]
<sb0>
I see differences between sdram and ttl are around the SERDES_MODE and IOBDELAY parameters
<sb0>
s/are//
<rjo>
sb0: WARNING: [DRC REQP-1580] Phase alignment: Unsupported clocking topology used for OSERDESE2_22. This can result in corrupted data. The OSERDESE2_22/CLK / OSERDESE2_22/CLKDIV pins should be driven by the same source through the same buffer type or by a BUFIO / BUFR combination in order to have a proper phase relationship OSERDESE2. Please refer t$
<rjo>
the Select I/O User Guide for supported clocking topologies of the chosen INTERFACE_TYPE mode.
<rjo>
W
<rjo>
that's a sdram OSERDESE2
<rjo>
on all of them.
<rjo>
clk is sys4x_clk, clkdiv is sys_clk.
<rjo>
ah. sys4x_clk is a BUFH. let me change that.
<sb0>
BUFG on the reset nets can improve timing, yes
<sb0>
what does it say regarding the placement/routing conflict?
<rjo>
sb0: nothing. do you see what's the placement/routing conflict that prevents insertion?
cr1901_modern has quit [Read error: Connection reset by peer]
<rjo>
sb0: i didn't even know that FF resets could be on BUFG nets.
<rjo>
sb0: and i think we should convert the IOBUFDS into IOBUFDS_INTERMDISABLE to save power. the new opticlock bitstream is running pretty hot, even with heatsink.
<rjo>
and then finally i think we should do an analysis of our position (or lack thereof) on DRC REQP-1839 and 1940...
<GitHub46>
artiq/master 0ef33dd Robert Jordens: manual: add note about the "correct" vivado version...
<GitHub46>
artiq/master 7002bea Robert Jordens: kasli: clean up urukul example more
rohitksingh1 has joined #m-labs
futarisIRCcloud has joined #m-labs
<sb0>
rjo, no, I don't know about the details, just read somewhere it was possible to use BUFGs for resets. I had thought you saw a specific vivado warning message
<larsc>
they can be used for all control signals
<larsc>
clk, reset and ce
<sb0>
do we care about coveralls.io? it is currently broken (and messes with the build icons on the commits) and doesn't provide useful information imo
<bb-m-labs>
build #2045 of artiq is complete: Failure [failed python_coverage_1] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2045 blamelist: Robert Jordens <rj@m-labs.hk>, Robert Jordens <jordens@gmail.com>
<rjo>
sb0: since (last i checked) we submit both the reduced and the test results with hardware it constantly toggles between two values. yes. that's not useful.
<rjo>
sb0: maybe it's something about the output from the async reset synchronizer that vivado doesn't like when running it through a BUFG.
bb-m-labs has quit [Quit: buildmaster reconfigured: bot disconnecting]
<GitHub-m-labs>
[buildbot-config] sbourdeauducq pushed 1 new commit to master: https://git.io/vAnL0
<hartytp>
Dx is for channel 1 = SCLKOUT1 = DAC2 SYSREF
<hartytp>
ex is DAC 1 (which you have down as DAC 1)
<hartytp>
oops
<hartytp>
Ex is SCLKOUT3 = DAC1 sync
<hartytp>
you have that down as DAC2
<hartytp>
where do you actually program these? In the code, you seem to set all delays to 0.
<hartytp>
^sb0, any ideas about that? Am I missing something?
<rjo>
adding a "normal" FF doesn't enable automatic BUFG insertion. forcing bufg insertion on all ARS doesn't produce an error and doesn't change the timing in any significant way. forcing it on just sys and rio: same.
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
sb0 has joined #m-labs
rohitksingh1 has quit [Ping timeout: 260 seconds]
rohitksingh1 has joined #m-labs
<whitequark>
sb0: I'm not sure how to reproduce the bug unfortunately, I guess I could try to rewrite the packet dumps I have such that all dest MACs are the core device's...
<sb0>
PATH problem? maybe that can be worked around by using python3 instead of python?
mumptai has joined #m-labs
<whitequark>
unlikely
<sb0>
well the coverage command (only available in conda) ran the correct pythobn
<GitHub53>
[smoltcp] whitequark commented on issue #165: > Alternatively, or in addition, it might be a good idea to make the Debug impls pretty print like this. That way I could simply write trace!("{:?}", frame). Thoughts?... https://github.com/m-labs/smoltcp/pull/165#issuecomment-365975463
<sb0>
so, if python3 is similarly only available in conda, it should run the right one
<sb0>
okay. so that can explain why the DACs are not reacting - perhaps the default phase value meets s/h wrt the DAC clock, by chance
<hartytp>
I'm scanning both the analog and the digital delays at once. The digital should shift SYSREF by 1/2 a DAC clock cycle per step IIRC (easy to see). The analog should should shift it by 25ps=10deg
<sb0>
though, I did look at the output waveforms, and it wasn't conclusive
<hartytp>
well, the rising edges are v sharp, so the window where there is jitter on this is usually v small
<hartytp>
so, I wouldn't expect to see any errors in genereal
<sb0>
the plot I made from the waveforms after ~30 power cycles is more like something to make an entomologist jealous ...
<sb0>
oh, most of the plots were showing a nice collection of sayma bugs (1.8v failure, DAC init failure, unexplained distortion of the DAC signal, unexplained noise superimposed on the DAC signal)
<whitequark>
sb0: and *that* is your fix?
<whitequark>
can you at least try to do things properly?
<sb0>
let's see if that's actually the problem
<sb0>
bb-m-labs, force build artiq
<bb-m-labs>
build #2049 forced
<bb-m-labs>
I'll give a shout when the build finishes
<hartytp>
yes, there is an output mux that switches the delays in and out. sec
<rjo>
you are not setting that.
<rjo>
it's zero.
<sb0>
if that's what the problem is, how do you propose fixing it cleanly? editing the path to remove c:\python27?
<whitequark>
appending to the path on windows and prepending on linux
<sb0>
isn't that something that conda is doing?
<rjo>
whitequark: when you start working on it, could you please write down a plan for developing the PCO driver?
<sb0>
hartytp, I didn't decide to use that tool, and I'd actually prefer programming the chip without it as well, but - are you sure it's not setting some more obscure values?
rohitksingh has quit [Read error: Connection reset by peer]
rohitksingh1 has quit [Read error: Connection reset by peer]
<hartytp>
sb0: well, firstly, I'm not sure exactly which version florent is using. the file I mentioned explicitly disables the analog delays. but, maybe he's not using that version
rohitksingh has joined #m-labs
rohitksingh1 has joined #m-labs
<travis-ci>
m-labs/smoltcp#745 (master - f1e5c73 : Andrew Cann): The build passed.
<hartytp>
this csv crap needs to go. Let's use it as a starting point, but then read the datasheet and get a set of register writes that we understand and add them to the code with comments/docs
<hartytp>
well, the analog delay works. not seeing the digital delay yet, so need to look at that. But, the analog should be enough for SC1
<sb0>
whitequark, what is the proper way to check for windows in buildbot master.cfg? the usual os.name == "nt"?
<hartytp>
okay, with that code, the phase of both SYSREF signals does sweep wrt my 1.2GHz source.
<hartytp>
I'll leave that there until I hear back from _florent_
<sb0>
hartytp, great, thanks a lot for your help. does that include the digital delay as well?
<hartytp>
So, what I see is small steps in the phase that look (by eye) consisten with 10deg steps
<hartytp>
I'm nominally sweeping both the digital and analog delays, so I should see both 1/2 cycle and 10deg steps at the same time.
<hartytp>
So, it seems that the digital delay still doesn't work. Probably some other reg needs setting correctly
<whitequark>
sb0: no
<whitequark>
that's always false because master runs only on linux
<hartytp>
max analog delay is 24*25ps=600PS
<hartytp>
600ps
<hartytp>
okay, so not enough for a 600MHz DAC clock, with 1.66ps period
<hartytp>
well, you might get lucky and see something.
<hartytp>
anyway I can help you look at that next, after _florent_ confirms which version of the code he's using, so I know I'm not wasting my time debugging out of date code
<sb0>
I see. so I can pass around some parameter that defines the path order when creating the steps
<sb0>
or can I get that from the factory?
rohitksingh1 has quit [Read error: Connection reset by peer]
rohitksingh has quit [Read error: Connection reset by peer]
rohitksingh has joined #m-labs
<sb0>
bb-m-labs, force build --branch=release-3 artiq
<bb-m-labs>
build #2050 forced
<bb-m-labs>
I'll give a shout when the build finishes
<whitequark>
sb0: let me look at this
<whitequark>
sb0: anyway, did your change work or not?
<sb0>
buildbot docs say you can put lists in PATH and PYTHONPATH, and it will join them with the appropriate separator. it doesn't say if it deals with ordering problems.
<whitequark>
oh, okay
<whitequark>
I missed that
bb-m-labs has quit [Quit: buildmaster reconfigured: bot disconnecting]
<GitHub-m-labs>
[buildbot-config] whitequark pushed 1 new commit to master: https://git.io/vAnVa
<GitHub-m-labs>
buildbot-config/master bded5fc whitequark: Get rid of %(pathsep)....
<GitHub100>
[smoltcp] batonius commented on issue #166: Now the layers themselves, the main difference between the `LinkLayer` and `RoutingLayer` traits is the `IfaceId` argument used in the former to identify the interface a packet has been received from or should be transmitted to. The Network Layer knows nothing about the interfaces, their IPs, neighbors, routes, etc. and uses the abstraction provided by a `RoutingLayer`, bu