sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
<sb0>
rjo, yes, lockup without timeout/register dump is a new problem
<cjbe__>
sb0: what is the status of the RTIO analyser for DRTIO systems?
attie has quit [Ping timeout: 245 seconds]
attie has joined #m-labs
futarisIRCcloud has joined #m-labs
rohitksingh has joined #m-labs
rohitksingh has quit [Client Quit]
<GitHub21>
[artiq] jonaskeller commented on issue #902: After running all weekend, it seemed fine when I got here this morning. So, in any case, this is a major improvement. I can't yet tell whether it's entirely solved, however: Since then, I've seen two instances of the coredevice not responding anymore until I manually reset the FPGA, but didn't catch any `panic` entries in the UART log (as described here https://github.com/m
attie has quit [Ping timeout: 245 seconds]
attie has joined #m-labs
<sb0>
cjbe__, it should work normally
<sb0>
cjbe__, it's Si5324 -> IBUFDS_GTE2 -> GTPE2_COMMON -> GTPE2_CHANNEL and goes out of the TXOUTCLK port of the latter
<sb0>
_florent_, have you tested the GTH multilink support? I don't get any link when I try to use it
<sb0>
it still works when configured for single-link
attie has quit [Ping timeout: 240 seconds]
attie has joined #m-labs
<sb0>
well, could be a problem elsewhere
sb0 has quit [Quit: Leaving]
attie has quit [Ping timeout: 240 seconds]
attie has joined #m-labs
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
<GitHub2>
[artiq] jordens commented on issue #856: Serwb also appears to hang randomly on sayma-3, mostly when waiting for HMC830 lock. This happens with current master as well as spi2. And @hartytp also sees this (with a slightly older master).... https://github.com/m-labs/artiq/issues/856#issuecomment-368777705
<GitHub116>
artiq/master e565d3f Sebastien Bourdeauducq: kasli: add analyzer and RTIO log to DRTIO master target
<sb0>
cjbe__, ^
<sb0>
_florent_, okay, but please focus on the gtp code for now. i suspect the problem is in my code anyway.
<_florent_>
sb0: yes, i'm already working on gtp, changed should be implemented, i'm testing that on hardware
<_florent_>
changed/changes
<GitHub144>
[artiq] hartytp commented on issue #854: @sbourdeauducq thanks for the update. hmmm...1ns still seems less than I'd expect from Greg's plots. What do you think @gkasprow are you happy that this is all consistent or should we look into it further? https://github.com/m-labs/artiq/issues/854#issuecomment-368821721
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
rohitksingh_work has quit [Read error: Connection reset by peer]
marmelada_ has joined #m-labs
marmelada has quit [Ping timeout: 260 seconds]
<cjbe__>
sb0: I am looking at the a master TTL output and a satellite TTL output on a scope, running this experiment: https://hastebin.com/ufinacilud.py
<cjbe__>
I am restarting the master and satellite, then running this experiment. About 1 in 4 times I see write underflow errors (https://hastebin.com/wubiyevayu.sql - master log on left, satellite on right)
<cjbe__>
Additionally, I see the time offset between the master and satellite TTL shift: after a restart there can be shifts of up to 25ns, and shortly after a restart, when the experiment is running, I see relative time drifts of up to 4ns over a few seconds
<whitequark>
rjo: I have no idea how
<rjo>
whitequark: please give me a report of your work yesterday.
<whitequark>
I've updated the conda recipes in release-3 to match those in master and verified that both result in empty packages
<cjbe__>
sb0: ignore the time offset statement for now - I see this between two master output channels, so must be me not resetting the rtiox4 PLL properly
<rjo>
whitequark: when did it break?
<whitequark>
rjo: exactly after your changes
<rjo>
whitequark: show me.
<rjo>
whitequark: show me the last commit that worked and the first that broke it.
<rjo>
whitequark: and show me when you changed the recipes.
<whitequark>
good: 37a0d658 bad: 1de2da56
<whitequark>
and that's on master where I didn't change any recipes.
<whitequark>
ah, no, it does include my changes
<whitequark>
ok, then I know why
<rjo>
whitequark: right.
<rjo>
whitequark: nothing on the camera driver? nothing on anything else?
<GitHub135>
artiq/master b81855c whitequark: conda: don't use globs in file list.
<whitequark>
I've finally fixed conda
<sb0>
whitequark, on release-3 too?
<whitequark>
not yet
<whitequark>
let's see how it passes tests on master first
sb0 has quit [Quit: Leaving]
rohitksingh has quit [Read error: Connection reset by peer]
<cjbe__>
sb0: I still see these latency drifts even using TTLSimple, so I don't believe it is my gateware.
<cjbe__>
After restarting master+satellite I see the master-satellite latency change by up to 30ns.
rohitksingh has joined #m-labs
<cjbe__>
Most (~70%) times the latency stays fixed (less than ~1.5ns jitter) after reset.
<cjbe__>
Sometimes it drifts by up to 20ns over 5s or so, then I get 'timeout attempting to get remote buffer space' from the master, and sometimes a stream of 'write underflow' errors from the satellite
<GitHub184>
pdq/spi2 282bac8 Robert Jordens: test: comment unimplemented function
sb0 has joined #m-labs
sb0_ has joined #m-labs
<sb0_>
cjbe__, by "drifts over 5s", is that 5s after the link is established and then it stabilizes? or is it continuously drifting?
<sb0_>
cjbe__, how are you measuring the latencies? the clock phases at the si5324s should be good indicators
attie has quit [Ping timeout: 256 seconds]
attie has joined #m-labs
<cjbe__>
sb0_: I see roughly a linear drift after I start running the kernel. I am starting the kernel a couple of seconds after the link is established
<cjbe__>
I am measuring the latencies by running a kernel that generates simultaneous (by RTIO timestamp) pulses on a local and a remote TTL every 1ms, then monitoring the time difference between edges on a scope
<GitHub185>
artiq/release-3 232940e whitequark: conda: don't use globs in file list.
<sb0_>
cjbe__, when exactly does the drift start? how long does it last?
<cjbe__>
sb0_: the drift is visible as soon as the kernel starts (= when I see the TTL pulses on the scope). It lasts for 5s or so. It ends by either the master throwing a remote buffer space error and the satellite throwing write underflow errors (at which point the satellite TTL output stops toggling), or it slows down and stops, and seems to maintain fixed latency
<cjbe__>
I can look at the si5324 clocks if that would be useful
<whitequark>
rjo: any idea what could be the cause of RTIO underflow, or should I start bisecting?
<whitequark>
that's probably going to take a while
<rjo>
whitequark: only a "feeling". when i was working on the opticlock experiments i had the impression that as a result of merging your compiler-rt fix, i had additional underflows.
<bb-m-labs>
build #2127 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2127 blamelist: whitequark <whitequark@whitequark.org>, Robert Jordens <rj@m-labs.hk>
<cjbe__>
rjo: I am trying to test the AD9910 Urukul driver, but currently I cannot get the PLL to lock - any suggestions?
<cjbe__>
I am using the latest version of the CPLD firmware. I am supplying a 125 MHz clock to the Urukul MMCX, and I have the placement option components in the clock path to do this
<cjbe__>
I have set the CPLD clk_sel argument to 0, and the pll_n argument to 32
hartytp has joined #m-labs
<hartytp>
Cjbe did you check the cpld But?
<hartytp>
Bitstream
<hartytp>
Not sure what's loaded atm
<hartytp>
Oops sorry missed that you said you'd done that
<hartytp>
Nevermind
<cjbe__>
sb0: looking at the si5324 outputs on the master and satellite when the DRTIO link is up, I see significant phase drift: https://streamable.com/9fdku
<cjbe__>
Hartytp: rjo's nice driver checks for version compatability and moans if there is a mismatch :)
<hartytp>
:)
<cjbe__>
sb0: fanning a little air over the Kaslis I see relative timing shifts of 3ns pk-pk between the master and slave clocks
<hartytp>
What's the loop be set to?
<hartytp>
Bw
<hartytp>
For lower bws the si5324 is pretty noisy. Doesn't follow the reference well and Kasli just has a cheap xtal
hartytp has quit [Ping timeout: 260 seconds]
mumptai has joined #m-labs
<rjo>
cjbe__: which board versions? which artiq versions? which cpld versions?
<rjo>
cjbe__: and did you set the ifc_mode switches to 0b0001?
<GitHub117>
[artiq] gkasprow commented on issue #854: with 12mA of drive strength I observe quite high slew rate so 1ns is quite possible. With 16ns the slew rate is much higher so 3ns makes sense. Take into account that we have really small window. The symbol length is 4ns and if you add setup, hold time and FPGA skew this makes sense. Traces are equalised up to 200ps https://github.com/m-labs/artiq/issues/854#issuecomment
<rjo>
cjbe__: right. that's why i wanted exact versions, not just "latest". but anyway. other than the ifc_mode switch i don't see what's wrong right now. pll lock times out and the led stays red?
<cjbe__>
I assume that this is due to the change to spi2 and is innocuous?
<cjbe__>
rjo: I think it probably is the ifc_mode switch - will let you know tomorrow if this is not the case
<GitHub123>
[artiq] philipkent commented on issue #932: Downgrading openocd to version 0.10.0 build 1 fixed the unknown flash device error. Running `artiq_flash -t kc705 -m nist_qc2` using artiq version 2.4 now complains that it cannot find the binaries directory though:... https://github.com/m-labs/artiq/issues/932#issuecomment-369044181
<GitHub40>
[artiq] philipkent commented on issue #932: Copying the binaries directory over from 2.3 worked. Are there any differences between the 2.3 binaries and artiq 2.4 that might cause problems doing it this way? https://github.com/m-labs/artiq/issues/932#issuecomment-369045878
<sb000>
cjbe__, yes, looking at si clocks (master vs. sat) would be very useful
<sb000>
input to the satellite si at the same time, if possible (probe on pcb or modify gateware to output somewhere else - it's the clock in the rtio_rx0 domain)
<sb000>
where are you blowing air? si? fpga?
<sb000>
if things are working properly, phases of those three clocks are fixed
cjbe has joined #m-labs
<cjbe>
sb000: I posted a video of the si clock output from master+satellite earlier: https://streamable.com/9fdku
<cjbe>
the pk-pk variations are ~3ns
<cjbe>
I am fanning air in the general direction of the master+satellite Kasli, so hitting both FPGAs and both Si5324
<cjbe>
I will have a look at the satellite rtio_rx0 clock tomorrow