<sb0>
cjbe, I suppose you modified the gateware? also if your experiment compiles with "t.pulse(10*us)(" (no syntax error) that would be a compiler bug
<sb0>
the "received packet of an unknown type" errors in the satellite would point to some data corruption, which is consistent with the difference between the cables
<sb0>
whitequa1k, what is the status of the buildbot?
<sb0>
whitequa1k, is that still the wrong version of artiq-dev being installed?
<sb0>
iirc that didn't affect release-3. is that one working at least?
attie has quit [Ping timeout: 240 seconds]
attie has joined #m-labs
whitequa1k is now known as whitequark
<whitequark>
no, needs conda updated
<whitequark>
and release-3 works only by accident
<whitequark>
it's still the wrong version, just the compatible one
<GitHub29>
[artiq] sbourdeauducq commented on issue #854: Well, even *reloading* the *same* bitstream sometimes produces different results. Sayma is such a pain in the ass, screwing you at every turn. https://github.com/m-labs/artiq/issues/854#issuecomment-368285814
rohitksingh has joined #m-labs
attie has quit [Ping timeout: 240 seconds]
attie has joined #m-labs
<GitHub3>
[artiq] sbourdeauducq commented on issue #854: Additionally, the MMCM behaves differently when it has had its phase edited using the ``open_checkpoint`` method, and when using the original phase. ... https://github.com/m-labs/artiq/issues/854#issuecomment-368289503
rohitksingh has quit [Quit: Leaving.]
rohitksingh has joined #m-labs
attie has quit [Ping timeout: 252 seconds]
attie has joined #m-labs
rohitksingh has quit [Ping timeout: 248 seconds]
attie has quit [Ping timeout: 252 seconds]
attie has joined #m-labs
rohitksingh has joined #m-labs
rohitksingh1 has joined #m-labs
rohitksingh has quit [Ping timeout: 240 seconds]
cjbe has quit [*.net *.split]
kristianpaul has quit [*.net *.split]
forrestv has quit [*.net *.split]
cjbe has joined #m-labs
forrestv has joined #m-labs
kristianpaul has joined #m-labs
rohitksingh1 has quit [Quit: Leaving.]
rohitksingh has joined #m-labs
forrestv has quit [Max SendQ exceeded]
forrestv has joined #m-labs
attie has quit [Ping timeout: 252 seconds]
attie has joined #m-labs
<sb0>
okay I am able to get (but non-reproducibly, as usual) 0% packet loss with SAWG enabled
<sb0>
I wonder if the MMCMs work correctly, this behavior looks kinda similar to the SDRAM and serwb shitbugs
rohitksingh1 has joined #m-labs
<sb0>
I just don't get it, the TX phase adjustement may actually be doing nothing, sometimes it works sometimes it doesn't, based on god-knows-what set of conditions
<sb0>
the exact history of vivado commands seems to have an effect, even if they result in the same properties set on the MMCMs
<sb0>
it's like xilinx is taking the piss
<sb0>
the behavior with the smaller traffic generator, of course, was deterministic, so that you have to spend time dealing with a huge and slow to compile design
<GitHub119>
[artiq] gkasprow commented on issue #854: OK, so I will check with scope how TX delay affect the clock and data relationship. Maybe RC matching circuit will solve this issue. The trace length between FPGA and PHY is high so reflections occur. https://github.com/m-labs/artiq/issues/854#issuecomment-368310337
rohitksingh has joined #m-labs
<sb0>
rjo, it's IDDR/ODDR which AFAIK are per-pin dedidacted IOB resources with fixed timing
attie has quit [Remote host closed the connection]
attie has joined #m-labs
rohitksingh has quit [Quit: Leaving.]
rohitksingh has joined #m-labs
attie has quit [Remote host closed the connection]
attie has joined #m-labs
rohitksingh has quit [Quit: Leaving.]
rohitksingh has joined #m-labs
rohitksingh has quit [Quit: Leaving.]
<cjbe>
rjo: I have been using lab.m-labs to do some Sayma-Sayma DRTIO tests
<cjbe>
sb0: I modified the Kasli Master and Satellite to have 4 TTLOut and 4 TTLInOut (without serdes) at each end
<cjbe>
sb0: yeah - the trailing '(' was a copy-paste error
<cjbe>
sb0: I assume you would not expect a 2m DAC cable to have SI issues? I have FS.com 36648 (0.5m) and 36651 (2m) - both are 10G SFP+ DAC
<sb0>
cjbe, when you're not using the drtio link and with that cable, is there anything reported?
<GitHub91>
misoc/master 60eb326 Sebastien Bourdeauducq: sayma: work around Vivado MMCM/set_property problems
<GitHub150>
[artiq] sbourdeauducq commented on issue #854: With the current MiSoC (60eb326b), Ethernet will fail upon startup in ~3/4 the cases due to 100% of the transmitted packets being lost. RX is OK as can be seen by using ``net_trace 1``. In the ~1/4 of startup cases when it works, there is no packet loss reported by ``ping``. This works with and without SAWG.... https://github.com/m-labs/artiq/issues/854#issuecomment
<sb0>
the I MMC command doesn't seem to have an effect either way (doesn't unbreak a broken ethernet, doesn't break a working one)
<sb0>
so, it looks a bit more like a FPGA and not a MMC issue
<rjo>
sb0: are the delays also fixed w.r.t. the clock input?
<sb0>
I'm pretty amazed at how something as simple as ethernet can become messed up like that
<rjo>
sb0: or is this definitely only the tx half? and you are changing the timing between the tx clock (output) and the data?
<sb0>
rjo, right now I'm looking at non-reproducibility across reloads of the same bitstream
<sb0>
sometimes I load the FPGA and things are working perfectly, 0% packet loss either way
<rjo>
i understand that. but "packet loss" is which half?
<sb0>
sometimes RX works 100% (or close to 100%), but all TX packets are lost
<sb0>
TX
<rjo>
and the tx clock is derived from sys_clk?
<sb0>
yes
<sb0>
it's the same MMCM that generates both
<sb0>
the ODDR that drive the data and txctl are clocked by sys_clk
<sb0>
the clock that goes to the PHY is driven by another output of the same MMCM
<rjo>
and (after all the back and forth with the mmc configuring the phy) we know that the phy is using that txclk to register the data?
<sb0>
yes, otherwise I don't see how we can get 0% packet loss
<sb0>
large packets (1000+ bytes) are also transmitted correctly
<sb0>
and even when the board has been running for ~30min with a working ethernet, it keeps working
<rjo>
this is nibble-based, right? how do those phys determine which nibble is which? txclk phase?
<sb0>
it's DDR nibbles
<sb0>
send 4 bits on rising edge, another 4 bits on falling edge
<cjbe>
sb0: sample size 1: I get an aux packet error a few ms after the 'Si5324 is locked' message on the slave, but no other errors for ~10 s, until I ran the kernel