sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
ohsix has joined #m-labs
sb0 has joined #m-labs
kaolpr has quit [Ping timeout: 268 seconds]
kaolpr has joined #m-labs
<sb0> rjo, HK board is working again. seems the lab got hotter and the RTM overheated. I added another two sizable fans cooling it.
Shikadi has quit [Ping timeout: 248 seconds]
<sb0> rjo, seems the frequency1 bug is gone!
<sb0> rjo, thanks and congrats!
mumptai_ has joined #m-labs
sb0 has quit [Quit: Leaving]
mumptai has quit [Ping timeout: 265 seconds]
<GitHub10> [smoltcp] crawford commented on issue #223: I just made it public. Here is the code in question (without the suggested changes): https://github.com/crawford/PoE/blob/3f26e93afc082af7fc9c40eb1b5c55354afa071a/firmware/src/main.rs#L113. https://github.com/m-labs/smoltcp/issues/223#issuecomment-393649685
futarisIRCcloud has joined #m-labs
mumptai has joined #m-labs
mumptai has quit [Read error: No route to host]
hartytp has joined #m-labs
<hartytp> sb0: Didn't Greg's guys get Sayma working in the racks?
<hartytp> if so, that might be safer than relying on desk fans
<GitHub-m-labs> [artiq] hartytp commented on issue #1040: @jbqubit ... https://github.com/m-labs/artiq/issues/1040#issuecomment-396034380
<GitHub-m-labs> [artiq] hartytp opened issue #1064: Sayma: UART silence https://github.com/m-labs/artiq/issues/1064
<hartytp> _florent_: I'm going to look at Sayma a bit more soon
<hartytp> is there anything I can do to help track down these serwb issues?
<hartytp> e.g. any better diagnostics we can run? add some probes to anything?
<_florent_> hartytp: thanks
<_florent_> hartytp: can you remind me how you are loading the rtm
<_florent_> after the amc is started or before?
<_florent_> if you can just confirm that when you see the issue, restarting the amc or reloading the rtm make it works
<hartytp> _florent_ currently, I'm flashing with `artiq_flash -t sayma --srcbuild ./artiq_sayma`
<hartytp> then loading with `artiq_flash -t sayma --srcbuild ./artiq_sayma load`
<hartytp> what do you mean by "after the amc is started"?
<hartytp> usually, I'm running artiq load either after the amc has finished booting
<hartytp> or after there has been an error during the boot
<hartytp> e.g. I wrote a script that loads the FPGAs once per min and scrapes the UART output to gather statistics
<_florent_> ok and you see the error after a few artiq_flash -t sayma --srcbuild ./artiq_sayma start?
<hartytp> yes
<hartytp> I had the impression that the first few loads generally work, then the issues set in
<hartytp> but my statistics on that observation are poor, so it could just be psychology
<hartytp> then something like 1/3 of boots get stuck in a serwb init failure loop
<hartytp> (this is all low literate stuff)
<hartytp> (the 1gbps literate builds generally give me large numbers of errors on amc <-> rtm link tests)
<hartytp> I can post a UART log with repeated loads later today if that helps
<hartytp> ?
<_florent_> yes it could be useful (for statistics)
<hartytp> will do
<hartytp> which version of the code? current master?
<_florent_> otherwise, if you can just test that it recover correctly by just reloading the rtm
<_florent_> you can use current master yes
<hartytp> I see, so while it's in the serwb init failure loop, reload the RTM only?
<_florent_> yes
<_florent_> ok thanks a lot, i have to go and won't probably be available in the afternoon, but if you post results, i'll analyze that and try to fix tomorrow
<hartytp> will do
<hartytp> if you think of anything else then let me know
hartytp has quit [Ping timeout: 260 seconds]
_whitelogger has joined #m-labs
Shikadi has joined #m-labs
<rjo> sb0: are you sure that migen's expression size and signedness is generally sound? e.g. the example given in verilog 2001 4.4.2 (answer = (a + b) >> 1; //will not work properly).
<GitHub-m-labs> [migen] jordens pushed 1 new commit to master: https://github.com/m-labs/migen/commit/4cb07f186bab610060d661d5be0e67ffa15e3f57
<GitHub-m-labs> migen/master 4cb07f1 Robert Jördens: bitcontainer: slices are unsigned...
<bb-m-labs> build #279 of migen is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/migen/builds/279
sb0 has joined #m-labs
<sb0> hartytp, no, µTCA crate management is full of bugs and totally unusable with sayma
<sb0> the best behavior I have seen (intermittently) is the sayma is powered for half a second and then dies
<sb0> ironically, among the pile of bugs that manifest themselves when sayma is inserted, one of them kills the fans completely but makes the MCH think the fans are running
<sb0> then you have to pull the power cord, otherwise the crate PSU cooks itself
<sb0> (this happens intermittently)
<sb0> rjo, maybe not; can you file issues?
<sb0> I have not touched this code for years and don't remember every detail...
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1040: @jbqubit the HK board working again and the SAWG looks OK now, you should test. https://github.com/m-labs/artiq/issues/1040#issuecomment-396045999
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1040: @jbqubit the HK board working again and the SAWG looks OK now, you should test.... https://github.com/m-labs/artiq/issues/1040#issuecomment-396045999
sb0 has quit [Quit: Leaving]
hartytp has joined #m-labs
<hartytp> sb0: can you file an issue about that
<hartytp> last I heard from Greg everything was working fine for Sayma in a uTCA rack
<hartytp> so, either you have the wrong firmware version etc or there is some issue that Greg doesn't know about
<hartytp> ping ^
<hartytp> I would like to look at #1064 by trying out other vivado versions
<hartytp> but Sayma doesn't meet timing on anything other than 2018.1 afaict
<hartytp> Vivado complains about the way ethernet timing is currently done, and removing those LOCs gets it to stop complaining and meet timing
<hartytp> so, I want to check that there really isn't a better way of doing this
<hartytp> is this *really* what Xilinx recommend for RGMII?
<hartytp> aah, just saw that you have posted a uTCA issue
<hartytp> never mind then, thanks!
sb000 has joined #m-labs
<sb000> hartytp, if you're not using ethernet, in theory things are working without any LOC
<hartytp> right, and I do plan to remove them during testing
<sb000> also, to meet timing with the other vivado version, and ethernet, in theory you can set the LOC to whatever the autoplacer put the components, and redo rgmii phase tuning
<hartytp> and there is no way of doing this without LOCs?
<sb000> you can use the ethernet-yakshaving repos for the phase tuning, IME the results with LOC could be then reproduced
<sb000> within misoc/artiq
<sb000> i don't know why you like to pick on those constraints so much. AFAICT they aren't causing much of a problem
<rjo> isn't the usual approach to do proper input delay constraints instead?
<rjo> then you are not fighting PnR
<sb000> there's one for the tx clock
<sb000> (iirc)
<sb000> the purpose is to set the phase between the clock sent to the chip and the clock to oddr
<sb000> with the limited odelay range, it is difficult to do
<hartytp> rjo: thanks for fixing that
<hartytp> will post in a sec, but SAWG output looks good
<hartytp> (well, modulo the fact that I don't have a reconstruction filter)
<sb000> one way to deal with that one without loc is to generate both clocks from the same MMCM
<sb000> but! right now the oddr are clocked from the system clock.which has non deterministic phase due to bufgce_div
<sb000> so you need a dedicated mmcm anyway (and you use one more clock in the design)
<hartytp> is there no way to deal with that?
<sb000> also, arguably, two clocks from the same MMCM is a LOC in disguise
<sb000> if you put a new mmcm for both the oddr and chip clock, and redo the phase tuning, then things work in theory for tx without this explicit LOC you dislike
<sb000> the ethernet core does not require the tx clock be synchronous to the system clock
<sb000> it just happens to be so in the sayma case
<hartytp> sb0: it's more a case of "that Vivado doesn't like" than me
<hartytp> why Sayma?
<sb000> because both the system clock and rgmii clock are 125MHz
<sb000> I.e. 125M was already available so I just used it
<sb000> that's all
<sb000> for rx, hm, might be worth a try without LOC
<sb000> it's a more common mmcm use case, from a clock capable pin, so, the skew between mmcm locations may be low enough to be tolerable
<hartytp> I have an ethernet adapter on the way from greg so will be able to test this soon
<GitHub-m-labs> [artiq] hartytp commented on issue #1039: This bug seems fixed now.... https://github.com/m-labs/artiq/issues/1039#issuecomment-396050955
<GitHub-m-labs> [artiq] hartytp commented on issue #1039: This bug seems fixed now.... https://github.com/m-labs/artiq/issues/1039#issuecomment-396050955
<GitHub-m-labs> [artiq] hartytp commented on issue #1039: This bug seems fixed now.... https://github.com/m-labs/artiq/issues/1039#issuecomment-396050955
<sb000> hartytp, ok. does the tx mmcm technique sound clear enough?
<sb000> I can send you a draft patch for you to test and improve on
<hartytp> remind me, is there an artiq idiom that lets me do `for sawg in self.sawgs` in a kernel?
<rjo> just that.
<hartytp> with `self.sawgs = [self.setattr_device("sawg{}".format(i)) for i in range(7)]`?
<hartytp> in build
<rjo> yes. range(8)
<rjo> get_device
<hartytp> thanks, that was what I was looking for
<hartytp> sb0: I haven't totally followed Sayma ethernet, so can you remind me how the clocking is done?
<hartytp> so, for Tx, the FPGA generates a clock that's sent to the phy chip
<hartytp> and clocks the ODDRs from a phase shifted version of that clock
<GitHub-m-labs> [artiq] hartytp opened issue #1065: Sayma: memory corruption? https://github.com/m-labs/artiq/issues/1065
<hartytp> (bus is 4 bits wide, 125MHz clock, DDR)
sb000 has quit [Ping timeout: 260 seconds]
<GitHub-m-labs> [artiq] hartytp commented on issue #794: Looking at this on my Sayma.... https://github.com/m-labs/artiq/issues/794#issuecomment-396053902
<hartytp> nice! Deterministic phase seems to work between the two DAC chips!
<hartytp> nice! Deterministic phase seems to work between the two DAC chips!
<GitHub-m-labs> [artiq] hartytp commented on issue #794: Cool, so we have working SAWG and SC1. If we can just fix the remaining crashes/serwb issues then we're sorted. https://github.com/m-labs/artiq/issues/794#issuecomment-396053989
<GitHub-m-labs> [artiq] jordens commented on issue #1065: Or a legitimate bug in the stack. Could you `addr2line` it? https://github.com/m-labs/artiq/issues/1065#issuecomment-396054441
<GitHub-m-labs> [artiq] jordens commented on issue #1065: Or a legitimate bug in the stack. ~Could you `addr2line` it?~ https://github.com/m-labs/artiq/issues/1065#issuecomment-396054441
<GitHub-m-labs> [artiq] jordens commented on issue #1065: Or a legitimate bug in the stack. ~Could you `addr2line` it?~... https://github.com/m-labs/artiq/issues/1065#issuecomment-396054441
<hartytp> so, we can either delay the clock or the data to meet s/h at the phy IC
<GitHub-m-labs> [artiq] jordens commented on issue #1065: Or a legitimate bug in the stack. ~Could you `addr2line` it?~... https://github.com/m-labs/artiq/issues/1065#issuecomment-396054441
sb000 has joined #m-labs
<sb000> we're shifting the clock to the chip (oddr is currently connected to the system clock) but that's the idea
<sb000> how did you test the phase between the dacs?
<hartytp> min delay provided by the ODDR is 1.28ns https://www.xilinx.com/support/answers/60802.html
<hartytp> looked at the DAC output on a scope
<hartytp> how else would you do it?
<sb000> and that can tell you if it slips by 1 cycle of the 1.2ghz clock?
<hartytp> yes
<hartytp> easily
<hartytp> as I said in the issue, the scope tells me the phase 1 1 deg
<sb000> scope plus some processing, see the red pitaya python script in the issue
<hartytp> well, it's basically the same thing, but Keysight did the dsp for me
<hartytp> I can put it on a frequency counter with a 1s gate if you want and get the phase shift to silly precision
<hartytp> as I said, the scope tells me the phase to 1 deg trivially, which is 280ps
<sb000> ok
<hartytp> and, in any case, it's not 1.2GHz, it's 600MHz
<sb000> I've seen slips of 1.2GHz cycles
<hartytp> I've already checked that the HMC7043 outputs are deterministic (as has Joe) and everyhting after that is 600MHz or less
<hartytp> when?
<hartytp> sb0: fwiw, if you have a 20GSPS, 11bit scope, these things are easy
<hartytp> :)
<sb000> while ago.could have been 7043 issues
<hartytp> well, can you recheck?
<sb000> yea, like lasers etc.
<sb000> sigh
<hartytp> all seems fine after rework and using the latest master
<sb000> ok
<hartytp> sb0: can you generate the Tx clock output using a SERDES at 1GHz?
<hartytp> then you get easy bitslip
<sb000> it's more complicated than the mmcm
<hartytp> how so?
<sb000> less margin unless you also include odelay
<hartytp> so, this isn't my field at all, so maybe a dumb question, but can't you just replace https://github.com/m-labs/misoc/blob/3825320ade45217524f0099deb1fcfe99416d24b/misoc/targets/sayma_amc.py#L158
<hartytp> with a serdes
<sb000> since the resolution is so coarse
<hartytp> okay, so worst case it's serdes + odelay
<sb000> yes
<sb000> but you lose resolution
<hartytp> not with odelay as well
<hartytp> that seems pretty easy to implement
<sb000> odelays are complicated on ultrascale, but yes in theory it works
<hartytp> well, in any case, I don't think the odelays should be neccessary
<sb000> then when editing the bitstream for phase tuning you need to figure out how to set the serdes pattern plus the odelay value
<hartytp> I know that we had a silly small eye a while back
<hartytp> but AFAICT, that should be resolved now that greg fixed the hw issues with the phy
<sb000> maybe. but ethernet is still so fragile that I want to avoid things that can break like that
<sb000> a large part of the difficulty here is testing
<hartytp> is it still fragile after the latest hw fixes?
<hartytp> that's news to me
<sb000> you tell me. I haven't seen so many boards.
<sb000> note that rgmii is ddr
<hartytp> I know
<sb000> so with the serdes your granularity really is 1/4 ui
<hartytp> ack
<hartytp> although, I thought that RGMII was usually SDR with an 8-bit bus
<hartytp> I agree that this will be a bit of a PITA to tune if the eye is 0.5ns wide. But, if it's a couple of ns, then it will be easy to find a SERDES tap that works and then tweak the ODELAY
<hartytp> that really doesn't seem difficult
<hartytp> e.g. much less frustrating than fighting against timing issues that arise because of LOCs
<hartytp> so, I'll have a look at that when I look at ethernet
<hartytp> Rx...
sb000 has quit [Ping timeout: 260 seconds]
<hartytp> sb0: remind me where the bufg_div comes into this and what the issues are there?
<hartytp> output not generated
<hartytp> nope
<hartytp> sb0: hmmm...it looks like the code tries to drive the ethernet tx clock output in two places
<hartytp> sorry, never mind, was looking at an old branch
<hartytp> oops
<hartytp> Rx clock is generated by PLL here
<hartytp> sb0: for the Rx, can we not just use a timing constraint?
<hartytp> anwyay, as I said, not really my field, so looking for advice here. But, AFAICT, what might work better than the current situation is:
<hartytp> 1. generate Tx clock output from SERDES with ODELAY tuning if necessary
<hartytp> then, add a timing constraint on the inputs
<hartytp> 3. remove all LOCs
<hartytp> sb0, rjo: does that sound right?
<tpw_rules> so uh can sb0 get messages when they're offline
<rjo> hartytp: i'd have to look at the code. dunno.
<hartytp> rjo: I did. That's what would make most sense to me. but, as I said, I'm not an expert on these things...
<hartytp> if it doesn't sound obviously wrong then that's a start
<hartytp> sec 5.8
<hartytp> always nice when the defaults do things which are explicitly forbidden, but without any explanation
hartytp has quit [Quit: Page closed]
<GitHub-m-labs> [misoc] whitequark pushed 1 new commit to master: https://github.com/m-labs/misoc/commit/beef2a4f625082cf20fd702ebef5d2bc35148d99
<GitHub-m-labs> misoc/master beef2a4 whitequark: Update CONTRIBUTING....
<bb-m-labs> build #440 of misoc is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/misoc/builds/440
<GitHub-m-labs> [misoc] whitequark pushed 1 new commit to master: https://github.com/m-labs/misoc/commit/0b978297f8bc78468aa0951153f15fdb57c66466
<GitHub-m-labs> misoc/master 0b97829 whitequark: Change CONTRIBUTING to use ReST syntax and rename back.
<bb-m-labs> build #441 of misoc is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/misoc/builds/441
<GitHub-m-labs> [artiq] jbqubit opened issue #1066: Sayma: 0x40023e78 https://github.com/m-labs/artiq/issues/1066
<GitHub-m-labs> [artiq] hartytp commented on issue #998: @sbourdeauducq do you need any other data to help debug this? https://github.com/m-labs/artiq/issues/998#issuecomment-396071272
<GitHub-m-labs> [artiq] jbqubit opened issue #1067: Sayma SAWG setting frequency1 causes lock-up https://github.com/m-labs/artiq/issues/1067
<GitHub-m-labs> [artiq] jbqubit commented on issue #1067: There's nothing printed on the UART. Setting frequency2 also seems to trigger same behavior. ... https://github.com/m-labs/artiq/issues/1067#issuecomment-396074161
<GitHub-m-labs> [misoc] jordens pushed 1 new commit to master: https://github.com/m-labs/misoc/commit/1325aff65cc5d2657bbae0551526982499324d8a
<GitHub-m-labs> misoc/master 1325aff Thomas: correctly use result of Record.connect in Converter (#81)...
<GitHub-m-labs> [artiq] jbqubit commented on issue #1064: I haven't seen this yet. I'll post if I do. https://github.com/m-labs/artiq/issues/1064#issuecomment-396079109
<bb-m-labs> build #442 of misoc is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/misoc/builds/442
<GitHub150> [smoltcp] jhwgh1968 commented on pull request #232 994d8a7: Fixed, @ProgVal.... https://github.com/m-labs/smoltcp/pull/232#discussion_r194270124
<GitHub-m-labs> [artiq] jbqubit commented on issue #794: Running 4.0.dev+1133.g0b086225. Compare phase between a pair of channels spanning both DACs on single Sayma. Set frequency0 to 40 MHz. Measured phase difference using Tek FCA 3003 Timer using 10k samples of 10 ms intervals. Saw stdv < 1 deg with peak deviation < 9 deg. Cycling off power and reloading 5 times I see variation in mean relative phase of ... https://githu
<GitHub-m-labs> [artiq] hartytp closed issue #794: Clocking, DAC support and JESD synchronization on one Sayma card https://github.com/m-labs/artiq/issues/794
hartytp has joined #m-labs
<hartytp> _florent_: just getting round to doing the tests you wanted
<hartytp> you asked "rtiq_flash -t sayma --srcbuild ./artiq_sayma start"
<hartytp> I must have misread that...I've never tried `artiq_flash ... start`
<hartytp> hmm...1 crash after 'loading hmc7043...done'
<hartytp> but no serwb init errors yet
<hartytp> sigh
<hartytp> okay, I can't break serwb tonight
<hartytp> will try again in the mornign
hartytp has quit [Quit: Page closed]
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: I also had one occasion where Sayma crashed after `loading HMC7043...done` https://github.com/m-labs/artiq/issues/1065#issuecomment-396085471
mumptai_ has quit [Remote host closed the connection]
<GitHub93> [smoltcp] whitequark commented on pull request #232 994d8a7: Nevermind, what @ProgVal said is correct. https://github.com/m-labs/smoltcp/pull/232#discussion_r194273839
<GitHub-m-labs> [artiq] jbqubit commented on issue #794: I got the fancy scope back so can make a better measurement. Scope and 100 MHz clock to Sayma are phase locked. Still using 4.0.dev+1133.g0b086225 . ... https://github.com/m-labs/artiq/issues/794#issuecomment-396091242