sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
kuldeep has quit [Ping timeout: 256 seconds]
kuldeep has joined #m-labs
<GitHub-m-labs> [artiq] whitequark closed issue #1056: artiq_coremgmt config erase broken https://github.com/m-labs/artiq/issues/1056
<GitHub-m-labs> [migen] whitequark pushed 1 new commit to master: https://github.com/m-labs/migen/commit/df0ce4abac668704ed5b18d17fdca976449c84ce
<GitHub-m-labs> migen/master df0ce4a whitequark: Update version in setup.py....
<bb-m-labs> build #2477 of artiq is complete: Failure [failed python_unittest_1] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2477 blamelist: whitequark <whitequark@whitequark.org>
<GitHub-m-labs> [artiq] mfe5003 closed issue #1078: kasli installation https://github.com/m-labs/artiq/issues/1078
<18VAEF38L> [artiq] whitequark closed issue #1079: support runtime build without RTIO DMA https://github.com/m-labs/artiq/issues/1079
<7GHAA46TI> [artiq] whitequark pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/b6dd9c8bb054945ba38cbb64294cad914bb71682
<7GHAA46TI> artiq/master b6dd9c8 whitequark: runtime: support builds without RTIO DMA....
<GitHub82> [smoltcp] whitequark commented on pull request #232 15f2868: Do we really need an `Option` here? The `None` case seems identical to `Some(0)` in every respect. https://github.com/m-labs/smoltcp/pull/232#discussion_r197597258
<GitHub123> [smoltcp] whitequark commented on pull request #232 15f2868: Take a look at how all other log statements are written--you're missing the socket endpoints in this one. https://github.com/m-labs/smoltcp/pull/232#discussion_r197597311
<GitHub197> [smoltcp] whitequark commented on pull request #232 15f2868: Nit (here and elsewhere): we write RFC numbers as `RFC NNNN`, not `RFC-NNNN`. https://github.com/m-labs/smoltcp/pull/232#discussion_r197597342
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: @gkasprow... https://github.com/m-labs/artiq/issues/1065#issuecomment-399618108
<GitHub183> [smoltcp] whitequark commented on pull request #232 15f2868: Maybe `mss_header_len`? https://github.com/m-labs/smoltcp/pull/232#discussion_r197597374
<bb-m-labs> build #290 of migen is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/migen/builds/290
<sb0_> hartytp_, use the vadatech board instead?
<sb0_> there isn't a plan really, I've never seen a mess like Sayma before...
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1080: Have you tried ``--without-sawg``? I suspect that the corruption runs deeper than just the SDRAM. https://github.com/m-labs/artiq/issues/1080#issuecomment-399619339
<sb0_> so, it's just poking around hoping to find something
kuldeep has quit [Ping timeout: 268 seconds]
kuldeep has joined #m-labs
<bb-m-labs> build #1677 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/1677
<bb-m-labs> build #2478 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2478 blamelist: whitequark <whitequark@whitequark.org>
kaolpr has quit [Ping timeout: 240 seconds]
kaolpr has joined #m-labs
<GitHub133> [smoltcp] whitequark commented on issue #235: @m-labs-homu retry https://github.com/m-labs/smoltcp/pull/235#issuecomment-399624218
<GitHub5> [smoltcp] m-labs-homu force-pushed auto from 78651bb to f4c76b2: https://github.com/m-labs/smoltcp/commits/auto
<GitHub5> smoltcp/auto f4c76b2 Dan Robertson: Add MLDv2 packet parsing support to wire...
<GitHub44> [smoltcp] m-labs-homu commented on issue #235: :hourglass: Testing commit 354b3c4b18de1f5c6c3a959c9574db3b0f1b163d with merge f4c76b2a5722e303fa8bd7ca57f826ff3cfa65e6... https://github.com/m-labs/smoltcp/pull/235#issuecomment-399624240
<travis-ci> m-labs/smoltcp#1062 (auto - f4c76b2 : Dan Robertson): The build has errored.
<GitHub65> [pdq] jonaskeller opened pull request #24: Fix wavesynth program transfer via SPI (m-labs/pdq#20) (master...master) https://github.com/m-labs/pdq/pull/24
<GitHub61> [smoltcp] whitequark pushed 1 new commit to master: https://github.com/m-labs/smoltcp/commit/a1d06027358d75df7efbcfae9356d3094a3f1294
<GitHub61> smoltcp/master a1d0602 whitequark: Travis: add Clippy to allowed failures.
<GitHub8> [smoltcp] whitequark commented on issue #235: @m-labs-homu retry https://github.com/m-labs/smoltcp/pull/235#issuecomment-399625381
<GitHub107> [smoltcp] m-labs-homu force-pushed auto from f4c76b2 to 48bc838: https://github.com/m-labs/smoltcp/commits/auto
<GitHub107> smoltcp/auto 48bc838 Dan Robertson: Add MLDv2 packet parsing support to wire...
<GitHub123> [smoltcp] m-labs-homu commented on issue #235: :hourglass: Testing commit 354b3c4b18de1f5c6c3a959c9574db3b0f1b163d with merge 48bc838b69a1168bc0f57c521e2879b0de44a4f8... https://github.com/m-labs/smoltcp/pull/235#issuecomment-399625399
<sb0_> hartytp_, you seem to have found a way to reproduce the "illegal instruction" error. can that be minimized maybe? e.g. no RTM involvement
<travis-ci> m-labs/smoltcp#1063 (master - a1d0602 : whitequark): The build passed.
<sb0_> it's also remarkable that we get bit-flips in a read-only region ...
<GitHub107> [smoltcp] m-labs-homu merged auto into master: https://github.com/m-labs/smoltcp/compare/a1d06027358d...48bc838b69a1
<GitHub8> [smoltcp] m-labs-homu commented on issue #235: :sunny: Test successful - [status-travis](https://travis-ci.org/m-labs/smoltcp/builds/395708809?utm_source=github_status&utm_medium=notification)
<travis-ci> m-labs/smoltcp#1064 (auto - 48bc838 : Dan Robertson): The build passed.
<GitHub196> [smoltcp] m-labs-homu closed pull request #235: Add MLDv2 packet parsing support to wire (master...mldv2) https://github.com/m-labs/smoltcp/pull/235
<travis-ci> m-labs/smoltcp#1065 (master - 48bc838 : Dan Robertson): The build passed.
<GitHub-m-labs> [artiq] sbourdeauducq pushed 2 new commits to master: https://github.com/m-labs/artiq/compare/b6dd9c8bb054...84b3d9ecc604
<GitHub-m-labs> artiq/master 84b3d9e Sebastien Bourdeauducq: bootloader: also check firmware CRC in SDRAM (#1065)
<GitHub-m-labs> artiq/master 68530fd Sebastien Bourdeauducq: sayma: generate 100MHz from Si5324 on standalone and master targets...
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: Here's another tool:... https://github.com/m-labs/artiq/issues/1065#issuecomment-399629182
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: With the DRTIO master that has the SAWG, this insanity manifests itself by breaking the RTM FPGA loading, with the error "Did not exit INIT after releasing PROGRAM". If this symptom is reproducible, that would be something that is less of a PITA to zero in on than the very complicated crash-kernel. https://github.com/m-labs/artiq/issues/1065#issuecomment-39962
<bb-m-labs> build #1678 of artiq-board is complete: Exception [exception conda_build_output] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/1678 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<bb-m-labs> build #2479 of artiq is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/2479 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<sb0_> of course I cannot reproduce the RTM FPGA loading failure ...
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: I did get one memory corruption event reported while running the crash-kernel, by running the CRC in a loop (no 1s delay) and with 1/10 the size, so that it runs faster and it increases the chance of catching an error before the board freezes.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399637216
<sb0_> hartytp_, found any good way of crashing the board without the crash-kernel?
sb0 has joined #m-labs
sb0 has quit [Client Quit]
<GitHub-m-labs> [artiq] hartytp commented on issue #1080: Okay. I'll try that and your blinker next to see if we can find some issue with a simpler logic block that we can focus on instead of debugging complex jesd/memory issues. https://github.com/m-labs/artiq/issues/1080#issuecomment-399643322
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #998: The Sayma DRTIO master, after adding SAWG/JESD, now also exhibits this bug.... https://github.com/m-labs/artiq/issues/998#issuecomment-399643588
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: Okay. Next steps imho are to move the sawg onto a separate CD with reset controlled by a kernel csr. Will then see if enabling/disabling it during the kernel fixes the issue.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399644005
<sb0_> so far, it would seem we have two ways to reproducibly trigger sayma insanity: the crash-kernel, and #998
<sb0_> #998 is simpler...
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: Final test I can think if would be to try reproducing this without the rtm... https://github.com/m-labs/artiq/issues/1065#issuecomment-399644080
<sb0_> hartytp_, how do you want to clock SAWG/JESD without the RTM? the Si5324 output is not routable to all the JESD transceiver (thank Xilinx for that...)
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #998: Reverting commit 83428961ad3dd74c5aa58859da035630c2fc06cf makes the bug disappear on the master (and stops the crash-kernel from crashing). So, it really looks like this and #1065 are linked. https://github.com/m-labs/artiq/issues/998#issuecomment-399646597
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #998: Reverting commit 83428961ad3dd74c5aa58859da035630c2fc06cf (done on 84b3d9ecc604e3fd30c3b2095b6f336eef3d11c2) makes the bug disappear on the master (and stops the crash-kernel from crashing). So, it really looks like this and #1065 are linked. https://github.com/m-labs/artiq/issues/998#issuecomment-399646597
<sb0_> hartytp_, maybe if you use the drtio master, you can keep rtio/sawg clocked from the si5324 and jesd clocked from the hmc7043
<sb0_> there will be elastic buffer overruns and DAc data corruption etc. but it shouldn't matter
<sb0_> the #998 repro should be independent from that
<sb0_> let me try that actually
<sb0_> basically revert 8b3c12e6ebcf3e72f2a54f9cfea76475cdf73c6d (though it needs to be done manually)
hartytp__ has joined #m-labs
<hartytp__> sb0: I don't know why you have any confidence that the Vadatec board will be easier to get up and running with ARTIQ/SAWG
<hartytp__> that seems unlikely to me
<sb0_> because all artiq boards work correctly except sayma?
<hartytp__> well, Sayma works just fine without SAWG
<hartytp__> we haven't put 8 channels of SAWG on anything else
<sb0_> pipistrello, papilio pro, kasli and its variants, kc705 without sawg, kc705 with sawg
<sb0_> none of those has been the royal PITA that sayma is
<hartytp__> I'm still of the opinion that the level of moaning I've heard about it is far out of proportion to the level of issues
<hartytp__> e.g. turns out that all serwb issues were due to you not bothering to use the correct IO standard
<hartytp__> oops
<hartytp__> how much of my life did the lack of even a basic code review from you cost me?
<hartytp__> so, let's focus on getting this to work and cut the crap?
<hartytp__> I had a go at moving SAWG into a separate CD that's controlled by kernels
<hartytp__> will be interesting to see if the crash kernel still crashes with the SAWG in reset state
<hartytp__> (let me know if you see anything obviously wrong with that code)
<hartytp__> otherwise, I'll look at the blinker
<hartytp__> tbh, I'm not sure how to test without the RTM
<sb0_> not all serwb issues were due to that problem, and _florent_ also tried to debug it for a long time without finding out about the io standard issue (which was his mistake in the first place)
<hartytp__> sb0: your subcontractor = your issue
<sb0_> and of course, when the root cause is identified, everything seems "simple"
<hartytp__> at least, that's the way i see it
<hartytp__> sure, but my point is that I'm still not convinced that these issues are all to do with Sayma
<hartytp__> rather than, say issues with project management/reviews
<sb0_> yes, but I'm saying that you cannot say it's "basic" or "simple"
<sb0_> a "simple" and "basic" capacitor problem could also cause the crashes we're seeing right now ...
<hartytp__> well, not using the right IO standards is about as "simple" as HDL issues get
<hartytp__> anyway, this isn't how I plan to spend my weekend
<hartytp__> arguing about this
<hartytp__> point is just that I don't find your comments about Sayma helpful in moving us towards a good outcome
<hartytp__> for anyone
<hartytp__> ...
<hartytp__> moving on
<hartytp__> if you see something wrong with that commit then let me know, otherwise I'll try playing around with a Kernel
<hartytp__> re: removing the rtm to minimize the example
<hartytp__> I can add a 150MHz JESD clock by soldering coax onto the AMC to RTM connector
<hartytp__> but, how much does that help us? The JESD obviously can't start up without an RTM
<hartytp__> still, I guess that's what I'll do
<hartytp__> make the runtime not crash if the JESD link doesn't start up
<hartytp__> s/crash/panic
<hartytp__> and try with no RTM
<hartytp__> (oh, and, yes a broken capacitor could cause this issue, but Greg has looked at all power supplies several times and found nothing suspicious
<sb0_> it's not clear to me whether he tried it with the SAWG running, nor if his board crashes at all
<hartytp__> maybe there is something daft like some FPGA pin that isn't connected correctly
<sb0_> yes, it could be that too
<hartytp__> so we should get another design review for the AMC
<hartytp__> I can't do that
<hartytp__> we can ask greg to take another look
<hartytp__> you know any hardware guys who would mind having a look?
<hartytp__> I think he did try with sawg, but if you don't know then why not ask him on GitHub? No point wondering when he's generally very fast at responding to questions
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: @gkasprow ... https://github.com/m-labs/artiq/issues/1065#issuecomment-399650157
<hartytp__> what is that igst supposed to do?
<hartytp__> gist
<hartytp__> marmelada: while this may well not be a hw issue, I think we do need to consider that
<hartytp__> c.f. how many issues we had with the ethernet chip because of a floating pin
<hartytp__> do you know of anyone (e.g. creotech) that could do a full design review of the AMC?
<hartytp__> all the boring stuff like digging through the xilinx guides on capacitors, clocking recommendations, magic pins that have to have something done to them to make things work
<hartytp__> the works
<hartytp__> well, not a full design review, but at least the parts around the FPGA/RAM
<hartytp__> and particularly anything ultrascale-related
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: okay, I modified my code to move the SAWG onto a separate CD, whose RESET can be controlled by Kernels. Running the "crash kernel" with the SAWG in reset does not crash. https://github.com/m-labs/artiq/issues/1065#issuecomment-399654163
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: okay, I modified my code to move the SAWG onto a separate CD, whose RESET can be controlled by Kernels. Running the "crash kernel" with the SAWG in reset does not crash.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399654163
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: Do you see the #998 bug? https://github.com/m-labs/artiq/issues/1065#issuecomment-399655477
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: Haven't looked. I'm seeing what happens if I enable the SAWG in the Kernel. If you have time, can you try running my code and see if you get the same result (no crash with SAWG disabled)?... https://github.com/m-labs/artiq/issues/1065#issuecomment-399656141
<GitHub-m-labs> [artiq] hartytp commented on issue #1080: One data point here: running with the SAWG held in reset, I don't see the "crash kernel" crash. But, I do see a bunch of errors during init (JESD PRBS, can't determine SYSREF margin at FPGA) https://github.com/m-labs/artiq/issues/1080#issuecomment-399656302
<GitHub-m-labs> [artiq] hartytp commented on issue #1080: Note to self: try this with a no-sawg build. It would be interesting to see if there is a difference between no SAWG and SAWG in reset. If there is, then this seems much more like a vivado issue than a hardware issue. https://github.com/m-labs/artiq/issues/1080#issuecomment-399656378
<GitHub-m-labs> [artiq] gkasprow commented on issue #1065: @sbourdeauducq at the moment I cannot run the SAWG because both boards started having PRBS errors during initialisation. https://github.com/m-labs/artiq/issues/1065#issuecomment-399656632
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: Well, then your PI test doesn't say much, we only have problems when JESD/SAWG are running. https://github.com/m-labs/artiq/issues/1065#issuecomment-399657980
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: @gkasprow The SAWG still runs even when there are JESD errors. Look at the logs I've posted, they all have JESD errors but, so long as the board boots up fully and you can run the kernel, I don't think that's a problem. https://github.com/m-labs/artiq/issues/1065#issuecomment-399659751
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: Running this kernel:... https://github.com/m-labs/artiq/issues/1065#issuecomment-399659966
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: @gkasprow While this might well be a code issue, I think it is worth doing a complete design review of Sayma AMC. Is there anyone else (e.g. Creotech) who can lend us a fresh pair of eyes. Looking at the usual things like checking Xilinx decoupling requirements, checking for any pins with special requirements, comparing our schematic to relevant ultra-scale eval boards,
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: @sbourdeauducq what would it take to get the SAWG running on a kintex ultrascale eval board? I would feel *much* more confident that this is likely a HW issue if you could show that a design of comparable complexity runs correctly on an ultrascale eval board.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399660481
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: So, on Monday I'll look at:... https://github.com/m-labs/artiq/issues/1065#issuecomment-399660680
<GitHub-m-labs> [artiq] gkasprow commented on issue #1065: @hartytp Creotech guys will do that. They are currently reviewing EEM modules prior mass production.... https://github.com/m-labs/artiq/issues/1065#issuecomment-399663451
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: Perfect, thanks Greg!... https://github.com/m-labs/artiq/issues/1065#issuecomment-399664113
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: > @sbourdeauducq what would it take to get the SAWG running on a kintex ultrascale eval board? ... https://github.com/m-labs/artiq/issues/1065#issuecomment-399664279
<GitHub-m-labs> [artiq] sbourdeauducq commented on issue #1065: > @sbourdeauducq what would it take to get the SAWG running on a kintex ultrascale eval board? ... https://github.com/m-labs/artiq/issues/1065#issuecomment-399664279
_whitelogger has joined #m-labs
<GitHub-m-labs> [artiq] hartytp commented on issue #1065: @gkasprow it would be good to find someone who has used ultrascale fpga and knows their "features" https://github.com/m-labs/artiq/issues/1065#issuecomment-399664782
<sb0_> hartytp_, where is your code for the sawg reset clock domain?
<GitHub-m-labs> [artiq] gkasprow commented on issue #1065: CERN guys are using it in several designs. https://github.com/m-labs/artiq/issues/1065#issuecomment-399668141
sb000 has joined #m-labs
<sb000> another tool we can use to debug those crashes (if resets are not enough) is clock gating with BUFGCE
sb000 has quit [Ping timeout: 260 seconds]
hartytp__ has joined #m-labs
<hartytp__> sb0 it's hacked onto my men test brach
<hartytp__> Afk til Monday now
hartytp__ has quit [Ping timeout: 260 seconds]
Gurty has quit [Ping timeout: 256 seconds]
Gurty has joined #m-labs
rohitksingh has joined #m-labs
rohitksingh has quit [Read error: Connection reset by peer]
rohitksingh has joined #m-labs
rohitksingh has quit [Ping timeout: 264 seconds]
<GitHub159> [smoltcp] dlrobertson commented on commit a1d0602: :+1: Thanks, there has definitely been an incrrease in the number of travis failures https://github.com/m-labs/smoltcp/commit/a1d06027358d75df7efbcfae9356d3094a3f1294#commitcomment-29472986
<GitHub41> [smoltcp] dlrobertson commented on issue #234: @whitequark any other feedback for this patch? https://github.com/m-labs/smoltcp/pull/234#issuecomment-399682420
<GitHub37> [smoltcp] dlrobertson commented on pull request #236 200ec19: whitespace https://github.com/m-labs/smoltcp/pull/236#discussion_r197613792
<GitHub61> [smoltcp] dlrobertson commented on pull request #236 200ec19: The is `O(3n)` right? Is there a way we could make this faster? What if we used a `ManagedMap` instead of a `ManagedSlice`? https://github.com/m-labs/smoltcp/pull/236#discussion_r197614219
<GitHub180> [smoltcp] dlrobertson commented on pull request #236 200ec19: :bike: :house: : `ipv4-fragmentation`? Will it ever make sense to have a single generic `ip-fragmentation` feature? Is there a reason to keep two separate features for ipv4 and ipv6? https://github.com/m-labs/smoltcp/pull/236#discussion_r197613823
X-Scale has quit [Quit: HydraIRC -> http://www.hydrairc.com <- s0 d4Mn l33t |t'z 5c4rY!]
Gurty has quit [Ping timeout: 256 seconds]
Gurty has joined #m-labs
Gurty has joined #m-labs
Gurty has quit [Changing host]
<GitHub51> [smoltcp] dlrobertson opened pull request #249: Move more iface tests to test ipv6 (master...ipv6ize_more_tests) https://github.com/m-labs/smoltcp/pull/249
X-Scale has joined #m-labs