sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
marbler has joined #m-labs
jfng has joined #m-labs
<GitHub-m-labs>
[artiq] jbqubit commented on issue #1040: Yes. I've confirmed this at several specific frequencies using startup kernel. Kernel panics prevent sweeping across a range of frequencies so testing hasn't been exhausted. But please close this issue as I suspect kernel panics are a separate problem. https://github.com/m-labs/artiq/issues/1040#issuecomment-396778443
jbqubit has joined #m-labs
bb-m-labs has quit [Ping timeout: 240 seconds]
bb-m-labs has joined #m-labs
<GitHub-m-labs>
[artiq] jordens commented on issue #813: I think this won't work because FPGA_CFG_DIN (on RTM) is not connected to IO_L1N_T0_D01_DIN_14 (L17) but to IO_L1P_T0_D00_MOSI_14 (K16). That's close but not close enough.... https://github.com/m-labs/artiq/issues/813#issuecomment-396677246
<GitHub-m-labs>
[artiq] jordens commented on issue #813: I think this won't work because FPGA_CFG_DIN (on RTM) is not connected to IO_L1N_T0_D01_DIN_14 (L17) but to IO_L1P_T0_D00_MOSI_14 (K16). That's close but not close enough.... https://github.com/m-labs/artiq/issues/813#issuecomment-396677246
<rjo>
1/100 sdram init failures after rtm load with f8627952
<GitHub3>
[smoltcp] pothos commented on pull request #233 277e7a3: Yes, `set_for_retransmit` is called after all was send out, so this leaves the fast retransmit mode and should wait now instead of looping in `poll`. https://github.com/m-labs/smoltcp/pull/233#discussion_r194977683
hilmipilmi has joined #m-labs
<hilmipilmi>
Xilinx has released a decrypt util for 2018.1: https://pastebin.com/raw/usWNKC2W. Decrypts xilinxt_2017_05 key encrypt sources. Spread before removed :-)...
hilmipilmi has quit [Client Quit]
cr1901_modern has left #m-labs [#m-labs]
<rjo>
sb0: is there a reason you didn't bother with specifying the correct voltages on the rtm fpga?
cr1901_modern has joined #m-labs
hartytp has joined #m-labs
<hartytp>
rjo: are you saying that giving the clock muxes proper reset values affected the memory corruption?
<hartytp>
also, are you saying that you have a situation where you see memory corruption only if the RTM is reloaded, and not if only the AMC is?
<rjo>
hartytp: no. shot noise.
<hartytp>
ok
<rjo>
hartytp: i'll check the latter.
<hartytp>
no to both?
<rjo>
no to 1, will check to 2
<hartytp>
thanks! I really want to narrow this down a bit so we can figure out what is going on here
<rjo>
i am seeing it 5/20 when loading (each time) the rtm from jtag and the amc from flash
<hartytp>
and none when just loading the amc from flash?
<rjo>
1/4 when only loading the amc from flash with the rtm kept running
<hartytp>
wow
<rjo>
2/10 now.
<hartytp>
do you want to give me instructions/binaries for exactly what you're doing and I'll test it here?
<hartytp>
if you can find a reproducible test case then it should be easy to find the cause -- there are only a small number of ways that those two boards can interact
<hartytp>
oops never mind, just re-read your last post, this is not to do with the RTM loading
<hartytp>
what about the RTM physically disconnected?
<rjo>
dunno. sb0?
<rjo>
i am just doing "for i in `seq 100`; do artiq_flash -t sayma --srcbuild artiq_sayma start; sleep 20; done"
<hartytp>
rjo: SAWG SFDR @210 MHZ is about 60dB
<hartytp>
biggest spur is at 89.6MHz
<hartytp>
600MHz spur is about the same level
<hartytp>
(doing a wide scan, so 89.6MHz is probably not the exact frequency)
<hartytp>
390MHz is -10dBc ish
<hartytp>
is that (90MHz+210MHz)*2 = fclk?
<rjo>
ok no interpolation then. the 390 is expected to be high. the 90 MHz (shouldn't be 89.6) is fclk/2 - 210, yes. let me dig out the AA filter response.
<hartytp>
rjo, sb0: any interest in Kasli BP adapters?
<hartytp>
getting a quote from TechnoSystem atm
<hartytp>
dominated by tooling, so extra boards will be cheap if produced at the same time
<hartytp>
currently quoting 300EUR/pc at 5pcs
<rjo>
ok. those spurs are pretty much expected. if anybody ever gets around to doing those, i'd be interested in (1) seeing intermodulation performance between f1/f2, (2) confirmation that the AA filter before the 600 MHz DUC works (https://ssl.serverraum.org/lists-archive/artiq/attachments/20170113/f328c247/attachment-0001.pdf) and (3) close-in noise from f1 and f0 (<1 MHz, on top of clock noise).
sb000 has joined #m-labs
<sb000>
rjo, the rtm io defs are florent's iirc, though I fixed some. what voltage is wrong?
<sb000>
hartytp, no bp adapter for me
<sb000>
rjo, do you have a good repro for crashing in the bios or getting bad memory scan results?
<rjo>
hartytp: yes. maybe 3 for me or 4 if the continue to charge that hilarious amount on shipping. but last i checked the price goes down 5% if you hit the 10 qty limit. i'd be interested to see that they do better on the BP adapter.
<rjo>
sb000: afaict there is not a single bank operating at 2.5 v but all definitions are LVCMOS25
<rjo>
just artiq_flash start.
<sb000>
but that only crashes rarely, correct?
<sb000>
is there a way to make the crashes more frequent?
<sb000>
or do we have to do 100's of reboots with the rtm disconnected to see if it comes from there?
<hartytp>
rjo: ack. I'll ask what the pricing is for larger batches and let them know you're interested
<rjo>
i don't have a 100% repro. testing it 100 times takes a couple minutes. it's not that bad.
<rjo>
hartytp: thanks. also please complain about the shipping costs. ;)
<hartytp>
will do
<rjo>
the next EEM i'd like to do is a generic microprocessor one. i.e. the TTL tester with a uC and bunch of pin headers on board in pmod/arduino/whatever-is-hip-these-days style. and i'd like to see the stm32f4 as that has "flight herritage".
<rjo>
and then in a similar vein a board with a ~small FPGA. i.e. i want to have uC/FPGA development boards with (a) EEM connectivity and (b) EEM form factor.
<hartytp>
I see, small FPGA/uP boards that hook up to Kasli, but have some GPIO (ADC/DAC?) broken out to sensible headers/FP connectors?
<hartytp>
sounds nice
<rjo>
sb000: there is something fundamentally wrong. e.g. take that patch that ultimately "resolved" all the SAWG issues: it's not supposed to make any difference AFAICT.
sb000 has quit [Ping timeout: 260 seconds]
<cjbe__>
rjo: some of my suservo use cases require 8 sampler channels, but 16 Urukul channels (i.e. several beams on one PD) - from a quick glance at the suservo gateware this seems to be already in there, just requiring a change in susero instantiation
<hartytp>
I'd love to see a feeback controller EEM. Simple 16-bit DACs + ADCs and an FPGA. nothing overly fancy in terms of AFE or BW
<cjbe__>
rjo: can you see any difficulties in doing this? (apart from timing closure)
<rjo>
hartytp: that could live as a typical mezzanine (pmod/arduino, maybe something sufficient exists already) on the uC or FPGA EEM.
<hartytp>
if that works out mechanically decently then that would be cool
<hartytp>
however, you'd want to think about FPs etc, so it might not be trivial to do a decent job like that
<rjo>
cjbe__: hmm. haven't tried but i can't see why it shouldn't work. reduce the number of profiles per channel (16 instead of 32), adapt that EEM definition. cycle would be a bit longer but not much.
<rjo>
hartytp: the uC/FPGA carrier gets 4HP panel for USB/leds/raw pins, and then the mezzanine can just occupy as much panel space as it wants next to the carrier.
<hartytp>
true
<hartytp>
yes, that could be really nice
<rjo>
that's not a big issue mechanically afaict. the double constraint from the mezzanine plugged into the carrier and the front panel might be fine.
<rjo>
or just support a way to screw the two halves of the panel together on the inside. that's fine as well.
<hartytp>
In some cases it would be nice to embed that into some other product. e.g. for our coil current control circuit, I'd like to replace the complex analog loop filter with a digital one.
<hartytp>
the afe for the current measurement lives on or close to (the 288A has to pass through the flux gate and the AFE is near the sensor for noise/grounding/emi reasons)
<rjo>
sb0: and i have the feeling that this could either be some vivado miscompilation, or some complete breakage of timing constraints (but it seems unlikely as the pattern doesn't fit), or power/switching activity related (that would be my first wild guess).
<hartytp>
it would be nice to put the digital part in the same box as the AFE, but not to put a complete Kasli there
<hartytp>
so, either supporting control of the feedback controller FPGA via, say RJ45 LVDS, or some kind of "remote EEM" thing
<hartytp>
more generally, I can imagine wanting to scatter those around my lab without having to spend huge amounts of cash on Kasli
<rjo>
hartytp: yes. the vhdci extension or 2xrj45 would work.
<rjo>
those boards are already there.
<hartytp>
VHDCI extension is still quite buly and pricy + big cable and connector. too much for a single board, that just needs some basic control lines
<hartytp>
1 RJ45 (e.g. SCPI) + a power jack on the FP could work well
<hartytp>
(or, maybe don't bother with the power jack, but have some internal header for supplying power e.g. Molex KK)
<hartytp>
anyway, modulo details, that sounds like a good idea
<hartytp>
rjo: Kasli BP 10pcs is 224EUR/pc
marmelada has joined #m-labs
<hartytp>
rjo: re sayma...
<hartytp>
the power supplies for Sayma AMC have all been tested right by the FPGA
<hartytp>
when we looked at the SDRAM before
<hartytp>
not likely things have changed
<hartytp>
so, it would have to be a PI/SI issue inside the FPGA that's not correctly captured by Vivado's models (so, basically a miscompilation)
<cjbe__>
rjo: ok - thanks. The only sequential section is the IIR DSP, right? Any recollection on roughly how many clks per channel through that?
<marmelada>
hey rjo, what did you use to import issues in sinara-hw/meta repository? I want to import issues to kasli repo
<rjo>
hartytp: have we really tested them at the loads and the amount of switching activity (8 sawg channels at full activity) we are using them at now?
<rjo>
cjbe__: 4 or 5 cycles per dds channel.
<hartytp>
rjo: most of the issues we see are at boot, right
<rjo>
hartytp: yes. point taken. but it is weird that this also seems to be triggered by doing stuff with sawg.
<hartytp>
ack
<hartytp>
well, maybe
<hartytp>
but, if there is already memory corruption, then it could just be that doing anything non trivial brings up the issue
<hartytp>
how about simplifying the ethernet clocking and rebuilding with different vivado?
<rjo>
it seemed to be specific to a certain usage pattern (rtio or cpu)
<hartytp>
you mean the crashes after boot?
<marmelada>
rjo: thanks!
<rjo>
hartytp: or just remove the locs as wyou are not using ethernet anyway.
<hartytp>
I was trying to focus on the boot issues for now, as they are simpler to track down
<hartytp>
rjo: yes, but then that breaks ethernet
<rjo>
hartytp: in the sense that it fails to PnR it?
<marmelada>
oh, it crashed on 35. comment :(
<rjo>
hartytp: or in the sense that it compiled and didn't work? it thought it doesn't work anyway for you.
<hartytp>
no, it doesn't give any errors, but could alter timing
<hartytp>
in the sense that sb0 said "this will probably break ethernet" and reverted the change
<rjo>
marmelada: ha. talk to them!
<hartytp>
afaict, the original issues with ethernet were due to a HW issue that caused silly narrow eyes
<hartytp>
now that's been fixed, I believe that the eyes are wide enough that we can revert to the original plan
<rjo>
hartytp: testing with another vivado would be one of the next things to do. and if i have to break ethernet on a system where we can't use it anyway then that's an easy decision.
<hartytp>
to clock the ODDR that generates the output Tx clock from the same clock that clocks the ODDRs used to send data to the phy
<hartytp>
rjo: I'll hold off looking at the slave FPGA loading until we've had more of a look at this memory corruption issue
<hartytp>
let me know if there is anything I can do to help there, as it's my top priority atm
<rjo>
hartytp: thanks.
sb0 has joined #m-labs
<sb0>
rjo, when you had the all-11111.... memory scan result, was that immediately after a reboot caused by the crashy startup kernel I had loaded into the board?
<GitHub-m-labs>
migen/master e5cabe1 Sebastien Bourdeauducq: sayma_rtm: fix I/O bank voltages
<hartytp>
marmelada: not that ARTIQ on Sayma kind of works, do you think you guys can ship us some Allaki?
<hartytp>
I'd like to have them for testing
<hartytp>
(we paid for 4 a while back, but IIRC they're sitting with TechnoSystems, waiting for testing)
<rjo>
sb0: i haven't touched the flash since yesterday evening. that case was right after a crash from your startup kernel that spewed out a couple of 0x00 on the uart. and it was in a loop of loading the rtm and then loading the amc from flash.
<GitHub-m-labs>
migen/master 34a3c62 Sebastien Bourdeauducq: sayma_rtm: LVDS_18 is called LVDS
<marmelada>
uh oh
<marmelada>
is it on rtm?
<rjo>
sb0: i have only seen the all-ones case after your startup kernel crashing. the other case (noise and apparent doubled tap delays) happens more frequently and independently of a cpu crash before.
<marmelada>
sb0: LVDS (the 1v8 one) is only on hp banks, and artix fpgas do not have them
<sb0>
CRITICAL WARNING: [Vivado 12-4470] I/O Standard 'LVDS' is not supported on 'xc7a15tcsg325-1' part. [/home/sb/artiq_drtio/artiq_sayma/rtm_gateware/rtm.xdc:211]
<marmelada>
if you have 1V8 bank and want to use LVDS then you're out of luck
<rjo>
marmelada: what testing did you do on the kasli bp adapter?
<marmelada>
all lvds and i2c lines, both power connectors and I checked if mmcx output works
<sb0>
rjo, 100/100 memory test passed
<marmelada>
I also use it to test bp connetor on kasli
<sb0>
(after replacing the startup kernel with one that does not crash)
<marmelada>
also sorry for spam on kasli wishlist, but I'm trying to find a tool which will import all coments
<rjo>
marmelada: it's ok. it will dramatically increase your github karma and number of contributions ;) if you use a pair of test repos then try that one PR that i referenced.
<sb0>
I had 7 failures in those 100 restart before the startup kernel
<GitHub2>
[smoltcp] dlrobertson commented on issue #235: To make the PR a bit smaller I only included packet parsing support in this PR. I'll submit a follow-up with a `Repr` and `Iterator` for the sources. https://github.com/m-labs/smoltcp/pull/235#issuecomment-396915417
<hartytp>
rjo: you guys interested in booster at all?
<rjo>
marmelada: ok. since we don't have a way to mount the backplane adapter, is your impression that that's going to be ok?
<rjo>
hartytp: not yet.
<rjo>
hartytp: did that 224€ include VAT?
<hartytp>
ack
<hartytp>
just forwarded you the quote
<hartytp>
(booster is also there if you're interested)
<hartytp>
np, again, tooling is a large chunk of the bill, so looking to see if anyone else is interested
<GitHub189>
[smoltcp] dlrobertson commented on pull request #234 53fd624: Could you make it clearer that this is only run with the `Borrowed` variant. IMO a doc comment or changing the name to `run_owned_gc` would be sufficient. https://github.com/m-labs/smoltcp/pull/234#discussion_r195059299
<GitHub94>
[smoltcp] dlrobertson commented on pull request #234 53fd624: Could you make it clearer that this is only run with the `Owned` variant. IMO a doc comment or changing the name to `run_owned_gc` would be sufficient. https://github.com/m-labs/smoltcp/pull/234#discussion_r195059299
<rjo>
marmelada: the mounting and mechanical situation if the bp adapter is only held by the DIN connector.
<marmelada>
there shouldn't be any issues
<marmelada>
bp connector requires similar amount of force asidc connectors but is larger
<marmelada>
it won't fall of by itself
<marmelada>
unless you help it ;)
<rjo>
i was thinking about adding proper mounting options. i.e. with screws. because e.g. when shipping a kasli system, i'd expect a lot of problems with the DIN connector and BP adapter to wiggle loose.
<marmelada>
though if you want to use back power connectors you need to use right angle plug
<rjo>
i.e. either the z-rail (but that seems tricky now given the connector choice) or (as described before) making the length so that it can be screwed into the real mouting rails + rear panel.
<GitHub-m-labs>
migen/master a51a5f6 Sebastien Bourdeauducq: sayma: use LVCMOS18 for serwb
jbqubit_ has joined #m-labs
<rjo>
to mount the bp adapter properly, for a 235 mm subrack (every vendor has those) it would need to be longer by a few mm plus mounting holes for the angle brackets. that's the only difference.
<rjo>
marmelada: when doing vibration test you'd be surprised how nasty that can be. e.g. look at the problems with finding a proper connector for spacevpx that survices a bit of launch vibration without eating the connector. on a van when shipping the vibration a a day is not as bad as a rocket but surprisingly high.
sb0 has quit [Quit: Leaving]
<hartytp>
marmelada: thanks!
<rjo>
in short: i'd sleep much better if there was a way to srew the bp adapter to something.
<hartytp>
sb0: so, plan is to run serwb at a v low line rate with unterminated LVCMOS?
<hartytp>
rjo: good catches on Sayma bugs today. Thanks!
salientas has quit [Remote host closed the connection]
sb0 has joined #m-labs
<sb0>
rjo, hartytp, startup kernel was run 100/100 times without crash after changing serwb to lvcmos
<sb0>
100 reboots with artiq_flash start...
FabM has quit [Quit: Leaving]
<rjo>
and all that by dumb review of platform definitions?
<rjo>
sb0: let me give it a spin...
<hartytp>
nice!
<hartytp>
will test that on my hw tomorrow
<hartytp>
does that mean that boot is now 100% successful on your board?
hartytp has quit [Quit: Page closed]
<sb0>
the memory corruption repro kernel still crashes the board, but I see only a lockup now, no more garbage on UART
<sb0>
but that might just be chance
<sb0>
rjo, before changing the io standards, I only saw serwb and stapl failures - no memory-related crash
<sb0>
rjo, beware, the crashy kernel is still flashed
<rjo>
sb0: no memory test issues but the kernel crashes still?
<sb0>
rjo, yes
<sb0>
rjo, the startup kernel that was run correctly 100/100 was core_log("hello world")
<sb0>
the crash repro still locks up the board every single time
<rjo>
in different ways? or always lockup?
<sb0>
always lockup now afaict. but since this is intermittent, this needs more testing
<rjo>
the lockup is intermittent? before that kernel was crashing or locking at 100%, right?
<sb0>
it always crashes, but the symptom (reboot/lockup/garbage) is intermittent
sb0 has quit [Quit: Leaving]
iwxzr has joined #m-labs
<rjo>
vivado 2018.1 also has surprisingly/suspiciously few problems meeting timing. it estimates WNS=1.6ns early and then has 0.4 after place, and 0.6 as the initial summary early in routing. also happily sprays bufgs on the resets.
<cr1901_modern>
Oh are you also playing a fun game of "the synthesizer creates a bitstream that claims to meet timing but it crashes anyway"?
<rjo>
i still get >1/20 failed sdram init/test after a crash and serwb failures after loading the rtm. and interestingly that kernel (the test one from #1039) didn't crash once during the weekend.
hilmipilmi has joined #m-labs
<hilmipilmi>
Unless you missed it: Xilinx has released a decrypt util for their xilinxt_2017_05 encrypted source: https://pastebin.com/raw/usWNKC2W . It works and you should save it before it is removed.
<hilmipilmi>
Ok, I guess the actual private key (if it is RSA after all?) would be better but this is a beginning.
<sb0>
other than with a crash-induced reboot (after which it is reasonable to assume broken device state) I have not seen any
sb0 has quit [Client Quit]
<GitHub99>
[smoltcp] podhrmic opened pull request #237: Fix MTU settings so fragmented packets can be received (master...proper_mtu_handling) https://github.com/m-labs/smoltcp/pull/237