sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
Neuron1k has quit [Quit: ZNC 1.6.1 - http://znc.in]
<GitHub> [artiq] cjbe commented on issue #670: @jordens the only thing pulling in m-labs/pyqt5 now is quamash: the recipe is fixed, but the conda package at m-labs/main has not been rebuilt with this new recipe (current build is 2016-12-01) - could you trigger a rebuild? https://github.com/m-labs/artiq/issues/670#issuecomment-283843885
FabM has quit [Ping timeout: 268 seconds]
<GitHub> [artiq] sbourdeauducq commented on issue #670: @cjbe Things are rarely simple. The rebuild fails (http://buildbot.m-labs.hk/builders/conda-lin64/builds/286). Also what to do about the documentation? https://github.com/m-labs/artiq/issues/670#issuecomment-283846489
FabM has joined #m-labs
<sb0> bb-m-labs: force build --props=package=quamash conda-lin64
<bb-m-labs> build #287 forced
<bb-m-labs> I'll give a shout when the build finishes
<bb-m-labs> build #287 of conda-lin64 is complete: Failure [failed anaconda_upload] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/287
hedgeberg|away is now known as hedgeberg
<whitequark> bb-m-labs: force build --props=package=quamash conda-lin64
<bb-m-labs> build #288 forced
<bb-m-labs> I'll give a shout when the build finishes
<bb-m-labs> build #288 of conda-lin64 is complete: Failure [failed anaconda_upload] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/288
<whitequark> sb0: you need to increment the build number
<sb0> bb-m-labs: force build --props=package=quamash conda-lin64
<bb-m-labs> build #289 forced
<bb-m-labs> I'll give a shout when the build finishes
<GitHub> [conda-recipes] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/conda-recipes/commit/2481e6f1bacd894ec5d2b59b75689a5144ff720d
<GitHub> conda-recipes/master 2481e6f Sébastien Bourdeauducq: quamash: increment build number
<bb-m-labs> build #289 of conda-lin64 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/conda-lin64/builds/289
<sb0> bb-m-labs, force build --branch=release-2 artiq
<bb-m-labs> build forced [ETA 32m53s]
<bb-m-labs> I'll give a shout when the build finishes
<GitHub> [artiq] sbourdeauducq pushed 3 new commits to release-2: https://github.com/m-labs/artiq/compare/791976e6d0ba...90aeb76a2ce6
<GitHub> artiq/release-2 007ae00 whitequark: compiler.embedding: fix an overly lax hasher.
<GitHub> artiq/release-2 fd5cdb7 whitequark: compiler.transforms: implement a typedtree printer.
<GitHub> artiq/release-2 90aeb76 whitequark: transforms.inferencer: do not unnecessarily mutate typedtree....
<bb-m-labs> build #1357 of artiq is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1357
<sb0> wtf
<sb0> bb-m-labs, force build artiq
<bb-m-labs> build forced [ETA 32m53s]
<bb-m-labs> I'll give a shout when the build finishes
<sb0> hm
<sb0> rjo, how to build release-2?
<whitequark> force build --branch=release-2 artiq
<whitequark> bb-m-labs: force build --branch=release-2 artiq
<bb-m-labs> The build has been queued, I'll give a shout when it starts
<sb0> I think the problem is the new artiq-dev conda recipe, which is in master but not in release-2
<whitequark> oh
<whitequark> yeah, you can't build it now
FabM has quit [Ping timeout: 264 seconds]
<sb0> whitequark, also I think there are some problems with the compiler (see latest master tests)
<whitequark> what, test_pulse_rate_dds failed again?
<whitequark> no, that's just the way it works *shrug*
<whitequark> I haven't changed *anything* that could affect code generation
<sb0> TypeError: unhashable type: 'TTuple'
<whitequark> hm
<sb0> many tests in test_embedding fail with that
<whitequark> oh, I missed that part somehow
<whitequark> sec
<whitequark> sb0: are you using kc705?
FabM has joined #m-labs
<sb0> no
<bb-m-labs> build #433 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/433
<bb-m-labs> build #1358 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1358
<bb-m-labs> build forced [ETA 32m53s]
<bb-m-labs> I'll give a shout when the build finishes
<bb-m-labs> build #1359 of artiq is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1359
rohitksingh_work has joined #m-labs
FabM has quit [Ping timeout: 260 seconds]
FabM has joined #m-labs
<bb-m-labs> build #434 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/434
<bb-m-labs> build #1360 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1360 blamelist: whitequark <whitequark@whitequark.org>
<GitHub> [artiq] sbourdeauducq deleted drtio at 5d145ff: https://github.com/m-labs/artiq/commit/5d145ff
FabM has quit [Ping timeout: 256 seconds]
stekern_ is now known as stekern
FabM has joined #m-labs
<whitequark> sb0: hm
<whitequark> so the core analyzer doesn't record *any* messages from the dma core
<sb0> yes, that sounds consistent with what happens on the board
kuldeep_ has quit [Remote host closed the connection]
kuldeep_ has joined #m-labs
kuldeep_ has quit [Remote host closed the connection]
<whitequark> ... hm
kuldeep_ has joined #m-labs
<whitequark> sb0: that's interesting
<whitequark> can you confirm that a runaway DMA core can hang the entire SoC
stekern has quit [Ping timeout: 255 seconds]
stekern has joined #m-labs
<whitequark> sb0: otherwise
<whitequark> I've just modified the encoder to produce the exact same thing as the test
<whitequark> no effect
<whitequark> no error raised (I deliberately put them out of sequence), no output in analyzer
<GitHub> [artiq] whitequark pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/3070a2fac1637e6ceed726fd4a79d85ec07133e3
<GitHub> artiq/master 3070a2f whitequark: runtime: fix more bugs in DMA trace encoder.
<sb0> whitequark, it should only read memory, so no it shouldn't hang
<whitequark> sb0: well I forgot a 0 at the end and the entire system hung
<sb0> the end marker?
<whitequark> yeah
<whitequark> hm
<whitequark> that's odd
<sb0> btw are you using the main (buildbot) or aux kc705?
<sb0> I'll do some tests soon (~30min) on the one you are not using
<whitequark> main
<whitequark> oh, I know why DMA isn't working
<whitequark> that's because that code ends up never actually being called
<bb-m-labs> build #435 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/435
<bb-m-labs> build #1361 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1361 blamelist: whitequark <whitequark@whitequark.org>
AndChat|326081 has quit [Quit: Bye]
AndChat326081 has joined #m-labs
<GitHub> [artiq] whitequark pushed 3 new commits to master: https://github.com/m-labs/artiq/compare/3070a2fac163...4f94709e9f9f
<GitHub> artiq/master 4f94709 whitequark: firmware: move packet dumps to the DEBUG log level.
<GitHub> artiq/master e8c093d whitequark: Allow changing runtime log level without recompilation....
<GitHub> artiq/master fe77fcc whitequark: firmware: fix a warning.
<whitequark> no, it's not, I just forgot you can't always call println in ksupport
<whitequark> hm
<sb0> I've been delaying upgrading the CPU on the buildserver because I'm pretty sure this again will go wrong...
<whitequark> that is interesting
<whitequark> I reordered some code in ksupport and now it hangs
AndChat|326081 has joined #m-labs
AndChat326081 has quit [Ping timeout: 260 seconds]
AndChat|326081 has quit [Quit: Bye]
AndChat326081 has joined #m-labs
<bb-m-labs> build #436 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/436
<bb-m-labs> build #1362 of artiq is complete: Exception [exception interrupted] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1362 blamelist: whitequark <whitequark@whitequark.org>
<GitHub> [artiq] sbourdeauducq commented on issue #562: Remaining steps:... https://github.com/m-labs/artiq/issues/562#issuecomment-282458954
<whitequark> sb0: nope, can't get it to work
<whitequark> as far as I can see the error is not in the encoding and not in the wrappers...
<sb0> can you double check the decoding code (in the gateware)?
<sb0> hey, at least no xilinx transceiver is involved!
rohitksingh_work has quit [Ping timeout: 240 seconds]
<whitequark> sb0: I hung the SoC again.
<whitequark> this happens intermittently with no discernible reason except it relates to using the DMA core.
rohitksingh_work has joined #m-labs
AndChat|326081 has joined #m-labs
AndChat|326081 has quit [Client Quit]
AndChat|326081 has joined #m-labs
AndChat326081 has quit [Read error: Connection reset by peer]
<sb0> maybe it is blocking the DRAM bus somehow?
<sb0> this shouldn't happen, but there can be a few layers of bugs there ...
<whitequark> sb0: I can reliably reproduce it
<sb0> did you align the start of the dma buffer to a sdram word?
<whitequark> oh, hm
<whitequark> what's an SDRAM word?
<sb0> 512 bits
<whitequark> so 64 bytes?
<sb0> same alignment requirement as the analyzer
<sb0> yes
<whitequark> oh.
<whitequark> well that explains it
<whitequark> also, that's a really obnoxious requirement
<whitequark> there's no way to allocate 64-byte-aligned buffers currently
<sb0> realigning stuff in gateware isn't exactly non-obnoxious either
<whitequark> yes, but you only have to do that once
<sb0> maybe you can allocate an extra 63 bits, look at the start address that you got, and pad the beginning with dummy data?
<sb0> *63 bytes
<sb0> that's *a lot* easier to debug than gateware realignment
<whitequark> yes, that's really my only choice here
<whitequark> well, it doesn't have to be 63, sec
<sb0> better than spending a week on fixing obscure gateware bugs in the realignment logic
<sb0> also, why is it that rust, a "systems language", cannot align things in memory? alignment requirements are pretty common when dealing with hardware
<whitequark> you can't do that in C any better either?
<sb0> there's posix_memalign. but C doesn't really have memory management...
<whitequark> sure it does, malloc is in the C standard
<whitequark> rust provides __rust_allocate, which takes an alignment argument (and which is not implemented in liballoc_artiq anyway), and its collections will generally ensure the elements are aligned
<whitequark> and there's an (unstable) global attribute, which is equivalent to C's implementation-specific ones
<cr1901_modern> Rust has an attribute for packed I believe (idk if it's stable tho)
<whitequark> so their story here is similar, and also unsatisfyingly weak
<whitequark> actually, most "systems" languages are pretty bad at interfacing with hardware
<sb0> you call malloc() memory management? :)
<whitequark> yes, malloc() is memory management
<whitequark> no systems language I know of provides a decent interface for hardware registers, for example, or DMA
<whitequark> well, that's not quite it
<whitequark> there is an obscure Haskell dialect that is somewhat decent
<cr1901_modern> What do you mean by decent?
<whitequark> well, it has first-class support for bit manipulation and raw memory areas
<whitequark> so you can describe something (I think they use PCI config space as an example in the report) and the compiler will guarantee that the code you write using those definitions conforms to (certain) hardware's expectations
<cr1901_modern> By first class bit manipulation you mean "I can masking ops as objects to be assigned to a variable"?
<whitequark> no
<whitequark> I'm unconcerned with syntax
AndChat326081 has joined #m-labs
<whitequark> see section 3.6.8
<cr1901_modern> ack
<cr1901_modern> In any case, I think Ada also lets you create type safe device drivers, but I have no experience w/ this
<whitequark> which assignment anyway, it's monadic
<whitequark> though strict
<whitequark> yes. I don't know Ada
AndChat|326081 has quit [Ping timeout: 240 seconds]
AndChat326081 has quit [Quit: Bye]
AndChat326081 has joined #m-labs
AndChat|326081 has joined #m-labs
AndChat|326081 has quit [Client Quit]
AndChat326081 has quit [Read error: Connection reset by peer]
AndChat326081 has joined #m-labs
AndChat|326081 has joined #m-labs
AndChat326081 has quit [Ping timeout: 240 seconds]
FabM has quit [Ping timeout: 258 seconds]
kuldeep_ has quit [Ping timeout: 240 seconds]
hedgeberg is now known as hedgeberg|away
kuldeep has joined #m-labs
AndChat326081 has joined #m-labs
AndChat|326081 has quit [Read error: Connection reset by peer]
<whitequark> hrm: OutputMessage(channel=62, timestamp=34016141000432, rtio_counter=222640455192, address=58706, data=0)
<whitequark> OutputMessage(channel=62, timestamp=34016141000432, rtio_counter=222640455256, address=58706, data=0)
<whitequark> oh, I need to flush the L2 cache, do I not
<whitequark> hm, no difference.
<whitequark> sb0: what does it mean that the analyzer has the following in the output:
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551849360, address=0, data=0)
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551840008, address=0, data=0)
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551849480, address=0, data=0)
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551840176, address=0, data=0)
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551849648, address=0, data=0)
<whitequark> OutputMessage(channel=114, timestamp=3886186820, rtio_counter=95551840296, address=0, data=0)
<whitequark> it looks as if there are two streams of data being interleaved
<whitequark> sb0: ok so looks like trying to feed an unaligned address into the DMA core crashes the SoC
<whitequark> now it just reads garbage data
<sb0> whitequark, it just looks like garbage data...
<whitequark> why is rtio_counter not monotonically increasing?
<whitequark> sb0: is there anything else i need to do with the base address?
<sb0> I don't know. weird
<whitequark> shift it somewhere, express in terms of another address space, ...
<sb0> no
<sb0> whitequark, can I take the kc705 for a while? I need the TTL loop_out>loop_in connection
<sb0> whitequark, okay done
<GitHub> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/1e6a33b586107287eae0c7b33690c70796be8ed4
<GitHub> artiq/master 1e6a33b Sebastien Bourdeauducq: rtio: handle input timeout in gateware...
<GitHub> [artiq] sbourdeauducq pushed 1 new commit to release-2: https://github.com/m-labs/artiq/commit/37c9b97bc4a62518dead2cea0ec5c45d2ef7be64
<GitHub> artiq/release-2 37c9b97 whitequark: compiler.types: add missing TTuple.__hash__ implementation.
AndChat326081 has quit [Quit: Bye]
AndChat326081 has joined #m-labs
AndChat|326081 has joined #m-labs
AndChat326081 has quit [Ping timeout: 260 seconds]
<bb-m-labs> build #437 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/437
<GitHub> [artiq] sbourdeauducq pushed 1 new commit to release-2: https://github.com/m-labs/artiq/commit/55f217ceef2657446d45f3ab26715f8f427ae28e
<GitHub> artiq/release-2 55f217c Sebastien Bourdeauducq: conda: use artiq-dev system
<sb0> bb-m-labs, force build --branch=release-2 artiq
<bb-m-labs> The build has been queued, I'll give a shout when it starts
<bb-m-labs> build #1363 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1363 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<bb-m-labs> build forced [ETA 32m53s]
<bb-m-labs> I'll give a shout when the build finishes
<GitHub> [artiq] sbourdeauducq pushed 1 new commit to release-2: https://github.com/m-labs/artiq/commit/bf868284456d540a2a5473f033eef097850dcb81
<GitHub> artiq/release-2 bf86828 Sebastien Bourdeauducq: add missing artiq-dev.yaml
<GitHub> [artiq] sbourdeauducq pushed 1 new commit to master: https://github.com/m-labs/artiq/commit/f121ca20fe158b7ceefb07e53a84654391a3c252
<GitHub> artiq/master f121ca2 Sebastien Bourdeauducq: test: relax test_pulse_rate_dds
<bb-m-labs> build #438 of artiq-board is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/438
<bb-m-labs> build #1364 of artiq is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1364
ashish_ has joined #m-labs
<bb-m-labs> build #439 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/439
<bb-m-labs> build #1365 of artiq is complete: Failure [failed python_unittest_2] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1365 blamelist: Sebastien Bourdeauducq <sb@m-labs.hk>
<whitequark> cp: cannot stat '/var/lib/buildbot/slaves/debian-stretch-amd64-2/miniconda/envs/_build:/lib/libcrypto.so.1.0.0': No such file or directory
<whitequark> mv: cannot stat '/var/lib/buildbot/slaves/debian-stretch-amd64-2/miniconda/envs/_build/lib/libcrypto.so.1.0.0-tmp': No such file or directory
<whitequark> wtf
ashish_ has quit [Ping timeout: 246 seconds]
<whitequark> sb0: no, it still hangs
rohitksingh_wor1 has joined #m-labs
rohitksingh_work has quit [Ping timeout: 246 seconds]
<whitequark> sb0: reset doesn't work correctly
<whitequark> so I'm submitting this set of bytes into the DMA core directly: https://hastebin.com/ixanunarim
<whitequark> it's aligned, at address 40822d40
<whitequark> the first time the sequence passes, and nothing is in the analyzer
<whitequark> the second time the kernel CPU hangs at rtio_arb_dma()
ashish_ has joined #m-labs
kuldeep has quit [Read error: Connection reset by peer]
kuldeep has joined #m-labs
rohitksingh_wor1 has quit [Read error: Connection reset by peer]
<whitequark> sb0: ok. I don't know how to fix it.
kuldeep has quit [Remote host closed the connection]
rohitksingh has joined #m-labs
<sb0> whitequark, are you sure all fields are little endian?
<sb0> whitequark, and one bug at a time. just reload the fpga for now
<sb0> resetting the kernel CPU does not reset the DMA core for now
ashish_ has quit [Ping timeout: 246 seconds]
<sb0> rjo, if seems that your "artiq-dev {{ "{tag} py_{number}+git{hash}".format(tag=environ.get("GIT_DESCRIBE_TAG") ..." dependency isn't working
<sb0> at least for the artiq-board builder, it works with the artiq one
<sb0> it first downloads artiq-dev 3.0 then downgrades, hm
<sb0> bb-m-labs, force build artiq
<bb-m-labs> build forced [ETA 32m53s]
<bb-m-labs> I'll give a shout when the build finishes
mumptai has joined #m-labs
<sb0> rjo, Greg is designing the DAC lanes for 10Gbps (raw), didn't we need 12.5?
ashish_ has joined #m-labs
<bb-m-labs> build #440 of artiq-board is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-board/builds/440
<bb-m-labs> build #430 of artiq-win64-test is complete: Failure [failed python_unittest] Build details are at http://buildbot.m-labs.hk/builders/artiq-win64-test/builds/430
<bb-m-labs> build #1366 of artiq is complete: Failure [failed] Build details are at http://buildbot.m-labs.hk/builders/artiq/builds/1366
<larsc> sb0: which converter do you want to use with those lanes?
<sb0> AD9154
<sb0> yea it's 10, for some reason I had 12 in mind
<larsc> AD9154 can do up to 12.5Gbps, but if your max data output rate is 1GSPS you only need 10Gbps
<larsc> ah know, mistake in my document. 10.96 Gbps is max
<rjo> sb0: i'll bring it up.
kuldeep has joined #m-labs
rqou_ has joined #m-labs
kmehall_ has joined #m-labs
kmehall has quit [*.net *.split]
rqou has quit [*.net *.split]
rqou_ is now known as rqou
ashish_ has quit [Ping timeout: 246 seconds]
AndChat326081 has joined #m-labs
AndChat|326081 has quit [Read error: Connection reset by peer]
AndChat326081 has quit [Client Quit]
AndChat326081 has joined #m-labs
AndChat|326081 has joined #m-labs
AndChat|326081 has quit [Client Quit]
AndChat|326081 has joined #m-labs
AndChat326081 has quit [Read error: Connection reset by peer]
AndChat326081 has joined #m-labs
AndChat|326081 has quit [Ping timeout: 246 seconds]
rohitksingh has quit [Quit: Leaving.]
rjo1 has joined #m-labs
rjo1 has quit [Client Quit]
rjo has quit [Quit: leaving]
rjo has joined #m-labs
<GitHub> [artiq] cjbe commented on issue #670: @sbourdeauducq I see - thanks for fixing that build. There now seems to be a working quamash (0.5.5-py_2) in m-labs/dev, but m-labs/main still has 0.5.5-py_1. Is it possible to update the main as well? https://github.com/m-labs/artiq/issues/670#issuecomment-284100936
<GitHub> [artiq] cjbe commented on issue #670: @sbourdeauducq the documentation relating to 3f556a3 / #361 ? I have not been able to reproduce this on Ubuntu - perhaps @r-srinivas may be able to reproduce? https://github.com/m-labs/artiq/issues/670#issuecomment-284101834
<whitequark> sb0: it doesn't matter if they're LE or BE, because no data is shown as submitted
<whitequark> and yes, I reload the FPGA