#m-labs on 2015-12-31 — irc logs at freenode.irclog.whitequark.org

2015-03-04 14:45 sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs

00:00 fengling has joined #m-labs

00:05 fengling has quit [Ping timeout: 256 seconds]

00:23 rohitksingh has quit [Ping timeout: 250 seconds]

00:37 rohitksingh has joined #m-labs

00:41 fengling has joined #m-labs

00:52 fengling has quit [Ping timeout: 256 seconds]

00:58 fengling has joined #m-labs

01:49 <GitHub77> [artiq] sbourdeauducq pushed 1 new commit to master: http://git.io/vEhfj

01:49 <GitHub77> artiq/master 17802d3 Sebastien Bourdeauducq: test/coredevice/primes: keep output list entirely on the host

02:12 rohitksingh has quit [Quit: Leaving.]

04:31 sb0 has quit [Ping timeout: 260 seconds]

04:40 sb0 has joined #m-labs

05:58 sb0 has quit [Quit: Leaving]

07:30 sb0 has joined #m-labs

07:48 <bb-m-labs> build #67 of artiq is complete: Failure [failed lit_test] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/67

07:58 <whitequark> let me fix this

08:36 fengling has quit [Ping timeout: 256 seconds]

09:00 <whitequark> hrm

09:07 <GitHub80> [artiq] whitequark pushed 1 new commit to master: http://git.io/vEjnx

09:07 <GitHub80> artiq/master 9ed6b54 whitequark: transforms.cfg_simplifier: remove....

09:09 <bb-m-labs> build #68 of artiq is complete: Success [build successful] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/68

09:09 <whitequark> \o/ sb0 ^

09:11 fengling has joined #m-labs

09:20 <whitequark> sb0: there is a problem with exceptions

09:20 <whitequark> sure, I can map exceptions back to something else than ARTIQException, but then you will lose the core device traceback

09:20 <whitequark> ah no wait, I can use the same hack as now, though it's gross

09:27 <whitequark> sb0: ping

09:39 <sb0> whitequark, yes?

09:40 <whitequark> I think I told you before that to make an exception recognizable after going through the core device you have to add a builtin

09:40 <whitequark> which is incorrect, of course, it just has to derive from ARTIQException

09:40 <whitequark> is that still a problem?

09:41 <sb0> I'd say no for now, as there are more pressing issues

09:42 <whitequark> I kinda started working on it already, so would you figure out if this is the semantics you want or not?

09:42 <whitequark> there are some advantages to it, for example, I have to store the core device backtrace and params somewhere

09:43 <whitequark> and if I just map it to bare Python exceptions I have to stash them into strings

09:43 <whitequark> even if I leave the original around this messes with anything that wants to examine them

09:43 <sb0> so you are proposing that all exceptions that go through the core device need to derive from ARTIQException?

09:44 <whitequark> in short, yes

09:44 <sb0> what if e.g. a RPC raises TypeError?

09:44 <whitequark> it will be reraised as an ARTIQException

09:45 <whitequark> and you will get one line of backtrace before it went through the core device, as opposed to zero

09:46 <whitequark> it's not possible to fake backtraces in python. I tried basically every imaginable option

09:46 <sb0> I don't like ARTIQException. can't you catch the exception in the RPC handler, store its class in a map of exceptions that the core device may potentially raise, and raise the correct one?

09:46 <sb0> didn't we talk about fake backtraces sometime ago?

09:47 <sb0> some templating engine did that iirc

09:47 <sb0> ...of course, all this is orders of magnitude less important than e.g. the context managers not working

09:47 <whitequark> yes, it used the C bindings to Python to manipulate exceptions

09:47 <whitequark> poking the innards of the runtime basically

09:48 <sb0> ok, why can't we recycle that?

09:48 <whitequark> because it's going to break when a new Python version comes out someday?

09:50 <sb0> what was that templating engine again?

09:50 <whitequark> https://github.com/mitsuhiko/jinja2/blob/9b4b20aa56fde3a5cd5ac49d4feacd96eacb832d/jinja2/debug.py#L276

09:50 <sb0> do they have a good track record of dealing with that?

09:51 <sb0> looks acceptable to me. also it seems to be a quite popular project...

09:51 <whitequark> "def _init_ugly_crap():" looks acceptable to you? high standards for sure

09:52 <whitequark> hm, now that I think about it, it doesn't really matter, because there is an issue that mirrors this

09:52 <whitequark> which is, you can't catch a host exception on core device properly. and it can't be solved using this hack

09:55 <sb0> why not?

09:55 <sb0> regarding compatibility: the commit log looks rather quiet, https://github.com/mitsuhiko/jinja2/commits/9b4b20aa56fde3a5cd5ac49d4feacd96eacb832d/jinja2/debug.py

09:56 <sb0> the main problem seemed 2 to 3

09:56 <whitequark> because right now, only the exception name is put into the core device exception

09:56 <sb0> and i've seen plenty of worse code that didn't call itself "ugly crap"

09:56 <whitequark> so there's no 1:1 mapping from host exceptions to exceptions on core device

09:56 <whitequark> this causes both problems

09:57 <whitequark> ARTIQException solved that by having an 1:1 mapping from core device exceptions to ARTIQExceptions

09:58 <sb0> i'd say keep ARTIQException for now. sounds like the user-friendlier way would take too much time anyway

09:59 <sb0> we can revisit later

11:15 <bb-m-labs> build #69 of artiq is complete: Failure [failed anaconda_upload] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/69

11:16 <sb0> fuck

11:16 <sb0> ah, nothing serious in fact.

11:26 <whitequark> sb0: artiq_corelog hangs right now

11:26 <whitequark> why?

11:27 <whitequark> "No route to host" 192.168.1.50 actually

11:30 <sb0> whitequark, i just checked that the "external clock not present" detection works

11:31 <sb0> it did report correctly that the clock was not there, i wonder if that crashed it too

11:31 <sb0> let me try again

11:32 <sb0> hmm, even after resetting the fpga the board won't come back up

11:32 <sb0> fuck

11:32 <sb0> the fpga answers jtag though

11:33 <whitequark> can you revert it back so i can test my code?

11:33 <sb0> i loaded the fpga from jtag (not flash) - and it came back up

11:33 <sb0> and i just tried a failed clock switch, and it didn't crash...

11:34 <whitequark> hrm

11:34 <whitequark> maybe my code crashes it?

11:34 <whitequark> yes, yes it does

11:34 bentley` has quit [Ping timeout: 272 seconds]

11:34 <whitequark> wtf

11:35 <sb0> fpga reset brought it back up this time

11:35 <whitequark> no luck getting the corelog out?..

11:36 <whitequark> can you poke memory using jtag maybe?

11:36 <sb0> maybe you crashed it while i was attempting to connect...

11:36 <sb0> corelog works here

11:36 <sb0> you can also look at the serial port if your code crashes lwip

11:36 <whitequark> well, it's empty now.

11:36 <whitequark> how do I look at the serial port?

11:37 <sb0> flterm --port /dev/ttyUSB2

11:39 <whitequark> hm, almost works

11:39 <whitequark> lwip explodes before I get the entire log on serial

11:40 <whitequark> how do you reset the FPGA?

11:41 <sb0> there are "load" and "reload" scripts in my home

11:41 <sb0> reload reads from flash

11:43 <whitequark> ok, that worked

11:50 fengling has quit [Ping timeout: 256 seconds]

11:52 <whitequark> sb0: how do you reflash?

11:53 <GitHub98> [artiq] whitequark pushed 1 new commit to master: http://git.io/vEjHT

11:53 <GitHub98> artiq/master 0b69e48 whitequark: transforms.llvm_ir_generator: compare exn typeinfo using strcmp....

11:54 <sb0> artiq_flash.sh... or you can take care of #103

11:55 <sb0> note that I didn't try flashing with openocd, only loading/resetting

11:56 <whitequark> sigh

12:04 <whitequark> sb0: can you please reflash runtime?

12:04 <whitequark> I've built the package

12:04 <sb0> what doesn't work?

12:04 <whitequark> hm?

12:05 <sb0> 0.0-py_2331+git9ed6b54?

12:05 <sb0> whitequark, ^

12:05 <whitequark> git0b69e488, it should be

12:06 <sb0> where is the package?

12:06 <whitequark> http://101.78.236.68/buildbot/builders/artiq-kc705-nist_qc1/builds/42

12:06 <sb0> ah. but the conda dependencies cause issue.

12:07 <whitequark> oh

12:07 <sb0> let me rebuild the python package

12:07 <whitequark> I already did

12:07 <bb-m-labs> build #71 of artiq is complete: Exception [exception interrupted] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/71

12:08 <bb-m-labs> build #70 of artiq is complete: Success [build successful] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/70

12:09 <sb0> flashing...

12:09 <sb0> btw you can load the runtime with flterm too

12:09 <whitequark> oh?

12:10 <whitequark> also on kc705?

12:10 <sb0> yes, it's the same BIOS

12:10 <sb0> and there's TFTP netboot even

12:11 <sb0> isn't this linking mechanism incompatible with the MPU proposal?

12:12 <whitequark> hm, it would be enough to guard writes, really

12:13 <whitequark> since the only thing we really care about are stack overflows, also maybe random unsoundness

12:13 <whitequark> hang on

12:13 <whitequark> why would it be incompatible? ksupport has a copy of all basic routines

12:14 <sb0> but e.g. strcmp is linked with the main binary, ksupport.c is compiled with it, and therefore &strcmp will point to the function in the main binary

12:14 <sb0> no?

12:15 <sb0> remember there was this problem that the main binary would not know where the functions were in ksupport, and you didn't like my elftools-based script, so you replaced that with a single symbol space

12:16 <sb0> board is reflashed and up

12:16 <whitequark> thanks

12:17 <whitequark> as for ksupport, no

12:17 <whitequark> or1k-linux-nm /tmp/kc705/software/runtime/ksupport.elf |grep strcmp

12:17 <whitequark> 40404464 T strcmp

12:18 <sb0> is it the kernel CPU doing the linking now?

12:18 <whitequark> yep

12:18 <sb0> ah, ok

12:18 <whitequark> I don't remember why exactly I made it so but there was a reason

12:18 <whitequark> yeah, you could also guard reads, to catch bugs earlier

12:19 <whitequark> ... wtf

12:19 <whitequark> it doesn't crash anymore

12:19 <whitequark> I didn't do anything that would have made it not crash

12:20 <whitequark> .... oh

12:22 <whitequark> it was a completely unrelated bug in the unwinder debug logging code which no one has noticed until now by sheer chance

12:22 <sb0> how fast is the linker? or is there still a way to load multiple kernels in advance and switch fast between them, e.g. by having a single copy of ksupport?

12:22 <whitequark> the linker should take no more than a millisecond to run

12:23 <sb0> actually if we do that, the MPU should probably be smarter as well

12:23 <whitequark> loading multiple kernels in advance should be easy

12:33 _whitelogger has joined #m-labs

12:38 _whitelogger_ has joined #m-labs

12:40 <whitequark> sb0: got kicked out... linode having connectivity issues again

12:41 <sb0> ok, nothing happened while you were gone

12:41 <whitequark> so, multiple kernels, you literally just load multiple kernels

12:41 <whitequark> since they're now PIC

12:41 <whitequark> but that's semantically unsound

12:42 <whitequark> because kernels carry the state of host objects with them.

12:44 bentley` has joined #m-labs

12:45 rohitksingh has joined #m-labs

12:53 _whitelogger_ has quit [Read error: Connection reset by peer]

12:54 _whitelogger has joined #m-labs

13:01 <whitequark> perhaps, but you have to multiplex stuff on commcpu and also give them separate stack spaces

13:04 fengling has joined #m-labs

13:34 _whitelogger has joined #m-labs

13:50 _whitelogger has quit [Excess Flood]

13:51 _whitelogger has joined #m-labs

13:55 _whitelogger has quit [Excess Flood]

13:56 _whitelogger has joined #m-labs

14:00 _whitelogger has quit [Excess Flood]

14:05 fengling has quit [Read error: Connection reset by peer]

14:05 _whitelogger_ has joined #m-labs

14:06 <GitHub78> [artiq] whitequark pushed 1 new commit to master: http://git.io/vuesC

14:06 <GitHub78> artiq/master a2618f0 whitequark: runtime/artiq_personality.c: add missing cast.

14:07 <bb-m-labs> build #73 of artiq is complete: Failure [failed lit_test] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/73

14:09 <GitHub70> artiq/master cb90bf6 whitequark: test/coredevice/portability: keep trace list entirely on host.

14:09 whitequark has joined #m-labs

14:09 <GitHub70> [artiq] whitequark pushed 1 new commit to master: http://git.io/vuesp

14:11 _whitelogger_ has quit [Excess Flood]

14:12 _whitelogger has joined #m-labs

14:19 _whitelogger has joined #m-labs

14:22 whitequark has joined #m-labs

14:23 <sb0> Boost GDP—Hyper-personalized advertising, based on quantum computation, will stimulate consumer spending.

14:25 _whitelogger has quit [Excess Flood]

14:25 _whitelogger has joined #m-labs

14:30 _whitelogger has joined #m-labs

14:30 whitequark has quit [Ping timeout: 250 seconds]

14:30 _whitelogger has quit [Excess Flood]

14:31 _whitelogger has joined #m-labs

14:34 <whitequark> I think I figured out why it crashes

14:34 <whitequark> that debug print sometimes causes an unaligned access, which raises an exception, which causes a double fault

14:35 <whitequark> what does OR1K do during double fault?

14:36 <sb0> stekern, ^

14:36 <GitHub153> [artiq] whitequark pushed 3 new commits to master: http://git.io/vueW6

14:36 <GitHub153> artiq/master 71d8cbb whitequark: runtime/artiq_personality: add missing cast.

14:36 <GitHub153> artiq/master 79d020d whitequark: transforms.artiq_ir_generator: handle terminated try body.

14:36 <GitHub153> artiq/master ff0ab73 whitequark: Commit missing parts of 8aa34ee9.

14:38 <bb-m-labs> build #74 of artiq is complete: Failure [failed lit_test] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/74

14:39 <GitHub147> [artiq] whitequark pushed 1 new commit to master: http://git.io/vuelL

14:39 <GitHub147> artiq/master 693a364 whitequark: transforms.artiq_ir_generator: fix typo.

14:42 <GitHub17> [artiq] whitequark pushed 1 new commit to master: http://git.io/vue86

14:42 <GitHub17> artiq/master 05bdd5c whitequark: Commit missing parts of 8aa34ee9.

14:42 <bb-m-labs> build #45 of artiq-kc705-nist_qc1 is complete: Exception [exception interrupted] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq-kc705-nist_qc1/builds/45

14:44 <bb-m-labs> build #75 of artiq is complete: Success [build successful] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq/builds/75

14:52 _whitelogger has quit [Excess Flood]

14:52 _whitelogger has joined #m-labs

14:54 <bb-m-labs> build #46 of artiq-kc705-nist_qc1 is complete: Success [build successful] Build details are at http://m-labs-buildserver.lan/buildbot/builders/artiq-kc705-nist_qc1/builds/46

14:58 _whitelogger has quit [Excess Flood]

15:00 whitequark has quit [Ping timeout: 260 seconds]

15:07 _whitelogger has joined #m-labs

15:28 _whitelogger has joined #m-labs

15:28 whitequark has joined #m-labs

15:46 _whitelogger has joined #m-labs

15:46 whitequark has joined #m-labs

16:08 _whitelogger has joined #m-labs

16:11 whitequark has joined #m-labs

16:28 _whitelogger has quit [Ping timeout: 240 seconds]

16:31 _whitelogger has joined #m-labs

16:53 _whitelogger has joined #m-labs

16:59 _whitelogger has joined #m-labs

17:13 _whitelogger has quit [Excess Flood]

17:13 _whitelogger has joined #m-labs

17:25 _whitelogger has quit [Ping timeout: 250 seconds]

17:28 _whitelogger has joined #m-labs

17:34 <whitequark> ok that was easy enough to fix, at least

17:35 _whitelogger has quit [Excess Flood]

17:36 _whitelogger has joined #m-labs

17:48 _whitelogger has quit [Ping timeout: 240 seconds]

17:49 _whitelogger has joined #m-labs

17:55 _whitelogger has joined #m-labs

18:07 _whitelogger has quit [Excess Flood]

18:09 _whitelogger has joined #m-labs

18:15 _whitelogger has quit [Excess Flood]

18:15 _whitelogger has joined #m-labs

18:23 _whitelogger has joined #m-labs

18:29 whitequark has joined #m-labs

18:29 _whitelogger has quit [Excess Flood]

18:29 _whitelogger has joined #m-labs

18:49 _whitelogger has quit [Excess Flood]

18:50 _whitelogger has joined #m-labs

18:52 whitequark has quit [Ping timeout: 255 seconds]

18:52 _whitelogger has quit [Excess Flood]

18:53 _whitelogger has joined #m-labs

18:57 _whitelogger has quit [Ping timeout: 240 seconds]

19:08 _whitelogger has joined #m-labs

20:04 <GitHub170> [conda-recipes] whitequark pushed 1 new commit to master: https://github.com/m-labs/conda-recipes/commit/a62e03411ef9758bbd9ee5a1964a7609e9ec42ed

20:04 <GitHub170> conda-recipes/master a62e034 whitequark: llvm-or1k: bump.

20:08 <whitequark> sb0: WTF

20:09 <whitequark> so I just looked up and apparently the machine code I generated for finally: was *never* valid

20:09 <whitequark> specifically it did stuff like

20:09 <whitequark> l.movhir3, (.Ltmp17+1)

20:09 <whitequark> l.orir18, r3, (.Ltmp17+1)

20:09 <whitequark> and then it jumped to r3.

20:09 <whitequark> how on earth this succeeded?!

20:21 _whitelogger has joined #m-labs

20:25 whitequark has joined #m-labs

20:41 _whitelogger has joined #m-labs

20:47 _whitelogger has quit [Ping timeout: 240 seconds]

20:48 _whitelogger_ has joined #m-labs

21:05 _whitelogger has joined #m-labs

21:05 whitequark has joined #m-labs

21:58 <whitequark> ok, all the rest that's broken is in unittests is 1) resync of attributes and 2) test_loopback_count

21:59 <whitequark> sb0, would you look at how we can deal with that test?

22:07 <whitequark> sb0: speaking of resync of attributes, I'm not quite sure what's the best implementation strategy

22:07 <whitequark> I think they should be sent using an asynchronous (fire-and-forget)[4~ RPC mechanism

22:08 <whitequark> furthermore, there is generally a large amount of attributes in any given ARTIQ Python executable, most of which are never modified

22:08 <whitequark> furthermore, there are obvious correctness concerns with the host code reading stale data if it's called from an RPC

22:09 <whitequark> I suggest sending a fire-and-forget RPC /every time/ an attribute is modified

22:09 <whitequark> this solves the problem that most attributes are never modified but still have to be written back

22:09 <whitequark> and the correctness issue

22:10 <whitequark> it doesn't introduce any more latency than the time needed to serialize an object, unless your buffer is full

22:10 <whitequark> it will be slow if you modify a very large array in a loop, but I think we can tell people to just not do that

22:11 <whitequark> I can also make a warning for it fairly easily, e.g. one that fires if you modify a list over 100 elements or something

22:11 <whitequark> objections?

22:11 * whitequark → zzz

23:23 rohitksingh has quit [Ping timeout: 264 seconds]

23:37 rohitksingh has joined #m-labs