00:34
<
GitHub6 >
artiq/master 2ae20fb Sebastien Bourdeauducq: runtime: cleanup now_init/now_save
00:34
<
GitHub6 >
artiq/master 8e30848 Sebastien Bourdeauducq: runtime: save now when terminating with exception
00:34
<
GitHub6 >
artiq/master 917cc05 Sebastien Bourdeauducq: test: add test for seamless handover on exception termination
00:35
sb0 has joined #m-labs
00:35
<
GitHub131 >
artiq/release-1 5f8b02a Sebastien Bourdeauducq: runtime: save now when terminating with exception
00:35
<
GitHub131 >
artiq/release-1 e069ce9 Sebastien Bourdeauducq: runtime: cleanup now_init/now_save
00:35
<
GitHub131 >
artiq/release-1 5baba5f Sebastien Bourdeauducq: test: add test for seamless handover on exception termination
01:05
kristian1aul has quit [Quit: Reconnecting]
01:05
kristianpaul has joined #m-labs
01:05
kristianpaul has joined #m-labs
02:57
stekern has quit [Ping timeout: 258 seconds]
03:36
stekern has joined #m-labs
04:00
DocScrutinizer05 has quit [Disconnected by services]
04:00
DocScrutinizer05 has joined #m-labs
04:02
sb0 has quit [Quit: Leaving]
04:39
rohitksingh_work has joined #m-labs
05:43
EvilSpirit has joined #m-labs
05:57
sb0 has joined #m-labs
06:26
FabM has joined #m-labs
07:07
cyrozap has quit [Quit: Client quit]
07:07
cyrozap has joined #m-labs
09:18
FabM has quit [Remote host closed the connection]
09:19
FabM has joined #m-labs
09:32
<
whitequark >
sb0: ok, this is not lwip's keepalive
09:32
<
whitequark >
I've disabled keepalive and the bug still occrus
09:32
<
whitequark >
let me try and get a pcap
10:00
acathla has quit [Quit: Coyote finally caught me]
10:01
acathla has joined #m-labs
10:07
sb0 has quit [Quit: Leaving]
10:12
key2 has joined #m-labs
10:13
sb0 has joined #m-labs
10:17
sandeepkr has quit [Ping timeout: 272 seconds]
10:18
kuldeep has quit [Ping timeout: 276 seconds]
10:18
sb0 has quit [Client Quit]
10:28
sandeepkr has joined #m-labs
10:29
kuldeep has joined #m-labs
10:46
fengling has joined #m-labs
10:53
fengling has quit [Ping timeout: 240 seconds]
11:00
key2 has quit [Ping timeout: 250 seconds]
11:03
EvilSpirit has quit [Ping timeout: 244 seconds]
11:03
acathla has quit [Quit: Coyote finally caught me]
11:04
acathla has joined #m-labs
11:11
fengling has joined #m-labs
11:15
acathla is now known as fabien
11:16
fengling has quit [Ping timeout: 240 seconds]
11:19
fabien is now known as acathla
11:19
acathla has quit [Changing host]
11:19
acathla has joined #m-labs
11:34
sb0 has joined #m-labs
11:35
<
whitequark >
sb0: please do not disturb the kc705 for now, I'm trying to get a capture with no keepalive enabled
11:36
<
whitequark >
not sure why it takes so much longer to reproduce that in the lab
11:36
<
whitequark >
I tried ping-flooding the network but it didn
11:36
<
whitequark >
t seem to do anything
11:36
<
whitequark >
rather, ping-flooding the kc705
11:36
<
whitequark >
maybe it's just too fast for that stupid 'load
11:43
rohitksingh has joined #m-labs
12:08
<
whitequark >
stupid 'load'
12:08
<
whitequark >
I mean loading the network connection
12:17
cyrozap has quit [Ping timeout: 264 seconds]
12:17
bentley` has quit [Remote host closed the connection]
12:26
cyrozap has joined #m-labs
12:35
<
whitequark >
sb0: hm
12:35
<
whitequark >
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
12:35
<
whitequark >
socket.gaierror: [Errno -3] Temporary failure in name resolution
12:36
<
whitequark >
have you considered not resolving it every time you open a connection?
12:36
sandeepkr has quit [Read error: Connection reset by peer]
12:36
<
whitequark >
(I assume this happens with the scheduler as well)
12:36
<
whitequark >
because this seems like a really stupid reason for an experiment to fail
12:39
key2 has joined #m-labs
12:50
<
sb0 >
what is that from, why should it fail, and why is the solution to cache name resolutions?
12:52
<
whitequark >
that's called by self.open in comm_tcp
12:52
<
whitequark >
it fails because if you're making a ten thousand DNS queries, eventually one of them will not go through for some stupid reason
12:53
<
whitequark >
I don't know the exact reason it failed just now
12:53
<
whitequark >
also, I wonder for how much time that accounts
12:53
<
whitequark >
won't show up on the profiler...
12:55
sandeepkr has joined #m-labs
13:29
rohitksingh has quit [Quit: Leaving.]
13:55
rohitksingh_work has quit [Read error: Connection reset by peer]
14:02
FelixVi has joined #m-labs
14:38
FelixVi has quit [Remote host closed the connection]
15:29
sb0 has quit [Quit: Leaving]
15:57
rohitksingh has joined #m-labs
16:42
rohitksingh has quit [Read error: Connection reset by peer]
16:51
bentley` has joined #m-labs
17:57
<
GitHub53 >
artiq/master f5deafb Robert Jordens: browser: add a debug message for OSError on HDF5 open
17:59
mumptai has joined #m-labs
18:08
kuldeep has quit [Ping timeout: 240 seconds]
18:08
sandeepkr has quit [Ping timeout: 264 seconds]
18:31
<
whitequark >
rjo: so, hm, seems to be a bug in lwip keepalive
18:31
<
whitequark >
shall I just turn that off?
18:43
<
rjo >
whitequark: are you certain? the dumps don't indicate that.
18:44
<
rjo >
if we turn it off we need to implement heartbeat.
18:46
<
rjo >
and i remember (somewhat foggy memory though) triggering something like that behavior by running the pipistrello at higher ppp speeds (more buffer juggling).
18:47
<
rjo >
and that getaddrinfo() caching is #161 on asyncio
18:49
<
whitequark >
the dumps do not, but I no longer observe this problem after turning off keepalive
18:50
<
whitequark >
I do see other intermittent failures, that are much rarer, and never involve keepalive packets
18:52
<
rjo >
ok. i am fine with turning it off and seeing wether the alternative is "less bad" for now.
18:52
<
whitequark >
okay, let's do that.
18:55
<
GitHub119 >
artiq/master 0db6ef0 whitequark: runtime: disable lwip TCP keepalive....
19:03
sandeepkr has joined #m-labs
19:03
kuldeep has joined #m-labs
19:06
sandeepkr has quit [Max SendQ exceeded]
19:06
sandeepkr has joined #m-labs
19:07
kuldeep has quit [Max SendQ exceeded]
19:07
kuldeep has joined #m-labs
19:15
<
rjo >
and you can try a newer lwip. it's just that that will most likely break ppp.
19:17
<
mumptai >
is there a way to evaluate an fhdl expression in an interactive python session?
20:50
Gurty has quit [Ping timeout: 244 seconds]
21:14
Gurty has joined #m-labs
22:14
mumptai has quit [Remote host closed the connection]