#pypy on 2020-03-12 — irc logs at freenode.irclog.whitequark.org

2019-08-29 19:33 cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin

00:04 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/6861 [default]

00:04 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-32/builds/5769 [default]

00:31 jcea1 has joined #pypy

00:32 jcea has quit [Ping timeout: 265 seconds]

00:32 jcea1 is now known as jcea

00:34 <bbot2> Failure: http://buildbot.pypy.org/builders/own-linux-s390x/builds/1479 [default]

00:36 jvesely has quit [Quit: jvesely]

00:48 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-macosx-x86-64/builds/4954 [default]

01:08 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-win-x86-32/builds/5173 [default]

01:22 adamholmberg has joined #pypy

01:27 adamholmberg has quit [Ping timeout: 255 seconds]

01:29 ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]

01:58 jvesely has joined #pypy

02:07 jvesely has quit [Quit: jvesely]

02:09 jvesely has joined #pypy

02:45 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/396 [default]

03:15 xcm has quit [Remote host closed the connection]

03:17 xcm has joined #pypy

03:27 jcea has quit [Quit: jcea]

03:50 _whitelogger has joined #pypy

04:29 _whitelogger has joined #pypy

04:53 adamholmberg has joined #pypy

04:58 adamholmberg has quit [Ping timeout: 255 seconds]

05:26 _whitelogger has joined #pypy

06:00 <bbot2> Started: http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/2917 [default]

06:00 xcm has quit [Remote host closed the connection]

06:02 xcm has joined #pypy

06:14 suhdonghwi[m] has joined #pypy

06:23 jvesely has quit [Quit: jvesely]

06:44 _whitelogger has joined #pypy

07:30 tsaka_ has quit [Ping timeout: 272 seconds]

07:46 dddddd has quit [Ping timeout: 258 seconds]

07:49 tsaka_ has joined #pypy

07:50 tsaka_ has quit [Client Quit]

07:50 tsaka_ has joined #pypy

08:02 tsaka_ has quit [Ping timeout: 260 seconds]

08:06 otisolsen70 has joined #pypy

08:10 <mattip> can anyone replicate the untranslated cpyext test crash on linux64 that is happenning on the buildbot?

08:11 <bbot2> Success: http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/2917 [default]

08:26 tsaka_ has joined #pypy

08:38 tsaka_ has quit [Ping timeout: 258 seconds]

08:55 adamholmberg has joined #pypy

09:00 adamholmberg has quit [Ping timeout: 258 seconds]

09:16 tsaka_ has joined #pypy

09:29 Dejan has quit [Ping timeout: 255 seconds]

09:36 Dejan has joined #pypy

09:42 Dejan has quit [Ping timeout: 255 seconds]

09:43 Dejan has joined #pypy

09:46 <Dejan> mattip, I will give it a try now

09:55 <Dejan> erm ... untranslated?

09:55 <Dejan> does that mean that I do not run translation?

09:56 fryguybob has quit [Ping timeout: 240 seconds]

09:58 fryguybob has joined #pypy

10:11 tsaka_ has quit [Ping timeout: 256 seconds]

10:19 lritter has joined #pypy

10:23 <mattip> right, from a pypy checkout on linux 64 do "python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py"

10:24 <Dejan> ok, omw to do that

10:25 <Dejan> https://bpaste.net/HYAQ

10:26 <Dejan> ok, with CPython 2 it works

10:26 <Dejan> it does not coredump... yet

10:27 <antocuni> marmoute: maybe you are already aware of it, but pushing to heptapod is VERY slow (at least on the pypy repo): http://paste.openstack.org/show/790585/

10:37 ekaologik has joined #pypy

10:39 <Dejan> mattip, with CPython 2 it is stuck in this state: pypy/module/cpyext/test/test_pyerrors.py .........s.....s........s.

10:39 <Dejan> it is there for last 15m

10:43 <Dejan> 20m now

10:44 <Dejan> i will keep it running 10m more and kill it if nothing changes

10:48 <Dejan> It seems like it got stuck in:

10:48 <Dejan> File "/home/dejan/oswork/pypy/rpython/tool/runsubprocess.py", line 61, in <module>

10:48 <Dejan> operation = sys.stdin.readline()

10:49 <mattip> can you rerun with python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py -vv

10:53 <Dejan> question is why it core dumps when executed with pypy2

10:53 <Dejan> weird...

10:53 <Dejan> it passes with -vv

10:54 ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]

10:54 <mattip> yes, it seems like a timeout in test_error_thread_race

10:55 <Dejan> i think it coredumps at pypy/module/cpyext/test/test_pyerrors.py::AppTestFetch::test_fetch_and_restore

10:56 <mattip> are you sure? this log seems to suggest the test after the last successful one

10:56 <mattip> http://buildbot.pypy.org/builders/own-linux-x86-64/builds/8058/steps/shell_8/logs/pytestLog

10:57 <Dejan> https://bpaste.net/BWUA

10:57 adamholmberg has joined #pypy

10:58 <Dejan> with python2 that test succeeds

10:59 <Dejan> it is really weird, with -vv and python2 it all passes

10:59 <mattip> the tests are not supposed to succeed with pypy. If you want to test translated, use pypy pytest.py -A pypy/module/cpyext/test/test_pyerrors.py

10:59 <Dejan> without -vv python2 is stuck somewhere

10:59 <Dejan> and it never finishes

10:59 <mattip> just like the buildbot

11:01 <Dejan> i will try to find the exact test where it gets stuck now

11:02 adamholmberg has quit [Ping timeout: 268 seconds]

11:02 <Dejan> it passed even without the -vv now

11:02 <Dejan> so it is completely random behaviour...

11:03 <mattip> try cleaning out the repo

11:03 <Dejan> now i executed it again, and it got stuck!

11:03 <Dejan> :D

11:03 <Dejan> it got stuck at the test_error_thread_race

11:05 <Dejan> yea, looks like some kind of race condition and it is not blocked forever

11:05 <Dejan> s/not/now/

11:06 <Dejan> and like most race conditions it is random, so now it all makes sense

11:28 tsaka_ has joined #pypy

11:45 dddddd has joined #pypy

11:51 tsaka_ has quit [Ping timeout: 255 seconds]

11:54 tsaka_ has joined #pypy

12:02 zmt01 has joined #pypy

12:03 zmt00 has quit [Ping timeout: 255 seconds]

12:20 lastmikoi has quit [Excess Flood]

12:25 lastmikoi has joined #pypy

12:31 <cfbolz> mattip: could be related to the Gil work by arigato and antocuni?

12:49 adamholmberg has joined #pypy

13:14 <Dejan> mattip, fresh clone and pytest run works well

13:14 <Dejan> i have tried it 10 times and it succeeded all 10 times

13:14 <Dejan> with "13 passed, 3 skipped in 3.73 seconds"

13:22 <Dejan> it looks like i had an old code...

13:24 <Dejan> no, i was in the tip

13:25 <Dejan> it pointed to the hpy branch

13:25 <Dejan> (i am still learning Mercurial...)

13:27 <Dejan> Blah, still the same - it freezes at test_error_thread_race

13:30 <antocuni> Dejan, cfbolz : our gil-related work was merged to default in cd7261a5a735, so if you still experience the problem it is worth trying that revision and the one immediately before (b79b87185e0b)

13:31 <Dejan> that is too advance for me, i barely know how to use hg

13:31 <Dejan> but i will try to checkout b79b87185e0b

13:31 <Dejan> and see if i experience the same problem

13:32 <Dejan> if my boss find out i am doing this i am dead meat

13:35 <antocuni> hg up -r REV

13:35 <antocuni> this is the equivalent of git checkout REV

13:36 <antocuni> and this might be useful as well: https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone

13:37 <Dejan> i did hg checkout b79b87185e0b

13:37 <Dejan> hg tip shows i am at that rev

13:38 <antocuni> hg tip shows always the most recent commit

13:38 <antocuni> hg id shows the one you are currently on

13:38 <Dejan> thanks

13:39 <Dejan> i am at b79b87185e0b and it all works well

13:39 <antocuni> what about cd7261a5a735?

13:39 <Dejan> i am testing that right now

13:41 <Dejan> seems like it is working pretty well

13:41 <antocuni> good

13:41 <antocuni> so probably your problem was not related to our GIL work

13:41 <Dejan> it is not my problem

13:42 <Dejan> it makes buildbot freeze

13:42 <Dejan> actually, timeout

13:42 <Dejan> that is why started testing, when mattip asked for help

13:45 <Dejan> i am going down the "default" log, in order to find at which commit it starts working as expected

13:46 <Dejan> 8b1743e23736 is definitely failing

13:47 YannickJadoul has joined #pypy

13:47 adamholmberg has quit [Remote host closed the connection]

13:48 adamholmberg has joined #pypy

13:49 adamholmberg has quit [Read error: Connection reset by peer]

13:49 adamholmberg has joined #pypy

13:50 <antocuni> ah but wait, it fails only if you run it on top of pypy?

13:50 <Dejan> no

13:51 <Dejan> I run python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py

13:53 tsaka_ has quit [Ping timeout: 246 seconds]

13:54 jacob22 has quit [Quit: Konversation terminated!]

13:56 <antocuni> Dejan: works for me, although it doesn't mean that the bug isn't there of course

13:56 <Dejan> x86-64?

13:56 jacob22 has joined #pypy

13:57 <Dejan> repeat it couple of times

13:57 <Dejan> it is completely random

13:58 <Dejan> it gets stuck at the test_error_thread_race

13:58 <antocuni> uh, this is interesting: http://paste.openstack.org/show/790599/

13:59 <Dejan> yep, that is the problematic one

13:59 <antocuni> I see many messages like this in the output: rpython.rtyper.debug.FatalError: GIL not held when a CPython C extension module calls 'PyGILState_Release'

13:59 <antocuni> arigato: could this be related to 0fd6d867bff6 ?

14:02 <Dejan> i did not know about the -k thread_race

14:02 <Dejan> thanks!

14:02 <antocuni> you're welcome

14:03 <antocuni> other useful pytest tricks are: -v to get more verbose output

14:03 <Dejan> that i know of

14:03 <Dejan> like ssh you can combine them with -vv for more verbose output

14:03 <antocuni> -s to avoid capturing stdout/stderr (so you see prints during the test execution)

14:03 <antocuni> --pdb to get a pdb prompt in case of exception

14:04 <antocuni> -x to stop on the first failed test

14:04 <antocuni> and --ff to run the last-failed tests first

14:07 <antocuni> so, I'm running this on 0fd6d867bff6:

14:07 <antocuni> while ./pytest.py pypy/module/cpyext/test/test_pyerrors.py -x -v -k thread_race; do let "x=x+1"; echo $x; done

14:07 <antocuni> after 3 runs, the test got stuck

14:08 * antocuni tries with 2f391434f09e

14:08 <Dejan> i am at 84b0c1357d94

14:08 <Dejan> it failed there too

14:10 <antocuni> yes, but the two suspicious changesets are cd7261a5a735 (which changed the way the GIL is implemented) and 0fd6d867bff6 (which changed the way the GIL is used in cpyext)

14:13 <Dejan> well, you know it better than me :)

14:14 <Dejan> i am using the brute force to find which commit introduced this bug

14:14 <Dejan> s/commit/changeset/

14:16 <antocuni> Dejan: you should learn about hg bisect as well :)

14:17 <Dejan> can it automate this process?

14:19 <antocuni> yes

14:19 <Dejan> then by all means lets run it

14:19 <Dejan> as it takes ages...

14:19 <antocuni> to start with, it can automatically bisect, so you do log2(N) tests instead of N

14:20 <antocuni> then you can also use hg bisect -c, to give it a command to execute automatically

14:20 <antocuni> but it's not very useful in this case because it hangs

14:20 <Dejan> well, i want to bisect it with the command like yours above

14:20 <Dejan> without the infinite while

14:20 <Dejan> :D

14:21 <antocuni> well, it's not really useful because it might give you false negative (like, a changeset seems to work but actually is still buggy)

14:30 <Dejan> ah, i need to mark changesets as good and bad...

14:31 jcea has joined #pypy

14:32 jvesely has joined #pypy

14:33 <Dejan> would be nice if hg bisect could run concurrent jobs

14:34 <Dejan> ;)

14:36 <antocuni> so, 0fd6d867bff6 is definitely broken

14:37 <antocuni> arigato: ^^^

14:52 tsaka_ has joined #pypy

15:15 xcm has quit [Remote host closed the connection]

15:16 xcm has joined #pypy

15:29 <Dejan> well, it is just that particular test that has some kind of race condition

15:34 <antocuni> well yes, that revision broke this test, so either one should be fixed. This is the point of having tests :)

15:46 tsaka_ has quit [Quit: Konversation terminated!]

16:07 <Dejan> antocuni, how do you track down memory leaks?

16:07 <antocuni> very broad question

16:08 <antocuni> any more specific use case or problem in mind?

16:09 <Dejan> I think I have found a memory leak in Celery... I used pympler and it clearly show a constant increase of memory allocated for "list" object(s)

16:09 <antocuni> using pypy or cpython?

16:09 <Dejan> and my test case contains a while loop and 3 lines of code that call Celery inspect API

16:09 <Dejan> i used CPython for this

16:10 <Dejan> but I should give PyPy a try...

16:10 <Dejan> I would like to find what "list" object is constantly increasing in size

16:11 <antocuni> in this case, I'd either try to bisect the code (e.g. by removing unnecessary code while still checking that the leak is still there) until I pinpoint the piece of code which leaks

16:11 <antocuni> once the code is sufficiently small, you can e.g. try to use objgraph on it

16:11 <Dejan> that is very hard

16:11 <Dejan> as my test case is ~7 lines

16:12 <Dejan> it just calls Celery's inspect API...

16:12 <antocuni> Dejan: you can also use gc.get_objects() to inspect all the lists which are alive; if you find one which has >N items, you know what's inside

16:12 <Dejan> that is a good idea, i never used it before

16:12 <antocuni> Dejan: well of course if the bug is inside Celery, you need to dirty your hands inside celery code

16:13 <mattip> often it is not a bug, maybe just a memory cycle. Did you try using "for i in range(3): gc.collect()" ?

16:15 <mattip> inspecting things can create cycles by holding on to things that were designed to be released

16:17 <Dejan> i kept tracking memory usage for 3h

16:18 <Dejan> it is constantly increasing

16:18 <Dejan> at a steady rate

16:18 <Dejan> and I got determined to find the culprit

16:31 <Dejan> mattip, i find it difficult to understand that GC does not collect in 3h

16:35 <Dejan> i am printing gc.get_stats() output every 20s

16:35 <Dejan> it constantly shows the same output :)

16:35 <Dejan> yet the process consume more and more memory...

16:36 YannickJadoul has quit [Quit: Leaving]

17:20 <Dejan> can you guys install pycrypto in your pypy 3.6 venvs?

17:21 <ronan> AFAICT, the GIL issue is that the untranslated emulation of the RPython GIL is broken

17:22 <ronan> ATM, if any thread has the GIL, all threads think they have it

17:44 xcm has quit [Remote host closed the connection]

17:49 xcm has joined #pypy

17:52 jvesely has quit [Quit: jvesely]

17:55 <Dejan> antocuni, do you by any chance have binary wheel of pycrypto package for pypy3.6 7.3.0 ?

17:56 jvesely has joined #pypy

17:56 <Dejan> it does not build here...

17:57 <antocuni> Dejan: did it work before?

17:59 <Dejan> Yep

17:59 <Dejan> I managed to install few of our projects (one of them use celery) and all of them depend on pycrypto

18:00 <antocuni> maybe a recent version of pycrypto broke and previous ones work?

18:04 xcm has quit [Remote host closed the connection]

18:08 xcm has joined #pypy

18:16 <antocuni> ronan: is it a new issue or it was known also before?

18:17 <mattip> the test did pass before, so if it was known before there must have been a work-around.

18:18 <mattip> if it is hard to fix we should skip the test untranslated

18:18 <antocuni> true enough

18:18 <antocuni> I admit that I didn't look at it in detail yet

18:31 <ronan> antocuni: it's new. The old code was checking cpyext_glob_tid_ptr.

18:32 <antocuni> oh

18:38 ekaologik has joined #pypy

19:18 xcm has quit [Remote host closed the connection]

19:21 xcm has joined #pypy

19:53 otisolsen71 has joined #pypy

19:57 otisolsen70 has quit [Ping timeout: 256 seconds]

19:58 otisolsen71 has quit [Ping timeout: 256 seconds]

21:04 xcm has quit [Killed (barjavel.freenode.net (Nickname regained by services))]

21:05 xcm has joined #pypy

21:23 mattip has quit [Ping timeout: 255 seconds]

21:25 xcm has quit [Ping timeout: 256 seconds]

21:29 xcm has joined #pypy

21:51 adamholmberg has quit [Remote host closed the connection]

21:51 adamholmberg has joined #pypy

21:56 adamholmberg has quit [Ping timeout: 265 seconds]

22:08 ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]

22:22 lritter has quit [Quit: Leaving]

23:10 mattip has joined #pypy

23:40 wleslie has joined #pypy