cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
jcea1 has joined #pypy
jcea has quit [Ping timeout: 265 seconds]
jcea1 is now known as jcea
jvesely has quit [Quit: jvesely]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 255 seconds]
ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]
jvesely has joined #pypy
jvesely has quit [Quit: jvesely]
jvesely has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
jcea has quit [Quit: jcea]
_whitelogger has joined #pypy
_whitelogger has joined #pypy
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 255 seconds]
_whitelogger has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
suhdonghwi[m] has joined #pypy
jvesely has quit [Quit: jvesely]
_whitelogger has joined #pypy
tsaka_ has quit [Ping timeout: 272 seconds]
dddddd has quit [Ping timeout: 258 seconds]
tsaka_ has joined #pypy
tsaka_ has quit [Client Quit]
tsaka_ has joined #pypy
tsaka_ has quit [Ping timeout: 260 seconds]
otisolsen70 has joined #pypy
<mattip> can anyone replicate the untranslated cpyext test crash on linux64 that is happenning on the buildbot?
tsaka_ has joined #pypy
tsaka_ has quit [Ping timeout: 258 seconds]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 258 seconds]
tsaka_ has joined #pypy
Dejan has quit [Ping timeout: 255 seconds]
Dejan has joined #pypy
Dejan has quit [Ping timeout: 255 seconds]
Dejan has joined #pypy
<Dejan> mattip, I will give it a try now
<Dejan> erm ... untranslated?
<Dejan> does that mean that I do not run translation?
fryguybob has quit [Ping timeout: 240 seconds]
fryguybob has joined #pypy
tsaka_ has quit [Ping timeout: 256 seconds]
lritter has joined #pypy
<mattip> right, from a pypy checkout on linux 64 do "python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py"
<Dejan> ok, omw to do that
<Dejan> ok, with CPython 2 it works
<Dejan> it does not coredump... yet
<antocuni> marmoute: maybe you are already aware of it, but pushing to heptapod is VERY slow (at least on the pypy repo): http://paste.openstack.org/show/790585/
ekaologik has joined #pypy
<Dejan> mattip, with CPython 2 it is stuck in this state: pypy/module/cpyext/test/test_pyerrors.py .........s.....s........s.
<Dejan> it is there for last 15m
<Dejan> 20m now
<Dejan> i will keep it running 10m more and kill it if nothing changes
<Dejan> It seems like it got stuck in:
<Dejan> File "/home/dejan/oswork/pypy/rpython/tool/runsubprocess.py", line 61, in <module>
<Dejan> operation = sys.stdin.readline()
<mattip> can you rerun with python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py -vv
<Dejan> question is why it core dumps when executed with pypy2
<Dejan> weird...
<Dejan> it passes with -vv
ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]
<mattip> yes, it seems like a timeout in test_error_thread_race
<Dejan> i think it coredumps at pypy/module/cpyext/test/test_pyerrors.py::AppTestFetch::test_fetch_and_restore
<mattip> are you sure? this log seems to suggest the test after the last successful one
adamholmberg has joined #pypy
<Dejan> with python2 that test succeeds
<Dejan> it is really weird, with -vv and python2 it all passes
<mattip> the tests are not supposed to succeed with pypy. If you want to test translated, use pypy pytest.py -A pypy/module/cpyext/test/test_pyerrors.py
<Dejan> without -vv python2 is stuck somewhere
<Dejan> and it never finishes
<mattip> just like the buildbot
<Dejan> i will try to find the exact test where it gets stuck now
adamholmberg has quit [Ping timeout: 268 seconds]
<Dejan> it passed even without the -vv now
<Dejan> so it is completely random behaviour...
<mattip> try cleaning out the repo
<Dejan> now i executed it again, and it got stuck!
<Dejan> :D
<Dejan> it got stuck at the test_error_thread_race
<Dejan> yea, looks like some kind of race condition and it is not blocked forever
<Dejan> s/not/now/
<Dejan> and like most race conditions it is random, so now it all makes sense
tsaka_ has joined #pypy
dddddd has joined #pypy
tsaka_ has quit [Ping timeout: 255 seconds]
tsaka_ has joined #pypy
zmt01 has joined #pypy
zmt00 has quit [Ping timeout: 255 seconds]
lastmikoi has quit [Excess Flood]
lastmikoi has joined #pypy
<cfbolz> mattip: could be related to the Gil work by arigato and antocuni?
adamholmberg has joined #pypy
<Dejan> mattip, fresh clone and pytest run works well
<Dejan> i have tried it 10 times and it succeeded all 10 times
<Dejan> with "13 passed, 3 skipped in 3.73 seconds"
<Dejan> it looks like i had an old code...
<Dejan> no, i was in the tip
<Dejan> it pointed to the hpy branch
<Dejan> (i am still learning Mercurial...)
<Dejan> Blah, still the same - it freezes at test_error_thread_race
<antocuni> Dejan, cfbolz : our gil-related work was merged to default in cd7261a5a735, so if you still experience the problem it is worth trying that revision and the one immediately before (b79b87185e0b)
<Dejan> that is too advance for me, i barely know how to use hg
<Dejan> but i will try to checkout b79b87185e0b
<Dejan> and see if i experience the same problem
<Dejan> if my boss find out i am doing this i am dead meat
<antocuni> hg up -r REV
<antocuni> this is the equivalent of git checkout REV
<antocuni> and this might be useful as well: https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone
<Dejan> i did hg checkout b79b87185e0b
<Dejan> hg tip shows i am at that rev
<antocuni> hg tip shows always the most recent commit
<antocuni> hg id shows the one you are currently on
<Dejan> thanks
<Dejan> i am at b79b87185e0b and it all works well
<antocuni> what about cd7261a5a735?
<Dejan> i am testing that right now
<Dejan> seems like it is working pretty well
<antocuni> good
<antocuni> so probably your problem was not related to our GIL work
<Dejan> it is not my problem
<Dejan> it makes buildbot freeze
<Dejan> actually, timeout
<Dejan> that is why started testing, when mattip asked for help
<Dejan> i am going down the "default" log, in order to find at which commit it starts working as expected
<Dejan> 8b1743e23736 is definitely failing
YannickJadoul has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Read error: Connection reset by peer]
adamholmberg has joined #pypy
<antocuni> ah but wait, it fails only if you run it on top of pypy?
<Dejan> no
<Dejan> I run python2 pytest.py pypy/module/cpyext/test/test_pyerrors.py
tsaka_ has quit [Ping timeout: 246 seconds]
jacob22 has quit [Quit: Konversation terminated!]
<antocuni> Dejan: works for me, although it doesn't mean that the bug isn't there of course
<Dejan> x86-64?
jacob22 has joined #pypy
<Dejan> repeat it couple of times
<Dejan> it is completely random
<Dejan> it gets stuck at the test_error_thread_race
<antocuni> uh, this is interesting: http://paste.openstack.org/show/790599/
<Dejan> yep, that is the problematic one
<antocuni> I see many messages like this in the output: rpython.rtyper.debug.FatalError: GIL not held when a CPython C extension module calls 'PyGILState_Release'
<antocuni> arigato: could this be related to 0fd6d867bff6 ?
<Dejan> i did not know about the -k thread_race
<Dejan> thanks!
<antocuni> you're welcome
<antocuni> other useful pytest tricks are: -v to get more verbose output
<Dejan> that i know of
<Dejan> like ssh you can combine them with -vv for more verbose output
<antocuni> -s to avoid capturing stdout/stderr (so you see prints during the test execution)
<antocuni> --pdb to get a pdb prompt in case of exception
<antocuni> -x to stop on the first failed test
<antocuni> and --ff to run the last-failed tests first
<antocuni> so, I'm running this on 0fd6d867bff6:
<antocuni> while ./pytest.py pypy/module/cpyext/test/test_pyerrors.py -x -v -k thread_race; do let "x=x+1"; echo $x; done
<antocuni> after 3 runs, the test got stuck
* antocuni tries with 2f391434f09e
<Dejan> i am at 84b0c1357d94
<Dejan> it failed there too
<antocuni> yes, but the two suspicious changesets are cd7261a5a735 (which changed the way the GIL is implemented) and 0fd6d867bff6 (which changed the way the GIL is used in cpyext)
<Dejan> well, you know it better than me :)
<Dejan> i am using the brute force to find which commit introduced this bug
<Dejan> s/commit/changeset/
<antocuni> Dejan: you should learn about hg bisect as well :)
<Dejan> can it automate this process?
<antocuni> yes
<Dejan> then by all means lets run it
<Dejan> as it takes ages...
<antocuni> to start with, it can automatically bisect, so you do log2(N) tests instead of N
<antocuni> then you can also use hg bisect -c, to give it a command to execute automatically
<antocuni> but it's not very useful in this case because it hangs
<Dejan> well, i want to bisect it with the command like yours above
<Dejan> without the infinite while
<Dejan> :D
<antocuni> well, it's not really useful because it might give you false negative (like, a changeset seems to work but actually is still buggy)
<Dejan> ah, i need to mark changesets as good and bad...
jcea has joined #pypy
jvesely has joined #pypy
<Dejan> would be nice if hg bisect could run concurrent jobs
<Dejan> ;)
<antocuni> so, 0fd6d867bff6 is definitely broken
<antocuni> arigato: ^^^
tsaka_ has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<Dejan> well, it is just that particular test that has some kind of race condition
<antocuni> well yes, that revision broke this test, so either one should be fixed. This is the point of having tests :)
tsaka_ has quit [Quit: Konversation terminated!]
<Dejan> antocuni, how do you track down memory leaks?
<antocuni> very broad question
<antocuni> any more specific use case or problem in mind?
<Dejan> I think I have found a memory leak in Celery... I used pympler and it clearly show a constant increase of memory allocated for "list" object(s)
<antocuni> using pypy or cpython?
<Dejan> and my test case contains a while loop and 3 lines of code that call Celery inspect API
<Dejan> i used CPython for this
<Dejan> but I should give PyPy a try...
<Dejan> I would like to find what "list" object is constantly increasing in size
<antocuni> in this case, I'd either try to bisect the code (e.g. by removing unnecessary code while still checking that the leak is still there) until I pinpoint the piece of code which leaks
<antocuni> once the code is sufficiently small, you can e.g. try to use objgraph on it
<Dejan> that is very hard
<Dejan> as my test case is ~7 lines
<Dejan> it just calls Celery's inspect API...
<antocuni> Dejan: you can also use gc.get_objects() to inspect all the lists which are alive; if you find one which has >N items, you know what's inside
<Dejan> that is a good idea, i never used it before
<antocuni> Dejan: well of course if the bug is inside Celery, you need to dirty your hands inside celery code
<mattip> often it is not a bug, maybe just a memory cycle. Did you try using "for i in range(3): gc.collect()" ?
<mattip> inspecting things can create cycles by holding on to things that were designed to be released
<Dejan> i kept tracking memory usage for 3h
<Dejan> it is constantly increasing
<Dejan> at a steady rate
<Dejan> and I got determined to find the culprit
<Dejan> mattip, i find it difficult to understand that GC does not collect in 3h
<Dejan> i am printing gc.get_stats() output every 20s
<Dejan> it constantly shows the same output :)
<Dejan> yet the process consume more and more memory...
YannickJadoul has quit [Quit: Leaving]
<Dejan> can you guys install pycrypto in your pypy 3.6 venvs?
<ronan> AFAICT, the GIL issue is that the untranslated emulation of the RPython GIL is broken
<ronan> ATM, if any thread has the GIL, all threads think they have it
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
jvesely has quit [Quit: jvesely]
<Dejan> antocuni, do you by any chance have binary wheel of pycrypto package for pypy3.6 7.3.0 ?
jvesely has joined #pypy
<Dejan> it does not build here...
<antocuni> Dejan: did it work before?
<Dejan> Yep
<Dejan> I managed to install few of our projects (one of them use celery) and all of them depend on pycrypto
<antocuni> maybe a recent version of pycrypto broke and previous ones work?
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<antocuni> ronan: is it a new issue or it was known also before?
<mattip> the test did pass before, so if it was known before there must have been a work-around.
<mattip> if it is hard to fix we should skip the test untranslated
<antocuni> true enough
<antocuni> I admit that I didn't look at it in detail yet
<ronan> antocuni: it's new. The old code was checking cpyext_glob_tid_ptr.
<antocuni> oh
ekaologik has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
otisolsen71 has joined #pypy
otisolsen70 has quit [Ping timeout: 256 seconds]
otisolsen71 has quit [Ping timeout: 256 seconds]
xcm has quit [Killed (barjavel.freenode.net (Nickname regained by services))]
xcm has joined #pypy
mattip has quit [Ping timeout: 255 seconds]
xcm has quit [Ping timeout: 256 seconds]
xcm has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 265 seconds]
ekaologik has quit [Quit: https://quassel-irc.org - Komfortabler Chat. Überall.]
lritter has quit [Quit: Leaving]
mattip has joined #pypy
wleslie has joined #pypy