#pypy on 2019-09-13 — irc logs at freenode.irclog.whitequark.org

2019-08-29 19:33 cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin

00:00 <bbot2> Success: http://buildbot.pypy.org/builders/jitbackendonly-own-linux-armhf-v7/builds/1584 [default]

00:22 <bbot2> Failure: http://buildbot.pypy.org/builders/rpython-win-x86-32/builds/191 [default]

00:25 BPL has quit [Quit: Leaving]

00:31 <bbot2> Success: http://buildbot.pypy.org/builders/rpython-linux-x86-64/builds/219 [default]

01:00 <bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-macosx-x86-64/builds/4588 [py3.6]

01:00 <bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/6446 [py3.6]

01:00 <bbot2> Started: http://buildbot.pypy.org/builders/own-linux-x86-64/builds/7677 [py3.6]

01:00 <bbot2> Started: http://buildbot.pypy.org/builders/own-linux-x86-32/builds/6633 [py3.6]

01:00 <bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-32/builds/5416 [py3.6]

01:02 marky1991 has quit [Ping timeout: 276 seconds]

01:18 <bbot2> Failure: http://buildbot.pypy.org/builders/own-win-x86-32/builds/2116 [default]

01:22 <bbot2> Success: http://buildbot.pypy.org/builders/rpython-linux-aarch64/builds/18 [default]

02:02 _whitelogger has joined #pypy

02:08 Rabid_Python has quit [Quit: KVIrc 4.9.3 Aria http://www.kvirc.net/]

02:30 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/6446 [py3.6]

03:07 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-32/builds/5416 [py3.6]

03:18 <bbot2> Failure: http://buildbot.pypy.org/builders/own-linux-aarch64/builds/61 [default]

03:18 <bbot2> Started: http://buildbot.pypy.org/builders/own-linux-aarch64/builds/62 [py3.6]

03:33 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-macosx-x86-64/builds/4588 [py3.6]

03:36 <bbot2> Failure: http://buildbot.pypy.org/builders/own-linux-x86-32/builds/6633 [py3.6]

04:02 <bbot2> Failure: http://buildbot.pypy.org/builders/own-linux-x86-64/builds/7677 [py3.6]

04:04 dddddd has quit [Remote host closed the connection]

04:41 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-win-x86-32/builds/4811 [default]

04:41 <bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-win-x86-32/builds/4812 [py3.6]

05:00 <bbot2> Started: http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/2717

05:26 cjwelborn has quit [Quit: sys.exit(0)]

06:32 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/59 [default]

06:32 <bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/60 [py3.6]

06:35 <mattip> cfbolz: there is an own failure, some interaction with _readline and utf8

06:35 <mattip> http://buildbot.pypy.org/summary/longrepr?testname=test_readline&builder=own-linux-x86-64&build=7676&mod=pypy.module._io.test.test_interp_textio

06:36 <cfbolz> mattip: thanks, will fix in a bit

06:37 <mattip> cool, thanks for the speedups and fixes

06:37 <cfbolz> mattip: after this change is merged to py3.6 we are beating CPython3 on AliReza's issue

06:38 <cfbolz> Not massively, but by 20% or so

07:01 <mattip> there is not alot of overhead in "read from stdin, write to stdout" in cpython, so even pulling even is an achievement

07:14 <cfbolz> mattip: we are jitting the loop, so I think our io and unicode infrastructure still has some overhead

07:15 <cfbolz> Also, we are hitting the ASCII case in this example, things are probably different for 'real' unicode

07:16 <bbot2> Success: http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/2717

07:52 antocuni has joined #pypy

08:32 ajlawrence has joined #pypy

08:37 <cfbolz> arigato: V8 is doing compressed pointers: https://docs.google.com/document/d/10qh2-b4C5OtSg-xLwyZpEI5ZihVBPtn1xwKBbQC26yI/mobilebasic

08:42 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-win-x86-32/builds/4812 [py3.6]

09:22 <kenaan> cfbolz default 6e25c50447f0 /pypy/module/_io/: fix corner case about readline with limit that goes beyond the end of the string

09:45 BPL has joined #pypy

10:25 antocuni has quit [Ping timeout: 245 seconds]

10:34 <mattip> yay for hypothesis and corner cases

10:39 <cfbolz> mattip: yes, what would we do without it

10:43 dddddd has joined #pypy

10:49 ajlawrence has quit [Remote host closed the connection]

11:01 <kenaan> cfbolz py3.6 6e832892a7f7 /: merge default

11:05 Masklinn has joined #pypy

12:01 xcm has quit [Remote host closed the connection]

12:03 xcm has joined #pypy

12:27 <Dejan> cfbolz, this paper looks like they are still far from compressed pointers in V8

12:28 antocuni has joined #pypy

12:31 <cfbolz> Dejan: I read it as they are getting there ;-)

12:42 astronavt has quit [Quit: ...]

12:43 astronavt has joined #pypy

12:53 <Dejan> Exactly 6h ago: "<bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/60 [py3.6]"

12:53 <Dejan> and it is still waiting for lock

12:57 Masklinn has quit []

12:59 <Dejan> Do you guys need more aarch64 builders?

12:59 <Dejan> I have an AArch64 VPS that is mostly idle

13:00 <Dejan> on scaleway.com

13:14 <cfbolz> Dejan: talk to mattip, we might!

13:16 Dejan has quit [Quit: Leaving]

13:29 Dejan has joined #pypy

13:33 <bbot2> Exception: http://buildbot.pypy.org/builders/own-linux-aarch64/builds/62 [py3.6]

13:38 xcm has quit [Read error: Connection reset by peer]

13:40 xcm has joined #pypy

13:57 marky1991 has joined #pypy

13:58 marky1991 has quit [Remote host closed the connection]

13:58 marky1991 has joined #pypy

14:05 marky1991 has quit [Ping timeout: 265 seconds]

14:08 <astronavt> do packages that don't work with pypy generally fail to pip install? or would i have to try it and find out the hard way

14:08 <astronavt> what _doesn't_ work with pypy anyway at this point? just certain C extensions?

14:09 <simpson> There are some documented quirks of CPython as well: http://pypy.readthedocs.io/en/latest/cpython_differences.html

14:09 <simpson> Generally, code will install fine, if its dependencies are compatible, but only fail at runtime.

14:10 <astronavt> ah thats too bad. i guess youd need the developer to be a good citizen and check for non-cpython in setup.py

14:10 <astronavt> or maybe theres a way to specify in pyproject.toml

14:10 <simpson> The most common failure mode you might not already know about is file-descriptor exhaustion, which occurs when people forget to close what they've open()'d.

14:10 <Dejan> astronavt, that is why there are thousands of tests to run

14:11 <astronavt> simpson that would mean the library developer isnt using 'with open'? i never want to install those packages anyway :P

14:13 <astronavt> the weakref stuff looks more problematic

14:13 <astronavt> and __del__ not being called reliably

14:13 <simpson> astronavt: Such code is depressingly common IRL. And folks don't always come to that conclusion; I recall a session with some folks from Intel a few years ago, where they weren't sure whether their code was buggy, or PyPy was buggy.

14:13 <simpson> Well, as you say, there's style, and then there's smell. .__del__() is already a risky method.

14:14 <astronavt> good point

14:16 <astronavt> "For example, a generator left pending in the middle is — again — garbage-collected later in PyPy than in CPython." this is where i imagine stuff like memory leaks could arise even if you arent messing around with weak references and such

14:18 <simpson> In usual use, no, it's not a problem. You'd have to leak the generator in order to leak its frame. What's really being communicated here is a meta-pattern: PyPy doesn't have reference-counting, so unlike with CPython, sometimes objects aren't destroyed immediately upon becoming inaccessible.

14:18 <astronavt> i see. so it will still be gc'ed if it goes out of scope, just not _immediately_

14:19 <Dejan> I am not an expert, but i think CPython does not use RC either

14:19 <Dejan> It uses generational GC

14:19 <simpson> Right. It's like any other GC'd language with finalizers; the finalizers run when the objects are reaped, but that's disconnected from when the objects are dereferenced.

14:19 <astronavt> i thought reference counting was like the heart and soul of cpython semantics

14:20 <simpson> Dejan: CPython has a GC and also RC. The GC chiefly is used to break cyclic references.

14:20 <astronavt> thats the only reason i know what reference counting even is

14:20 <astronavt> "C-API Differences" <- so other than stuff that uses CPython internals, a "correct" C extension should work, albeit maybe with extra overhead

14:21 <Dejan> how that hybrid even works?

14:21 <astronavt> looks like basic cython code actually works w/ cpyext, cool! http://docs.cython.org/en/latest/src/userguide/pypy.html

14:21 <Dejan> with if both GC decides to free some memory referenced by RC?

14:21 <Dejan> or vice-versa

14:21 <simpson> Dejan: All objects are registered with the GC and also have an RC field. RC happens on every access. GC runs when memory is low.

14:22 <simpson> GC can read the RC field.

14:22 <Dejan> aha!

14:22 <Dejan> thanks!

14:23 <simpson> astronavt: "It's about time." It's really too bad that the Cython team insists on integration on their terms; it could be a lot smoother and simpler if they stopped insisting on so much C.

14:23 <astronavt> what do you mean "insisting on so much C"

14:24 <simpson> Cython's promise is that we can write code that is basically Python, and it goes fast. PyPy's promise is that we can write Python, and it goes fast.

14:25 <simpson> There's another part of Cython focused on CFFI, but really folks should be using cffi for that.

14:26 <Dejan> I do lots of D coding for my personal stuff mostly, at work I use Python predominantly... I came here as a Java programmer...

14:26 <Dejan> to me, PyPy is the reason why I started coding Python

14:27 <Dejan> CPython is just too slow for some use-cases, and I do not want to have to re-write pieces of code in Cython, or C/D

14:29 <Dejan> PyPy makes me believe Python has future :)

14:48 ronan has joined #pypy

14:55 Dejan has quit [Quit: Leaving]

15:04 ronan has quit [Ping timeout: 240 seconds]

15:04 antocuni has quit [Ping timeout: 276 seconds]

15:08 <arigato> cfbolz: compressed pointers: yay?

15:08 <arigato> this document throws some possible plans around

15:09 <cfbolz> arigato: yes, I agree that we don't know it will actually work in the end

15:09 <arigato> it limits the total memory to 4GB though

15:09 <arigato> the old branch in which I played with that compressed pointers to still allow for 32GB of memory

15:11 marky1991 has joined #pypy

15:12 <cfbolz> arigato: right

15:13 <cfbolz> arigato: did our branch work with the jit, do you remember?

15:13 <arigato> no, it didn't

15:14 <arigato> (i.e. yes I remember)

15:14 <cfbolz> Right

15:15 <arigato> also it did the thing that can't work on OS X, which is convincing the kernel to give us the initial 4 GB of addressable space

15:16 <arigato> note that the stm branches we did later implement something similar to the more general indirection needed

15:16 <arigato> basically they implement everything we'd need except they still used 64 bits to represent the modified pointers

15:17 <arigato> ("modified" as in made relative to a single global address (or thread-local in the case of stm))

15:17 <cfbolz> arigato: right, good to know if we ever want to try again

15:18 <cfbolz> arigato: and the jit was supporting stm, right?

15:18 <arigato> yes

15:18 <arigato> so we'd know exactly the places to fix, at least

15:19 <cfbolz> Cool

15:21 <arigato> I'm still a bit skeptical, but maybe a plain max-4-GB-for-everything-allocated-by-the-GC would work to avoid most speed overheads (and maybe be faster in the end)

15:22 <cfbolz> arigato: why is that so much faster than 32gb?

15:22 <arigato> it would of course consume less RAM, but because the limit is 4 GB, it's not very useful for machines with a lot more than 4 GB

15:24 <arigato> the 32 GB version has got overhead because it needs to shift by three bits, and then also (as the V8 page discusses) it's a simplification if someone can look at 8 bytes of memory and get its value as a GC pointer whether it is "compressed" or not

15:24 <arigato> but maybe that's not a great argument, it might only say we were lazy somewhere

15:25 <cfbolz> arigato: the overhead was a small percentage only in any case, iirc

15:26 <arigato> OK

15:27 <arigato> then maybe we should (one day) try again with the 32 GB limit, but this time including all GC arrays? I think GC arrays were allowed to be anywhere

15:27 <arigato> in theory because it makes the memory limit much higher, but in practice because our GC uses plain malloc for large arrays

15:28 <cfbolz> arigato: right

15:28 <arigato> (the stm branch has got that fixed, because large arrays need to be allocated together with the rest)

15:29 <arigato> (and the GC header was still 64 bits when we could also force that inside 32 bits, etc.)

15:29 <cfbolz> arigato: are any of the changes there generally useful and we should merge them at some point?

15:30 <arigato> I'm not sure

15:30 <arigato> remember that it had its own GC written in C

15:30 <cfbolz> Right

15:32 <bbot2> Success: http://buildbot.pypy.org/builders/jitbackendonly-own-linux-armhf/builds/1832 [default]

16:35 Dejan has joined #pypy

16:38 dansan has quit [Remote host closed the connection]

16:43 dansan has joined #pypy

17:03 <bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/60 [py3.6]

17:56 marky1991 has quit [Ping timeout: 240 seconds]

18:01 <Dejan> the failed test is weird

18:05 marky1991 has joined #pypy

18:11 <Dejan> where can i find the source: pypy/interpreter/test/apptest_coroutine.py

18:11 <Dejan> (I am looking at http://buildbot.pypy.org/builders/pypy-c-jit-linux-aarch64/builds/60/steps/shell_12/logs/pytestLog and I found the test in this file failed)

18:11 <Dejan> thing is... the test should not fail

18:12 <Dejan> the result is the same what i have on CPython

18:12 <Dejan> unless with PyPy we expect different behaviour?

18:30 <arigato> so we do gc.collect() in this test to make sure the coroutines are collected and the warnings emitted, but we also find some coroutines left around from previous tests

18:30 <cfbolz> Dejan: fwiw, it fails on x86 too

18:31 <arigato> yes, my quick fix was enough (see b8f000ca2554)

18:32 <kenaan> arigo py3.6 b8f000ca2554 /pypy/interpreter/test/apptest_coroutine.py: Emit warnings from unrelated older tests before catching the warnings from this precise test

18:33 <arigato> so there were other coroutine objects left lying around, which were waiting for the next GC to emit a warning

18:33 energizer_ is now known as energizer

18:34 <arigato> that's why the test failed by getting 3 warnings where only 1 was expected

18:49 <cfbolz> arigato: I've been wondering, this is the struct for the utf8 index storage:

18:49 <cfbolz> UTF8_INDEX_STORAGE = lltype.GcArray(lltype.Struct('utf8_loc_elem',

18:49 <cfbolz> ('ofs', lltype.FixedSizeArray(lltype.Char, 16)),

18:49 <cfbolz> ('baseindex', lltype.Signed),

18:49 <cfbolz> ))

18:49 <cfbolz> arigato: due to alignment, every array entry will be 2*wordsize big, right?

18:50 <cfbolz> ah, sorry. no that works out fine

18:50 <cfbolz> it's three words per array entry on 64 bit

19:11 marky1991 has quit [Ping timeout: 265 seconds]

19:14 <arigato> yes

19:15 <cfbolz> arigato: I feel there are still some optimizations missing somehow. eg if we have a string of length 1 and we index it, we make such an array of size 1

19:16 <cfbolz> (probably to not make a bridge that checks the size?)

19:18 <arigato> maybe create_utf8_index_storage could use a cache for very small string sizes?

19:19 <cfbolz> arigato: to share the structures?

19:19 <arigato> yes

19:20 <cfbolz> right

19:20 <arigato> maybe just for utf8len==1

19:20 <cfbolz> for size 1 there are four different ones, for size 2 16

19:20 <cfbolz> right

19:20 <arigato> yes

19:21 marky1991 has joined #pypy

19:21 <arigato> or for utf8len <= 3

19:21 <cfbolz> ah no, it's less

19:22 <cfbolz> the ascii ones are never created

19:22 <arigato> yes, and there is only the position of indices 1, 5, 9,...

19:22 <cfbolz> true

19:23 <cfbolz> arigato: I think the other optimization that would be useful is to not create a structure at all for things like s[1:] and s[:-1]. we could only do that optimization in the jit, with jit.isconstant

19:24 <arigato> ouch, yes

19:24 <cfbolz> ok, will look into these

19:25 <arigato> s[:-1] could be done always, by just copying the _index_storage of the bigger string and assuming we're not really loosing memory

19:25 <arigato> s[1:] is quite a bit harder

19:25 <arigato> ah no, I see what you mean

19:25 <cfbolz> arigato: the goal for both would be to never even create the index

19:26 <arigato> yes

19:26 <cfbolz> by just calling next_codepoint_pos

19:26 <arigato> yes

19:26 <cfbolz> ok, on it :-)

19:26 <arigato> :-)

19:27 <arigato> right now, even doing s[0] builds the index I think