cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
<arigato>
ronan: if you want to turn off test_recompiler.py in py3.6 I'm fine with it (it runs equivalent tests from extra_tests/cffi_tests/ after translation)
<arigato>
I always do cffi changes in default first and merge them to py3.6 without much troubles
<kenaan>
arigo py3.6 2d88bbee870d /pypy/module/_cffi_backend/test/test_recompiler.py: A large speedup to the own test 'test_recompiler' in py3.6
<arigato>
fixed this particular slowness to some extent (200 seconds)
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
tsaka__ has joined #pypy
BPL has joined #pypy
BPL has quit [Remote host closed the connection]
antocuni has quit [Ping timeout: 245 seconds]
mattip_ has joined #pypy
mattip_ has quit [Ping timeout: 245 seconds]
mattip_ has joined #pypy
antocuni has joined #pypy
mattip_ has left #pypy [#pypy]
dddddd has joined #pypy
BPL has joined #pypy
BPL has quit [Remote host closed the connection]
Dejan has joined #pypy
antocuni has quit [Ping timeout: 265 seconds]
BPL has joined #pypy
marky1991 has quit [Read error: Connection reset by peer]
<kenaan>
cfbolz py3.6 bd340e819dcd /pypy/interpreter/unicodehelper.py: now I get to what I actually wanted to achieve: a fast path in utf8_encode_utf_8 for the common case where no surrog...
oberstet has quit [Remote host closed the connection]
ronan has quit [Ping timeout: 246 seconds]
marky1991 has quit [Remote host closed the connection]
marky1991 has joined #pypy
marky1991 has quit [Remote host closed the connection]
<Alex_Gaynor>
cfbolz: 99.9% of my rows have no multibyte codepoints, so the only important property is that it's cheap to check if you need one, and free otherwise
<cfbolz>
right
<gutworth>
18:17 < Alex_Gaynor> glyph: we desperately need better security metrics, all the implicit balancing of concerns in "All software has bugs, it's how you respond to them, but also you really shouldn't have like
<gutworth>
1k vulns, but also just counting CVEs is bad" just leaves us with nothing :-/
<gutworth>
sorry
<cfbolz>
gutworth: hey benjamin :-)
<cfbolz>
how's the cpython sprint?
<gutworth>
hi, cfbolz
<gutworth>
good; lots more to be done than a week allows as usual
<cfbolz>
sure :-)
<gutworth>
we were discussing more pure python implementations of stdlib modules
<gutworth>
how useful do you think it would be to take some of the pypy reimplementations and move them to the cpython stdlib?
<simpson>
It'd be extremely healthy for CPython. Folks seem to love those ordered dictionaries by default, after all.
ronan has joined #pypy
<simpson>
Also, less C is always good, isn't it?
<gutworth>
I'm not sure I follow?
<gutworth>
there wouldn't necessarily be less C
<gutworth>
just more two stdlib implementations
<simpson>
But we can hope.
<Alex_Gaynor>
gutworth: It'd be good, it'd mean when pypy updated python versions, we wouldn't have to do stdlib work on our pure python impls
phoe64 has joined #pypy
<cfbolz>
gutworth: would be good, but so far the track record of CPython for keeping C/Python versions in sync isn't that good - see datetime, where the specialized C API functions don't work at all on the pure python versions
<gutworth>
yes, it's a bit tricky for the ones that export c apis
<cfbolz>
yes, not a simple problem admittedly
ronan__ has joined #pypy
ronan has quit [Ping timeout: 246 seconds]
ronan__ has quit [Remote host closed the connection]
ronan__ has joined #pypy
dunpeal has joined #pypy
<dunpeal>
Hi. Question: I remember going to speed.pypy.org a few months ago, and it stated PyPy was overall almost x8 faster than CPython. Now it says it's only x4.3 faster. What happened?
<simpson>
dunpeal: "overall" is probably the faulty reasoning. speed.pypy.org shows a selected list of benchmarks. In general, you'll have to benchmark your application of interest in order to find out how PyPy compares to CPython on that particular app.
<simpson>
e.g. I remember a 60x improvement on a microbenchmark that exercised only one part of an application, but a mere 20x improvement in a fuller benchmark that exhibited human-like usage.
<dunpeal>
simpson: OK, but still, why is the figure shown by speed.pypy.org right now about half what it showed a few months ago?
<cfbolz>
dunpeal: to answer your question concretely: our benchmarking server died
<cfbolz>
and we started from scratch, and the world changed a bit
<dunpeal>
cfbolz: I see. In that case, what figure is more truthful?
<cfbolz>
dunpeal: both :-)
<simpson>
Aha. I was anticipating a less-dire version of that, where some benchmarks had been added to or removed from the list.
<cfbolz>
dunpeal: you really need to look at your concrete hardware and your concrete code
<dunpeal>
I'm asking since we're evaluating PyPy for a project to get better performance, and I'm not sure I can justify it with x4.
<cfbolz>
yes, sorry, it's not possible to predict without trying
<dunpeal>
What's the status of the numpy stack support on PyPy?
<dunpeal>
Numpy, scipy, pandas etc?
<cfbolz>
it works nowadays
<dunpeal>
Same performance and features as on CPython?
<cfbolz>
if your code relies very heavily on numpy, pypy is probably not going to help your performance
<dunpeal>
Sure, but will numpy perform worse on PyPy, or the same?
<cfbolz>
dunpeal: basically every single numpy call has a slightly higher overhead on pypy. that means if you mostly do things in a proper vectorized way, you are fine. if you have python loops over arrays a lot, it can be worse
<dunpeal>
I see, so it's the same status it was some months ago, though the way you speak of it makes it sounds like the overhead was reduced.
<cfbolz>
ah, if your question is "did it get a lot faster in the last few months" the answer is no ;-)
<cfbolz>
we are hoping to start an effort to optimize the C-API from october on or so
<dunpeal>
How will that work? IIRC there were some fundamental/architectural reasons PyPy's C API couldn't be made as fast as CPython's.
<simpson>
dunpeal: That 60x example was from me changing a numeric kernel from using Numpy on CPython to using plain array.array on PyPy. It doesn't seem to me that Numpy is fast, just written in low-level languages.
<simpson>
Write less C, write more Python, go faster.
<dunpeal>
simpson: that's good, however anyone using the numpy/pandas stack needs a lot more than array.array
<dunpeal>
*provides
<cfbolz>
dunpeal: yes, there are, but we can close the gap
<cfbolz>
(we are missing a number of the optimizations that CPython has, for example)
<simpson>
dunpeal: False; I was using Numpy and I turned out not to need it. I certainly see your point, but I hope that you see mine: Numpy isn't magical, just a pile of Somebody Else's C & FORTRAN.
<simpson>
Again, ultimately, there's no substitute for actually getting your app onto PyPy and running your benchmark suite for yourself.
<cfbolz>
yes, I agree with that last part
<dunpeal>
cfbolz: that would be cool. RE earlier discussion, what would be a ballpark speedup figure for a common pure Python application that doesn't rely heavily on C calls?
<dunpeal>
If PyPy's C-API becomes as fast as CPython's, then it seems the last major reason to avoid porting to PyPy would be eliminated.
<cfbolz>
we'll see ;-)
<cfbolz>
I won't commit to a number, the spread is too wide
<dunpeal>
Too bad, I was going to sue you if you were off, there goes my retirement plan.
<dunpeal>
cfbolz: are there any other reasons?
<cfbolz>
eh :-)
<dunpeal>
Last I looked at porting to PyPy, the only major problem that stuck out was the C-API overhead.
<simpson>
Sure, if it's not possible to leave C-API. However, it should always be possible to leave C-API, if one can tolerate using cffi instead.
<cfbolz>
dunpeal: well, the fact is that the performance spread is too wide and RAM usage can be high. there is a bit of porting overhead always, and if you then find out that the improvement is 10% you might be annoyed
<dunpeal>
Yeah, 10% would suck.
<dunpeal>
Would you say PyPy is comparable to other contemporary JIT VMs like V8?
<cfbolz>
the techniques are very similar, but the other ones have had at least 10x the investment
<cfbolz>
so they apply them more consistently and cover all the corners better (sometimes - then sometimes pypy is just better ;-) )
<simpson>
PyPy is very roughly comparable, in that there's the same shape underneath. However, the only other Python implementation I'd class with PyPy is ZipPy.
<dunpeal>
cfbolz: Interesting. I thought the type of JIT techniques applied by PyPy were actually very different than V8's JIT?
<cfbolz>
dunpeal: you mean tracing vs method based, and hand written vs generated?
<dunpeal>
Yes.
<dunpeal>
I don't know the latter one (hand written?) but the tracing vs method based.
<dunpeal>
Oh, PyPy is a JIT generator, now I remember
<dunpeal>
Whereas V8's JIT is "hand written".
<cfbolz>
Yes, that's true. But in the end we often achieve very similar effects. Sometimes tracing wins, sometimes method based. And then there are parts where we basically do the same thing, such as the way we both implement user define objects, and lists and dicts
<dunpeal>
Very interesting, from what I recall of learning about tracing vs method based, they seemed to be fundamentally different approaches.
<cfbolz>
That's true. But both approaches then work hard to get the best from the other one, thus getting somewhat closer to each other again ;-)
<cfbolz>
simpson: zippy is sort of similar, but they cheat in some ways. Eg they don't support frame objects
<cfbolz>
simpson: there is now TrufflePython though
<simpson>
cfbolz: Yeah. They've got a different approach to deopt, as well.
<simpson>
I think that that might be the biggest distinguisher. All of these JITs have "the trick" of monomorphizing megamorphic call sites one way or another, but their approaches to deopt are all quite different.
<dunpeal>
Someonef on #python just said: "lukasz langa, the CPython3.9 release manager, is planning to de-emphasize the central role of CPython"
<dunpeal>
Does that mean that as of 3.9, PyPy can expect a more central placement in the forefront of "official" Python runtimes?
<cfbolz>
dunpeal: it's a bit unclear what 'de-emphasizing the central role' actually means in practice
<tos9>
put pypy download links on python.org hooray
<tos9>
(47% serious suggestion)
<cfbolz>
tos9: not bad
<dunpeal>
Well, that sounds like a practical implementation of "de-emphasizing CPython's central role".
<dunpeal>
Linking any implementation that isn't CPython.
<simpson>
I'll believe it when I see it.
<cfbolz>
I'd like 'consistently use the word CPython instead of python when appropriate'
mattip has quit [Ping timeout: 246 seconds]
dunpeal has quit [Quit: leaving]
marky1991 has quit [Ping timeout: 245 seconds]
<cfbolz>
dunpeal: another difference between V8 and pypy is that js is a much smaller language than python, and in particular it's c API (=browser and node integration) is under almost full control of the VM team
antocuni has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
marky1991 has joined #pypy
mattip has joined #pypy
marky1991 has quit [Read error: Connection reset by peer]
marky1991 has joined #pypy
<mattip>
the discussion about dual implementations python/c-extension is currently being driven by Grumpy, they seem to be creating a pure-python _codecs module