<kenaan>
mattip unicode-utf8-py3 914068c8b956 /: remove most runicode from pypy, refactor FormatErrorW, add utf8 to SocketError
<kenaan>
mattip unicode-utf8-py3 18628545b899 /: win32 fixes, still uses runicode for str_decode_mbcs
<cfbolz>
mattip: thanks for the reminder, I'll fix the regalloc test
<cfbolz>
did you get home ok?
ajlawrence has joined #pypy
<kenaan>
cfbolz default 413357a1f973 /rpython/jit/backend/x86/test/test_regalloc.py: make the test not fail we could do better here, but should be part of looking at some more examples of the result...
illume has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
illume has joined #pypy
antocuni has joined #pypy
<antocuni>
hi guys
<antocuni>
unless there are objections, I'll send out the release announcement soon
<mattip>
+1
<arigato>
+1
<mattip>
I did not follow completely the build situation, all the uploads (2.7, 3.5, 3.6) are updated to the proper tags?
<fijal>
Alex_Gaynor: I would say "minimal"
<fijal>
Alex_Gaynor: as in, usually we are chasing pointers and arithmetic units are just bored
<fijal>
but it depends on your use case, do you mean pypy or something else?
ajlawrence has quit [Ping timeout: 256 seconds]
<mattip>
float or int overflow?
<antocuni>
mattip: the source tarballs are in sync with the release tag, which e.g. for 2.7 points to c8805ee6d784
<mattip>
antocuni: good, and win32 is up-to-date as well?
<antocuni>
some of the binary builds could point to the previous commit, but it doesn't matter as c8805ee6d784 is only relevant at translation time
<fijal>
I assumed int
<antocuni>
mattip: yes, win32 is the latest I built
<mattip>
nice, thanks
<fijal>
was great to have such a busy sprint
<Ninpo>
BTW the performance "regressions" I suspected over the weekend were network variance. pypy 3.5 and 3.6 v7 are at least as quick as v6
<Ninpo>
(for my use case anyway)
<Ninpo>
Haven't got around to memory profiling yet, hopefully that'll be today
<mattip>
but I fear it is only a symptom of the real problem, which is something else
<cfbolz>
someone2: while it was designed for dynamic languages, it works for different ones too
<Ninpo>
mattip: I fixed it
<mattip>
Ninpo: ?
<Ninpo>
I made a package and the problem went away. I noticed in my other envs there was a pypy openssl lib with the version 7 in the name
<Ninpo>
and we only copied in the libpypy-c so
<mattip>
ahh, so 71 vs 7
<mattip>
sorry to mislead you
<Ninpo>
yeah
<Ninpo>
No worries :)
<Ninpo>
running a test now
<mattip>
about the website - I have reached out to #python-infra and they are starting to look up who has access to the machine
<mattip>
github has github pages, which can host static websites. Apparently bitbucket does too, but only for git repositories?
<Ninpo>
mattip: so this build munches my mysql result set into a dict much much faster, but overall run time (DB communication via threads) is almost twice as slow.
<Ninpo>
32.8s vs 50.2s
<antocuni>
mattip: it would be enough to have a way to force-trigger a pypy.org update, and have a bitbucker pipeline which does it when we push
<mattip>
I think that is what we have
* mattip
checking
<mattip>
maybe the hook changed
<antocuni>
I don't see any bitbucket-pipeline.yml in the repo
<Ninpo>
mattip: anything I can do here to help you see where the slowdown is?
<Ninpo>
it has occurred to me for this to be a fair test I should build stable version 7
<mattip>
Ninpo: what kind of db communication do you use? I know you said so a few days ago, can you repeat?
<Ninpo>
pymysql via sqlalchemy_aio and trio, to a unix socket locally
<Ninpo>
mattip: ^
<mattip>
Ninpo: do benchmarks exist for trio? Maybe it is slower on the branch
<Ninpo>
I can ask njs
<Ninpo>
Oh he's here :)
<Ninpo>
mattip: I'm going to build 35 v 7.0 though as I'm not comparing like for like, build wise. My v7.0 is the portable binaries made by squeaky_pl
<Ninpo>
Whom earlier mentioned they build with cpython
<mattip>
if that matters it is a bug
<Ninpo>
mattip: a lot changing though in my comparison atm right? I'm building against different libraries, using pypy instead of cpython (and squeaky's pypy2.7v6) etc etc
<Ninpo>
Don't want to give you duff info if it's as a result of the build process or some library I've got
oberstet has quit [Remote host closed the connection]
oberstet has joined #pypy
<fijal>
Ninpo: it's rather unlikely
<fijal>
Compared to the likelihood that something is off with pypy
<antocuni>
fijal: do you know how to force an update of pypy.org?
<Ninpo>
have faith :) besides it's free for me to check, I'm all in now anyway
<fijal>
antocuni: no
<mattip>
someone should have login access to virt-7tac5q.psf.osuosl.org
antocuni has quit [Remote host closed the connection]
<Ninpo>
yeesh if ever there's a good indicator of how much faster pypy is than cpython, it's building pypy
antocuni has joined #pypy
<simpson>
Ninpo: I've become in the habit of putting timers or tqdm or similar on my small scripts, and then the difference becomes visible. It's often 2-5x faster.
<antocuni>
mattip: some of the links in the download page are broken :(
<antocuni>
(my fault)
<antocuni>
I'm fixing them now
<mattip>
antocuni: scipy CI builds with latest nightlies on pypy3.5
<mattip>
antocuni: push the page and I will copy-paste again
<kenaan>
antocuni pypy.org[extradoc] 43d47b661208 /: fix the download links and regenerate HTML
<antocuni>
mattip: links fixed and pushed
<mattip>
antocuni: check now?
<antocuni>
mattip: seems good, thanks again
<kenaan>
mattip pypy.org[extradoc] 2cfb8fb9d732 /README: document the site update process
ajlawrence has quit [Quit: Page closed]
<cfbolz>
antocuni: thanks for being the release manager!
<antocuni>
not a particularly good one, admittedly :)
<marmoute>
antocuni: did the release made it out ? Looks like yes
<marmoute>
So, not too bad
<antocuni>
yeah, it only took half a day to smooth all the issues, but too bad :)\
<antocuni>
the fact that nobody reported the broken links should tell us something, btw
<antocuni>
(not sure what)
<Ninpo>
Er, I did
<Ninpo>
Well I reported the v6 thing
<Ninpo>
Just realised you meant the download links on the fixed v7 page
<antocuni>
ah, I'm stupid of course
<Ninpo>
I wasn't implying that!
<antocuni>
nobody realized that the links were broken because nobody had a chance to see the v7 website anyway :)
<antocuni>
Ninpo: sure, I was talking to myself
<Ninpo>
hehe ok
<Ninpo>
I follow the zen of python when it comes to insults (explicit etc) :D
<Ninpo>
19% away from finding out if it's my libs, or building on cpython that slows it down
<Ninpo>
er not building it on cpython
<antocuni>
don't worry, I surely didn't take it as an offense; the "I'm stupid" part was meant to be read together with the "nobody realized" part
<Ninpo>
It's cool, I'm a dinosaur that struggles to get by in the more modern easy to offend people internet
<Ninpo>
So I worry about that
<Ninpo>
Appreciate you clearing it up :)
<antocuni>
btw, pip install scipy works well on my machine, not sure why it fails on travis
<mattip>
antocuni: numpy 1.16 changed the use of exec_command in f2py. It now calls Popen() rather than some really old thing
<mattip>
which numpy is the travis recipe using?
<antocuni>
mattip: I think it does a "pip install scipy" on a clean environment, so it automatically brings in numpy 1.16.1
Garen has quit [Read error: Connection reset by peer]
Garen has joined #pypy
<mattip>
antocuni: you are building scipy==1.0.0 ? I think 1.2 has been released
<mattip>
can we close release branches that have not been updated in more than 5 years?
<mattip>
cfbolz: ^^^
<antocuni>
mattip: oh, that's true. I am building BOTH scipy==1.0.0 and scipy (which collects 1.12.1)
<antocuni>
but the error is about scipy==1.0.0
<antocuni>
I suppose I can just avoid building scipy 1.0.0 for now
<mattip>
+1
<cfbolz>
mattip: yes please
illume has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<mattip>
did we ever get an issue out of the crashing example for cpyext that someone was working on at the sprint?
<Ninpo>
hmm build process with cpython2.7 has been seemingly stuck for a long time at starting stackcheckinsertion, 100% cpu with lots of brk(0x55febe468000) type lines in strace, is this normal (and usually much faster on pypy?
<cfbolz>
mattip: nope, will ping tim again
<cfbolz>
Ninpo: yes it's probably normal, and yes faster on pypy
<Ninpo>
rog
<Ninpo>
I'll wait then :)
<mattip>
<spam>
<kenaan>
mattip default c080e584813c /pypy/module/zlib/test/test_zlib.py: skip test that crashes on old zlib version
<kenaan>
mattip rmod-radd-slots ce5a8c7604a6 /: close abandoned branch
<kenaan>
mattip ndarray-promote cf24682f0f6d /: close abandoned branch
<kenaan>
mattip pypy3-release-2.6.x 7cf7426ddc77 /: close abandoned branch
<kenaan>
mattip pypy3-release-2.3.x 25560ec3b2f5 /: close abandoned branch
<kenaan>
mattip pypy3-release-2.4.x 072df3de15c5 /: close abandoned branch
<kenaan>
mattip release-pypy3.3-v5 6dfb3af9716d /: close abandoned branch
<kenaan>
mattip release-5.x 1ce3b640e7d7 /: close abandoned branch
<kenaan>
mattip numpy_broadcast_nd a32aff107924 /: close abandoned branch
<kenaan>
mattip cpyext-inheritance fba889ae9aaa /: close abandoned branch
<kenaan>
mattip cpyext-debug-type_dealloc 91b5766bb8c6 /: close abandoned branch
<kenaan>
mattip override-tp_as-methods 6e767f90b99a /: close abandoned branch
<kenaan>
mattip matplotlib 6fcafa0bb5ea /: close abandoned branch
<kenaan>
mattip win32-vmprof e4e1582c4390 /: close abandoned branch
<kenaan>
mattip non-linux-vmprof-stacklet-switch-2 fa16a566a0a7 /: close abandoned branch
<mattip>
</spam>
kipras has joined #pypy
<antocuni>
mattip: am I supposed to send the release announcement to some mailing lists other than pypy-dev? The "how-to-release" mentions python-dev and python-annonce, but I can't find previous pypy announcement in the archives
ssbr has quit [Remote host closed the connection]
ssbr has joined #pypy
<Ninpo>
mattip: well cpython2.7 build ruled out. Trying a build with squeaky_pl's docker build setup now
<squeaky_pl>
Ninpo, the BUILD.rst is not updated, I'm guilty of not documenting the changes.
<Ninpo>
squeaky_pl: no worries I'm working my way through. How easily can I make this use pypy for the build?
<Ninpo>
Just overwrite the prefix64/cpython dir?
<squeaky_pl>
you would need to change Dockerfile to download PyPy as last step and untar it somewhere
<squeaky_pl>
then delete CPython compilation step from build_deps, and put pypy as python on your path
<Ninpo>
Ok thanks
<squeaky_pl>
as in symlink pypy as python
<Ninpo>
yeah
<Ninpo>
How are you adding the /opt/cpython bit to path?
marky1991_2 has joined #pypy
beystef has joined #pypy
<squeaky_pl>
Ninpo, it's added in build and package to $PATH
<squeaky_pl>
I dont remember why I needed it just there and not globally but there was a reason
marky1991 has quit [Ping timeout: 246 seconds]
antocuni has quit [Ping timeout: 268 seconds]
<Ninpo>
wheyy and we're off
<Ninpo>
squeaky_pl: a patch to curses failed, should I be worried?
<Ninpo>
squeaky_pl: aw you turn the mandlebrot off :P
<Ninpo>
man alive the difference in speed on the rtyping/rtyper phase between pypy and cpython is utterly ridiculouas
<Ninpo>
-a
<Ninpo>
squeaky_pl: it looks like you have it in both places once for the build phase and once for the packaging part. Presumably so no system pythons interfered
<Ninpo>
I'm building it with YOUR pypy2.7-v7 build :D squeakyception xD
<kenaan>
stevie_92 cpyext-gc-trialdeletion 8ec9653041d2 /: Close branch cpyext-gc-trialdeletion.
asmeurer_ has joined #pypy
someone2 has quit [Quit: Page closed]
<squeaky_pl>
Ninpo, that patch is different for pypy2 and pypy3, one of them always fail, I didnt turn off mandlebrot, Docker by default doesnt allocate TTY and translation toolchain turns it off automatically
<Ninpo>
ahhhhh
<Ninpo>
gotcha
<Ninpo>
squeaky_pl: Incidentally I'm not long away from letting you know how building with pypy2.7 went :)
<Ninpo>
I'm attempting to determine why your builds are a _hell_ of a lot faster than mine done on opensuse
<Ninpo>
leap 15
<squeaky_pl>
Ninpo, I'm using latest bleeding edge GCC, maybe that's a clue
<squeaky_pl>
As in there is GCC 8.2 portable pulled into the build in the Dockerfile
<squeaky_pl>
should not affect JIT but interpreter maybe
<arigato>
mattip: sorry, missing links added to pypy.org/download.html
<Ninpo>
squeaky_pl: ah potentially.
<mattip>
arigato: ok, updating by hand
<mattip>
arigato: done
<Ninpo>
aaaand we're compiling! please don't be slow please don't be slow....
* Ninpo
closes eyes and crosses fingers
<kenaan>
rlamy Opcode-class 0cdae6b1a07e /: Close obsolete branch
<Ninpo>
squeaky_pl: fwiw, the gcc version my build ran was 7.3.1
<Ninpo>
squeaky_pl: hmm the package step falls over with find `pypy-` no such file or dir
<Ninpo>
ah I was missing the revision argument
marky1991 has joined #pypy
marky1991 has quit [Remote host closed the connection]
<Ninpo>
Ok, moment of truth..
marky1991_2 has quit [Ping timeout: 240 seconds]
marky1991 has joined #pypy
<Ninpo>
Argh it's still slower, what on earth...
<squeaky_pl>
Ninpo, maybe that branch you are testing has a speed regression for your use case
<Ninpo>
True, I'll try your build steps on a known good version. That said, it's running similar time as the version 7 I built by hand as a baseline (again, slower than your build)
<Ninpo>
I checked building with cpython2.7 on my system that didn't make a difference either
<Ninpo>
I wonder if it's an optimisation it's picking up for this CPU?
<squeaky_pl>
Ninpo, during build time? I would expect JIT to be agnostic from the CPU used during build time.
<Ninpo>
what about the gcc build
<Ninpo>
oh I'm being dumb that's a portable build
<mattip>
Ninpo: the same code runs faster when using squeaky_pl's build, even though both of you use the same version of pypy?
<mattip>
s/use/build/
<Ninpo>
I built that branch you gave me, I'm about to build 3.5-7 release
<Ninpo>
Just weird that all of my builds so far have the same 20s or so performance regression
<mattip>
no, not so weird if they all are building the same pypy version
<mattip>
it would be weird if one build was faster
<Ninpo>
they weren't all. I built a 7.0 release myself and that had the same issue
<Ninpo>
so I built your build, had a performance regression. Built the earlier branch you gave me, same issue. Tried a version 7.0 build, same issue. So I tested with using cpython as squeaky_pl does, same issue. Now I just built your branch with squeaky_pl's build system, same issue. About to build 3.5-7.0 with squeaky_pl's build system using pypy2.7, and if that has the same issue, I'll use squeaky's build tool
<Ninpo>
again with their defaults and go find something to do for a few hours :P
<mattip>
ah. So all the locally built versions are slower, and only the downloaded is faster (so far)?
<Ninpo>
so far yes
<squeaky_pl>
is it even remotely possible that something from the host build system "contaminates" speed?
<mattip>
can you compare the downloaded version's packaged *.so support library versions to yours?
<Ninpo>
with an md5sum or something you mean?
<mattip>
what os and version are you using?
<squeaky_pl>
Ninpo, the files typically have SONAME encoded in ELF headers
<Ninpo>
opensuse leap15
<Ninpo>
squeaky_pl: starting to approach going over my head now, can you elaborate?
<mattip>
in the portable package, there is a lib directory. It has various *.so files
<Ninpo>
ah right
<Ninpo>
ok
<squeaky_pl>
I'm building everything on Fedora 29, the "OS image" inside docker is latest Centos 6 + gcc 8.2 with latest binutils, I package all the latest stable versions of libs.
<Ninpo>
versions all look the same between squeaky_pl's 3.5-7.0 and my 3.5-7.1alpha
<Ninpo>
even sizes
<LarstiQ>
what exactly are you benchmarking to get this 20s slowdown?
<Ninpo>
I'm timing run time on an app I've written to find utf8 bytes in mysql fields that are supposed to be latin1. Run time on py3.5 v6.0 and squeaky_pl's builds of py3.5v7 and py3.6v7 are all comparable around the 25s run time, I'm getting 45s+ on everything I've built so far
<Ninpo>
an entire day in the hopes I could give mattip some real world feedback on the unicode branch xD the universe is out to get me
marky1991 has quit [Ping timeout: 246 seconds]
<fijal>
Ninpo: at this stage the very obvious thing to do would be to compare the builds using something like valgrind
<fijal>
you will immediately see, if it's in .so
<Ninpo>
Yeah I'm building a release version with squeaky's tool now so I've got a proper baseline
<Ninpo>
fijal: Though, I'd appreciate a pointer on how to do that?
<Ninpo>
The valgrind part
<fijal>
valgrind --tool=callgrind pypy foo.py
<Ninpo>
oh, ha
<fijal>
if you can't get around the output, send them to me fijall at gmail
<Ninpo>
sure
<fijal>
(the best tool for viewing them is kcachegrind)
<fijal>
it slows down everything ~20x, so get your test smaller accordingly, if possible
<fijal>
20x, not 20%
<Ninpo>
Does it wrap the thing while it runs or what? Do I need to be concerned about what it captures from runtime?
<Ninpo>
data wise
<fijal>
the run will output number of times each C function got called
<fijal>
are your C functions data?
<Ninpo>
Right ok
<fijal>
(the output contains just that and relation who called who)
<Ninpo>
okie dokie :)
<Ninpo>
Just being careful, the stuff in the DB is potentially sensitive
<fijal>
yeah sure
<fijal>
I can definitely explain what every single output of profiling data contains :)
<fijal>
at least the ones I ever wrote parser for ;-)
<Ninpo>
I can't believe after all that, I was running cpython every time. Oh my stars.
<simpson>
Happens.
<Ninpo>
mattip: btw, one thing I have noticed, from today, my tqdm progress bar I use when my code turns my list of tuples from sqlalchemy results into a dict, tqdm says about 200k items per second...it's around 50k in pypy (various versions)
<Ninpo>
mattip: ok so performance on my small dataset is about the same, it seems consistently 10s faster on the set that was taking 4m35s or so
<Ninpo>
I'm going to kick off a native build again and see if being non portable is any quicker.
<mattip>
you might want to reduce your use of tqdm, I seem to remember someone saying it is not pypy freindly
<Ninpo>
Me and njs probably
<Ninpo>
I did ditch it for the runtime progress bar
<Ninpo>
it was causing a lockup on threading
<Ninpo>
I just chuck a count out to stderr occasionally now. Not as pretty but does the job.
<Ninpo>
Now I want to use mandlebrot :D
<Ninpo>
mattip: so anything I should be wary of with this special branch?
<mattip>
lists to dictionaries should be fast. Are the keys all one kind: int, unicode, ascii ?
<mattip>
Ninpo: no, it should just work faster with strings
<Ninpo>
and now my issues are sorted (apart from not being able to pull in tcl/curses) any reason to not use the first one you gave me?
* mattip
forgot, which one?
<Ninpo>
the first one that you thought was broken when it wasn't pulling in ssl, but it was once I made a package
<Ninpo>
it had a proper name, unicode something. Then you gave me c0c8a1eba246 to try
<Ninpo>
mattip: ironically on the longer run time we're now 20s or so faster, pretty consistently. That's just on one DB out of a thousand or so as well, so exponentially that'll be nice.
<Ninpo>
oh and the list is a list of tuples in each tuple it's just string keys and values. The bulk of data processing grabs a HEX representation of text fields (text, char, varchar), binascii.unhexlifies, then tries to decode as ascii then as utf8 (then will run ftfy on it later)
<Ninpo>
er in each tuple it's just strings, I presume ascii, not sure what come back from mysql natively, possibly unicode
<Ninpo>
since I munch on a _lot_ of strings after I have the result set (async generator, fetchmany)
<mattip>
can you just decode to utf8, why both that and ascii?
<mattip>
if it is ascii, then utf8 == ascii
<Ninpo>
I'm looking for utf8 in latin1. If it decodes fine as ascii I know it's OK. If it fails to decode as ascii, I know I've got non ascii bytes in there. If they're latin1, it should fail if I decode with utf8. If it succeeds, I've a high chance I've found mojibake.
<Ninpo>
This poor DB has been abused and one of the problems is a horrible mix of encodings from the app that talks to it. The app is being fixed, I've taken on the job of fixing the data.
<Ninpo>
mattip thanks so much for your time today, I'm sorry I wasted so much of it
<Ninpo>
squeaky left, if they come back and I'm not around please pass on my thanks there too.
<mattip>
:)
<Ninpo>
At least tomorrow I can build a native build and know the symlink caveat
<Ninpo>
did I miss a package option or something? I noticed squeaky's downloads have the syms in
<mattip>
it sounds like you could try scanning the string rather than decoding on pypy
<Ninpo>
How do you mean?
<Ninpo>
check the raw bytes?
<mattip>
[ord(x) > 127 for x in s] or for latin-1 [ord(x) > 255 for x in s]
<Ninpo>
I was doing that and it took a _lot_ longer. I had a if any(ord(c) > 127 for c in binascii.unhexlify(textfield) or something along those lines
<Ninpo>
I could see if it's faster with your build, I've still got that revision somewhere
<mattip>
on pypy it took longer?
<mattip>
or cpython
<Ninpo>
pypy
<Ninpo>
I had the if any on both at first, I didn't switch to decoding the whole thing until later
<Ninpo>
I can try again tomorrow though, I've just put it on my list
<mattip>
huh. interesting
<Ninpo>
Some of the strings are massive text fields, I'd assume walking an iterator that far if no chars found at some point would be a diminishing return
<Ninpo>
we're talking a good 3TB of text data overall
<Ninpo>
Happy to try it again tomorrow though once I've licked my wounds :P
<mattip>
so do it explicitly, for c in ...: if c > 127: break else:
<mattip>
so it will early exit
<Ninpo>
won't any() early exit then?
<Ninpo>
if any(ord(char) > 127 for char in binascii.unhexlify(row[0]).decode('utf-8')):
<Ninpo>
That's what I originally had
<mattip>
why decode?
<Ninpo>
it's bytes
<Ninpo>
and I was under the impression that was the way to do it because if I got a unicode decode error, it wasn't utf8
<Ninpo>
and if it did decode, walk it for higher than acii
<Ninpo>
er ascii
<Ninpo>
does ord(char) work on bytes/handle multibytes?
<Ninpo>
I'll investigate tomorrow :)
<Ninpo>
Thanks a bunch
<mattip>
multibytes means something is over 256
<Ninpo>
yeah but latin1 bytes are OK
<Ninpo>
utf8 aren't
<Ninpo>
oh I keep confusing, not latin1, mysql latin1, so cp1252
<Ninpo>
besides, for the actual _fixing_, I need to decode it as utf8 to use ftfy on it
<Ninpo>
For the probe I'll have a play. Thanks mattip
<Ninpo>
o/
<mattip>
anyhow, [c for c in b'abc'] gives [97, 98, 99] so no ord needed
<cfbolz>
Ninpo: if you want to try again, probably the explicit loop is really much faster than any with a generator expression