cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | "the modern world where network packets and compiler optimizations are effectively hostile"
<fijal>
njs: it would make sense if we had immutable objects
<fijal>
njs: and if python had sane semantics
<njs>
fijal: I'm suggesting new semantics :-)
<fijal>
well, it's hard to do
<fijal>
you can't share classes
<njs>
obviously it would be work, I'm not suggesting otherwise, but a priori I can't see anything about python's semantics that stops you from adding some built-in immutable types and hosting multiple independent Python instances inside the same process
<fijal>
and that generally shuts down
<fijal>
what do you gain if all you can share is strings?
<fijal>
as opposed to multiple processes
<fijal>
njs: python semantics are too bad (and too mutable) for anything to work
<njs>
you can have arbitrarily complex immutable types if you want
<njs>
define a frozendict type or whatever (though it's not quite the same as tuple or frozenset, because for this you'd want the rule that an immutable container can only hold other immutable objects)
<fijal>
yes ok
<fijal>
what do you gain over multiple processes?
<fijal>
if you have to serialize everything in those strange types anyway
<njs>
passing large strings with zero overhead is not nothing :-)
<fijal>
you can already do that
<fijal>
fuck yeah shared memory
<fijal>
njs: my answer is generally "no, the semantics are wrong, you will not gain anything"
<njs>
eh, in principle yes, in practice c'mon
<fijal>
e.g. you would still need N copies of code objects and classes and N copies of jit code
<fijal>
and N warmups
<njs>
code objects are immutable, no?
<fijal>
heh
<fijal>
the BYTECODE is immutable
<fijal>
not the constants
<njs>
oh, meh
<njs>
and anyway the JITted code probably needs to guard on type ids
<fijal>
you can't share types, right?
<njs>
I guess the more ambitious thing would be to try to have CoW for things like types
<fijal>
yes, STM is kinda that
<fijal>
overheads are high
<fijal>
but also you have writes that are not necesarilly bad
<fijal>
it's all complicated greatly by the fact that python is such a large language
<fijal>
with tons of C extensions that don't have support for all of that
<fijal>
and there is a metric fuckton of global state
<njs>
but the overall point is (a) shared-everything semantics are really programmer-unfriendly and backwards-incompatible with traditional python, (b) shared-everything is why you have some horrible buggy slog of adding locks to everything
<fijal>
sure
<fijal>
I'm definitely not debating *that*
<njs>
and the multiprocessing *semantic* model is actually great for lots of things, it's just awkward for practical reasons
<fijal>
well, maybe it's not that great if it's awkward to use?
<njs>
well, it's awkward because things like, control-C never works right, and pickle is a hassle in a lot of ways, and it relies on this incredibly complicated set of machinery involving background threads, ...
<njs>
you're right that it would make a huge difference if there were some way to take advantage of living in the same process behind just passing around transitively immutable builtin object graphs
<njs>
s/behind/beyond/
yuyichao has joined #pypy
<fijal>
njs: you're saying that python took a crap concept and implemented it badly?
<fijal>
I would agree!
<fijal>
but not much I can do
<njs>
eh, I'm not sure you can really do *better* than multiprocessing given how it's fighting the OS
exarkun has quit [Read error: Connection reset by peer]
<cfbolz>
of course RPython classes are integer-like for the JIT, obviously
<kenaan>
cfbolz default 1cf0fac8f2fe /pypy/module/__builtin__/test/test_classobj.py: fix fallout from celldict defaultification (this test is anyway much closer now to what we really care about)
<cfbolz>
fijal: you are being a bit too combative, imo ;-)
<fijal>
cfbolz: with?
<fijal>
the blog post or my opinion on python threading options?
<cfbolz>
njs
<cfbolz>
whether or not you like the idea, it's not unlikely that other readers of the post will have similar reactions to njs
<fijal>
so how would we do subinterpreters?
<fijal>
right, so that part is important
<fijal>
we should a) explain what's the plan b) explain why STM does not work and maybe c) explain why subinterpreters don't really give much on top of subprocesses
<cfbolz>
tbh, I find subinterpretes about as unlikely as removing the GIL without going mad
<njs>
cfbolz: meaning you think this whole plan is doomed, or...?
<cfbolz>
maybe not doomed, just extraordinarily painful ;-)
<cfbolz>
fijal: and note that gil-removal is pain you would inflict on the whole project
<fijal>
yeah
<fijal>
cfbolz: so we should not propose it? should we propose it for extraordinary amounts of money?
<fijal>
there is always an option of "never merge it"
<fijal>
which is in some way "fine"
<fijal>
I was thinking of merging things that are definitely an improvement (thread-safe rpython GC) for example and not merge the locks
<cfbolz>
fijal: I am more saying it's probably a good idea to think about it and make sure that there is no alternative
<fijal>
cfbolz: I'm happy to hear opinions :)
<fijal>
we've been thinking for past 10 years no?
<fijal>
obviously it's impossible to prove there are no alternatives
<cfbolz>
we also thought about the GIL for 10 years, and for 9.8 concluded that it's impractical to remove ;-)
<fijal>
right
<fijal>
but maybe there are ways to remove it that have commercial benefits
<cfbolz>
(anyway, of course you can implement whatever you want, particularly on a branch)
<fijal>
cfbolz: look, I would not do that for free
<fijal>
right
<cfbolz>
I know
<fijal>
so one option is to implement it on a branch
<fijal>
and then make people pay for keeping it up to date
<cfbolz>
honestly, using the name recognition of "the gil is terrible" as a marketing argument is the part that sounds very convincing :P
<fijal>
heh
<fijal>
but also I want to get some idea whether it's a whole lot of beating the bush or someone is actually willing to vote with their wallets
<cfbolz>
yes, there's a not small chance of that
<fijal>
so other than a) b) and c) any other feedback about the blog post?
<fijal>
cfbolz: I promise I'll not merge it randomly
<fijal>
(as I said, I still think stuff like thread-safe rpython GC is not a terrible idea for rpython, or thread-safe JIT compilation)
cstratak has quit [Quit: Leaving]
cstratak has joined #pypy
adamholmberg has joined #pypy
http_GK1wmSU has joined #pypy
adamholmberg has quit [Ping timeout: 276 seconds]
http_GK1wmSU has left #pypy [#pypy]
<LarstiQ>
is there a stm paper with: "we tried these things and it doesn't work for these reasons"?
<LarstiQ>
as some kind of result to extract from the project
<Remi_M>
LarstiQ: the paper shows that STM can work in some cases, but also that especially single-thread overhead is bad and that performance is unpredictable. It's clear that it is nowhere near production-ready.
<Remi_M>
I think it's a matter of opinion if STM is a dead end or if the project lacks manpower
<xorAxAx>
hmm, i am not convinced that people would use nogil-unsafe in any way correctly :-) what if pypy had a mode in which it would activate automatically added per-object locks, and otherwise remain in the GIL mode? i.e. some binary which would automatically switch modes on its own
<xorAxAx>
(it should be as fast as a current pypy in the worst case)
<mattip>
how much of the money would be put in a fund to support it down the road?
<fijal>
antocuni: it does not work
<fijal>
mattip: I think it's pretty simple, we leave it on the side unless we have clients
<fijal>
(or someone wants to do the work)
<antocuni>
so the claim in the blog post is false?
<fijal>
antocuni: it does work on very simple examples
<fijal>
if you're careful enough otherwise segfaults
adamholmberg has quit [Ping timeout: 255 seconds]
<fijal>
it also has bugs (so segfaults even if you're careful sometimes)
* mattip
waiting my turn to request expansion of the "the total cost of doing the work at $50k" statement
<fijal>
mattip: sorry?
<fijal>
antocuni: should I change the statement somehow?
<antocuni>
I know. I propose to provide a github repo which contains a binary and a very simple example which works, with a big disclaimer that everything else doesn't
<antocuni>
fijal: I'm editing the blog post right now, please wait for my reviews first :)
<fijal>
antocuni: ok, but let's not creep the scope of what we're doing
<fijal>
because there is no such thing as a "linux binary" and it does not work anywhere else etc. etc.
<fijal>
and it's stupid because it does not work on anything anyway, so why would you let people download and check it themselves at all?
<fijal>
I'm happy to replace it with "we have a viable path"
<antocuni>
"""Since such work would complicate the code base and our day to day work,
<antocuni>
we would like to judge the interest on the community and the commercial
<antocuni>
"""
<antocuni>
PyPy users.
<antocuni>
is this just a complicate way to say "we are not doing it unless you pay?"
<mattip>
my question is how "the total cost of doing the work..." implies "we leave it on the side unless..."
<mattip>
it sounds like for $50k they get it all
<antocuni>
yes, I also got the same concerns as mattip; is it 50k or 100k?
<mattip>
and for $100K they get it all "right now"
<kenaan>
tobweber stmgc[c8-binary-trx-length-per-thread] cac4878ee56a /c8/stm/nursery.c: Update transaction lengths with learnings from TCP style
<kenaan>
tobweber stmgc[c8-binary-trx-length-per-thread] 114803b15227 /c8/stm/core.c: Update trx length on commit and abort only
<mattip>
so maybe "we propose to create a branch of PyPy and make it pass a concurrent test suite, we estimate the cost of that at X,
<kenaan>
tobweber stmgc[c8-binary-trx-length-per-thread] 3394aed50b06 /c8/stm/timing.h: Fix missing type definitions for custom payload
<exarkun>
arigato: do you know if fc69d0277999b20c630a804e828d5cdb6742cc41 really fixed test_unicode_ord_positive in pypy/module/array/test/test_array.py? afaict it's the array constructor that raises ValueError, not the index access.
<exarkun>
Maybe that revision even introduced the constructor ValueError, in fact
<exarkun>
looking at the revision on bitbucket and seeing more context I think it probably did make the test pass. the ValueError is coming from w_getitem, not the constructor as I thought.
* exarkun
pays a little closer attention to the facts
<cfbolz>
:-)
<LarstiQ>
hah, git-remote-hg, a provider of alternative facts ;)
<mihaid>
hello, sorry to trouble you, I was looking through the buildbot logs and I found a failed test that I don't quite know how to interpret: https://pastebin.com/arquveK5
<exarkun>
mihaid: I'll make a guess. :) It's an annotation error (which is part of translation. There's some RPython that's not actually valid RPython. In interp_socket.py, I suppose. w_ancillary is only known to be `SomeInstance` and `getitems` isn't a legal thing to do to a `SomeInstance`.
<exarkun>
mihaid: I take it this is from a branch where you've been working on the socket module?
<mattip>
(often helping the annotater by "assert isinstance(w_obj, <desired type>)" will solve the ambiguity, but then you should ask yourself why is this ambiguous in the first place)
vkirilic_ has joined #pypy
<cfbolz>
mihaid: regular translation works on your branch?
vkirilichev has quit [Ping timeout: 246 seconds]
vkirilichev has joined #pypy
vkirilic_ has quit [Ping timeout: 248 seconds]
<cfbolz>
exarkun: are you actually touching the array mod?
<exarkun>
cfbolz: I was looking at the failure of that test in the py3.5 branch. I haven't made any real changes yet.
vkirilichev has quit [Ping timeout: 240 seconds]
<exarkun>
I haven't quite figured out what change would make sense. It doesn't look like the test should be failing but it is. And, also, it's asserting behavior that's not the same as CPython 3.5...
<exarkun>
Why do you ask? Should I work on something else?
<cfbolz>
exarkun: because that merge above pulled changes in there
<cfbolz>
just a heads up in case you had local mods
vkirilic_ has quit [Ping timeout: 268 seconds]
* exarkun
nods
<exarkun>
okay, thanks
<mihaid>
cfbolz: yes the translation is successful on my setup and the build bots. exarkun: yes, this is from branch py3.5-sendmsg-recvmsg. I was just looking for errors in the build bots logs, and the only one left that is related to what I implemented is this. However, as I said, the translation is successful on both my setup and the build bots. It is ju
<mihaid>
st that the ztranslation test fails.
<cfbolz>
mihaid: is this the last of your failing tests?
<cfbolz>
it's possible that it's a bogus failure - test_ztranslation can sometimes be weird
<mihaid>
yes, well I looked around in the errors logs, and there are some failures in some of the regression tests but none that seem related to sendmsg or recvmsg in any way.
marky1991 has joined #pypy
<cfbolz>
mihaid: yes, you should compare to the failures on the py3.5 branch. If they are the same, you are done
adamholmberg has joined #pypy
marky1991 has quit [Ping timeout: 248 seconds]
lritter has joined #pypy
leto_ni has quit [Ping timeout: 255 seconds]
vkirilichev has joined #pypy
vkirilichev has quit [Read error: Connection reset by peer]
<mihaid>
cfbolz: I compared the logs of py3.5 and my branch with meld and found some differences but none that relate to what I implemented. I think some tests are no longer skipped because of sendmsg & recvmsg ( such as test_multiprocessing_forkserver in the cpython regrtests) but have some failures/ errors due to other issues.
<cfbolz>
mihaid: sorry, I didn't mean the source code. Just the failures on the buidbot
<mihaid>
cfbolz: yes that is what I compared. The failure logs for app-level tests, regrtests and pytests.
<cfbolz>
ah ok
<cfbolz>
mihaid: as I said, it's probably fine to ignore test_ztranslation for a while longer
<cfbolz>
somebody should probably review your branch, but it can't be me, because I have no clue about low level pointer programming
<cfbolz>
s/pointer/socket
yuyichao has quit [Ping timeout: 246 seconds]
raynold has joined #pypy
<mihaid>
cfbolz: I understand. Should I ask somebody? Or to put it in another way, is there anyway to tag my branch as ready for review? Because I have no idea who to ask :)
<cfbolz>
just open a pull request. armin is a candidate, as usual, but no clue who else
leto_ni has joined #pypy
Remi_M has quit [Quit: See you!]
yuyichao has joined #pypy
<mihaid>
cfbolz: Okay, I will do so, shortly. Thanks a lot, for the help Carl!
exarkun has quit [Read error: Connection reset by peer]
exarkun has joined #pypy
q4 has joined #pypy
jcea has joined #pypy
igitoor has quit [Ping timeout: 255 seconds]
<kenaan>
cfbolz getarrayitem-into-bridges 251f979f4bcc /: close to-be-merged branch
<kenaan>
cfbolz default 43ff4a9015e3 /rpython/jit/metainterp/: merge getarrayitem-into-bridges: improvement on what information is retained into a bridge: in particular, knowle...
<njs>
I guess I should re: the conversation about the GIL: while I think it's worth thinking hard about whether there's any way to improve on classic shared-everything threading semantics, pretty much any kind of GIL removal would be super useful (if it can be done sustainably). I guess I wouldn't want to give up memory safety (segfaults are not on), but even if, like, concurrent dict mutations corrupt the dict state, then it'd still be useful for lots of c