<vstinner>
mattip: dropping the C API or moving to webassembly doesn't solve the problems that i described in my doc
<xorAxAx>
i can also host people at my place, its 80 minutes door-to-door to HHU
<vstinner>
mattip: in short, my problem is that take a simple C extension from PyPI, convert it, and it now uses a stable ABI working on PyPy and CPython and don't access to implementation details
<cfbolz>
xorAxAx: ah, where is that?
<xorAxAx>
cfbolz, essen-altendorf
<mattip>
vstinner: "don't access implementation details" is not worht the effort in my opinion
<vstinner>
mattip: what do you mean?
<xorAxAx>
mjacob, you are also invited given that you might be still a student
<mattip>
what is the use case that you are serving by hiding the details? Is that use case widespread enough to justify the change?
<xorAxAx>
ah, it might be even less than 80 minutes if we go by car
<antocuni>
TL;DR: when you pass an object to C, by default it is a "local reference" which is valid only until the function returns; if you want to store it somewhere, you need to upgrade to a "global reference"
<vstinner>
mattip: sorry but i don't get your point
<antocuni>
this would save entirely the need of filling the PyHandle table, as long as you don't upgrade them
<vstinner>
mattip: people do write C, they are plenty of C extensions all around. that's just a fact
<vstinner>
mattip: you cannot ignore the giant pile of C extensions used in the wild
<mattip>
how likely is it really that they will move to a new c-api? Most of them will never move
<mattip>
the live ones would happily move to a more expressive performant paradigm, if it was easy
<mattip>
not just "trade a pyobject for a pyhandle" and get really nothing out of the deal
<mattip>
I don't beleive CPython will get a 2x performance boost by using this new API. Can you convince me it is true?
<vstinner>
mattip: my plan is to provide converters, so you don't have to port code yourself. "setup.py install" will magically do the trick :)
<vstinner>
(sorry, i start a meeting)
<mattip>
sounds like 2to3, and we know there are corner cases taht don't work
<mattip>
In the mean time, your work toward this goal, that many people do not believe in, causes cpython reviewers to work for you
<mattip>
checking your PRs and replying to mails
<mattip>
when that energy could go toward something else, maybe
<mattip>
I work full time for NumPy, but my goals are set by the community via a roadmap
adamholmberg has joined #pypy
jcea has joined #pypy
<arigato>
antocuni: that's similar, I think. A C function would be called with PyHandle arguments, and these are definitely valid only for the duration of the call. You'd need to make a copy with pyhandle_dup(), just like right now you need to Py_INCREF() them in order to store them somewhere for later
<arigato>
on PyPy there would still be one indirection via the table, but this indirection is really needed, because moving gc
<antocuni>
arigato: well, not necessarily but it depends on how you define the API of course. E.g. by reading your python-dev email, I got the impression than a PyHandle is valid until you call PyHanlde_Close
<arigato>
yes, for handles returned to you by a function call you did
<arigato>
for handles that are passed in as arguments to a Python-exposed C function you implement, the lifetime is managed by the caller
<antocuni>
ah I see. If you just *receive* a handle as an argument, the caller owns it and you are not allowed to close
<antocuni>
makes sense
<arigato>
yes
<antocuni>
I wonder why JNI didn't do that then?
<arigato>
also, the "table" implementation is still open to possible optimizations, like storing an integer object by just copying it into the table and not having any extra memory allocated
<arigato>
or in general (in theory) storing in the table whatever exploded representation comes from our JIT
<antocuni>
yes, that's basically tagged pointers, but only for objects which are proxied by handles
<arigato>
in my PyHandle approach that would be all of them
<arigato>
all of them from the point of view of the C extension
<antocuni>
yes exactly. What I meant is that objects which are NOT passed to cpyext don't get a handle at all
<arigato>
right
<arigato>
and yes, unsure why JNI needs separate notions of local vs global references
<antocuni>
maybe to avoid the double indirection?
<antocuni>
if you have local/global refs, you can just keep the local refs alive around the C call and pass the raw pointer
<arigato>
right, maybe local references come with pinning?
<antocuni>
here it explicitly says that you have at least one level of indirection
<arigato>
maybe the difference is that they need fully concurrent GCs
<arigato>
so a global reference would contain more stuff to make sure it works even if another thread moves the object
<arigato>
concurrently
<arigato>
it's a problem we can avoid by not touching the GIL---saying "don't call the C API if you don't own the GIL"
<Hodgestar>
vstinner: I'm not sure small steps are necessarily better if moving through the steps is painful and requires people to repeatedly do work. I don't think people would have a problem with a big change if the old way still worked (e.g old C-API) and the new way was much better (e.g. new PyHandles).
<Hodgestar>
With both Python 3 and CFFI people started writing new stuff using them, and the old stuff didn't go away, even though they were big steps.
<cfbolz>
arigato: I like this proposal. I still wonder whether we can find something to offer to CPython users to entice them to use the new API
<Hodgestar>
cfbolz: Maybe the answer to that is to find something that isn't PyPy that would benefit and get them onboard? No idea what the something might be though at the moment. :|
<antocuni>
note that unfortunately it's not enough to make cpyext faster, because currently we never measured the gc-to-pyobject* code as a bottleneck
dddddd has joined #pypy
<cfbolz>
:-(
<arigato>
cfbolz: I think one milestone would be if Cython would generate C code targetting the new C-API
<cfbolz>
Yes, but again that would mostly benefit PyPy users, no?
<arigato>
directly, yes
<arigato>
still, even if we get "only" that, it's very interesting for PyPy
<arigato>
but maybe more to the point when talking about the future of Python:
<arigato>
this simpler C API would allow more projects to experiment with CPython or CPython-like approaches
<cfbolz>
Well, I like cython like approaches, because at least in theory cython could have a totally different backend that also produces eg jit codes
<cfbolz>
So we could inline into the C extension
<cfbolz>
(which as mattip points out is what we would *really* need)
<arigato>
right
<arigato>
there are two things here, one is what you're saying and the other is what might make sense for the future of Python
<cfbolz>
Yes
<cfbolz>
We probably need both of these things. At least if we want to have a way for CPython out of its local maximum
<mattip>
if the code no longer is succeptible to refcount bugs, that is a significant +
<arigato>
there are still issues though
<arigato>
the major one I foresee is that, assuming the CPython implementation is done just as "a PyHandle = a PyObject *",
<arigato>
then you have potential problems because calling pyhandle_dup(x) would just return x, after incrementing the reference counter,
<arigato>
and then possible bugs where people store the wrong x and it works fine on cpython
<mattip>
maybe a summit of cython, sip, pypy, dask, numba, mypyc, swig, cffi, cppyy, pybind11, ... maintainers would be helpful
<arigato>
we could mitigate that issue a little bit by using another, slower implementation (e.g. when compiling in debug mode) but still
<arigato>
and of course you still have the risk that someone forgets to close a handle
xcm has quit [Ping timeout: 244 seconds]
<cfbolz>
arigato: the latter is like forgetting how to decref, right?
<arigato>
yes
xcm has joined #pypy
<cfbolz>
mattip: yes, that would make sense. Though one risk in adding more project is that nothing gets done because of too many possible directions and concerns
<antocuni>
arigato: one nice advantage of PyHandle is that in theory you can have a debug mode which tells you EXACTLY which handle you forgot to close
marky1991 has joined #pypy
<antocuni>
much better than debugging missing decrefs
<arigato>
yes
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<cfbolz>
Nice
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
xcm has quit [Read error: Connection reset by peer]
xcm has joined #pypy
squeaky_pl has joined #pypy
<squeaky_pl>
mattip, there is a typo in the blog post, it says multilinux2010 instead of manylinux2010
<mattip>
squeaky_pl: thanks, and hi!
<mattip>
btw, those are just some suggested topics, other ideas or rejection of those is welcome