cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | the secret reason for us trying to get PyPy users: to test the JIT well enough that we're somewhat confident about it
Frankablu has quit [Quit: Leaving]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 268 seconds]
dddddd has quit [Remote host closed the connection]
speeder39 has joined #pypy
speeder39 has quit [Quit: Connection closed for inactivity]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 264 seconds]
froztbyte has quit [Ping timeout: 244 seconds]
<kenaan>
mattip cffi_dlopen_unicode 295154c77400 /pypy/module/_cffi_backend/libraryobj.py: copy unicode handling to _cffi_backend.libraryobj.W_Library
bbot2 has quit [Quit: buildmaster reconfigured: bot disconnecting]
bbot2 has joined #pypy
bbot2 has quit [Quit: buildmaster reconfigured: bot disconnecting]
bbot2 has joined #pypy
adamholmberg has joined #pypy
Snowolf781 has joined #pypy
adamholmberg has quit [Ping timeout: 245 seconds]
mattip has joined #pypy
<mattip>
strange. The buildbot had lots of jobs "forced" by me, that I didn't submit
bbot2 has quit [Quit: buildmaster reconfigured: bot disconnecting]
<mattip>
now I cleared them and restarted the buildmaster. Now no buildslaves are connected.
bbot2 has joined #pypy
<mattip>
I checked the hg id before restarting, it was the head of the buildbot repo
the_drow has joined #pypy
mattip has quit [Read error: Connection reset by peer]
<the_drow>
mattip, I meant, whenever a Py_* function is called write the appropriate byte code for the VM to execute the equivalent of that function call. Then, you have to figure out when is the next Py_* call. Between those two calls there is pure C code.
the_drow has quit [Remote host closed the connection]
the_drow has joined #pypy
<the_drow>
Imagine an extension where there are only Py_* calls. We can translate it to a representation that PyPy can JIT
<the_drow>
The complex case would be to execute the C code correctly without crashes.
<the_drow>
But with the proper instruction pointer arithmetics it might be doable
<the_drow>
I have given the separate GCs problem another thought. What if we introduce PyPy_TRACK, PyPy_RELEASE and parhaps PyPy_SURVIVE to mark an object that is going to outlive the local scope and should skip the nursery as a way for interacting with our GC?
<the_drow>
Cython could use those pretty easily
<fijal>
the_drow: how can you tell those things?
<fijal>
generally speaking, you can't quite tell which parts of the C code are "C things" and which parts are operating on CPython C API, because we have direct access to pointers, for example
<the_drow>
If we have debug symbols we can
<fijal>
how?
<fijal>
noone stops you from casting PyObject* to void* and modifying some memory
<the_drow>
Because you can tell which function is called when
<fijal>
function, yes
<fijal>
but not all the other things
tayfun26 has joined #pypy
<fijal>
that's additional to the fact that those functions are ill-suited
<the_drow>
A more complex endeavor would be to transpile the C code to an intermediate representative
<fijal>
you can't quite tell what people want
<fijal>
this is, in theory, possible
<fijal>
but I don't see how it would be better
<the_drow>
Why would you manipulate PyObject* without the Python API?
<fijal>
excellent question!
<fijal>
because this is how people write extensions somehow - by directly reading fields (and writing to them)
<the_drow>
Because we can execute this IR without giving consideration for other runtimes
<the_drow>
That's terrible
<fijal>
but one of the main issues is that the API is ill suited - you can't quite tell what sort of *python* operation people are doing from just seeing the functions executed
<the_drow>
Side note: feel free to dismiss my ideas as nonsense. They likely are nonsense but maybe we'll figure something out here
<the_drow>
But I as a human being can track the flow of execution and translate it to pure python no?
<fijal>
not really no
<fijal>
some are inexpressable
<fijal>
have some slightly different behavior
<the_drow>
So even if we were able to execute the extension in the RPython VM it would be limited
<fijal>
I think your approach has a very serious philosophical problem
<the_drow>
Hehe I like those
<fijal>
you're suggesting a bunch of heuristics how to run fast some common patterns, without thinking "ok, but what's the fallback?"
<the_drow>
The normal way cpyext executes
<fijal>
yes, but you cannot
<fijal>
because the normal way requires modifying C-level structures that resemble CPython C API
<fijal>
like refcounting
<fijal>
and only every now and again syncing it to PyPy-level stuff
<fijal>
but if you stop having those C structures, there is no fallback
<the_drow>
So you'd have to inject the code that creates it
<fijal>
we have that code
<fijal>
ok, anyway
<fijal>
if you want to have C tracing/compiling/transpiling *additionally* to the current mammoth cpyext, then it's a no out of sheer complexity of the solution
<the_drow>
Yeh never mind. It's not one of my brightest ideas
<fijal>
I'm happy to strike it down on those grounds, without even thinking if it's feasible :-)
<fijal>
some ruby (jruby?) has only reinterpreting C/assembler
<fijal>
that's *doable*, but a project size of cpyext
<the_drow>
It was very very broken
<the_drow>
What do you think about introducing PyPy specific GC macros that Cython and others can use?
<fijal>
the_drow: for what is worth, I did not believe cpyext can be made to work either ;-)
<fijal>
PyPy specific GC macros for cython - that's probably a step in the right direction
<fijal>
but step one would be to understand a) if they're going to use it b) what macros do they want
<fijal>
from what I found, every major cython extension uses CPython C API directly
<the_drow>
That would allow C extensions like rapidjson to exist since they create tons of objects
<fijal>
well maybe
<fijal>
needs cooperation from cython
<the_drow>
So that needs to be behind a feature flag or activate the emulated recounting once we encounter PY_INCREF on an object we already track
<the_drow>
Have we opened an issue about it on their repository? I can do that
<fijal>
the_drow: wait
<fijal>
the_drow: we have zero need for an issue if we have no real will to work on it
<fijal>
"nice to have" issues on an issue tracker are a bit pointless
<fijal>
are you going to find out if cython is happy with that?
<the_drow>
The intention was to gather their feedback
<the_drow>
Yes
<fijal>
well, then start there :-)
<fijal>
if you have feedback, then an issue is very much in place
bbot2 has quit [Remote host closed the connection]
<mattip>
there were two buildmaster processes running
the_drow has quit [Ping timeout: 252 seconds]
adamholmberg has quit [Ping timeout: 272 seconds]
mattip has quit [Remote host closed the connection]
froztbyte has joined #pypy
<cfbolz>
the_drow: fwiw, that sounds pretty close to what oracle is doing for c extensions in Jruby-truffle
<cfbolz>
They have a C interpreter and jit the C code
<cfbolz>
It can be quite fast, but the downside is that they never actually reach C speed, because they never call machine code produced by a C compiler
<cfbolz>
I can search for the papers, if you want
<cfbolz>
They have an important side motivation in addition to speed which is that in this way they can run untrusted C code in a managed environment
<kenaan>
arigo default 91a8a20e0809 /pypy/module/cpyext/: Remove from stubs.py the few functions that are implemented. Checked in the script I used to find them.