<njs>
_aegis_: AFAICT anything you can do with subinterpreters you can do better with subprocesses, in both cpython and pypy
<njs>
or since making subprocesses work like for these kinds of cases is admittedly not trivial, maybe it's better to say that solving these problems with subinterpreters is harder than solving them with subprocesses :-)
Garen has quit [Read error: Connection reset by peer]
Garen has joined #pypy
zmt00 has quit [Read error: Connection reset by peer]
<_aegis_>
I thought about that a bit. subprocesses are outright worse at blocking handoff
<_aegis_>
but the problems of what to share and how aren't solved that differently between subprocesses and subinterpreters
<_aegis_>
like PEP-554 says "we don't move objects, just serialized data", which is totally doable with subprocesses
<_aegis_>
ok, subprocesses are worse at (1) blocking handoff and (2) pooling
<_aegis_>
by blocking handoff, I mean when you send a message into a channel, if you know the target actor is idle you can just jump into its code instead of doing synchronization
<_aegis_>
and subprocesses are better for outright isolation (one crash won't take down everything) but introduces platform-specific jankiness like zombie processes, children getting signalled, etc
<_aegis_>
by pooling I mean if I have 20 user scripts I want to run in isolation, with subprocesses I'd need to pin each one to a subprocess, I couldn't move them around feasibly
<_aegis_>
with subinterpreters you can do green threading / coroutines across interpreters
<_aegis_>
(and thread pool so not all subprocesses are running at once)
<simpson>
Strange as it sounds, basically all of my desires are either to run some non-Python stuff with `subprocess`, or to be called by GNU Parallel. I'm not saying that those are the only cases, although I *am* saying that GNU Parallel is just that good.
<njs>
_aegis_: don't OSes automagically do "blocking handoff" betweens processes these days? (i.e., the scheduler knows enough about common IPC primitives to do optimizations like handing off process A's timeslice to process B when they send data)
<njs>
_aegis_: and you can't move scripts between subinterpreters either. you can play games with which subinterpreters get scheduled onto OS threads and when, but that's literally reimplementing what the kernel scheduler does, and anyway you can manually put processes to sleep and wake them up if you really want to
dddddd has quit [Ping timeout: 250 seconds]
<njs>
for sure there are differences around the edges, but given that subinterpreters require implementing a ton of complex machinery inside your interpreter/JIT and still break C extension compatibility (e.g. numpy doesn't support running in subinterpreters)... dealing with zombies seems easier to me :-)
<_aegis_>
in my case I don't need to play games at all, I have explicit control of when user scripts are running and I sure don't want a thread per
<_aegis_>
you can do blocking handoff with an atomic instead of a syscall, which thanks to meltdown are pretty slow these days
<_aegis_>
I have a lot of experience with the nuances of signals on linux and macos, suspending processes is not reliable enough to do
<_aegis_>
I'm also special in that I have no non-cffi C extensions
<_aegis_>
I'd probably be able to put user scripts in a box that can't even import and get away with it
<_aegis_>
which is why the idea of just implementing actors with this sounds nice to me
<_aegis_>
there's honestly way too much warmup time to do it out of process
<njs>
I mean if you can convince the pypy devs to add robust sandboxing, isolated subinterpreters, and a mechanism for sharing warmup time across subinterpreters, then I guess that would solve your problems
<_aegis_>
don't really need true sandboxing
<njs>
but that's like 3 impossible asks, so I hope you have a large budget for this...
<_aegis_>
and the other context is that I'm comfortable writing pypy patches
<simpson>
njs: And then who will pay for brunch at Milliway's~?
<_aegis_>
my question isn't to make the devs prioritize this, it's about what in pypy is going to break
<_aegis_>
because I have personal uses for this and would consider working on it
yaewa has joined #pypy
jacob22_ has joined #pypy
<simpson>
Ah. We might need Europe to wake up first, but that sounds interesting to chat about.
speeder39_ has quit [Quit: Connection closed for inactivity]
<_aegis_>
my case is a desktop app that needs to run multiple high perf user interactable components (transparent 60fps screen overlay, 90fps eye tracker, speech recognition, input simulation, background jobs)
<_aegis_>
this is not a theoretical use, I ship it today with pypy
<simpson>
Oh cool. TIL.
<_aegis_>
and it honestly works great and typically has very low cpu needs, but I could control jitter better if I moved some of the code onto a new thread
<_aegis_>
(the docs are horribly outdated, maybe look at changelog)
<energizer>
_aegis_: ah i see. is this a commercial project?
<_aegis_>
no, it's free but has closed source parts
<_aegis_>
I plan to leave it free and maybe charge for some addons or something eventually
<_aegis_>
I'm going through a difficult crippling injury and in the early stages I really would've loved to have something like this readily available, so I'm building it and giving it away so other people in a similar situation will have guaranteed access to it
<energizer>
yeah i can see that being really useful, assuming the closed parts are optional extensions rather than the core?
<_aegis_>
the core is closed source but core means very specific things here
<_aegis_>
unlike many projects almost the entire user facing experience is defined using scriptable python
<_aegis_>
there's a lot of hard tech in there that's not open-source, but it's also sharp edges stuff that casual users aren't gonna touch anyway
<_aegis_>
if I put up a faq I'm pretty sure the first ~3 entries will be about it not being open-source
adamholmberg has joined #pypy
<energizer>
if you're worried about being able to monetize the tech, there's a number of good licenses available that would allow you to make it open source and free for noncommerical use -- which sounds like is your goal
<_aegis_>
I can't work a full-time job, I don't have any large sponsors, and I'm not a corporation with other income, so it's a very particular kind of user who thinks the important bit is whether they get to edit my implementation of how a key is pressed (which they are totally free to override in the scripting layer)
<_aegis_>
you don't know me or my situation and you've never used the project so I'm not sure why this is an important thing for you to presume I haven't considered
<energizer>
sorry i didn't mean to make an assumption. i must have misunderstood what you said
adamholmberg has quit [Ping timeout: 246 seconds]
<_aegis_>
my opinion is the bits that are important for users to tweak are already completely in their control already
<_aegis_>
I have a very large list of reasons to not open-source the core
<_aegis_>
(in its entirety anyway)
<_aegis_>
this is totally the wrong place to have this discussion as well
<energizer>
if you dont mind sharing your thinking, i am interested. obviously i'm not in a position to have an opinion about it :)
<energizer>
perhaps #python-offtopic
<_aegis_>
I'm in #pypy to contribute back to a project I use, my project details are only really relevant here when I'm talking about how I'm using pypy :P
<energizer>
_aegis_: sure
<energizer>
_aegis_: out of curiosity, is persisting the jit about startup performance, or do you think it'd help overall speed?
<_aegis_>
can you quote what I said
<_aegis_>
I lost context
<energizer>
actually i just started with a new client, i dont have it either :\
<_aegis_>
ok for subinterpreters
<_aegis_>
pypy's warmup is slower than cpython in my experience
<energizer>
ya
<_aegis_>
so if you need to start from `import site` all over again with no context
<_aegis_>
might be talking like 300ms!
<_aegis_>
one talon feature is hot code reloading, I don't want to add a quarter second to it
<_aegis_>
(user can modify a script and changes will immediately take effect)
<_aegis_>
I'm saying if I had to invoke a new pypy for any reason there would be random extra latencies
<_aegis_>
my process startup is ~2s right now, if I started a few subinterpreters that would probably add >1s to it
<_aegis_>
there's also the problem of each jit warming up separately
<energizer>
that project uses a pool of fork()ed processes and import-transaction system so that when you want to start a new process it's already started the interpreter and imported everything you want except the changed files
<_aegis_>
no fork() on windows, also a process pool means I need a whole new RPC concept for talking to the screen overlay and eye tracker efficiently
<_aegis_>
(while right now I can just do a C callback into pypy with no rpc)
<energizer>
ah yeah, cross platform may be complicated, i dont know windows much
<_aegis_>
I'm more likely to embed luajit for latency sensitive tasks than spawn a subprocess :)
<ronan>
mattip: the error is raised from rpython/rlib/rsre/rsre_utf8.py:43
<ronan>
that code looks a bit silly
Rhy0lite has quit [Quit: Leaving]
<ronan>
mattip: IIUC, it won't raise MemoryError after translation, but this next_n() method shouldn't even exist because it's necessarily very slow with UTF8
marky1991 has quit [Ping timeout: 250 seconds]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
Zaab1t has quit [Quit: bye bye friends]
<mattip>
ronan: 1. would it be faster to use rutf8.create_utf8_index_storage? and 2. why should calling a function with no side effects run out of memory?
k1nd0f has joined #pypy
xcm has quit [Read error: Connection reset by peer]
adamholmberg has quit [Remote host closed the connection]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<ronan>
mattip: 1. yes, but that doesn't quite fit the current design of rsre. 2. The problem is 'range(n)', where n is 4294967294 in the test (and BTW, it's only a problem untranslated on CPython)