antocuni changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | "PyPy: the Gradual Reduction of Magic (tm)"
<blachance>
if I want to translate my interpreter so I can profile it (e.g. w/gperftools), do I need to use any particular translation options to get debug symbols?
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
tbodt has joined #pypy
tbodt has quit [Read error: Connection reset by peer]
<ronan>
blachance: not sure what's best, but I'd try with --lldebug first
<arigato>
fijal: there is no way in rutf8 to get the flag apart from check_utf8(), right?
<fijal>
arigato: from a unicodeobject?
<arigato>
no from a utf8 string we just built
<fijal>
as in w_unicodeobject
<fijal>
ah, no
<fijal>
you have to walk characters, if you didn't get it from W_UnicodeObject
<fijal>
(but you can get it from W_UnicodeObject)
<arigato>
and there is no way to incrementally build up the flag, for example, and no way to call a faster function that assumes it *is* valid utf8 because we just built it
<fijal>
we can create such a function
<fijal>
*but*
<fijal>
my thinking was that writing as SSE-based function will be faster than rpython function that assumes it's valid UTF8
<fijal>
incremental building is quite a pain though
<arigato>
well, the SSE function can also be faster by assuming it is valid UTF8
<fijal>
yes right
<arigato>
but
<fijal>
so maybe we should have get_length_and_flag which for now will call check_utf8
<fijal>
but we can replace it with a faster version in the future?
<arigato>
we should have an incremental way
<fijal>
ok
<fijal>
so let's do that
<arigato>
as in, when you build a utf8 string you can usually compute its length as you build it
<fijal>
yes, it has been done by hand in codecs
<arigato>
ok, so what do you do there?
<fijal>
I use combine_flag()
<fijal>
and keep track of length
<arigato>
ah, that's what I'm talking about
<arigato>
where is combine_flag()?
<fijal>
in rutf8
<arigato>
no, it's in unicodehelper.py
<fijal>
ah yes
<fijal>
should be moved to rutf8
<arigato>
ok I can do that
<arigato>
that's what I was looking for
<fijal>
still, should we make a faster version?
<fijal>
even if it's "call the same thing for now"?
<arigato>
there is little point if you can call combine_flags() instead as you go along
<arigato>
which is faster on long strings at least (no re-walking the string)
<fijal>
yes sure
<fijal>
but it's a pain in the ass in a lot of cases
<fijal>
like, splitline
<arigato>
ok
<arigato>
then yes
<fijal>
I must say I have no idea what is ronan doing
<arigato>
:-/
<arigato>
there is rutf8.get_flag_from_code and rutf8.unichr_to_flag
<arigato>
identical
<arigato>
which one should I kill? :-)
<fijal>
heh, pick one :)
<fijal>
like, he's changing code in _io module (but still using unicode instead of utf8)
<fijal>
which makes it much harder to merge into anything
<arigato>
I see your point, but I guess wait until he can explain?
kenaan has quit [Read error: No route to host]
<arigato>
major speed-up of combine_flags(): return flag1 | flag2
<arigato>
with a tweak to the actual values, of course
* arigato
tries to do _cffi_backend but keeps being distracted by corner cases
<arigato>
e.g. u.lower() on a unicode with surrogates would probably get an RPython-level ValueError
kenaan_ has joined #pypy
<kenaan_>
arigo unicode-utf8 a1cf21d7a124 /: Tweak the unicode FLAG_xx values for performance; collapse two identical helpers; move combine_flags() to rutf8
<kenaan_>
arigo unicode-utf8 16bfad77e3d5 /pypy/objspace/std/: Tests and fixes for 'allow_surrogates=True' in various unicode methods
<arigato>
we should actually have a StringBuilder and unichr_as_utf8_append() that computes the flag for us, too
<fijal>
arigato: ah
marr has joined #pypy
<antocuni>
arigato: thanks for fixing vmprof
<antocuni>
although r1cc101a9ee5a apparently is not enough. I have an example using eventlets in which pypy nightly doesn't record any sample, which pypy 5.9 does :(
<kenaan_>
arigo unicode-utf8 dc6582a05b85 /: Review for surrogates
<arigato>
antocuni: well, no tests fail :-(
<antocuni>
arigato: sure, I don't think it's your fault
<antocuni>
I mean, I can observe this buggy behavior also before 1cc101a9ee5a
<arigato>
fijal: just to be clear, sys.maxunicode == 0x10ffff in any future pypy even on platforms where it isn't the case in CPython 2.7, right?
<fijal>
arigato: that question makes no longer any sense, in a way
<fijal>
(but we need to keep track of size of WCHAR anyway)
<arigato>
yes
<fijal>
arigato: I have electricity, but I need to go outside
<fijal>
feel free to apply any form of refactoring
<fijal>
like faster version of check, string builder etc
<fijal>
I'm not TOO happy how haphazard the stuff is right now
<fijal>
arigato: ah, and I'm fine doing the mechanical work of adjusting the current solutions :)
<arigato>
I'm kind of happy that you're not too happy about it :-)
<arigato>
I guess I'll try to have a version of StringBuilder that is really a Utf8StringBuilder
antocuni has quit [Ping timeout: 268 seconds]
<arigato>
also, UTF8_INDEX_STORAGE seems to have added one level of indirection
<arigato>
if the goal was only to provide a place for the flag without overhead, then that is missed
oberstet has joined #pypy
<fijal>
what do you mean?
<fijal>
we still need storage for the index stuff no?
<arigato>
of course, but I'm complaining that it is now one indirection farther
<fijal>
ok?
<arigato>
also, the *only* use of the flag "HAS_SURROGATES" is in unicode.encode('utf8')?
<arigato>
is that right?
<fijal>
yes
<fijal>
but ASCII is used a bit more
<arigato>
maybe we can have a lightway solution for "HAS_SURROGATES"
<arigato>
still thinking
<arigato>
I'm thinking about something that takes care of bytes.decode('utf8').encode('utf8') but not necessarily more complicated cases
the_drow has quit [Ping timeout: 260 seconds]
<fijal>
I think it's kind of important to have encode that does not scan the string
<fijal>
on strings that you might have gotten from splitting other strings, for example
<arigato>
ok
<arigato>
as implemented now I'm unsure you win in the end
<arigato>
i.e.
<arigato>
it creates overhead a bit everywhere
<fijal>
keeping track of the flag?
<arigato>
both in runtime cost and in complexity of implementation
<arigato>
yes
<fijal>
right, but knowing it's ascii is very important
<arigato>
starting with how UTF8_INDEX_STORAGE is now two allocations instead of one
<arigato>
yes
<fijal>
why is it two allocations?
<fijal>
in the common case it should be zero, no?
<arigato>
well, maybe, but in case it's != 0, then it's 2
<fijal>
why is it 2?
<fijal>
because struct and array?
<arigato>
yes
<arigato>
it means that every indexing is slower
<fijal>
so you're worried about this level of indirection, not about keeping the flag at all, ok
<arigato>
no, that's a consequence
<fijal>
arigato: sorry, please explain what do you actually mean, it took us two pages to understand what are you after
<arigato>
we didn't so far :-)
<arigato>
I am saying that indexing in a string is now slower, because it needs to walk through one more pointer indirection
<fijal>
no, "the overhead of keeping a flag this way on UTF8_INDEX_STORAGE" is very different from "why do we need to keep HAS_SURROGATES at all"
<fijal>
yes ok, but that took me two pages to understand
<fijal>
there are other ways
<fijal>
like we can keep an array one longer and the first item of array is always a flag
<fijal>
that increases complexity a bit, but removes a level of indirection
<arigato>
that doesn't seem to be the problem...
<arigato>
you can define the GcStruct in a way that it starts with 'flag' and then has a GcArray, not a Ptr to it
<fijal>
that would really confuse the JIT ;-)
<arigato>
the problem is that you can't do it because of the UTF8_HAS_SURROGATES constant
<fijal>
but maybe
<arigato>
right
<fijal>
why?
<fijal>
the constant is just an id of something, it can be anything
<fijal>
can also be a different GcStruct
<arigato>
so that means that at every character indexing, you need to check "is it actually equal to UTF8_IS_ASCII? is it actually equal to UTF8_HAS_SURROGATES?"
<arigato>
this is overhead
<fijal>
no, because the length is zero
<fijal>
we can have instead of NULL some other thing
<fijal>
sure it's a bit more than just a pointer check, but not that much more
<fijal>
and then just check the length
<arigato>
well, these 500 ifs are a lot
<fijal>
I'm not even sure it does not disappear in the noise of character checking with 500 ifs....
<fijal>
it would need to be measured
<arigato>
I know we said that looping over characters is very slow in CPython so it's ok if it's slowish in PyPy
<arigato>
I'm still trying to optimize a bit :-)
the_drow has joined #pypy
<fijal>
yes :-)
<fijal>
arigato: also note that we probably don't have a single benchmark that would actually execute that part
<arigato>
and again, that's both about the runtime cost and the complexity of implementation adding checks everywhere
<fijal>
so it's a very open question how much do we care
<fijal>
complexity is in single place, common
<fijal>
I think you're overthinking that a tiny bit
<arigato>
well, not the combine_flags mess, but I see
<fijal>
no, but that's different
<fijal>
that's keeping flags at all
<fijal>
can we agree what's the topic of the conversation first?
<fijal>
do you:
<fijal>
a) not like keeping flags at all
<fijal>
b) not like the current flag layout
<arigato>
I mostly rant about the complexity I see everywhere, which is here *only* to speed up .encode('utf8'), because it's unclear to me that the win here is not lost in the slow-down everywhere else
<fijal>
so no
<fijal>
because we also keep the ascii flag
<fijal>
which is far more important
<fijal>
I'm ok arguing for the speeding up of only encode('utf8') btw, but this is not the topic right now, because of the ascii
<arigato>
I don't fully see, because adding the "no surrogates" flag appears to require more implementation efforts than adding just the "ascii" flag, particularly around UTF8_INDEX_STORAGE
<fijal>
how would you add just the ascii flag?
<arigato>
I'm ok with "if it points to UTF8_IS_ASCII"
<fijal>
so is it about a) or b)?
<fijal>
I'm sorry, but each time I'm trying to have an argument, you jump between those two topics
<arigato>
sorry, have to go soon
<fijal>
the difference is:
<fijal>
a) has the complexity problems
<fijal>
and b) has the overheads that can be addressed, but not the complexity
<arigato>
for example, I'm not sure that keeping "is ascii" and "has surrogates" in the same place makes sense
<arigato>
maybe it does, but that is open
<fijal>
ok
<fijal>
that's b) right?
<arigato>
that's neither a) nor b), because that's saying "the way you handle these two flags maybe needs to be different"
<arigato>
as in,
<arigato>
different from each other
<fijal>
ok, maybe
<fijal>
but if we need to have flags than we need to have all that complexity
<fijal>
also, the complexity goes somewhere else than the runtime costs go
* arigato
-> really away
<fijal>
so having them is one problem and how do we store them is another problem
<arigato>
sorry, mostly ranting here
<fijal>
yes, but it's not very comprehensible to me what exactly is the problem
<fijal>
arigato: ok see you, let's chat tonight
<fijal>
ronan: (for logs) I don't believe what you did belongs to this branch at all
<fijal>
either a) do it in default and b) merge to the branch and only then do c) move the stuff from unicode to utf8 on the branch
<fijal>
or a) move unicode to utf8 on the branch, merge the branch, refactor later
raynold has quit [Quit: Connection closed for inactivity]
<fijal>
arigato: my take is that you're unhappy about something, but I don't exactly know what and you don't seem to know either
<fijal>
If flag tracking then utf8stringbuilder should deal with it
<fijal>
If flag storage then we have a few options
<fijal>
I seriously doubt a single extra pointer comparison is measurable though. Most of the cost is probably amortized building of index
jcea has joined #pypy
oberstet has quit [Ping timeout: 252 seconds]
Rhy0lite has joined #pypy
adamholmberg has joined #pypy
oberstet has joined #pypy
antocuni has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 240 seconds]
adamholmberg has joined #pypy
slacky__ has quit [Remote host closed the connection]
slackyy has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Read error: Connection reset by peer]
adamholm_ has joined #pypy
adamholm_ has quit [Remote host closed the connection]
the_drow has quit [Ping timeout: 240 seconds]
ronan has joined #pypy
the_drow has joined #pypy
<fijal>
arigato: ping
<fijal>
arigato: ah sorry your commits are from the morning
<fijal>
please check with me before doing more changes :)
<arigato>
fijal: sure
<arigato>
sorry about this morning
<fijal>
arigato: no worries :-)
<kenaan_>
arigo unicode-utf8 a94b5860dbb3 /: Fixes for _cffi_backend
<fijal>
arigato: I'm adding half of your suggestions (Utf8StringBuilder and Utf8StringIterator)
<arigato>
I just pushed fixes to _cffi_backend, nothing more
<fijal>
I think it's important to get interfaces first
<fijal>
and we can tweak the actual details later
<fijal>
so we only need to agree whether keeping any flags at all makes sense
<arigato>
agreed
<fijal>
let me push that and we can have a quick look
<arigato>
note that UTF8_INDEX_STORAGE being changed to avoid the Ptr in 'contents' would not make the JIT unhappy: the JIT is *already* unhappy about the structure, and it's not seeing it
fryguybob has quit [Ping timeout: 260 seconds]
<arigato>
(it contains a GcArray of Struct, moreover with a FixedSizeArray)
<fijal>
ok
<fijal>
it would make some analyzer unhappy I think
<fijal>
(I whacked at it, but just a bit, not sure if enough)
<fijal>
then that's a non-controversial change
<fijal>
would that fix the problem?
<arigato>
still unhappy about the number of checks for a simple __getitem__, which I tried very hard to keep to a minimum
<arigato>
but that can come later
fryguybob has joined #pypy
<fijal>
right
<fijal>
I kind of agree, but I would vote for pushing towards having everything compiling so we can actually do measurments
<fijal>
arigato: note that some tests fail for me
<arigato>
fijal: agreed
<fijal>
ok, something is off
<arigato>
fijal: for me too. unless you mean inside _cffi_backend, in which case, not on linux
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 240 seconds]
<mattip>
antocuni: around?
<antocuni>
yes
<mattip>
about eventlet and your code, don;t you need to use a vmprof.enable somewhere in cpuburn?
<antocuni>
mattip: no, it's called inside run_profiler
<antocuni>
if you try to run it, you can look at the prints and follow the order of execution
<antocuni>
but basically it is: "start profiling", "cpuburn 0 to 4", "stop profiling"
<mattip>
ahh, the eventlet is running two "threads" concurrently
<antocuni>
yes
<antocuni>
hidden inside eventlet there is a "main hub" which drives the execution
<antocuni>
whenever you call eventlet.sleep() (or any other non-blocking green function), the execution is transfered to the hub, which decides which greenlet to resume
<mattip>
it works as it should on CPython?
<antocuni>
yes, ando also on pypy-5.9
<mattip>
ok, now I got it thanks.
<antocuni>
I think the bug is due to my recent changes to vmprof+rstacklet (which are needed to prevent segfaults)
<antocuni>
but I had no time to investigate properly yet
<mattip>
not sure if I can help, but I might try to look
<arigato>
antocuni: you're running with the trunk, right? I know before my fixes yesterday it would sometimes leave the state to "stopped"
<antocuni>
arigato: yes, with the nightly build which reports 1cc101a9ee5a
<antocuni>
arigato: I have encountered this problem a couple of days ago; then yesterday I saw your commit and though "ahah, that's the fix!"
<antocuni>
but apparently, not :(
<arigato>
ok :-/
<antocuni>
it might be a similar problem, for all I know
<mattip>
arigato: thanks for fixing the non-linux translations. There are failing "own" tests, seems simple, I am looking
<arigato>
thanks to you
<mattip>
antocuni: another topic - cpyext-avoid-roundtrip should get merged, correct?
<antocuni>
yes
<antocuni>
I think it is ready to be merged; last time I tried, it "passed" all the numpy and pandas tests
<antocuni>
"passed" as: it is not more broken than default :)
<mattip>
yes, seems to speed up numpy test suite by ~10%
<antocuni>
I think I asked arigato to review, but then both of us forgot about it
<mattip>
are there corners that need careful review, or can we simply merge?
<antocuni>
I think that the biggest change which I made to the branch after the cape town sprint is 770b53602445, i.e. the merging of cpyext-refactor-methodobject
<antocuni>
this is probably worth a review
jcea has quit [Quit: jcea]
<fijal>
pom pom pom
<fijal>
arigato: do we care about how multibytecodec.incremental works?
<fijal>
ronan: can you park your additions to objspace on a branch?
<fijal>
it kinda breaks my workflow, or suggest some better solution
Nizumzen has quit [Ping timeout: 240 seconds]
<fijal>
maybe I can be just careful about commits?
<ronan>
fijal: I think we should keep some of those additions for tests and/or defining constants
<ronan>
but feel free to deal with them however you want
<kenaan_>
rlamy default 8369cd92f7d0 /pypy/module/_io/interp_textio.py: Simplify _find_line_ending() and fix logic in the case of embedded \r and self.readnl=='\r\n'
<fijal>
ronan: defining constants?
<fijal>
well, it makes everything pass, while it actually shouldn't
<fijal>
that's the problem
<fijal>
sure, you can kill them, redo them etc.
<fijal>
but it's kinda around case, why not make a branch where you can deal with _textio on your own instead?
jcea has joined #pypy
<ronan>
if you prefer it that way, that's fine by me
<fijal>
cool :-)
<fijal>
ronan: and immediately as I said that, I run into the fact that I might need it for sre ;-)
<fijal>
but no, I don't think so
<fijal>
arigato: feel like also doing sre stuff?
<fijal>
it's a tiny bit fragile I think
<kenaan_>
fijal unicode-utf8 5a057586add0 /pypy/module/_sre/interp_sre.py: one part of interp_sre
<fijal>
ronan: anyway, it's ok for now, maybe we can get this fixed till tomorrow anyway and then the problem disappears