antocuni changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | "PyPy: the Gradual Reduction of Magic (tm)"
<blachance> if I want to translate my interpreter so I can profile it (e.g. w/gperftools), do I need to use any particular translation options to get debug symbols?
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
tbodt has joined #pypy
tbodt has quit [Read error: Connection reset by peer]
<ronan> blachance: not sure what's best, but I'd try with --lldebug first
tbodt has joined #pypy
<blachance> hmm, that's what I suspected.. I tried that, and the different cost-centers the profiler shows are (what look like) addresses
<blachance> (there are a few that aren't addresses, e.g. _malloc)
<blachance> so, thanks for confirming about --lldebug ronan. Sounds like I'm at least close to the right path
yuyichao has quit [Ping timeout: 240 seconds]
Thinh has quit [Quit: Bye!]
Thinh has joined #pypy
yuyichao has joined #pypy
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
tbodt has joined #pypy
antocuni has quit [Ping timeout: 248 seconds]
gclawes has quit [Read error: Connection reset by peer]
gclawes has joined #pypy
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
pilne has joined #pypy
Thinh has quit [Quit: Bye!]
Thinh has joined #pypy
slacky__ has joined #pypy
slackyy has quit [Ping timeout: 260 seconds]
Thinh has quit [Quit: Bye!]
Thinh has joined #pypy
marr has quit [Ping timeout: 268 seconds]
jcea has quit [Ping timeout: 250 seconds]
jcea has joined #pypy
ArneBab has joined #pypy
ArneBab_ has quit [Ping timeout: 248 seconds]
jcea has quit [Remote host closed the connection]
<kenaan> rlamy unicode-utf8 b89046216269 /pypy/: Add (back) convenience methods space.newunicode(), space.new_from_utf8() and space.unicode_w()
<kenaan> rlamy unicode-utf8 a8f461710bf8 /pypy/module/_io/interp_textio.py: Do some unicode>utf8 conversions in interp_textio
tav` has joined #pypy
tav has quit [Ping timeout: 240 seconds]
tav` is now known as tav
<kenaan> rlamy default 6eab39056eb5 /pypy/module/_io/: Refactor interp_textio.py a little
<kenaan> rlamy default 870515a86876 /pypy/module/_io/interp_textio.py: Use a UnicodeBuilder in _io.TextIOWrapper.readline
Nizumzen has joined #pypy
the_drow has quit [Ping timeout: 240 seconds]
<kenaan> rlamy unicode-utf8 031e80f0a68e /pypy/module/_io/: Refactor interp_textio.py a little
<kenaan> rlamy unicode-utf8 8c2553a25336 /pypy/module/_io/interp_textio.py: Use a UnicodeBuilder in _io.TextIOWrapper.readline
the_drow has joined #pypy
Nizumzen has quit [Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/]
the_drow has quit [Ping timeout: 248 seconds]
the_drow has joined #pypy
ronan has quit [Ping timeout: 264 seconds]
Nizumzen has joined #pypy
<fijal> ronan: for logs, uh, why are you putting UnicodeBuilder anywhere?
<fijal> the point was not to have it at all
<fijal> blachance: no, you get debug symbols by default
<fijal> lldebug will make everything unevenly slower
drolando has quit [Remote host closed the connection]
drolando has joined #pypy
Nizumzen has quit [Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/]
<arigato> fijal: there is no way in rutf8 to get the flag apart from check_utf8(), right?
<fijal> arigato: from a unicodeobject?
<arigato> no from a utf8 string we just built
<fijal> as in w_unicodeobject
<fijal> ah, no
<fijal> you have to walk characters, if you didn't get it from W_UnicodeObject
<fijal> (but you can get it from W_UnicodeObject)
<arigato> and there is no way to incrementally build up the flag, for example, and no way to call a faster function that assumes it *is* valid utf8 because we just built it
<fijal> we can create such a function
<fijal> *but*
<fijal> my thinking was that writing as SSE-based function will be faster than rpython function that assumes it's valid UTF8
<fijal> incremental building is quite a pain though
<arigato> well, the SSE function can also be faster by assuming it is valid UTF8
<fijal> yes right
<arigato> but
<fijal> so maybe we should have get_length_and_flag which for now will call check_utf8
<fijal> but we can replace it with a faster version in the future?
<arigato> we should have an incremental way
<fijal> ok
<fijal> so let's do that
<arigato> as in, when you build a utf8 string you can usually compute its length as you build it
<fijal> yes, it has been done by hand in codecs
<arigato> ok, so what do you do there?
<fijal> I use combine_flag()
<fijal> and keep track of length
<arigato> ah, that's what I'm talking about
<arigato> where is combine_flag()?
<fijal> in rutf8
<arigato> no, it's in unicodehelper.py
<fijal> ah yes
<fijal> should be moved to rutf8
<arigato> ok I can do that
<arigato> that's what I was looking for
<fijal> still, should we make a faster version?
<fijal> even if it's "call the same thing for now"?
<arigato> there is little point if you can call combine_flags() instead as you go along
<arigato> which is faster on long strings at least (no re-walking the string)
<fijal> yes sure
<fijal> but it's a pain in the ass in a lot of cases
<fijal> like, splitline
<arigato> ok
<arigato> then yes
<fijal> I must say I have no idea what is ronan doing
<arigato> :-/
<arigato> there is rutf8.get_flag_from_code and rutf8.unichr_to_flag
<arigato> identical
<arigato> which one should I kill? :-)
<fijal> heh, pick one :)
<fijal> like, he's changing code in _io module (but still using unicode instead of utf8)
<fijal> which makes it much harder to merge into anything
<arigato> I see your point, but I guess wait until he can explain?
kenaan has quit [Read error: No route to host]
<arigato> major speed-up of combine_flags(): return flag1 | flag2
<arigato> with a tweak to the actual values, of course
antocuni has joined #pypy
* arigato tries to do _cffi_backend but keeps being distracted by corner cases
<arigato> e.g. u.lower() on a unicode with surrogates would probably get an RPython-level ValueError
kenaan_ has joined #pypy
<kenaan_> arigo unicode-utf8 a1cf21d7a124 /: Tweak the unicode FLAG_xx values for performance; collapse two identical helpers; move combine_flags() to rutf8
<kenaan_> arigo unicode-utf8 25ac6121d03c /pypy/: merge heads
<kenaan_> arigo unicode-utf8 16bfad77e3d5 /pypy/objspace/std/: Tests and fixes for 'allow_surrogates=True' in various unicode methods
<arigato> we should actually have a StringBuilder and unichr_as_utf8_append() that computes the flag for us, too
<fijal> arigato: ah
marr has joined #pypy
<antocuni> arigato: thanks for fixing vmprof
<antocuni> although r1cc101a9ee5a apparently is not enough. I have an example using eventlets in which pypy nightly doesn't record any sample, which pypy 5.9 does :(
<kenaan_> arigo unicode-utf8 dc6582a05b85 /: Review for surrogates
<arigato> antocuni: well, no tests fail :-(
<antocuni> arigato: sure, I don't think it's your fault
<antocuni> I mean, I can observe this buggy behavior also before 1cc101a9ee5a
<arigato> fijal: just to be clear, sys.maxunicode == 0x10ffff in any future pypy even on platforms where it isn't the case in CPython 2.7, right?
<antocuni> to be more precise, this is an example which shows the problem: http://paste.openstack.org/show/627200/
<fijal> arigato: yes
<fijal> arigato: that question makes no longer any sense, in a way
<fijal> (but we need to keep track of size of WCHAR anyway)
<arigato> yes
<fijal> arigato: I have electricity, but I need to go outside
<fijal> feel free to apply any form of refactoring
<fijal> like faster version of check, string builder etc
<fijal> I'm not TOO happy how haphazard the stuff is right now
<fijal> arigato: ah, and I'm fine doing the mechanical work of adjusting the current solutions :)
<arigato> I'm kind of happy that you're not too happy about it :-)
<arigato> I guess I'll try to have a version of StringBuilder that is really a Utf8StringBuilder
antocuni has quit [Ping timeout: 268 seconds]
<arigato> also, UTF8_INDEX_STORAGE seems to have added one level of indirection
<arigato> if the goal was only to provide a place for the flag without overhead, then that is missed
oberstet has joined #pypy
<fijal> what do you mean?
<fijal> we still need storage for the index stuff no?
<arigato> of course, but I'm complaining that it is now one indirection farther
<fijal> ok?
<arigato> also, the *only* use of the flag "HAS_SURROGATES" is in unicode.encode('utf8')?
<arigato> is that right?
<fijal> yes
<fijal> but ASCII is used a bit more
<arigato> maybe we can have a lightway solution for "HAS_SURROGATES"
<arigato> still thinking
<arigato> I'm thinking about something that takes care of bytes.decode('utf8').encode('utf8') but not necessarily more complicated cases
the_drow has quit [Ping timeout: 260 seconds]
<fijal> I think it's kind of important to have encode that does not scan the string
<fijal> on strings that you might have gotten from splitting other strings, for example
<arigato> ok
<arigato> as implemented now I'm unsure you win in the end
<arigato> i.e.
<arigato> it creates overhead a bit everywhere
<fijal> keeping track of the flag?
<arigato> both in runtime cost and in complexity of implementation
<arigato> yes
<fijal> right, but knowing it's ascii is very important
<arigato> starting with how UTF8_INDEX_STORAGE is now two allocations instead of one
<arigato> yes
<fijal> why is it two allocations?
<fijal> in the common case it should be zero, no?
<arigato> well, maybe, but in case it's != 0, then it's 2
<fijal> why is it 2?
<fijal> because struct and array?
<arigato> yes
<arigato> it means that every indexing is slower
<fijal> so you're worried about this level of indirection, not about keeping the flag at all, ok
<arigato> no, that's a consequence
<fijal> arigato: sorry, please explain what do you actually mean, it took us two pages to understand what are you after
<arigato> we didn't so far :-)
<arigato> I am saying that indexing in a string is now slower, because it needs to walk through one more pointer indirection
<fijal> no, "the overhead of keeping a flag this way on UTF8_INDEX_STORAGE" is very different from "why do we need to keep HAS_SURROGATES at all"
<fijal> yes ok, but that took me two pages to understand
<fijal> there are other ways
<fijal> like we can keep an array one longer and the first item of array is always a flag
<fijal> that increases complexity a bit, but removes a level of indirection
<arigato> that doesn't seem to be the problem...
<arigato> you can define the GcStruct in a way that it starts with 'flag' and then has a GcArray, not a Ptr to it
<fijal> that would really confuse the JIT ;-)
<arigato> the problem is that you can't do it because of the UTF8_HAS_SURROGATES constant
<fijal> but maybe
<arigato> right
<fijal> why?
<fijal> the constant is just an id of something, it can be anything
<fijal> can also be a different GcStruct
<arigato> so that means that at every character indexing, you need to check "is it actually equal to UTF8_IS_ASCII? is it actually equal to UTF8_HAS_SURROGATES?"
<arigato> this is overhead
<fijal> no, because the length is zero
<fijal> we can have instead of NULL some other thing
<fijal> sure it's a bit more than just a pointer check, but not that much more
<fijal> and then just check the length
<arigato> well, these 500 ifs are a lot
<fijal> I'm not even sure it does not disappear in the noise of character checking with 500 ifs....
<fijal> it would need to be measured
<arigato> I know we said that looping over characters is very slow in CPython so it's ok if it's slowish in PyPy
<arigato> I'm still trying to optimize a bit :-)
the_drow has joined #pypy
<fijal> yes :-)
<fijal> arigato: also note that we probably don't have a single benchmark that would actually execute that part
<arigato> and again, that's both about the runtime cost and the complexity of implementation adding checks everywhere
<fijal> so it's a very open question how much do we care
<fijal> complexity is in single place, common
<fijal> I think you're overthinking that a tiny bit
<arigato> well, not the combine_flags mess, but I see
<fijal> no, but that's different
<fijal> that's keeping flags at all
<fijal> can we agree what's the topic of the conversation first?
<fijal> do you:
<fijal> a) not like keeping flags at all
<fijal> b) not like the current flag layout
<arigato> I mostly rant about the complexity I see everywhere, which is here *only* to speed up .encode('utf8'), because it's unclear to me that the win here is not lost in the slow-down everywhere else
<fijal> so no
<fijal> because we also keep the ascii flag
<fijal> which is far more important
<fijal> I'm ok arguing for the speeding up of only encode('utf8') btw, but this is not the topic right now, because of the ascii
<arigato> I don't fully see, because adding the "no surrogates" flag appears to require more implementation efforts than adding just the "ascii" flag, particularly around UTF8_INDEX_STORAGE
<fijal> how would you add just the ascii flag?
<arigato> I'm ok with "if it points to UTF8_IS_ASCII"
<fijal> so is it about a) or b)?
<fijal> I'm sorry, but each time I'm trying to have an argument, you jump between those two topics
<arigato> sorry, have to go soon
<fijal> the difference is:
<fijal> a) has the complexity problems
<fijal> and b) has the overheads that can be addressed, but not the complexity
<arigato> for example, I'm not sure that keeping "is ascii" and "has surrogates" in the same place makes sense
<arigato> maybe it does, but that is open
<fijal> ok
<fijal> that's b) right?
<arigato> that's neither a) nor b), because that's saying "the way you handle these two flags maybe needs to be different"
<arigato> as in,
<arigato> different from each other
<fijal> ok, maybe
<fijal> but if we need to have flags than we need to have all that complexity
<fijal> also, the complexity goes somewhere else than the runtime costs go
* arigato -> really away
<fijal> so having them is one problem and how do we store them is another problem
<arigato> sorry, mostly ranting here
<fijal> yes, but it's not very comprehensible to me what exactly is the problem
<fijal> arigato: ok see you, let's chat tonight
<fijal> ronan: (for logs) I don't believe what you did belongs to this branch at all
<fijal> either a) do it in default and b) merge to the branch and only then do c) move the stuff from unicode to utf8 on the branch
<fijal> or a) move unicode to utf8 on the branch, merge the branch, refactor later
raynold has quit [Quit: Connection closed for inactivity]
<fijal> arigato: my take is that you're unhappy about something, but I don't exactly know what and you don't seem to know either
<fijal> If flag tracking then utf8stringbuilder should deal with it
<fijal> If flag storage then we have a few options
<fijal> I seriously doubt a single extra pointer comparison is measurable though. Most of the cost is probably amortized building of index
jcea has joined #pypy
oberstet has quit [Ping timeout: 252 seconds]
Rhy0lite has joined #pypy
adamholmberg has joined #pypy
oberstet has joined #pypy
antocuni has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 240 seconds]
adamholmberg has joined #pypy
slacky__ has quit [Remote host closed the connection]
slackyy has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Read error: Connection reset by peer]
adamholm_ has joined #pypy
adamholm_ has quit [Remote host closed the connection]
the_drow has quit [Ping timeout: 240 seconds]
ronan has joined #pypy
the_drow has joined #pypy
<fijal> arigato: ping
<fijal> arigato: ah sorry your commits are from the morning
<fijal> please check with me before doing more changes :)
<arigato> fijal: sure
<arigato> sorry about this morning
<fijal> arigato: no worries :-)
<kenaan_> arigo unicode-utf8 a94b5860dbb3 /: Fixes for _cffi_backend
<fijal> arigato: I'm adding half of your suggestions (Utf8StringBuilder and Utf8StringIterator)
<arigato> I just pushed fixes to _cffi_backend, nothing more
<fijal> I think it's important to get interfaces first
<fijal> and we can tweak the actual details later
<fijal> so we only need to agree whether keeping any flags at all makes sense
<arigato> agreed
<fijal> let me push that and we can have a quick look
<arigato> note that UTF8_INDEX_STORAGE being changed to avoid the Ptr in 'contents' would not make the JIT unhappy: the JIT is *already* unhappy about the structure, and it's not seeing it
fryguybob has quit [Ping timeout: 260 seconds]
<arigato> (it contains a GcArray of Struct, moreover with a FixedSizeArray)
<fijal> ok
<fijal> it would make some analyzer unhappy I think
<fijal> (I whacked at it, but just a bit, not sure if enough)
<fijal> then that's a non-controversial change
<fijal> would that fix the problem?
<arigato> still unhappy about the number of checks for a simple __getitem__, which I tried very hard to keep to a minimum
<arigato> but that can come later
fryguybob has joined #pypy
<fijal> right
<fijal> I kind of agree, but I would vote for pushing towards having everything compiling so we can actually do measurments
<fijal> arigato: note that some tests fail for me
<arigato> fijal: agreed
<fijal> ok, something is off
<arigato> fijal: for me too. unless you mean inside _cffi_backend, in which case, not on linux
<fijal> arigato: test_rutf8
<fijal> hypothesis caught some examples for me
<fijal> I'll look into them
<kenaan_> fijal unicode-utf8 9ede67aee27e /rpython/rlib/: Utf8StringBuilder
<kenaan_> fijal unicode-utf8 3e45feebc910 /: merge
<arigato> passing there (I guess it means I didn't run it often enough)
<fijal> yes, something like that
<fijal> should I add failing examples?
<fijal> you can list them with @examples or something
<fijal> arigato: please tell me if this is an interface that you wanted
<fijal> (interface, not the actual implementation)
<arigato> yes, would be nice
<arigato> so .append() is for already-checked, valid utf8 strings?
<fijal> yes
marr has quit [Ping timeout: 248 seconds]
mattip has joined #pypy
<arigato> and as usual, we need to be careful when calling append_code() because it could raise ValueError
<arigato> I don't know how to improve that
<arigato> in many places we know by construction that code <= 0x10ffff
<fijal> yeah
<arigato> in other places we need to catch the ValueError or else we crash
<arigato> fine about Utf8StringBuilder
<arigato> Utf8StringIterator is probably helpful
<fijal> yes, it's just untested yet
<kenaan_> fijal unicode-utf8 d24fe4f59c96 /rpython/rlib/test/test_rutf8.py: provide explicit examples
<arigato> note that w_u._has_surrogates() means "does this unicode string contain surrogates", right? there was some confusion in _cffi_backend
<fijal> yes
oberstet2 has joined #pypy
<fijal> it should really be .has_surrogates() without an _
<arigato> I think you called it with the expectation of it answering the question "does this unicode string uses chars >= 0x10000"
<fijal> uh
<fijal> no, that was never the intention (even if I did)
<arigato> ok, then unicode_size_as_char16() != w_u._len() if and only if there are chars >= 0x10000
oberstet has quit [Ping timeout: 255 seconds]
<arigato> it's not related to surrogates chars
<fijal> ah
<fijal> I did not get that
<fijal> sorry, my _cffi_backend should probably be completely reverted, I was tired and didn't know what I was doing
<fijal> (or at the very least carefully reviewed)
<arigato> yes, I think I carefully reviewed all your changes in _cffi_backend now
<arigato> how far are we to actually translate pypy?
<arigato> and run interesting programs
<fijal> I got stopped at _io module
<fijal> not *that* far
<arigato> ok cool
<fijal> but I think with new interfaces we can do _io module quite quickly
<fijal> then there is _pypyjson and cpyext, without which we can compile
<arigato> ok
<fijal> arigato: ah the tests fail because I have a narrow build of host
<fijal> u'\U00040000'
<fijal> (Pdb++) p len(u)
<fijal> 2
<fijal> that sort of stuff
<fijal> pom pom pom
<fijal> how do I get a *real* length?
<arigato> you can't distinguish between u'\U00040000' and the two surrogates character
<arigato> there are things in runicode that try to guess anyway
<arigato> better ask the question: which test fails and can we fix the test not to use unicode
<fijal> arigato: it's the checking of check_utf8
<fijal> arigato: so we're checking "is check_utf8 returning the right value"
<fijal> which it is, but we use python len(), which isn't
<arigato> ah, right
<arigato> a bit no clue. you can't use the "guess the length" here
traverseda has quit [Ping timeout: 240 seconds]
<fijal> I mean I have, in rutf8.check_utf8, but I'm trying to see if it works :)
<fijal> we can just skip the test on narrow build maybe?
<arigato> yes
<fijal> easy, I won't check for length if the build is narrow *and* i have surrogates
<kenaan_> fijal unicode-utf8 eb564d44a7c8 /rpython/rlib/test/test_rutf8.py: fix test on narrow host
<kenaan_> fijal unicode-utf8 fa3bcbe5b09f /rpython/rlib/test/test_rutf8.py: fix tests on narrow host
<ronan> fijal: which commits are you complaining about?
<fijal> ronan: well, most of them :-
<fijal> first of all, I don't understand your plan
<ronan> the plan is to get things working
<fijal> yeah ok
<fijal> but then why do you do random refactoring?
<fijal> like, why did you add back stuff to space?
<fijal> why did you change [] to UnicodeBuilder?
<ronan> I changed it to UnicodeBuilder on default
<ronan> adding stuff to space allows tests to pass
<ronan> then I can remove the old things one by one
traverseda has joined #pypy
<fijal> so it is more like default now or less so?
marr has joined #pypy
<fijal> ah ok
<fijal> no, that does make sense sorry
<ronan> it's a bit more like default
<ronan> I mostly did parallel changes
<fijal> yeah I see
<fijal> no that makes sense
<ronan> I forgot about hg graft doing the wrong thing by default, though, sorry
<fijal> ronan: I added Utf8StringBuilder now
<fijal> that tracks flags and stuff
<ronan> good
<fijal> and Utf8StringIterator
<ronan> hmm, TextIOWrapper.readline() still needs some refactoring in order to use that
<ronan> ATM, it really relies on doing unicode[:n], where n is just a number
<fijal> how does it find n?
<ronan> by doing arithmetic on unicode indexes
<fijal> but we can do all that arithmetic on byte indexes too right?
<kenaan_> fijal unicode-utf8 e4a568e4514c /rpython/rlib/test/test_rutf8.py: more tests
<ronan> no
<ronan> well, not without a refactoring
<fijal> why not?
<fijal> they all have corresponding indexes in utf8
<fijal> ronan: are you ok continuing with _textio?
<fijal> if so, I would whack at the rest
<ronan> yes, I'm fine working on it
<ronan> fijal: merging default into the branch would be helpful
<fijal> ok
<kenaan_> fijal unicode-utf8 177352fb8cf4 /: merge default
<fijal> ronan: done, the crashes were strange
<ronan> ta
<fijal> sorry, conflicts
adamholmberg has joined #pypy
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 240 seconds]
<mattip> antocuni: around?
<antocuni> yes
<mattip> about eventlet and your code, don;t you need to use a vmprof.enable somewhere in cpuburn?
<antocuni> mattip: no, it's called inside run_profiler
<antocuni> if you try to run it, you can look at the prints and follow the order of execution
<antocuni> but basically it is: "start profiling", "cpuburn 0 to 4", "stop profiling"
<mattip> ahh, the eventlet is running two "threads" concurrently
<antocuni> yes
<antocuni> hidden inside eventlet there is a "main hub" which drives the execution
<antocuni> whenever you call eventlet.sleep() (or any other non-blocking green function), the execution is transfered to the hub, which decides which greenlet to resume
<mattip> it works as it should on CPython?
<antocuni> yes, ando also on pypy-5.9
<mattip> ok, now I got it thanks.
<antocuni> I think the bug is due to my recent changes to vmprof+rstacklet (which are needed to prevent segfaults)
<antocuni> but I had no time to investigate properly yet
<mattip> not sure if I can help, but I might try to look
<arigato> antocuni: you're running with the trunk, right? I know before my fixes yesterday it would sometimes leave the state to "stopped"
<antocuni> arigato: yes, with the nightly build which reports 1cc101a9ee5a
<antocuni> arigato: I have encountered this problem a couple of days ago; then yesterday I saw your commit and though "ahah, that's the fix!"
<antocuni> but apparently, not :(
<arigato> ok :-/
<antocuni> it might be a similar problem, for all I know
<mattip> arigato: thanks for fixing the non-linux translations. There are failing "own" tests, seems simple, I am looking
<arigato> thanks to you
<mattip> antocuni: another topic - cpyext-avoid-roundtrip should get merged, correct?
<antocuni> yes
<antocuni> I think it is ready to be merged; last time I tried, it "passed" all the numpy and pandas tests
<antocuni> "passed" as: it is not more broken than default :)
<mattip> yes, seems to speed up numpy test suite by ~10%
<antocuni> I think I asked arigato to review, but then both of us forgot about it
<mattip> are there corners that need careful review, or can we simply merge?
<antocuni> I think that the biggest change which I made to the branch after the cape town sprint is 770b53602445, i.e. the merging of cpyext-refactor-methodobject
<antocuni> this is probably worth a review
jcea has quit [Quit: jcea]
<fijal> pom pom pom
<fijal> arigato: do we care about how multibytecodec.incremental works?
<fijal> right now it's kinda silly
Nizumzen has joined #pypy
<kenaan_> rlamy default ff05ee1c4b6a /pypy/module/_io/interp_textio.py: refactor
slackyy has quit [Ping timeout: 268 seconds]
raynold has joined #pypy
<kenaan_> fijal unicode-utf8 99ca8cf9bbc4 /pypy/module/_multibytecodec/: fix multibytecodec
<fijal> ronan: can you park your additions to objspace on a branch?
<fijal> it kinda breaks my workflow, or suggest some better solution
Nizumzen has quit [Ping timeout: 240 seconds]
<fijal> maybe I can be just careful about commits?
<ronan> fijal: I think we should keep some of those additions for tests and/or defining constants
<ronan> but feel free to deal with them however you want
<kenaan_> rlamy default 8369cd92f7d0 /pypy/module/_io/interp_textio.py: Simplify _find_line_ending() and fix logic in the case of embedded \r and self.readnl=='\r\n'
<fijal> ronan: defining constants?
<fijal> well, it makes everything pass, while it actually shouldn't
<fijal> that's the problem
<fijal> sure, you can kill them, redo them etc.
<fijal> but it's kinda around case, why not make a branch where you can deal with _textio on your own instead?
jcea has joined #pypy
<ronan> if you prefer it that way, that's fine by me
<fijal> cool :-)
<fijal> ronan: and immediately as I said that, I run into the fact that I might need it for sre ;-)
<fijal> but no, I don't think so
<fijal> arigato: feel like also doing sre stuff?
<fijal> it's a tiny bit fragile I think
<kenaan_> fijal unicode-utf8 5a057586add0 /pypy/module/_sre/interp_sre.py: one part of interp_sre
<fijal> ronan: anyway, it's ok for now, maybe we can get this fixed till tomorrow anyway and then the problem disappears
* fijal tries pypyjson
<kenaan_> rlamy unicode-utf8 0797bb6394b6 /pypy/module/_io/interp_textio.py: hg merge default
<fijal> antocuni: I believe _pypyjson can be made a lot faster
<antocuni> cool
<antocuni> how?
<fijal> SSE & stuff
<fijal> how much do you care?
<fijal> also, maps
<fijal> antocuni: can you explain a few things for me?
<antocuni> about _pypyjson? If I remember :)
<fijal> so create_string calls strslice2unicode_latin1
<fijal> right?
<fijal> but we know that there is no 0x80 set, so this must be ascii not latin1?
<antocuni> yes, I think so
<fijal> ok
<fijal> why is it called latin1 then?
<fijal> or no reason?
<fijal> ah it says so
<antocuni> because if you call it *ascii, then in theory you need to check that it's actually ascii
<antocuni> latin1 has the nice property that codepoints are between 0 and 255, so if you have bytes, you don't need to check anything
<fijal> we just checked, no?
<antocuni> yes
<antocuni> I'm just saying that if you rename the function to *ascii, I expect you to do the range check inside the function
<fijal> (Pdb++) self.space
<fijal> 'fake space'
<fijal> awesome
<antocuni> fijal: I'm about to go off; any more questions?
<fijal> antocuni: seems to be it for now
<antocuni> ok
Nizumzen has joined #pypy
<fijal> antocuni: one more thing, why JSONDecoder gets a string but does ll_string?
<antocuni> I don't remember; maybe because if you do s[0] it checks the index and thus it's slower?
<fijal> well, it would make sense if we used that for load()
<fijal> and not for loads()
<antocuni> I'm pretty sure I measured all change when doing it
<antocuni> so if I did that, it's because it was faster :)
<fijal> my question is why it's not even faster :)
<antocuni> ah, looking at f5cbf0738f31 it might be that we need str2charp to call rdtoa
<antocuni> anyway, /me off
<antocuni> bye
<fijal> bye
antocuni has quit [Ping timeout: 240 seconds]
tav has quit [Quit: tav]
<mattip> with this branch, https://github.com/mattip/matplotlib/tree/tkagg-cffi, tk matplotlib backend works with pypy
<mattip> but
<mattip> opening the subplot-adjustment dialog box on a plot gets to apoint where it has a dead weakref
<mattip> on the sliders callback
<mattip> any hints how I would debug why the weakref is dying? The object is definitely alive, just the weakref dies
<mattip> well actually it is a weakref.ref(func.im_self)
<mattip> and I am sure the func is alive
<fijal> mattip: im_self is not the same as the function btw
<fijal> although if the bound method is alive, the weakref should be alive as well
yuyichao has quit [Ping timeout: 248 seconds]
yuyichao has joined #pypy
tav has joined #pypy
<mattip> fijal: im_self is the original object the method is bound to
<mattip> right, and both the method and the object are alive
<fijal> mattip: pfff, hard to say
<fijal> but looks like a bug
<fijal> I would put a gdb where it dies
<fijal> time to reverse db?
<fijal> rr
<fijal> because you have to go back in time quite a bit to find out why
<fijal> and why is "because it's not referenced by anyone"
Nizumzen has quit [Ping timeout: 240 seconds]
<mattip> thanks
<fijal> well, it's not very helpful :-)
<fijal> it's a bit "hours of painful debugging"
<fijal> I would say the first step would be to get the rr/undodb set up in screen, is that possible?
<mattip> well it's a thread
<mattip> maybe, I need a block of time
<fijal> ok
<fijal> maybe something is not referenced from shadowstack?
<fijal> does it reproduce with --jit off?
<mattip> will check, I am kind of off right now
<kenaan_> mattip default 8c42f0f755c0 /requirements.txt: cannot pip install vmprof on arm, s390x
<kenaan_> mattip default 72001f56a97f /rpython/rlib/rvmprof/cintf.py: fix test use of eci for vmprof_start_sampling, vmprof_start_sampling
<mattip> bye for now
mattip has left #pypy ["bye"]
<kenaan_> mattip py3.5 ce6402cbdf3c /: merge default into py3.5
danieljabailey has quit [Read error: Connection reset by peer]
jamesaxl has joined #pypy
mattip has joined #pypy
<mattip> fwiw, with --jit off the weakref is not killed :(
<mattip> now how to distill this down to a failing test
<mattip> it is single threaded
<mattip> maybe I can make a generic version
mattip has left #pypy ["bye"]
marvin_ has quit [Read error: Connection reset by peer]
marvin has joined #pypy
marvin is now known as Guest48660
oberstet2 has quit [Ping timeout: 240 seconds]
jamesaxl has quit [Read error: Connection reset by peer]
jamesaxl has joined #pypy
oberstet2 has joined #pypy
Nizumzen has joined #pypy
Nizumzen has quit [Client Quit]
agronholm has quit [Ping timeout: 250 seconds]
Nizumzen has joined #pypy
agronholm has joined #pypy
oberstet2 has quit [Ping timeout: 240 seconds]
agronholm has quit [Ping timeout: 255 seconds]
agronholm has joined #pypy
ArneBab has quit [Ping timeout: 260 seconds]