00:07
jcea has joined #pypy
00:45
dansan has quit [Remote host closed the connection]
01:41
todda7 has quit [Ping timeout: 246 seconds]
01:48
tsaka__ has joined #pypy
01:51
tsaka__ has quit [Remote host closed the connection]
02:04
rubdos has quit [Ping timeout: 244 seconds]
02:07
rubdos has joined #pypy
02:08
dstufft has quit [*.net *.split]
02:08
avakdh has quit [*.net *.split]
02:08
krono has quit [*.net *.split]
02:08
oberstet has quit [*.net *.split]
02:08
mgedmin has quit [*.net *.split]
02:08
[Arfrever] has quit [*.net *.split]
02:08
ebarrett has quit [*.net *.split]
02:14
_whitelogger has joined #pypy
02:14
cfbolz has joined #pypy
02:14
dnshane has joined #pypy
02:14
ronan has joined #pypy
02:14
EWDurbin has joined #pypy
02:14
Guest68750 has joined #pypy
02:14
jaraco has joined #pypy
02:14
jeroud has joined #pypy
02:14
phlebas has joined #pypy
02:14
string has joined #pypy
02:14
mwhudson has joined #pypy
02:14
simpson has joined #pypy
02:14
JStoker has joined #pypy
02:14
arigo has joined #pypy
02:14
tbodt has joined #pypy
02:14
antocuni has joined #pypy
02:14
Civil has joined #pypy
02:14
gsnedders has joined #pypy
02:14
dmalcolm_ has joined #pypy
02:14
epsilonKNOT has joined #pypy
02:14
energizer has joined #pypy
02:14
trfl has joined #pypy
02:14
marvin_ has joined #pypy
02:14
gsnedders has quit [*.net *.split]
02:14
dmalcolm_ has quit [*.net *.split]
02:14
energizer has quit [*.net *.split]
02:14
marvin_ has quit [*.net *.split]
02:14
epsilonKNOT has quit [*.net *.split]
02:14
trfl has quit [*.net *.split]
02:14
ronan has quit [*.net *.split]
02:14
dnshane has quit [*.net *.split]
02:14
jaraco has quit [*.net *.split]
02:14
EWDurbin has quit [*.net *.split]
02:14
cfbolz has quit [*.net *.split]
02:14
Guest68750 has quit [*.net *.split]
02:14
string has quit [*.net *.split]
02:14
jeroud has quit [*.net *.split]
02:14
phlebas has quit [*.net *.split]
02:14
mwhudson has quit [*.net *.split]
02:14
simpson has quit [*.net *.split]
02:14
JStoker has quit [*.net *.split]
02:14
Civil has quit [*.net *.split]
02:14
antocuni has quit [*.net *.split]
02:14
arigo has quit [*.net *.split]
02:14
tbodt has quit [*.net *.split]
02:14
eregon has quit [*.net *.split]
02:14
Alex_Gaynor has quit [*.net *.split]
02:14
whitewolf has quit [*.net *.split]
02:14
lastmikoi has quit [*.net *.split]
02:14
danilonc has quit [*.net *.split]
02:14
pjenvey has quit [*.net *.split]
02:14
ulope has quit [*.net *.split]
02:14
runciter has quit [*.net *.split]
02:14
bogner has quit [*.net *.split]
02:14
_habnabit has quit [*.net *.split]
02:14
jerith has quit [*.net *.split]
02:14
glyph has quit [*.net *.split]
02:14
tazle has quit [*.net *.split]
02:14
tumbleweed has quit [*.net *.split]
02:14
raekye has quit [*.net *.split]
02:14
bbot2 has quit [*.net *.split]
02:14
marmoute has quit [*.net *.split]
02:14
the_rat has quit [*.net *.split]
02:14
Lightsword has quit [*.net *.split]
02:14
shodan45 has quit [*.net *.split]
02:14
Hodgestar has quit [*.net *.split]
02:14
oberstet has quit [*.net *.split]
02:14
mgedmin has quit [*.net *.split]
02:14
[Arfrever] has quit [*.net *.split]
02:14
ebarrett has quit [*.net *.split]
02:14
pmp-p has quit [*.net *.split]
02:14
epony has quit [*.net *.split]
02:14
Ninpo has quit [*.net *.split]
02:14
commandoline has quit [*.net *.split]
02:14
nopf_ has quit [*.net *.split]
02:14
jiffe has quit [*.net *.split]
02:14
atomizer has quit [*.net *.split]
02:15
Alex_Gaynor has joined #pypy
02:15
whitewolf has joined #pypy
02:15
eregon has joined #pypy
02:15
pulkit25 has quit [Ping timeout: 244 seconds]
02:16
oberstet has joined #pypy
02:16
mgedmin has joined #pypy
02:16
[Arfrever] has joined #pypy
02:16
ebarrett has joined #pypy
02:17
altendky has quit [Ping timeout: 260 seconds]
02:18
toad_polo has joined #pypy
02:18
idnar has quit [Ping timeout: 260 seconds]
02:18
ulope has joined #pypy
02:18
danilonc has joined #pypy
02:18
pjenvey has joined #pypy
02:18
lastmikoi has joined #pypy
02:18
jerith has joined #pypy
02:18
bogner has joined #pypy
02:18
runciter has joined #pypy
02:18
_habnabit has joined #pypy
02:19
energizer has joined #pypy
02:19
trfl has joined #pypy
02:19
dmalcolm_ has joined #pypy
02:19
marvin_ has joined #pypy
02:19
gsnedders has joined #pypy
02:19
epsilonKNOT has joined #pypy
02:19
tazle has joined #pypy
02:19
bbot2 has joined #pypy
02:19
raekye has joined #pypy
02:19
glyph has joined #pypy
02:19
Hodgestar has joined #pypy
02:19
the_rat has joined #pypy
02:19
marmoute has joined #pypy
02:19
Lightsword has joined #pypy
02:19
shodan45 has joined #pypy
02:19
tumbleweed has joined #pypy
02:19
nopf has joined #pypy
02:19
dnshane has joined #pypy
02:19
cfbolz has joined #pypy
02:19
ronan has joined #pypy
02:19
JStoker has joined #pypy
02:19
antocuni has joined #pypy
02:19
phlebas has joined #pypy
02:19
Civil has joined #pypy
02:19
jeroud has joined #pypy
02:19
simpson has joined #pypy
02:19
arigo has joined #pypy
02:19
mwhudson has joined #pypy
02:19
jaraco has joined #pypy
02:19
tbodt has joined #pypy
02:19
idnar has joined #pypy
02:19
idnar has quit [Changing host]
02:19
idnar has joined #pypy
02:19
idnar has joined #pypy
02:19
idnar has quit [Changing host]
02:20
graingert has quit [Ping timeout: 256 seconds]
02:20
idnar has quit [Changing host]
02:20
idnar has joined #pypy
02:21
Ninpo has joined #pypy
02:21
pmp-p has joined #pypy
02:21
commandoline has joined #pypy
02:21
epony has joined #pypy
02:21
jiffe has joined #pypy
02:21
atomizer has joined #pypy
02:21
EWDurbin has joined #pypy
02:22
pulkit25 has joined #pypy
02:22
Guest68750 has joined #pypy
02:23
altendky has joined #pypy
02:23
string has joined #pypy
02:25
graingert has joined #pypy
02:42
infernix has joined #pypy
02:42
the_drow[m] has joined #pypy
02:42
astrojl_matrix has joined #pypy
02:44
andi- has quit [Remote host closed the connection]
02:47
andi- has joined #pypy
02:56
lritter_ has joined #pypy
02:56
lritter has quit [Ping timeout: 244 seconds]
03:16
jcea has quit [Quit: jcea]
03:39
forgottenone has joined #pypy
06:52
<
fijal >
arigo: that's a bit bizzare
07:05
lritter_ has quit [Ping timeout: 240 seconds]
07:41
_whitelogger has joined #pypy
09:47
_whitelogger has joined #pypy
09:50
dddddd has joined #pypy
11:27
forgottenone has quit [Read error: Connection reset by peer]
11:30
forgottenone has joined #pypy
11:42
agronholm has joined #pypy
11:42
<
agronholm >
hello, could somebody shed a little light on this? why does this script produce different output on pypy than on cpython?
https://bpa.st/FVWA
11:44
<
agronholm >
the buffer still contains two bytes on pypy at exit, nothing on cpython
11:45
BPL has joined #pypy
11:58
forgottenone has quit [Quit: Konversation terminated!]
12:13
<
mattip >
agronholm: what version of PyPy, what platform?
12:25
<
agronholm >
mattip: pypy3 7.3.1, Linux (Fedora 32)
12:50
<
agronholm >
for some reason it stops on the first encoding error, unlike its cpython counterpart
12:50
<
agronholm >
*decoding error
13:04
<
mattip >
can you try with latest HEAD?
13:04
<
agronholm >
I don't suppose there's a precompiled version lying around anywhere?
13:04
<
agronholm >
if not, I'll get to building it
13:09
forgottenone has joined #pypy
13:15
<
agronholm >
mattip: that gives me the same result
13:18
<
agronholm >
I am having trouble finding the implementation in the sources
13:18
<
agronholm >
maybe I could track down the problem then
13:28
<
mattip >
add a test to modules/_codecs/test, run it with python2 pytests.py pypy/modules/_codecs ...
13:29
<
mattip >
and look in pypy/ interpreter/unicodehelper
13:31
<
mattip >
you don’t need the decode part if the encode is different
13:40
<
agronholm >
I'm not sure where to look, and the utf_8_decode() function is a builtin so I can't step into it with a debugger either
13:40
<
agronholm >
are you sure it invokes functions from the unicodehelper module?
16:19
tos9_ is now known as tos9
17:16
<
mattip >
agronholm: this is 32-bit fedora?
17:16
<
mattip >
I get identical results on 64-bit linux.
17:18
<
mattip >
what I was trying to explain before is that the way to debug this is via a test with untranslated pypy
17:19
<
mattip >
the tests live in pypy/module/_codecs/test/test_codecs.py
17:19
<
mattip >
and are run via
17:19
<
mattip >
python2 pytest.py pypy/module/_codecs/test/test_codecs.py
17:20
<
mattip >
for instance you can run the test_decoder_state function via
17:20
<
mattip >
python2 pytest.py pypy/module/_codecs/test/test_codecs.py -k test_decoder_state
17:45
speeder39_ has joined #pypy
17:45
<
agronholm >
64-bit Fedora
18:05
<
agronholm >
mattip: I get the same results also when I run the script against the latest pypy:3 docker image
18:05
<
agronholm >
I will try to run the test against the untranslated pypy, once I understand how
18:16
<
agronholm >
mattip: when I run the test (python3 pytest.py -D pypy/module/_codecs/test/test_decoder.py) it passes
18:17
<
agronholm >
I'm not entirely sure what the point of that is, since it loads the function from the host python, doesn't it?
18:31
<
mattip >
note my command line specifies python2 with no -D
18:31
<
mattip >
and you need to write/change a test to use your string
18:36
<
mattip >
that will allow you to run untranslated, and add a pdb.set_trace() and poke around (not in the test, in the RPython code inside interp_codecs or unicodehelper)
18:43
<
agronholm >
yes, I did add a test, I just didn't understand that running it with python3 runs it against the host and on python2 it does something entirely different
18:44
<
mattip >
so when I run decoder = codecs.getincrementaldecoder('utf-8')(errors='replace'); [ord(x) for x in decoder.decode('åäö'.encode('iso-8859-1'))]
18:45
<
mattip >
I get [65533, 65533, 65533]
18:45
<
mattip >
on both CPython3.7 and PyPy3.6-v7.3.1
18:45
<
agronholm >
I'm thoroughly confused
18:46
<
agronholm >
I ran that test on python 2 (why do I have to make it python 2 compatible when it's supposed to run on pypy3?)
18:46
<
agronholm >
it still gives me the wrong result though
18:46
<
mattip >
so you run with python2 because RPython is written in python2
18:48
<
mattip >
and the command python2 pytest.py pypy/module ... takes the test, notes that it imports _codecs, runs the test on top of the untranslated pypy while building enough of pypy to use the _codecs module
18:48
<
mattip >
what do you call "the wrong result"?
18:50
<
mattip >
are you using a locale?
18:51
<
agronholm >
I call it the wrong result because it differs from the cpython result
18:51
<
agronholm >
(and the result you get)
18:52
<
agronholm >
locale should have no bearing on these functions
18:53
<
mattip >
what is the value of the wrong result?
18:53
<
agronholm >
[65533]
18:53
<
mattip >
ahh, a single value?
18:53
<
mattip >
btw, you can write this as [ord(x) for x in 'åäö'.encode('iso-8859-1').decode('utf8', 'replace')]
18:54
<
agronholm >
no, that gives me the correct result
18:54
<
agronholm >
only using codecs.utf_8_decode() (or using the incremental decoder) gives the wrong result
18:54
<
mattip >
ahh, ok, so we are getting somewhere
18:59
<
mattip >
maybe connected to sys.getdefaultencoding() or sys.getfilesystemencoding() ? For me both those are utf-8
19:00
<
agronholm >
same here
19:01
<
agronholm >
although that should also not affect anything since we're being explicit about the encodings
19:03
rubdos has quit [Ping timeout: 260 seconds]
19:04
rubdos has joined #pypy
19:09
<
mattip >
ok, I can reproduce. Here is the test
19:09
<
mattip >
for some reason it is returning 1 for the length, not 3
19:12
<
agronholm >
so unicodehelper.str_decode_utf8() is the first place to look for trouble
19:13
<
mattip >
I don't know what I was doing before, but now pypy3 translated is showing that error as well, sorry I must have been doing something wrong
19:13
<
agronholm >
np, glad we're on the same page now :)
19:19
<
mattip >
it is a problem with the handling of final, which by default is False
19:19
<
mattip >
to support incremental decoding
19:21
<
mattip >
and by problem I mean incompatibility with cpython. I am not sure I understand why CPython finalizes the buffer
19:24
<
agronholm >
your patch did not trigger the debugger for me
19:26
<
mattip >
did you run the test: python2 pytest.py pypy/module/_cpyext/test/test_codecs.py -k test_utf_8_decode
19:26
<
agronholm >
no, and your patch doesn't touch that file either
19:26
tos9 has joined #pypy
19:27
<
agronholm >
it creates pypy/module/_codecs/test/test_codecs.py
19:27
<
agronholm >
other than that difference, I did use that command line
19:27
<
agronholm >
it already fails at the "assert utf8 == b'\\xe5\\xe4\\xf6'" line
19:30
Rhy0lite has joined #pypy
19:31
<
agronholm >
isn't the code wrong though? u'\xc3\xa5\xc3\xa4\xc3\xb6'.encode('iso-8859-1') does give b'\xc3\xa5\xc3\xa4\xc3\xb6'
19:31
<
agronholm >
as it should
19:31
<
mattip >
copy paste mess. replace the double backslash with single
19:33
<
mattip >
and the utf8 string should be your original unicode string (in the line above)
19:33
<
agronholm >
those code points are the utf-8 representation of my original string
19:34
<
agronholm >
yup – translated to escape codes, it would be '\xe4\xe5\xf6'
19:35
<
agronholm >
ok, now it triggers the debugger
19:41
<
mattip >
ok, so in CPython the equivalent to unicodehelper.str_decode_utf8 is PyUnicode_DecodeUTF8Stateful
19:42
<
mattip >
but with one big difference: it passes in a `consumed` rather than a boolean `final`
19:44
<
mattip >
the only place `consumed` is used is in _codecs.utf_8_decode, where if final is False it is the length in bytes of the utf8 string
19:44
<
mattip >
so, in short, a bug
20:04
jcea has joined #pypy
20:09
<
agronholm >
ok, so the cpython behavior is wrong?
20:11
<
mattip >
no, PyPy's interface is not sophisticated enough to mimic CPython's final=True/False handling
20:11
<
mattip >
we should be using "consumed" in the unicodehelper functions, not "final"
20:12
<
mattip >
and consumed==-1 will be the default value (like when consumed==NULL in CPython)
20:13
<
mattip >
sorry. ... will be the final=True value ...
20:15
<
agronholm >
I'll create an issue of this and copy the IRC logs there
20:35
speeder39_ has quit [Quit: Connection closed for inactivity]
20:53
<
mattip >
thanks for pursuing this
21:01
forgottenone has quit [Quit: Konversation terminated!]
21:14
lritter_ has joined #pypy
21:44
mattip has quit [Ping timeout: 264 seconds]
21:47
mattip has joined #pypy
21:48
Rhy0lite has quit [Ping timeout: 240 seconds]
21:57
xcm has quit [Remote host closed the connection]
21:58
xcm has joined #pypy
22:41
Rhy0lite has joined #pypy
23:02
_whitelogger has joined #pypy
23:25
BPL has quit [Quit: Leaving]