<GitHub199>
[jruby] perlun opened pull request #5140: spec/ruby: Bring in some changes from upstream (master...spec/update-with-sigterm-specs) https://git.io/vxhLb
sidx64 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
sidx64 has joined #jruby
olle has joined #jruby
sidx64 has quit [Read error: Connection reset by peer]
sidx64 has joined #jruby
shellac has joined #jruby
shellac has quit [Quit: Computer has gone to sleep.]
claudiuinberlin has joined #jruby
shellac has joined #jruby
olle has quit [Quit: olle]
olle has joined #jruby
olle has quit [Quit: olle]
<kares>
enebo: rdubya: well go for it but that is just weird - was under the impression than millis/nanos worked properly at some point
drbobbeaty has joined #jruby
olle has joined #jruby
Osho has quit [Ping timeout: 260 seconds]
me- has quit [Ping timeout: 264 seconds]
Osho has joined #jruby
me has joined #jruby
me is now known as Guest1483
shellac has quit [Quit: Computer has gone to sleep.]
shellac has joined #jruby
<kares>
rdubya: so I have looked into your complain with binary - that is definitely not a recent thing
<kares>
it was simply hiding due tests - also the CI results somehow seem un-realiable in this sense ;(
<kares>
which is almost 2 months old and it is still failing
<kares>
it will probably go even further down the road, the problem is that I'm getting the max_identifier_length problem with these older commits
<kares>
so its not easy to get down to - but I have a feeling like this might have been the same thing as with the binary 'in-house' test suite failure
<kares>
... so really someone needs to chase it down :)
<rdubya>
kares: ah, I thought we were good with binary data as long as prepared statements were disabled previously
<kares>
its true that I am 'unemployed' for a few months already :) but I need to focus on stuff I care for the moment
<kares>
rdubya: hey! seems not
<rdubya>
were those tests being skipped or something?
<kares>
maybe its due the AR version upgrade ... who knows
<kares>
they were not but they're failing me know as I co older commits
<kares>
+ CI outcome I am not reproducing ;( so this is not smt I can help out quickly ... it seems
<kares>
(I mean older CI outcome - its failing me always)
<rdubya>
with prepared statements off? I know they've always failed with them on (sorry if I'm being dense, I just don't remember seeing them fail previously)
<kares>
even on that old commit of yours I posted above
<kares>
hmm yes I had an env set
<kares>
will double check
<kares>
its on bydefault for PG right?
<rdubya>
ok thanks, guess if its all the same problem I can try digging in again
<rdubya>
yeah its on by default
<kares>
oh well you're right - was checking the wrong thing ;(
<kares>
used PS=false
<kares>
but that wasn't supported all the wya back with Rails suite
<kares>
so I guess I'll redo a few tests again
<rdubya>
ok thanks for checking it out
<kares>
its looking better
<kares>
seems its to be pointing to the ByteaUtils update
<enebo>
This is what strftime does in JRuby itself
<enebo>
So the way we make that will return "000" onto the front and it then formats weird
<kares>
yeah it felt like its being hacked around ;(
<enebo>
kares: I guess since nano is not in DateTime this is a big part of the problem
<kares>
enebo: well it was handled before
<kares>
there's places where its set onto the RubyTIme
<kares>
so changing DateTimeUtils to format it did not seem right ...
<kares>
but thought you would figure sooner or later since you know JRUby's internals :)
<enebo>
kares: hmm so the formatting code is from 13 so perhaps it is createTime which changed
<kares>
be careful to watch for test failures (regressions) in all the suites ...
<kares>
time handling is quite tricky territory
<kares>
I can pbly count time spent in weeks at this point
<kares>
+ did some of the things twice already :)
sidx64 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<enebo>
kares: well so long as I understand the actual format this is just how we call jruby itself (or I fix jruby itself but how we call it seems reasonable to me)
me has joined #jruby
me is now known as Guest23490
Osho has joined #jruby
<rdubya>
enebo: sorry, just trying to catch up, are you saying we should fix the issue in arjdbc or that it is something that needs fixed in jruby internal?
<enebo>
rdubya: well it is possible I fix it in JRuby itself so it no longer has this limitation but for all current jruby versions this calling convention does not work for strftime itself (maybe other methods) so I think we make a millis and nanos change to arjdbc
<enebo>
rdubya: from a basic logic perspective I think we just make a millis value and a nanos value but I will try and verify what JRuby really expects here
<enebo>
so my first sentence may be unclear. We should change ARJDBC for sure since not everyone will immediately use 9.1.17 but that if I can fix it on JRuby itself I will also correct it there as well (so older uses of RubyTime will work)
<enebo>
I should almost go back and see when this did work in the past
<enebo>
strftime itself came into existence as Java code in 2013 so unless setNano logic in RubyTime affected DateTime instance this probably never worked for strftime
<enebo>
setNTime is 2012 so unless constructor does it?
<rdubya>
enebo: sounds good
<enebo>
yeah constructor is even older I highly doubt strftime ever worked but it appears pretty much everything else does
<enebo>
but this is wacky on jruby side
<enebo>
we have two forms of constructors and one does not take nsec at all and does not muck with nsec as a field
<enebo>
Logically how ARJDBC sets this up makes sense to me though. We set 10^9 value of nanoseconds milliseconds in DateTime should just be the first 3 of those digits
<enebo>
so if we set 0 and a 10^9 should we be setting millis in DateTime also? Should we always set the nanos based on DateTime if we do not pass in a nanos
<enebo>
On one hand I think milliseconds maybe is important in a DateTime? Some weird timezone or something? Don't know but having it be 0 when it is not 0 by itself seems weird. But when passed with nanos it seems like nanos could be more relevant number.
<enebo>
rdubya: I think my PR is silly btw. We can calculate nanos as it was but then just calculate millis by dividing by 10^6
<enebo>
rdubya: are you willing to give that a quick shot for your tests?
<rdubya>
enebo: yeah I'm trying to wrap some stuff up here but I'll try to get to it
<enebo>
rdubya: ok
shellac has quit [Quit: Computer has gone to sleep.]
<enebo>
kares: you advocated moving 9.2 back to java.util.Date right?
<enebo>
kares: it may be late but we are mandating Java 8 so it does look very attractive. It solves this nanosecond thing so long as it is as accurate for historical dates as joda is
olle has quit [Quit: olle]
olle has joined #jruby
olle has quit [Client Quit]
<rdubya>
enebo: I had to tweak it a little bit (start + 5 => start + 4), but it looks like that fixes it
<enebo>
rdubya: oh ok weird after I made that I figured that would not work because nanos is just a 10^6 number
<enebo>
rdubya: so semantically in JRuby's constructor I am confused by nanos even means now...everything beyond ms perhaps?
<enebo>
rdubya: I would double check the fields on that time object as well as strftime (like asking for milliseconds and nsecs
<enebo>
)
<rdubya>
yeah i verified them
<rdubya>
actually is there a good way to get milliseconds? I see usec and nsec but don't see a method for milliseconds…
<enebo>
rdubya: I don't know offhand
<enebo>
long milliseconds = nanoseconds / 1000000;
<enebo>
return RubyTime.newTime(runtime, new DateTime(milliseconds), extraNanoseconds);
<enebo>
long extraNanoseconds = nanoseconds % 1000000;
<enebo>
This is how we handle a sole long value for making a Time instance
<enebo>
so I guess that method implies nanos is not full nanos but just fraction of nanos after the millis
<enebo>
I will at a minimum add some comments around the fields as to what nsec means
<enebo>
We have a really confusing type with no documentation explaining what we expect here
<rdubya>
I'll run the full suite but it looks like its generating it correctly now
<enebo>
(times/dates are crazy sauce in general -- not necessarily our time/date stuff :) )
<enebo>
rdubya: sweetness
<rdubya>
kares: it looks like your change for the bytea stuff was correct, it must have just uncovered that those two other tests should have been breaking but weren't
<rdubya>
and it gives me a path to consider for debugging those ones, so thanks for pointing me in that direction
<kares>
well I asked, the truffle guys mentioned it should be easy
<kares>
but I do not think so
<enebo>
kares: ok. yeah I sort of assumed that but I was not sure if something changed with Java Date since we abandoned it last and I think I misattributed it to you
<kares>
however the thing I will implement is Date/Time/DateTime toJava conversion do new Java date-time types
<kares>
enebo: with java.util.Date?
<enebo>
kares: yeah
<kares>
you mean the nanos thing
<kares>
?
<kares>
Date does not have either
<kares>
am really out of context
<enebo>
kares: heh
<kares>
and its a late friday night over here
<enebo>
kares: I think nanos is in Java 8 date
<kares>
RubyTime is fine - the nanopart is confusing but works
<kares>
well
<kares>
it just need some Java API for users
<kares>
but that is beyond the current discussion here
<enebo>
kares: yeah I am happy to not move away from joda as we have it working but I just remembered some older conversation wrong
<enebo>
kares: but the reason I even remembered it was that Java 8 has nano resolution now
<enebo>
kares: so ignore all that talk
<enebo>
I am not advocating anything
<kares>
btw. JODA APIs are I believe 'better'than what JDK did
<kares>
more flexible for sure
<kares>
+ a common DateTime type which you do not have in 8
<kares>
you have LocalDateTime etc
<kares>
it certainly isn't a direct port - so far as I looked (haven't been using the 8 stuff much myself)
<enebo>
kares: if only you could just drop in a newer version and have it just work...but maybe that is true now. You did it recently
<kares>
yy it seems fine so far
<kares>
but a release will really tell
<enebo>
historically something seemed to always break on each joda update. They probably have finished breaking changes
<enebo>
and I guess data is another issue
<enebo>
we will find out some babylonian date bug in new version or something :P
xardion has quit [Remote host closed the connection]
<kares>
its very stable - went through the changelogs
<kares>
mostly adapting to newer TZ rules
<kares>
which is still not sure for me whether JRuby needs to regenerate
<kares>
... mean that piece of pom project that JRuby has
<GitHub187>
jruby/master fbfc388 Marcin Mielzynski: untag String#test_casecmp? and Symbol#test_casecmp?
sidx64 has joined #jruby
sidx64_ has joined #jruby
sidx64 has quit [Ping timeout: 240 seconds]
<enebo>
lopex: I should not have even asked the question about performance. It just makes sense that since we scanned the bytes in a symbol we can leverage CR
<lopex>
enebo: like :foo.upcase
<lopex>
it doesnt rescan but goes slow path
<lopex>
well, like mri
<enebo>
lopex: yeah my original question was mostly does that happen but it is not a fair question since I cannot think of when that would happen either (but it might)
<lopex>
enebo: but can a symbol end up utf-8 with unscanned 7 bit cr ?
<enebo>
no I doubt it
<lopex>
I mean, a string from symbol
<enebo>
I think any string from a symbol which is 7bit is almost guaranteed to be 7bit clean
<enebo>
so only mbc will end up utf-8 generally
<enebo>
unless it is a non-ascii supporting encoding
<lopex>
enebo: since there's no force_encoding on symbol
<enebo>
yeah if it can be ascii from any ascii encoding it is just made into ascii encoding
<lopex>
which invalidates cr
<enebo>
err any 7bit ascii will bs US-ASCII
<lopex>
enebo: but still, it omits 7 bit cr specializations
<enebo>
as a symbol even if it comes from UTF-8
<lopex>
er
<enebo>
if it is US-ASCII we can probably just mark CR as valid and 7bit
<enebo>
err 7bit
<enebo>
lopex: another thing we could do is maybe not use CR but merely length in bytes
<enebo>
I guess it will only try to casemap ascii chars in utf8?
<lopex>
enebo: jcodings has casemap ascii only flag
<enebo>
I don't quite get it though. I mean to be utf-8 I think means it must contain non-7bit chars
<enebo>
they just ignore them here don't they?
<enebo>
lopex: just seems like a weird feature
<enebo>
lopex: so this is their fast path for string
<lopex>
enebo: and only for upcase/downcase
<enebo>
ok
<enebo>
yeah I see this is an option the user asks for
<lopex>
enebo: we can do for swap/cap too
<enebo>
perhaps this is super common with strings where anything non-ascii is not cased data
<enebo>
so if that flag is set it will use any single byte encoding or utf-8 OR any 7bit which is not turkish
<lopex>
enebo: well, the spec is for upcase/downcase
<lopex>
:ascii is supported for all of those
<enebo>
lopex: and the bug you are working through is this fast path is ignored because we make a symbol into a string which is 7bit but we do not know that because we lost cr
<lopex>
enebo: that bug was fixed in jcodings some time ago
<lopex>
just havent been updated
<enebo>
lopex: even without passing in CR for new string we can just set it to 7BIT if it is US-ASCII right?
<lopex>
enebo: but it seems because we lost cr
<enebo>
because symbols must be valid CR
<enebo>
symbols cannot be binary data
<enebo>
just adding 7BIT would eliminate 99.999% of all slow paths just doing that tweak
<enebo>
without having to change all symbol creation paths to accept or calc CR
<headius>
good afternoon
<lopex>
enebo: well I though newShared could
<lopex>
enebo: because it's the most frequent producer of such strings
<lopex>
enebo: not that I'm saying it's worth the penny
<enebo>
lopex: but you saw my point about just marking 7bit if bytelist is US-ASCII?
<lopex>
enebo: otoh newShared is used on vast majority of symbol methods
<lopex>
enebo: yeah
<enebo>
lopex: I think that eliminates all common uses of symbols
<lopex>
enebo: still in newShared ?
<enebo>
maybe
<lopex>
enebo: this way you dont have to add cr there
<enebo>
lopex: it could be done in to_s(context) too
<lopex>
enebo: I'm saying about return newSymbol(runtime, newShared(runtime).downcase(context).getByteList());
<enebo>
ah
<lopex>
lots of those
<enebo>
newShared may be best place then
<lopex>
headius: looks like test/mri/ruby/enc isnt run at all ?
<enebo>
lopex: I did see that to_s calls RubyString.newStringShared and not newShared
<enebo>
lopex: so you may want to audit to make sure all symbol to string conversions call through that
<enebo>
heh
<enebo>
only to_s(runtime) does that
<enebo>
everything else seems to use newShared
<lopex>
which is RubyString.newStringShared(runtime, symbolBytes);
<lopex>
hmm
<lopex>
enebo: what if encoding insance had a cr field ?
<lopex>
and 7 bit for singlebyte encodings
<lopex>
enebo: we wouldnt have to have anyconditions for initing cr
<lopex>
and unknown for utf-8 for example
<lopex>
it would simplify quite a few bits there
<enebo>
lopex: so if I make something which is mutable then its encoding might change
<enebo>
lopex: but I guess so would the CR
<enebo>
if we did not have this
<lopex>
yeah
<enebo>
It would somewhat mean if I did Encoding theEncoding = someStringIwillMutate.getEncoding(); may not be valid later in the method
<lopex>
er, that's a question what does it change
<lopex>
enebo: just for initial setting the cr
<enebo>
well it will be valid for what think of today as encoding but we would have to know not to trust the cr status
<enebo>
so we can look at it initially but not later?
<lopex>
dunno
<lopex>
maybe we could
<enebo>
yeah something "feels" wrong :)
<lopex>
but what ?
<enebo>
we store encoding as temp local a lot
<lopex>
it might not matter
<lopex>
that' why I'm saying for init
<enebo>
if we ask for CR from it or even think we can after a particular point then we will get a bad answer
<enebo>
lopex: but if I don't know the codebase how do I know that?
<enebo>
I mean RubyString and RubySymbol will have cr field/methods but if you are just jumping around through source code you will see Encoding has it too
<lopex>
not if encoding wont change
<enebo>
yeah true
<lopex>
and changing enc invalidates cr anyways
<enebo>
I don't mean changing enc though
<enebo>
I mean if I have 7bit CR on utf-8 encoding instance for a String and I add an mbc then I need to at a minimum change the encoding right?
<enebo>
if we want encoding to match what the string has set as a cr field
<headius>
lopex: maybe not...I know I ran them when implementing the rest of transcoding though
<enebo>
so cr as advisory coming in and we replace the encoding in constructor with non cr holding encoding?
<enebo>
lopex: perhaps that was what you meant
<lopex>
enebo: another question
<lopex>
if we force_encoding(:ascii-8bit)
<lopex>
can we say cr is valid anyways ?
<enebo>
I don't know :)
<enebo>
probably valid
<enebo>
it is binary at that point
<enebo>
we cannot walk it wrong at that point
<enebo>
I don't know if valid means that specifically though
<enebo>
perhaps nirvdrum can tell me :)
<lopex>
and force_encoding("iso-8859-1") ?
<lopex>
s = "ą";s.force_encoding("iso-8859-1");p s.length
<enebo>
seems reasonable it will print >1
<lopex>
now it doesnt matter that there's no 7bit
<enebo>
lopex: yeah but why are you asking?
<enebo>
I look at both of those force_encodings as basically saying, fuck it I don't care what the date is...it will be a stream of bytes now
<enebo>
s/date/data
<enebo>
too much ardjbc debugging :)
<lopex>
enebo: and then there's tons of specializations for encodings based on their instances
<lopex>
and other properties like max length
<enebo>
well I think passing CR as subtype to actual encoding is just a hack to get around needing to change signatures. I am not sure if are talking about that now though
<lopex>
if cr would have more priority them maybe it could be simplified ?
<lopex>
enebo: well, it would be "the lowest possible cr"
<enebo>
lopex: if we talk about solving this we should not even be using CR as bit flags
<lopex>
yeah, that's a different story
<enebo>
I think putting CR into Encoding would make it harder to remove
<enebo>
Same for bytelist for that matter
<enebo>
If I had my way since we almost always scan bytelists I would add a charlength field
<enebo>
if not scanned it would be -1 or something like that. If valid it would just be number of chars. If invalid it would be another special number -2