<headius> did I break the build?
<headius> oops, I did!
<headius> hooray!
<GitHub22> [jruby] headius pushed 1 new commit to jruby-9.1: https://git.io/vxpbr
<GitHub22> jruby/jruby-9.1 f25b8be Charles Oliver Nutter: Delete unused method that no longer has a path.
<GitHub176> [jruby] headius pushed 1 new commit to master: https://git.io/vxpbi
<GitHub176> jruby/master bb8e0ad Charles Oliver Nutter: Merge branch 'jruby-9.1'
<travis-ci> jruby/jruby (jruby-9.1:f25b8be by Charles Oliver Nutter): The build passed. (https://travis-ci.org/jruby/jruby/builds/365882593)
drbobbeaty has quit [Ping timeout: 245 seconds]
<headius> zing
bga57 has quit [Ping timeout: 264 seconds]
bga57 has joined #jruby
akp has joined #jruby
oxddmr has joined #jruby
oxddmr has quit [Remote host closed the connection]
niKorigan has joined #jruby
niKorigan has quit [Remote host closed the connection]
eschwartz737C7X has joined #jruby
sidx64 has joined #jruby
eschwartz737C7X has quit [Remote host closed the connection]
james41382288YQX has joined #jruby
james41382288YQX has quit [Remote host closed the connection]
BizarreFruit has joined #jruby
<BizarreFruit> Good morning!
<BizarreFruit> Is anyone actually up yet? :)
BizarreFruit has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
<GitHub199> [jruby] perlun opened pull request #5140: spec/ruby: Bring in some changes from upstream (master...spec/update-with-sigterm-specs) https://git.io/vxhLb
sidx64 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
sidx64 has joined #jruby
olle has joined #jruby
sidx64 has quit [Read error: Connection reset by peer]
sidx64 has joined #jruby
shellac has joined #jruby
shellac has quit [Quit: Computer has gone to sleep.]
claudiuinberlin has joined #jruby
shellac has joined #jruby
olle has quit [Quit: olle]
olle has joined #jruby
olle has quit [Quit: olle]
<kares> enebo: rdubya: well go for it but that is just weird - was under the impression than millis/nanos worked properly at some point
drbobbeaty has joined #jruby
olle has joined #jruby
Osho has quit [Ping timeout: 260 seconds]
me- has quit [Ping timeout: 264 seconds]
Osho has joined #jruby
me has joined #jruby
me is now known as Guest1483
shellac has quit [Quit: Computer has gone to sleep.]
shellac has joined #jruby
<kares> rdubya: so I have looked into your complain with binary - that is definitely not a recent thing
<kares> it was simply hiding due tests - also the CI results somehow seem un-realiable in this sense ;(
<kares> which is almost 2 months old and it is still failing
<kares> it will probably go even further down the road, the problem is that I'm getting the max_identifier_length problem with these older commits
<kares> so its not easy to get down to - but I have a feeling like this might have been the same thing as with the binary 'in-house' test suite failure
<kares> ... so really someone needs to chase it down :)
<rdubya> kares: ah, I thought we were good with binary data as long as prepared statements were disabled previously
<kares> its true that I am 'unemployed' for a few months already :) but I need to focus on stuff I care for the moment
<kares> rdubya: hey! seems not
<rdubya> were those tests being skipped or something?
<kares> maybe its due the AR version upgrade ... who knows
<kares> they were not but they're failing me know as I co older commits
<kares> + CI outcome I am not reproducing ;( so this is not smt I can help out quickly ... it seems
<kares> (I mean older CI outcome - its failing me always)
<rdubya> with prepared statements off? I know they've always failed with them on (sorry if I'm being dense, I just don't remember seeing them fail previously)
<kares> even on that old commit of yours I posted above
<kares> hmm yes I had an env set
<kares> will double check
<kares> its on bydefault for PG right?
<rdubya> ok thanks, guess if its all the same problem I can try digging in again
<rdubya> yeah its on by default
<kares> oh well you're right - was checking the wrong thing ;(
<kares> used PS=false
<kares> but that wasn't supported all the wya back with Rails suite
<kares> so I guess I'll redo a few tests again
<rdubya> ok thanks for checking it out
<kares> its looking better
<kares> seems its to be pointing to the ByteaUtils update
<kares> so I guess revert... ;(
<kares> that will bring up some new failures on Rails' suite
<kares> that directly test escaping
<kares> feel free to revert that commit and push to 50-stable
<rdubya> ok, thanks for tracking it down
Guest1483 has quit [Ping timeout: 246 seconds]
Osho has quit [Ping timeout: 264 seconds]
Osho has joined #jruby
me_ has joined #jruby
me_ has quit [Ping timeout: 268 seconds]
Osho has quit [Ping timeout: 268 seconds]
Osho has joined #jruby
me_ has joined #jruby
me_ has quit [Ping timeout: 240 seconds]
Osho has quit [Ping timeout: 260 seconds]
me_ has joined #jruby
Osho has joined #jruby
Osho has quit [Ping timeout: 256 seconds]
me_ has quit [Ping timeout: 256 seconds]
mee has joined #jruby
mee has quit [Client Quit]
ebarrett has quit [Quit: WeeChat 1.9.1]
<enebo> kares: rdubya: output = RubyTimeOutputFormatter.formatNumber(dt.getMillisOfSecond(), 3, '0');
ebarrett has joined #jruby
<enebo> This is what strftime does in JRuby itself
<enebo> So the way we make that will return "000" onto the front and it then formats weird
<kares> yeah it felt like its being hacked around ;(
<enebo> kares: I guess since nano is not in DateTime this is a big part of the problem
<kares> enebo: well it was handled before
<kares> there's places where its set onto the RubyTIme
<kares> so changing DateTimeUtils to format it did not seem right ...
<kares> but thought you would figure sooner or later since you know JRUby's internals :)
<enebo> kares: hmm so the formatting code is from 13 so perhaps it is createTime which changed
<kares> be careful to watch for test failures (regressions) in all the suites ...
<kares> time handling is quite tricky territory
<kares> I can pbly count time spent in weeks at this point
<kares> + did some of the things twice already :)
sidx64 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<enebo> kares: well so long as I understand the actual format this is just how we call jruby itself (or I fix jruby itself but how we call it seems reasonable to me)
me has joined #jruby
me is now known as Guest23490
Osho has joined #jruby
<rdubya> enebo: sorry, just trying to catch up, are you saying we should fix the issue in arjdbc or that it is something that needs fixed in jruby internal?
<enebo> rdubya: well it is possible I fix it in JRuby itself so it no longer has this limitation but for all current jruby versions this calling convention does not work for strftime itself (maybe other methods) so I think we make a millis and nanos change to arjdbc
<enebo> Not sure if this is actually right though
<enebo> rdubya: from a basic logic perspective I think we just make a millis value and a nanos value but I will try and verify what JRuby really expects here
<enebo> so my first sentence may be unclear. We should change ARJDBC for sure since not everyone will immediately use 9.1.17 but that if I can fix it on JRuby itself I will also correct it there as well (so older uses of RubyTime will work)
<enebo> I should almost go back and see when this did work in the past
<enebo> strftime itself came into existence as Java code in 2013 so unless setNano logic in RubyTime affected DateTime instance this probably never worked for strftime
<enebo> setNTime is 2012 so unless constructor does it?
<rdubya> enebo: sounds good
<enebo> yeah constructor is even older I highly doubt strftime ever worked but it appears pretty much everything else does
<enebo> but this is wacky on jruby side
<enebo> we have two forms of constructors and one does not take nsec at all and does not muck with nsec as a field
<enebo> Logically how ARJDBC sets this up makes sense to me though. We set 10^9 value of nanoseconds milliseconds in DateTime should just be the first 3 of those digits
<enebo> so if we set 0 and a 10^9 should we be setting millis in DateTime also? Should we always set the nanos based on DateTime if we do not pass in a nanos
<enebo> On one hand I think milliseconds maybe is important in a DateTime? Some weird timezone or something? Don't know but having it be 0 when it is not 0 by itself seems weird. But when passed with nanos it seems like nanos could be more relevant number.
<enebo> rdubya: I think my PR is silly btw. We can calculate nanos as it was but then just calculate millis by dividing by 10^6
<enebo> rdubya: are you willing to give that a quick shot for your tests?
<rdubya> enebo: yeah I'm trying to wrap some stuff up here but I'll try to get to it
<enebo> rdubya: ok
shellac has quit [Quit: Computer has gone to sleep.]
<enebo> kares: you advocated moving 9.2 back to java.util.Date right?
<enebo> kares: it may be late but we are mandating Java 8 so it does look very attractive. It solves this nanosecond thing so long as it is as accurate for historical dates as joda is
olle has quit [Quit: olle]
olle has joined #jruby
olle has quit [Client Quit]
<rdubya> enebo: I had to tweak it a little bit (start + 5 => start + 4), but it looks like that fixes it
<enebo> rdubya: oh ok weird after I made that I figured that would not work because nanos is just a 10^6 number
<enebo> rdubya: so semantically in JRuby's constructor I am confused by nanos even means now...everything beyond ms perhaps?
<enebo> rdubya: I would double check the fields on that time object as well as strftime (like asking for milliseconds and nsecs
<enebo> )
<rdubya> yeah i verified them
<rdubya> actually is there a good way to get milliseconds? I see usec and nsec but don't see a method for milliseconds…
<enebo> rdubya: I don't know offhand
<enebo> long milliseconds = nanoseconds / 1000000;
<enebo> return RubyTime.newTime(runtime, new DateTime(milliseconds), extraNanoseconds);
<enebo> long extraNanoseconds = nanoseconds % 1000000;
<enebo> This is how we handle a sole long value for making a Time instance
<enebo> so I guess that method implies nanos is not full nanos but just fraction of nanos after the millis
<enebo> I will at a minimum add some comments around the fields as to what nsec means
<enebo> We have a really confusing type with no documentation explaining what we expect here
<rdubya> I'll run the full suite but it looks like its generating it correctly now
<enebo> (times/dates are crazy sauce in general -- not necessarily our time/date stuff :) )
<enebo> rdubya: sweetness
<rdubya> kares: it looks like your change for the bytea stuff was correct, it must have just uncovered that those two other tests should have been breaking but weren't
<rdubya> and it gives me a path to consider for debugging those ones, so thanks for pointing me in that direction
claudiuinberlin has quit [Quit: Textual IRC Client: www.textualapp.com]
<kares> enebo: I advocated what?
<kares> moving to Java 8 date-time APIs
<kares> its a dead end - don't go there
<enebo> kares: wasn't that you?
<kares> would break a lot
<kares> well I asked, the truffle guys mentioned it should be easy
<kares> but I do not think so
<enebo> kares: ok. yeah I sort of assumed that but I was not sure if something changed with Java Date since we abandoned it last and I think I misattributed it to you
<kares> however the thing I will implement is Date/Time/DateTime toJava conversion do new Java date-time types
<kares> enebo: with java.util.Date?
<enebo> kares: yeah
<kares> you mean the nanos thing
<kares> ?
<kares> Date does not have either
<kares> am really out of context
<enebo> kares: heh
<kares> and its a late friday night over here
<enebo> kares: I think nanos is in Java 8 date
<kares> RubyTime is fine - the nanopart is confusing but works
<kares> well
<kares> it just need some Java API for users
<kares> but that is beyond the current discussion here
<enebo> kares: yeah I am happy to not move away from joda as we have it working but I just remembered some older conversation wrong
<enebo> kares: but the reason I even remembered it was that Java 8 has nano resolution now
<enebo> kares: so ignore all that talk
<enebo> I am not advocating anything
<kares> btw. JODA APIs are I believe 'better'than what JDK did
<kares> more flexible for sure
<kares> + a common DateTime type which you do not have in 8
<kares> you have LocalDateTime etc
<kares> it certainly isn't a direct port - so far as I looked (haven't been using the 8 stuff much myself)
<enebo> kares: if only you could just drop in a newer version and have it just work...but maybe that is true now. You did it recently
<kares> yy it seems fine so far
<kares> but a release will really tell
<enebo> historically something seemed to always break on each joda update. They probably have finished breaking changes
<enebo> and I guess data is another issue
<enebo> we will find out some babylonian date bug in new version or something :P
xardion has quit [Remote host closed the connection]
<kares> its very stable - went through the changelogs
<kares> mostly adapting to newer TZ rules
<kares> which is still not sure for me whether JRuby needs to regenerate
<kares> ... mean that piece of pom project that JRuby has
<enebo> ah
<enebo> well we will find out :)
xardion has joined #jruby
guilleiguaran has joined #jruby
claudiuinberlin has joined #jruby
<GitHub33> [jruby] lopex pushed 1 new commit to master: https://git.io/vxjG9
<GitHub33> jruby/master dac6261 Marcin Mielzynski: split upcase/downcase/swapcase/capitalize arities and optimize String#casecmp?
<GitHub71> [jruby] lopex pushed 1 new commit to master: https://git.io/vxjZ3
<GitHub71> jruby/master 3691861 Marcin Mielzynski: restore accidentally modified poms
subbu is now known as subbu|lunch
<GitHub137> [jruby] lopex pushed 1 new commit to master: https://git.io/vxjCC
<GitHub137> jruby/master a7b0833 Marcin Mielzynski: fix typos
subbu|lunch is now known as subbu
<lopex> enebo: it seems corerange isnt propagated in Symbol -> String
<lopex> *coderange even
<enebo> lopex: oh
<enebo> lopex: yeah that is something I had not considered
<lopex> enebo: just realized casemapping goes slow paths for it
<enebo> so perhaps we should save CR in Rubysymbol
<lopex> enebo: easily done from what I can see
<lopex> yes
<enebo> lopex: bytelist_love though uses Symbol a lot
<enebo> lopex: and in most places it is only used as a RubyString for exception messages
<enebo> lopex: so I am not sure my branch is in jeopardy
<enebo> lopex: where have you seen this crop up?
<enebo> lopex: I mean have you see a common case where we make a symbol into a string and then hit the slow path for something
<lopex> enebo: :foo.upcase had a bug in jcodings jruby currently uses
<lopex> enebo: it went us-ascii.caseMap
<lopex> enebo: and not a specialized version was run
<enebo> lopex: heh. but that should have worked
<enebo> lopex: so you just found a bug and happened to notice that was slow
<lopex> enebo: yes
<enebo> lopex: I am not against the idea of having CR in RubySymbol but I am wondering if this is a real performance issue at the same time
<enebo> lopex: I mean it obviously can be a performance problem but is it in practice
<lopex> enebo: hmm
<enebo> lopex: another thing of interest is most symbols are tiny so I also wonder how much faster the fast path is
<lopex> enebo: like "ąsss".foo
<lopex> it will go the slowest path possible
<lopex> er
<lopex> wait
<enebo> heh
<enebo> lopex: afk for about 15 minutes I need to buy coffee beans
<lopex> kk
<GitHub30> [jruby] lopex pushed 2 new commits to master: https://git.io/vxjRY
<GitHub30> jruby/master de4e6cd Marcin Mielzynski: Merge branch 'master' of https://github.com/jruby/jruby
<GitHub30> jruby/master f20c4e6 Marcin Mielzynski: update jcodings
<GitHub187> [jruby] lopex pushed 1 new commit to master: https://git.io/vxjRV
<GitHub187> jruby/master fbfc388 Marcin Mielzynski: untag String#test_casecmp? and Symbol#test_casecmp?
sidx64 has joined #jruby
sidx64_ has joined #jruby
sidx64 has quit [Ping timeout: 240 seconds]
<enebo> lopex: I should not have even asked the question about performance. It just makes sense that since we scanned the bytes in a symbol we can leverage CR
<lopex> enebo: like :foo.upcase
<lopex> it doesnt rescan but goes slow path
<lopex> well, like mri
<enebo> lopex: yeah my original question was mostly does that happen but it is not a fair question since I cannot think of when that would happen either (but it might)
<lopex> enebo: but can a symbol end up utf-8 with unscanned 7 bit cr ?
<enebo> no I doubt it
<lopex> I mean, a string from symbol
<enebo> I think any string from a symbol which is 7bit is almost guaranteed to be 7bit clean
<enebo> so only mbc will end up utf-8 generally
<enebo> unless it is a non-ascii supporting encoding
<lopex> enebo: since there's no force_encoding on symbol
<enebo> yeah if it can be ascii from any ascii encoding it is just made into ascii encoding
<lopex> which invalidates cr
<enebo> err any 7bit ascii will bs US-ASCII
<lopex> enebo: but still, it omits 7 bit cr specializations
<enebo> as a symbol even if it comes from UTF-8
<lopex> er
<enebo> if it is US-ASCII we can probably just mark CR as valid and 7bit
<enebo> err 7bit
<enebo> lopex: another thing we could do is maybe not use CR but merely length in bytes
<lopex> it's bug ?
<enebo> symbol.bytes == length && ascii is 7bit
<lopex> why not 7bit for both ?
<enebo> yeah good question
<lopex> plus ascii compatible
<enebo> maybe for non-scanned sequences?
<lopex> and why care about utf8?
<enebo> I don't get this if stmt though
<enebo> yeah how can they casemap utf8
<enebo> I guess it will only try to casemap ascii chars in utf8?
<lopex> enebo: jcodings has casemap ascii only flag
<enebo> I don't quite get it though. I mean to be utf-8 I think means it must contain non-7bit chars
<enebo> they just ignore them here don't they?
<enebo> lopex: just seems like a weird feature
<enebo> lopex: so this is their fast path for string
<lopex> enebo: and only for upcase/downcase
<enebo> ok
<enebo> yeah I see this is an option the user asks for
<lopex> enebo: we can do for swap/cap too
<enebo> perhaps this is super common with strings where anything non-ascii is not cased data
<enebo> so if that flag is set it will use any single byte encoding or utf-8 OR any 7bit which is not turkish
<lopex> enebo: well, the spec is for upcase/downcase
<lopex> :ascii is supported for all of those
<enebo> lopex: and the bug you are working through is this fast path is ignored because we make a symbol into a string which is 7bit but we do not know that because we lost cr
<lopex> enebo: that bug was fixed in jcodings some time ago
<lopex> just havent been updated
<enebo> lopex: even without passing in CR for new string we can just set it to 7BIT if it is US-ASCII right?
<lopex> enebo: but it seems because we lost cr
<enebo> because symbols must be valid CR
<enebo> symbols cannot be binary data
<enebo> just adding 7BIT would eliminate 99.999% of all slow paths just doing that tweak
<enebo> without having to change all symbol creation paths to accept or calc CR
<headius> good afternoon
<lopex> enebo: well I though newShared could
<lopex> enebo: because it's the most frequent producer of such strings
<lopex> enebo: not that I'm saying it's worth the penny
<enebo> lopex: but you saw my point about just marking 7bit if bytelist is US-ASCII?
<lopex> enebo: otoh newShared is used on vast majority of symbol methods
<lopex> enebo: yeah
<enebo> lopex: I think that eliminates all common uses of symbols
<lopex> enebo: still in newShared ?
<enebo> maybe
<lopex> enebo: this way you dont have to add cr there
<enebo> lopex: it could be done in to_s(context) too
<lopex> enebo: I'm saying about return newSymbol(runtime, newShared(runtime).downcase(context).getByteList());
<enebo> ah
<lopex> lots of those
<enebo> newShared may be best place then
<lopex> headius: looks like test/mri/ruby/enc isnt run at all ?
<enebo> lopex: I did see that to_s calls RubyString.newStringShared and not newShared
<enebo> lopex: so you may want to audit to make sure all symbol to string conversions call through that
<enebo> heh
<enebo> only to_s(runtime) does that
<enebo> everything else seems to use newShared
<lopex> which is RubyString.newStringShared(runtime, symbolBytes);
<lopex> hmm
<lopex> enebo: what if encoding insance had a cr field ?
<lopex> and 7 bit for singlebyte encodings
<lopex> enebo: we wouldnt have to have anyconditions for initing cr
<lopex> and unknown for utf-8 for example
<lopex> it would simplify quite a few bits there
<enebo> lopex: so if I make something which is mutable then its encoding might change
<enebo> lopex: but I guess so would the CR
<enebo> if we did not have this
<lopex> yeah
<enebo> It would somewhat mean if I did Encoding theEncoding = someStringIwillMutate.getEncoding(); may not be valid later in the method
<lopex> er, that's a question what does it change
<lopex> enebo: just for initial setting the cr
<enebo> well it will be valid for what think of today as encoding but we would have to know not to trust the cr status
<enebo> so we can look at it initially but not later?
<lopex> dunno
<lopex> maybe we could
<enebo> yeah something "feels" wrong :)
<lopex> but what ?
<enebo> we store encoding as temp local a lot
<lopex> it might not matter
<lopex> that' why I'm saying for init
<enebo> if we ask for CR from it or even think we can after a particular point then we will get a bad answer
<enebo> lopex: but if I don't know the codebase how do I know that?
<enebo> I mean RubyString and RubySymbol will have cr field/methods but if you are just jumping around through source code you will see Encoding has it too
<lopex> not if encoding wont change
<enebo> yeah true
<lopex> and changing enc invalidates cr anyways
<enebo> I don't mean changing enc though
<enebo> I mean if I have 7bit CR on utf-8 encoding instance for a String and I add an mbc then I need to at a minimum change the encoding right?
<enebo> if we want encoding to match what the string has set as a cr field
<headius> lopex: maybe not...I know I ran them when implementing the rest of transcoding though
<enebo> so cr as advisory coming in and we replace the encoding in constructor with non cr holding encoding?
<enebo> lopex: perhaps that was what you meant
<lopex> enebo: another question
<lopex> if we force_encoding(:ascii-8bit)
<lopex> can we say cr is valid anyways ?
<enebo> I don't know :)
<enebo> probably valid
<enebo> it is binary at that point
<enebo> we cannot walk it wrong at that point
<enebo> I don't know if valid means that specifically though
<enebo> perhaps nirvdrum can tell me :)
<lopex> and force_encoding("iso-8859-1") ?
<lopex> s = "ą";s.force_encoding("iso-8859-1");p s.length
<enebo> seems reasonable it will print >1
<lopex> now it doesnt matter that there's no 7bit
<enebo> lopex: yeah but why are you asking?
<enebo> I look at both of those force_encodings as basically saying, fuck it I don't care what the date is...it will be a stream of bytes now
<enebo> s/date/data
<enebo> too much ardjbc debugging :)
<lopex> enebo: and then there's tons of specializations for encodings based on their instances
<lopex> and other properties like max length
<enebo> well I think passing CR as subtype to actual encoding is just a hack to get around needing to change signatures. I am not sure if are talking about that now though
<lopex> if cr would have more priority them maybe it could be simplified ?
<lopex> enebo: well, it would be "the lowest possible cr"
<enebo> lopex: if we talk about solving this we should not even be using CR as bit flags
<lopex> yeah, that's a different story
<enebo> I think putting CR into Encoding would make it harder to remove
<enebo> Same for bytelist for that matter
<enebo> If I had my way since we almost always scan bytelists I would add a charlength field
<enebo> if not scanned it would be -1 or something like that. If valid it would just be number of chars. If invalid it would be another special number -2
<lopex> yeah, we discussed that I think
<enebo> CR_VALID != -2, CR_UNKNOWN == -1, CR_7BIT (realSize == characterLength)
<enebo> yeah but getting back to original problem this would be in the bytelist
<enebo> so we would not need to pass in CR since we could derive it entirely from the bytelist
<enebo> but it would obviously have other significant benefits like immediately know it's length
<enebo> main downside would be more memory
<nirvdrum> Storing char length makes a lot of sense.
<enebo> nirvdrum: I know you have thought about this a lot
<enebo> nirvdrum: other than memory is there really any downside over this bit stuff
<nirvdrum> I haven't put much thought into using special length values for code range encoding. We actually use an enum right now.
<nirvdrum> But I wouldn't mind getting rid of another field.
<enebo> I like the idea of string.length just returning immediately as well
<nirvdrum> Your definition of CR_7BIT doesn't work for ASCII-8BIT though.
<lopex> the best thing is cr/length invalidation in one shot
<enebo> unless it is some big creation from IO where we need to walk it but we would do that now
<nirvdrum> lopex: You could also get rid of that packed long you use now to encoding CR and length.
<lopex> that's the point
<lopex> and more atommeness :P
<enebo> nirvdrum: ah well ASCII-8BIT can just be -2 right?
<nirvdrum> I think there might be one of the IBM encodings that's byte-wide an ASCII-compatible as well.
<enebo> realSize will still == length
<nirvdrum> ASCII-8BIT is always CR_7BIT or CR_VALID
<enebo> nirvdrum: so scan still has to happen so if not 7bit it is set to -2
<enebo> otherwise it is characterLength = realSize
<enebo> but in whether VALID or 7BIT realSize will be length
<nirvdrum> You wrote CR_VALID != -2
<nirvdrum> Maybe I'm confused.
<enebo> oh yeah I wrote that wrong :)
<enebo> so perhaps we have one more negative number
<enebo> or flip that but we have a lot of negative numbers
<nirvdrum> That works for ASCII-8BIT. But where would you store the char length for CR_VALID UTF-8?
<lopex> enebo: you have one more number since there's no -0 :P
<nirvdrum> This whole thing is kinda dumb. CR_BROKEN strings just shouldn't be allowed to propagate through the system.
<enebo> characterLength is validLength if positive
<lopex> er, less
<enebo> so non-valid is a special designator
<enebo> not known yet as another
<enebo> I guess we need to still know valid vs 7bit on 8bit ascii/raw
<nirvdrum> Don't you store CR in the object header?
<lopex> in flags
<enebo> nirvdrum: currently yeah
<enebo> at least in what I think of as part of ruby object header :)
<nirvdrum> So whether you encode CR in the length or not, you're still going to have to eat the cost of allocating a new int for charLength.
<enebo> nirvdrum: yep
<nirvdrum> I'd recommend starting by caching charLength and leave CR alone for the time being.
<nirvdrum> You're going to have to deal with essentially a new type of invalidation even without CoW.
<enebo> nirvdrum: this whole conversation started from realizing symbols made into strings lose their ability to know the CR
<nirvdrum> E.g., taking a substring of a ByteList means a new byte scan.
<enebo> nirvdrum: maybe anyways
<nirvdrum> Why would they lose that ability?
<enebo> we only pass in the bytelist when we make the string (or we make with a jlString)
<enebo> err make the symbol
<enebo> we do not pass CR in as part of that even if we already know it
<nirvdrum> Ahh.
<nirvdrum> You guys need to decide what ByteList should be :-)
<lopex> nirvdrum: and then intermediate string on :foo.upcase didnt have a cr
<nirvdrum> It's a nifty class for working with bytes, but since it already tracks an Encoding, it's really just the JRuby string representation.
<lopex> and upcase went slow path
<nirvdrum> Storing the CR in there makes sense to me.
<enebo> nirvdrum: I agree personally
<nirvdrum> But then you basically have mutable ropes (oxymoron, yes).
<enebo> Actually I suspect we all agree but history
<lopex> exaclty
<enebo> nirvdrum: obviously our other big issue with bytelist is encapsulation :P
<headius> I've tried several times to migrate CR and encoding logic into ByteList...it's not easy to do incrementally
<headius> I have a branch that at least makes CharSequence methods do proper encoding logic (toString, charAt, etc)
<enebo> are lambdas good enough to fix our issues?
<enebo> for encapsulation
<lopex> enebo: bytelist should be private github project :P
<enebo> headius: oh btw did you happen to notice subSequence was broken in bytelist 1.x
<headius> yes, I fixed that as well
<headius> kinda had to
<nirvdrum> I had migrated everything over to that CodeRangeable interface before we forked away.
<headius> but I haven't returned to that work
<nirvdrum> Maybe that would make things easier?
<enebo> yeah it is broken in arjdbc now
<headius> there are many methods in ByteList that are broken that we leave there because of legacy
<enebo> or a single usage of it is
<headius> but we need to make a clean break at some point
<nirvdrum> ByteList 2.0?
<enebo> yeah knowing it is 2.0 is fine since it is an internal dep
<enebo> I nearly have all bytelist.toString() killed in bytelist_love
<headius> yeah 2.0
<headius> i pushed a snapshot and a JRuby branch to see how it affected things but have not gotten back to it
<headius> pretty swamped right now
<GitHub168> [jruby] lopex pushed 1 new commit to master: https://git.io/vxjV5
<GitHub168> jruby/master e85f1f7 Marcin Mielzynski: update joni
claudiuinberlin has quit [Quit: Textual IRC Client: www.textualapp.com]
sidx64_ has quit [Ping timeout: 264 seconds]
hbautista has joined #jruby
hbautista has quit [Remote host closed the connection]
drbobbeaty has quit [Ping timeout: 240 seconds]