#jruby on 2017-04-04 — irc logs at freenode.irclog.whitequark.org

2017-03-06 17:31 ChanServ changed the topic of #jruby to: Get 9.1.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

01:28 _whitelogger has joined #jruby

01:43 baweaver is now known as baweaver_away

01:48 baweaver_away is now known as baweaver

02:20 byteflam1 has quit [Ping timeout: 246 seconds]

02:21 byteflame has quit [Ping timeout: 260 seconds]

03:42 kenrestivo has quit [Read error: Connection reset by peer]

05:02 ankitr has joined #jruby

05:06 andrewvc_ has joined #jruby

05:06 codefinger_ has joined #jruby

05:07 electrical_ has joined #jruby

05:07 bga57 has quit [Ping timeout: 246 seconds]

05:09 Scorchin_ has joined #jruby

05:14 bga57 has joined #jruby

05:14 andrewvc has quit [*.net *.split]

05:14 Scorchin has quit [*.net *.split]

05:14 electrical has quit [*.net *.split]

05:14 zph has quit [*.net *.split]

05:14 codefinger has quit [*.net *.split]

05:14 andrewvc_ is now known as andrewvc

05:15 codefinger_ is now known as codefinger

05:16 electrical_ is now known as electrical

05:17 ankitr is now known as atm0sphere

05:18 Scorchin_ is now known as Scorchin

05:23 kenrestivo has joined #jruby

05:51 madgoat has joined #jruby

05:51 madgoat has left #jruby [#jruby]

06:02 atm0sphere has quit [Remote host closed the connection]

06:05 ankitr has joined #jruby

06:05 ankitr is now known as atm0sphere

06:25 donV has joined #jruby

07:10 thedarkone2 has quit [Quit: thedarkone2]

07:15 zph has joined #jruby

07:45 vtunka has joined #jruby

07:47 atm0sphere has quit [Remote host closed the connection]

07:51 ankitr has joined #jruby

07:52 ankitr is now known as atm0sphere

08:05 donV has quit [Quit: donV]

08:16 donV has joined #jruby

08:27 vtunka has quit [Quit: Leaving]

08:29 vtunka has joined #jruby

08:43 prasunanand has joined #jruby

09:02 drbobbeaty has joined #jruby

09:04 cschneid has quit [Ping timeout: 246 seconds]

09:07 prasunanand has quit [Ping timeout: 246 seconds]

09:17 atm0sphere has quit [Read error: Connection reset by peer]

09:20 prasunanand has joined #jruby

09:25 ankitr has joined #jruby

09:25 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

09:33 cschneid has joined #jruby

09:38 donV has quit [Quit: donV]

09:47 prasunanand has quit [Ping timeout: 246 seconds]

09:49 donV has joined #jruby

09:53 etehtsea has joined #jruby

09:59 etehtsea has quit [Ping timeout: 240 seconds]

10:32 etehtsea has joined #jruby

10:35 prasunanand has joined #jruby

10:41 etehtsea has quit [Quit: Computer has gone to sleep.]

11:08 etehtsea has joined #jruby

11:14 etehtsea has quit [Ping timeout: 240 seconds]

11:17 drbobbeaty has joined #jruby

11:20 etehtsea has joined #jruby

11:21 etehtsea has quit [Client Quit]

11:21 bbrowning_away is now known as bbrowning

11:25 etehtsea has joined #jruby

11:43 knu has quit [Ping timeout: 260 seconds]

11:46 knu has joined #jruby

11:48 donV has quit [Quit: donV]

11:49 prasunanand has quit [Ping timeout: 246 seconds]

11:49 etehtsea has quit [Quit: Textual IRC Client: www.textualapp.com]

11:54 donV has joined #jruby

12:03 shellac has joined #jruby

12:30 ankitr has quit [Ping timeout: 256 seconds]

12:30 lance|afk is now known as lanceball

13:05 yahonda has joined #jruby

13:07 yahonda has left #jruby [#jruby]

13:10 vtunka has quit [Quit: Leaving]

13:28 vtunka has joined #jruby

13:34 donV has quit [Quit: donV]

13:35 byteflame has joined #jruby

13:35 byteflam1 has joined #jruby

13:47 donV has joined #jruby

13:51 ankitr has joined #jruby

13:58 ankitr has quit [Ping timeout: 240 seconds]

14:27 enebo has joined #jruby

14:29 vtunka has quit [Quit: Leaving]

14:32 vtunka has joined #jruby

14:59 donV_ has joined #jruby

15:00 ankitr has joined #jruby

15:01 donV_ has quit [Client Quit]

15:03 donV has quit [Ping timeout: 260 seconds]

15:03 vtunka has quit [Quit: Leaving]

15:30 camlow325 has joined #jruby

15:50 Osho has joined #jruby

16:02 camlow325 has quit [Quit: WeeChat 1.5]

16:37 ankitr has quit [Ping timeout: 260 seconds]

16:37 ankitr has joined #jruby

16:38 shellac has quit [Ping timeout: 264 seconds]

16:48 bbrowning is now known as bbrowning_away

16:49 tcrawley-away is now known as tcrawley

17:03 ankitr has quit [Ping timeout: 258 seconds]

17:16 tcrawley is now known as tcrawley-away

17:20 ankitr has joined #jruby

17:46 bbrowning_away is now known as bbrowning

17:55 camlow325 has joined #jruby

18:05 subbu is now known as subbu|away

18:19 thedarkone2 has joined #jruby

18:32 ankitr has quit [Ping timeout: 240 seconds]

18:57 bbrowning is now known as bbrowning_away

19:33 bbrowning_away is now known as bbrowning

19:37 subbu|away is now known as subbu

20:50 <lopex> enebo: so what's the stance wrt that coding preamble issue ?

21:16 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

21:21 <enebo> lopex: as far as adding UTF8 as an alias?

21:22 <lopex> yeah

21:22 <enebo> lopex: I see no harm but I also have not made the change

21:22 <lopex> enebo: yeah saw your message

21:22 <enebo> lopex: if you want to then go for it

21:23 <lopex> enebo: can coding: appear in multiple palces in a file ?

21:23 <enebo> lopex: ask nirvdrum

21:23 <enebo> lopex: but no I don't think so

21:23 <enebo> lopex: I think coding is only at the topish

21:23 <lopex> ah, yeah he was involved in that conv

21:23 <lopex> yeah

21:23 <lopex> actually any other comment should give an error

21:23 <enebo> lopex: so add the alias if you have time and hopefully nirvdrum will amaze us

21:24 <nirvdrum> I pushed some changes to TruffleRuby. I scaled back what I was thinking of doing quite a bit though because it wasn't paying off as much as I thought. I'm going to prepare a JRuby patch soon.

21:24 <lopex> cool

21:24 <lopex> nirvdrum: cool

21:24 <lopex> enebo: but there is such alias in mri encodings

21:25 <lopex> enebo: and we have it too

21:25 <nirvdrum> There's https://github.com/graalvm/truffleruby/commit/57cabaec33ce9ca45e5d11dc0ab1922357799897, which is a straightforward port.

21:25 <nirvdrum> And https://github.com/graalvm/truffleruby/commit/adca85d2a98eb27f6c4de3a25d0c04d5208fd6bb, which I think is the biggest bang for the buck.

21:26 <lopex> now I'm confused

21:26 <nirvdrum> And then the reason I haven't ported to JRuby yet is https://github.com/graalvm/truffleruby/commit/ccf8d4331a98e9cb1e0da64488f961ee7e16dab5. Since JRuby does CoW for ByteLists, this commit may be unnecessary.

21:26 <nirvdrum> Oh, you guys are talking about that other bug. I wasn't looking at that. I was looking at improving startup time by not spending so much time parsing magic comments.

21:26 <nirvdrum> And I came across a couple other bugs along the way.

21:27 <enebo> nirvdrum: although do we have !tokenSeen?

21:27 <lopex> yeah, that commit is about case insensitive preambles

21:27 <lopex> enebo: do we have it ?

21:27 <enebo> lopex: checking

21:28 <lopex> nirvdrum: I'm still lost why that utf8 was not found

21:28 <lopex> nirvdrum: since there an alias in jcodings already

21:28 <enebo> lopex: different disucussions

21:28 <nirvdrum> enebo: https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/lexer/yacc/RubyLexer.java#L875

21:28 <lopex> ah ok

21:29 <enebo> nirvdrum: but is JRuby lexer missing something

21:30 <enebo> nirvdrum: you skip parseMagicComment potentially based on it

21:30 <lopex> aah

21:31 <lopex> so there's more such comments

21:31 <nirvdrum> enebo: As far as I can tell, this is completely safe. The encoding comment must be on the first line. The frozen string literals comment must occur before the first token is seen. And the warn on indent isn't implemented.

21:31 <lopex> yeah, I recall

21:31 <nirvdrum> enebo: The only thing that's funky is if you run with warnings enabled, you'd be notified about using frozen_string_literals after a token. So, I added that in as a check.

21:32 <nirvdrum> All of ruby spec passes with it. And the number of lines actually visited drops quite a bit.

21:32 <enebo> nirvdrum: I am really confused you must have other commits

21:32 <nirvdrum> Confused by what?

21:32 <enebo> nirvdrum: this.tokenSeen = tokenSeen where tokenSeen is unconditionally true isn't it?

21:32 <lopex> nirvdrum: so what if someone invents another magic comment ?

21:32 <lopex> nirvdrum: how will they mix ?

21:33 <lopex> the coding is first I gather ?

21:33 <nirvdrum> lopex: It would need to be revisited. But so would parseMagicComment.

21:33 <lopex> mhm

21:33 <enebo> ah it is the local

21:33 <nirvdrum> Since we do an if-else chain.

21:33 <enebo> and the local starts false but immediately sets field to true

21:33 bbrowning is now known as bbrowning_away

21:33 <lopex> but since that warn is undefined it;s still aproblem

21:34 <enebo> nirvdrum: does the file 'puts "A" # coding: UTF-16LE\n' work?

21:34 <lopex> enebo: how far 2.4 is ahead wrt that staff ?

21:35 <enebo> ah space

21:35 <enebo> maybe puts"A"# coding: UTF16-LE\n'

21:36 <nirvdrum> enebo: Work in what sense? I modified it to be 'puts "A".encoding' and the value is UTF-8 with MRI.

21:36 <lopex> with that utf 16 preamble ?

21:36 <enebo> I added a second line p "a".encoding

21:37 <enebo> It still prints UTF-8 for second string

21:37 <enebo> so It does not see that coding

21:38 <nirvdrum> So, this diverges from MRI. I'm happy to submit a PR and you can evaluate it. But it saves a lot of time in joni.

21:38 <enebo> derp I get it

21:38 <enebo> we call into yylex more than once so first assign marks it as too late

21:38 <nirvdrum> As lopex notes, it makes an assumption that future magic comments will follow suit.

21:39 <enebo> nirvdrum: can you measure a significant difference?

21:39 <enebo> I really thought we could see some of these anywhere

21:41 <nirvdrum> Keep in mind it is a bit more pronounced for us because a lot of our core library is authored in Ruby has 30+ lines of preamble for license & copyright.

21:41 <enebo> nirvdrum: if you cannot figure out any mechanism where it is anywhere but first line then It is crazy to keep looking every comment

21:42 <lopex> enebo: so they are not includec in the grammar ?

21:42 <lopex> *included

21:42 <nirvdrum> It dropped the number of lines visited from 7,721 to 2,885.

21:42 <lopex> that;s insane

21:42 <nirvdrum> I had that further reduced, but it came at the cost of a more complicated regexp and was basically a wash.

21:42 <lopex> wowo

21:43 <enebo> nirvdrum: so the only semantic issue is encountering multiple feature comments after a token is encountered right?

21:43 <lopex> so it;'s just a state in yy right ?

21:43 <nirvdrum> I removed the search for '-*-' and the set_file_encoding search by pushing both into the magic comment regexp. And then guarded all of that by a quick scan for ':' or '='.

21:43 <enebo> nirvdrum: which since it is a warning we can probably add a second boolean for that

21:43 <nirvdrum> I need to play with that more.

21:43 <enebo> nirvdrum: yeah I could see a double scan paying off

21:44 <nirvdrum> But, like I said, I had to fight joni a bit.

21:44 <lopex> how ?

21:44 <enebo> nirvdrum: can you have a comment like # :::::: coding: UTF-16LE?

21:44 <lopex> nirvdrum: ah, at parse time ?

21:45 <nirvdrum> I started off with something like ^.*?(coding|frozen[-_]string[-_]literal). That was bad.

21:45 <nirvdrum> It scanned that byte array many times.

21:45 <nirvdrum> I realized I was dumb and didn't need to anchor it.

21:45 <enebo> heh

21:45 <lopex> ah

21:46 <nirvdrum> But by complicated the regexp and having to fight with greedy matches to handle '-*-', I didn't really gain much.

21:46 * lopex wonders about that regex absent feature

21:46 <nirvdrum> I didn't think the pay-off was worth diverging from MRI enough.

21:46 <nirvdrum> But, I'm going to give the ':' & '=' scan another shot.

21:46 <enebo> Is there a way to scan backwards?

21:46 <enebo> lopex: ^

21:46 <lopex> enebo: no

21:47 <lopex> enebo: apart from back matching

21:47 <enebo> lopex: could there? Seems like regexp where it walked from end to start could be interesting

21:47 <lopex> sure

21:47 <nirvdrum> For its part, MRI seems to have dumped the state machine for that regexp into C and embedded that in the lexer.

21:47 <lopex> enebo: look behind is already useful

21:47 <enebo> lopex: ultimately find [:=] and look to the left

21:48 <enebo> lopex: so we can look behind on that

21:48 <nirvdrum> bbiab. Dinner time here.

21:48 <enebo> lopex: I never used look behind but this seems perfect

21:48 <lopex> enebo: look behind is somewhat restricted

21:48 <lopex> but you can do some things there

21:49 <lopex> enebo: like you cannot have alternatives there afaik

21:49 <lopex> and some quentifier restrictions too

21:50 <enebo> lopex: just consider if you can make one for this specific search...I am afk for about 15 minutes

21:50 <lopex> enebo: which one exactly ?

21:50 <lopex> the one nirvdrum posted ?

21:50 <enebo> lopex: matching coding: or string_frozen_liter

21:51 <enebo> lopex: the three features

21:51 <lopex> but that's exact matching

21:51 <enebo> match back from [:=] to look for those words

21:51 <enebo> so not case insensitive?

21:51 <lopex> afaik it can

21:51 <lopex> yeah it must

21:53 <lopex> nirvdrum: I'm not sure what enebo refers to := by ?

21:53 <lopex> the grammar ?

21:56 subbu is now known as subbu|afk

21:59 <enebo> lopex: perhaps I just mean coding: utf-8 style matches

21:59 <enebo> lopex: I thought coding= utf-8 also existed

21:59 <lopex> hmmm

21:59 <enebo> but I don't see that in our code

22:00 <enebo> so in any case it is a bit immaterial

22:00 <lopex> hmm p /(?<=foo=).*bar/ =~ "foo=bar bar"

22:00 <lopex> this will match

22:00 <enebo> Search back from a char

22:00 <lopex> enebo: is that close ?

22:01 <enebo> lopex: I think something like /(?<=(?:coding|set_string_literal)\s*:)

22:01 <enebo> heh I get a smily

22:02 <lopex> hmm alternatives are allowed afaik

22:02 <enebo> lopex: basically we want fast find of ':' then see what is in front

22:02 <lopex> but abitrary quantifiers nos

22:02 <lopex> yeah

22:02 <enebo> lopex: we can rescan with more expensive regexp

22:03 <enebo> lopex: since it almost never happens

22:03 <enebo> although the nexttoken check seems like it would have a huge affect on this

22:03 <lopex> yeah

22:03 <enebo> making the regexp faster would drop way off in value

22:04 <enebo> although most mri stdlib files now have comments

22:04 <lopex> enebo: but ouy can save positions in yy right ?

22:04 <lopex> oh I missing something

22:04 <lopex> enebo: so why mri does that ?

22:05 <enebo> lopex: I don't understand you

22:06 <lopex> enebo: mri invented taht right ?

22:06 <enebo> lopex: invented what?

22:06 <lopex> parsing of those magic comments

22:07 <enebo> lopex: yeah they added coding and later the two pragmas

22:07 <lopex> yeah

22:07 <enebo> lopex: they call them 'features' I think

22:07 <enebo> lopex: you can also pass them in as command-line options

22:07 <lopex> and they're position aware

22:07 <enebo> lopex: well you mean only at the top?

22:08 <lopex> top comments ?

22:08 <lopex> oh yeah

22:08 <enebo> lopex: from what nirvdrum said it looks like they must preceed first real token

22:08 <lopex> arent they ?

22:08 <lopex> hmm

22:08 <enebo> lopex: but if that is so then I don't know why they constantly parse them

22:08 <enebo> lopex: I can see one reason which is to add a warning

22:08 <lopex> and they do ?

22:08 <enebo> but I think maybe -w is needed

22:08 <lopex> yeah

22:09 <lopex> why so simple thing gets so much complicated

22:09 <lopex> it;s just a map of values in a comment

22:10 <enebo> so we could if (!tokenSeen || warning) && parseMagicComment

22:10 <enebo> at that point then only the first comment lines would be slow

22:10 <enebo> which if there are tons of doc comments then regexp is important again

22:10 <enebo> which is typical of a lot of stdlib files now

22:11 <enebo> so regexp probably still have some importance

22:12 <lopex> well

22:12 <lopex> it's all madness anyways

22:13 <enebo> almost makes you wish you had a db

22:13 <enebo> :)

22:13 <lopex> yeah like redis

22:13 <lopex> whatever

22:14 <lopex> it';s all about moral chices

22:14 <enebo> just would be nice to have all code in IRScopes and completely parsed and totally lazy

22:14 <enebo> ir persistence can lazily load methods and it is not faster so perhaps AST would be better

22:16 <enebo> doing the simplest test

22:16 <enebo> which matters to me gem list

22:17 <enebo> makes no obvious difference

22:17 <enebo> although I would need to do a lot of runs to know

22:17 <enebo> It is not big

22:19 <enebo> 2100 calls to magicComment with !tokenSeen for gem list

22:19 <enebo> 13400 without

22:19 <lopex> quite a lot

22:19 <enebo> yeah I guess 13 regexps is not a ton of time though

22:19 <enebo> err 13k

22:19 <enebo> nonetheless it seems like a good idea

22:20 <lopex> depends on the regxps

22:20 <enebo> lopex: of course

22:20 <enebo> lopex: I just mean even though this could be a better regexp it is not going to be a lot of time

22:20 subbu|afk is now known as subbu

22:20 <enebo> lopex: and consider before/after gem list was 1.6s and that is pretty short run time

22:21 <lopex> but joni regexps are quite heavy

22:21 <enebo> lopex: lets see if gem list will run if I comment it out

22:22 <enebo> lopex: If I completely comment out parseMagicComment I see no timing difference

22:23 <enebo> lopex: I am not saying it is not a reasonable change and no doubt less work is less work since this is not a complicated change but it is no smoking gun for us

22:23 <lopex> sure

22:23 <enebo> lopex: ok early dinner for me

22:24 <enebo> well somewhat normal dinner for me but I eat early :)

22:24 enebo has quit [Quit: Leaving.]

22:31 camlow325 has quit [Quit: WeeChat 1.5]

22:50 tcrawley-away is now known as tcrawley

22:53 tcrawley is now known as tcrawley-away

23:09 ankitr has joined #jruby

23:38 ankitr has quit [Ping timeout: 256 seconds]

23:49 <nirvdrum> lopex: I was using "(coding|frozen[_-]string[_-]literal|warn[_-]indent)\\s*(:|=)\\s*(\\\"?[\\w-]*\\w\\\"?)";

23:52 byteflam1 has quit [Ping timeout: 260 seconds]

23:52 byteflame has quit [Ping timeout: 260 seconds]