<joy_>
Why this happens? java -cp /Users/leonardostefani/.rvm//rubies/jruby-9.0.5.0/lib/jruby.jar test.class -> Error: Could not find or load main class test.class
<GitHub97>
[jruby] eregon force-pushed truffle-travis from 0e33a2b to a561152: https://git.io/v6FPm
<GitHub97>
jruby/truffle-travis a561152 Benoit Daloze: Travis: simplify install and script
<GitHub18>
[jruby] nirvdrum pushed 11 new commits to truffle-head: https://git.io/v6AuP
<GitHub18>
jruby/truffle-head a6c7d33 Kevin Menard: [Truffle] Added a faster path for fetching the bytes of a RepeatingRope whose child has its bytes populated.
<GitHub18>
jruby/truffle-head 10726d2 Kevin Menard: [Truffle] Added a faster path for fetching the bytes of a SubstringRope whose child has its bytes populated.
<GitHub18>
jruby/truffle-head 9efb699 Kevin Menard: [Truffle] Sped up making a RepeatingRope from a RopeBuffer....
hobodave has quit [Quit: Computer has gone to sleep.]
<GitHub18>
[jruby] eregon deleted truffle-thread-group-layout at 1dee397: https://git.io/v6AxC
camlow325 has quit [Read error: Connection reset by peer]
claudiuinberlin has quit []
camlow325 has joined #jruby
cremes has quit [Quit: cremes]
<bascule>
_____ ____ ___ ____ _ __ ___ _ _
<bascule>
| ___| _ \|_ _| _ \ / \\ \ / / | | |
<bascule>
| |_ | |_) || || | | |/ _ \\ V /| | | |
<bascule>
| _| | _ < | || |_| / ___ \| | |_|_|_|
<bascule>
|_| |_| \_\___|____/_/ \_\_| (_|_|_)
<bascule>
cprice404 has joined #jruby
<GitHub46>
[jruby] eregon deleted truffle-travis at a561152: https://git.io/v6xUq
zacts has joined #jruby
nicksieger has quit [Remote host closed the connection]
rsim has quit [Quit: Leaving.]
claudiuinberlin has joined #jruby
zacts has quit [Quit: WeeChat 1.5]
camlow325 has quit [Quit: WeeChat 1.5]
camlow325 has joined #jruby
temporalfox has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
nicksieger has joined #jruby
<GitHub94>
[jruby] nirvdrum force-pushed truffle-fix_match_backref from bae45f6 to bee562d: https://git.io/v6uWv
<GitHub94>
jruby/truffle-fix_match_backref 3eba4f2 Kevin Menard: [Truffle] Replaced calls to Regexp.last_match= with a direct primitive invoke to cut out an extra frame.
<GitHub94>
jruby/truffle-fix_match_backref 14a6dfa Kevin Menard: [Truffle] Simplified the RegexpNodes#matchCommon call a bit by removing support for 'operator' mode.
<GitHub94>
jruby/truffle-fix_match_backref dafe3ba Kevin Menard: [Truffle] Hide frame-local global variables from the local_variables list.
<GitHub141>
[jruby] nirvdrum pushed 1 new commit to truffle-head: https://git.io/v6x3H
camlow325 has quit [Read error: Connection reset by peer]
rsim has joined #jruby
camlow325 has joined #jruby
zacts has joined #jruby
zacts has quit [Quit: WeeChat 1.5]
subbu|lunch is now known as subbu
<headius>
nirvdrum: hey, enebo mentioned IO benchmarking to me so I was looking at that micro/core/file.rb
<headius>
I'm not sure writing to /dev/null is really a good test
<nirvdrum>
Oh?
<headius>
the kernel may short-circuit all writes to /dev/null to do nothing
<headius>
I'm basically getting nearly the same ips for writing a GB as writing a KB on MRI
<nirvdrum>
I think that might be fine. It's mostly to check the syscall overhead, I believe.
<nirvdrum>
chrisseaton: ^
<nirvdrum>
Some of these are really to test string performance, too.
<headius>
if I modify it to write to a file, MRI is 7 orders of magnitude slower
<headius>
that gives me pause
<enebo>
7ORDERS
<chrisseaton>
What we want to check is that we can get the bytes into to the kernel
<headius>
I guess it just feels weird to benchmark IO that isn't doing IO
<nirvdrum>
At least, that's how I've been using them.
<chrisseaton>
Well it's outputting it to the kernel
<headius>
enebo: this is probably the reason we're slower...we have a managed byte buffer that has to get transfered to a native buffer before the write can happen
<chrisseaton>
That's what the benchmark highlights, yeah
<enebo>
chrisseaton: when you mean kernel you just mean OS kernel time to invoke system-level writer (2)
<enebo>
err write() 2
<headius>
then perhaps there should also be a proper benchmark of writing to something that won't nop
<chrisseaton>
well yeah I guess if someone wants to benchmark that
<chrisseaton>
I didn't when I wrote that benchmark
<enebo>
I admit from a microbench time it makes sense to have this bench but why would a real user care about this result in a programming conference?
<enebo>
It really distorts expectation doesn’t it?
<enebo>
Maybe I don’t give average attendees enough credit
<chrisseaton>
Whoah back up guys, we aren't distorting anyone's expectation with a benchmark in a repo
<enebo>
chrisseaton no sorry I don’t want to imply that
<chrisseaton>
It's there to check performance of that operation stays the same
<headius>
not to compare perf across impls?
<chrisseaton>
I someone wants to talk about real IO performance, we'd probably use something like the asciidoctor end-to-end translation benchmark
<chrisseaton>
Well I do compare with this benchmark, but only in terms of checking we aren't slower than anyone else
<chrisseaton>
If the VM's actively detected it was /dev/null then that would be different, but if the kernel detects it the same from all VMs, then that's fine by me
<headius>
my concern is if these results get presented as "here's our IO performance" it's leaving out something kinda important... IO
<headius>
if that isn't going to happen then I don't care :-)
temporalfox has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
<chrisseaton>
Yeah, it's a tool for building up to a benchmark you might want to report
<headius>
ok
<chrisseaton>
We do a have a problem at the moment with the basic IO primitive being really slow, as the Rubinius code copies quite a bit, and then to implement that we then copy again, and this highlights that for us - we're using it because we're bad at it at the moment
<enebo>
chrisseaton: yeah sorry I really did not want to imply that. you guys do have this in micro subdir too. My context was talking to Kevin about some results for his ropes talk
<headius>
jruby proper can destroy MRI on every one of these write benchmarks if they don't write to null
<enebo>
chrisseaton: I have looked at it and the comparisons he are using are common string operations and looking at things from an impl difference standpoint
<chrisseaton>
But you'd like to match MRI even if the IO is dropped by the kernel, right? That's what we want to do
<nirvdrum>
enebo: Sorry if that came out incorrectly. I'm looking at this for the 10,000 String#+ operations. The writing to /dev/null is just to keep everyone honest.
<nirvdrum>
If I don't reify the rope bytes, you'd be surprised how much faster that can be :-)
<headius>
chrisseaton: is that something people need to be fast?
<enebo>
chrisseaton: but we were talking about these results this morning so this is where that came from
<headius>
IO that does nothing?
<enebo>
nirvdrum: yeah I htink part of the confusion in last few minutes is this is a file called ‘file.rb’
<nirvdrum>
headius: It's actually not that uncommon, to redirect output to /dev/null *shrug*
<headius>
nirvdrum: a GB of output per second?
<nirvdrum>
Probably not to that extreme.
<enebo>
nirvdrum: I think if these string tests were in string.rb it would be less confusing
<headius>
I mean, maybe that happens...it just isn't something I've had to do :-)
<headius>
yeah
<chrisseaton>
headius: I'm not saying anyone wants it to be fast, but if there's something in my code that's making it slow, I'd like to remove it
<headius>
I took these to mean that they're IO subsystem benchmarks
<headius>
specifically File operations
<headius>
which usually work with Files
<nirvdrum>
enebo: I tried to warn you of that from the outset, you fool ;-)
<enebo>
nirvdrum: hahaha sorry
<chrisseaton>
Well they are IO subsystem benchmarks - they test how fast we can send bytes to the kernel
<enebo>
do you guys need some side-effect to not eliminate all the work?
<chrisseaton>
And I know they're useful benchmarks, as we're currently slower than MRI on them
<headius>
but the kernel is not something you can ignore
<headius>
if it can't write everything you have to write again
<headius>
that goes back into your runtime
<headius>
by writing to something that always succeeds, you never have to deal with buffering, write loops, etc
<chrisseaton>
You're just saying the benchmark doesn't cover all cases, which is obviously true, but it doesn't make the benchmark useless
<headius>
I agree
<headius>
it helped me realize our IO is slower than MRI writing to null :-)
<headius>
so there's a win of a sort
<chrisseaton>
To back this up, don't worry about us focusing on micros, we're looking at bigger stuff like asciidoctor actually outputting HTML, and PSD.rb actually reading PSDs as our big benchmarks soon
<headius>
oh sure...I just saw some bench numbers flying around related to IO and wanted to have a look
<chrisseaton>
Oh and opt carrot of course
<headius>
I've never spent any time optimizing the ported MRI IO logic, so there's a lot of potential
<enebo>
chrisseaton: so because we want to use these micro benches as well could we perhapts point out this is a devnull test since I saw MRI results and figured something weird was going on
<enebo>
chrisseaton: the MRI result looks nuts
<chrisseaton>
In the same of the benchmark? Sure
<headius>
that would help clarify
<enebo>
chrisseaton: I mean I can see it writes to devnull by reading what it does but the results to this were super unexpected and we spent time trying to fiture out how MRI could have that result
<headius>
devnull was just a red flag for me
<headius>
so there it is
<enebo>
now that we know in hindsight we perhaps won’t forget it but I think anyone viewthing will wonder what the hell is going on
<nirvdrum>
FWIW, I think measuring the IO machinery is also useful in that there is certainly more than one way to do it. AFAIK, you guys use the JVM for all your IO and we use jnr-posix.
<headius>
ahh here's another issue with our impl
<headius>
it will allocate a contiguous 1GB native buffer when the managed buffer needs to be written
<headius>
that could be a single smaller buffer with N writes
<chrisseaton>
we do that, but then I wasn't sure if that broke any expectations of atomic-ness in the file system
<headius>
it certainly may
<headius>
there's multithreaded writes to consider
<headius>
it's because there's so much nuance in IO behavior that I decided to port MRI's logic for much of it
<headius>
hmm well that's much slower :-)
<headius>
assuming I did it right
<chrisseaton>
Does our handy little benchmark tell you that?
nicksieger has quit [Remote host closed the connection]
<headius>
well, I didn't run it to test the impl
<headius>
but the reduced case is still a devnull write
rsim has quit [Quit: Leaving.]
<headius>
I'm not yet sure if it's telling me this is a bad approach or I did something wrong
<enebo>
chrisseaton: we are happy this test is here. I think we both just got really confused what the test was actually testing
<enebo>
chrisseaton: we both spent time wondering how the hell MRI could do this so fast…which is a value in itself but had this been labbeled with core-write-gigabyte-devnull then I htink we would have realized this quickly
<chrisseaton>
Yeah maybe that's a better name
<enebo>
chrisseaton: in a way I feel dumb for not just realizing that but I feel I am in good company
<headius>
the confusing thing for me is that 1.7 seems to have similar devnull write perf
<headius>
it should have just as much overhead copying 1GB string to native
<enebo>
headius: it uses NIO channel and we use native now?
<headius>
yes, and that's a place I'll investigate certainly
<enebo>
headius: oh yeah it probably does copy still? Or maybe somethign is really smart
<headius>
but Java's NIO channels aren't a whole lot different
<headius>
so we are using FFI to bind write, and they bind it directly in a JNI function
<headius>
but jffi makes pretty tight bindings
<enebo>
perhaps they can just pass off the ran byte[] addr somehow
<enebo>
raw
<headius>
it's possible
<headius>
I didn't think modern hotspot could pin though
<enebo>
It is sad…all that JNI work I did and I don’t remember how references to primities work any more
<headius>
or at least, not via any API you can call
<enebo>
you can acquire refrences in JNI
<enebo>
it lets GC know it cannot free
<headius>
yeah but they're indirections back into managed heap
<enebo>
well perhaps with a primitive byte[] it knows it cannot relocate and uses the memory direclty
<headius>
right, pinning
<headius>
I didn't believe it could do that
<nirvdrum>
I haven't kept up with Project Panama at all. Does any of this get better?
<enebo>
no?
<headius>
there's an API in JNI but it says you don't have to pin for it, and I thought I'd heard they don't
<headius>
nirvdrum: it doesn't change anything about managed versus native
<headius>
it does make it easier/lighter to work with blocks of native memory
<headius>
big wins in panama are reduced jni overhead and JVM-aware layout of native structs
<nirvdrum>
So posix calls should get cheaper?
<headius>
yes
<nirvdrum>
Is this still landing in Java 9?
chrisseaton has quit [Read error: Connection reset by peer]
andrewvc has quit [Read error: Connection reset by peer]
<enebo>
GetPrimitiveArrayCritical()
<headius>
nirvdrum: as a jdk package I believe, yes
<headius>
not a standard Java API
<headius>
enebo: yeah that's the one
<enebo>
So looks like they could be using that method to mark critical section
<enebo>
It sounds like it may or may not force a copy
<headius>
right
<headius>
I thought it did
<enebo>
but that it can prevent a copy
<headius>
but it would be an explanation for 1.7 being faster
<headius>
I'll look at the C code for NIO
<headius>
as far as I can tell 1.7 and 9k are mostly the same otherwise
<enebo>
nirvdrum: Is my results different for you?
<enebo>
nirvdrum: I am not seeing big-concat as slower
<nirvdrum>
I'm not running with benchmark-ips.
<nirvdrum>
It's hard to compare that way.
<enebo>
nirvdrum: but I don’t see how JRuby would run slower running longer vs MRI
<headius>
yeah, I just want to see the bad + perf so I can fix it
<enebo>
nirvdrum: I guess the tool gives what the tool gives
<headius>
but I can't see it
<enebo>
nirvdrum: perhaps we are deopting much later
<nirvdrum>
enebo: Well, with bench9000 you also have some outside process managing things. Whereas with benchamrk-ips, you have the VM reporting its own results.
<nirvdrum>
I'm a bit surprised there's such a difference though.
<headius>
I get the same ratio as enebo, 2x faster on the + concat bench
<nirvdrum>
I'll try running with benchmark-ips then.
<enebo>
I should run with perfer
<headius>
jruby 2x MRI
<headius>
I'm running the "all" version with benchmark-interface, which I assume uses ips for this one
<headius>
nirvdrum: what ratio were you seeing?
<nirvdrum>
~1x
<enebo>
headius: it can use whatever you choose but ips is default
<headius>
certainly could be linux
<headius>
nirvdrum: jruby proper on graal or hotspot?
<nirvdrum>
HotSpot, indy enabled.
<headius>
and that was current master, only 1x MRI?