<rjnienaber>
should i just add a comment to the issue comparing the output between MRI 2.2.0 and JRuby head or should i be checking JRuby 1.7.* as well?
yfeldblum has joined #jruby
yfeldblum has quit [Ping timeout: 272 seconds]
mcclurmc has joined #jruby
mcclurmc has quit [Ping timeout: 256 seconds]
yfeldblum has joined #jruby
yfeldblum has quit [Ping timeout: 240 seconds]
elia has joined #jruby
elia has quit [Quit: Computer has gone to sleep.]
etehtsea has joined #jruby
etehtsea has quit [Max SendQ exceeded]
yfeldblum has joined #jruby
yfeldblum has quit [Ping timeout: 244 seconds]
mcclurmc has joined #jruby
mcclurmc has quit [Ping timeout: 265 seconds]
cprice has joined #jruby
cprice_ has joined #jruby
cprice has quit [Ping timeout: 265 seconds]
cprice_ has quit [Ping timeout: 264 seconds]
mcclurmc has joined #jruby
yfeldblum has joined #jruby
mcclurmc has quit [Ping timeout: 244 seconds]
yfeldblum has quit [Ping timeout: 240 seconds]
mattwildig has joined #jruby
Hobogrammer has quit [Ping timeout: 245 seconds]
e_dub has quit [Quit: e_dub]
mattwildig has quit []
yfeldblum has joined #jruby
zorak_ has joined #jruby
yfeldblum has quit [Ping timeout: 245 seconds]
mcclurmc has joined #jruby
mcclurmc has quit [Ping timeout: 252 seconds]
fivebats has quit [Quit: quit]
multibot_ has quit [Remote host closed the connection]
multibot_ has joined #jruby
cajone has joined #jruby
yfeldblum has joined #jruby
yfeldblum has quit [Ping timeout: 250 seconds]
rjnienaber has quit [Ping timeout: 246 seconds]
mbj has joined #jruby
elia has joined #jruby
mcclurmc has joined #jruby
mcclurmc has quit [Ping timeout: 244 seconds]
rjnienaber has joined #jruby
pgokeeffe has joined #jruby
elia has quit [Quit: Computer has gone to sleep.]
mbj has quit [Ping timeout: 252 seconds]
erikhatcher has joined #jruby
diegoviola has joined #jruby
mcclurmc has joined #jruby
mcclurmc has quit [Ping timeout: 245 seconds]
fniephaus has joined #jruby
<fniephaus>
Hi folks, anyone here who could help me with maven/truffle?
<rjnienaber>
i'm just adding comments at the moment, should I re-raise them on github ?
havenwood has joined #jruby
<electrical>
rjnienaber: commenting on Jira should be good. the people on the watch list will get a notification.. and i'm sure headius checks recent updated tickets as well. worst case you can compile a list of checked issues and create a single Github issue for it with links to them.
mbj has joined #jruby
pgokeeffe has quit [Quit: pgokeeffe]
fabio has joined #jruby
fniephaus has quit [Quit: Page closed]
fabio has left #jruby [#jruby]
<rjnienaber>
okay, cool
fabio has joined #jruby
fabio is now known as fniephaus
calavera has joined #jruby
fniephaus has left #jruby [#jruby]
fniephaus has joined #jruby
yfeldblum has joined #jruby
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
yfeldblum has quit [Ping timeout: 240 seconds]
elia has joined #jruby
mbj has quit [Ping timeout: 244 seconds]
subbu has joined #jruby
mbj has joined #jruby
subbu has quit [Read error: Connection reset by peer]
subbu has joined #jruby
elia has quit [Quit: Computer has gone to sleep.]
CodeWar has joined #jruby
elia has joined #jruby
<CodeWar>
[Question about implementing a language on top of JVM] Do you just compile the source to ByteCode and classes and give it to the VM and wash your hands off or is there an interpreter that is written by the language designer
<CodeWar>
In other words, what parts does the language designer control at the moment, just the AST to bytecode part or can one take traps from the VM, maintain a runtime interpreter and dance back n forth b/w interpreter and compiled code
yfeldblum has joined #jruby
fniephaus has quit [Quit: fniephaus]
<lopex>
CodeWar: most implementations just go from AST -> bytecode, since jruby9000 there is also an IR for most language specific optimizations
<CodeWar>
lopex: yes I saw their presentation recently
<CodeWar>
what libraries do you make use of for AST - > bytecode. how about the optimizations? Are they all written from scratch or is tehre a library like LLVM
<lopex>
CodeWar: for bytecode generation there's ASM java library
<lopex>
but it's pretty low level, but keeps track of stack consistency
<CodeWar>
lopex: this new IR you mentioned, is that pre bytecode ?
<lopex>
CodeWar: yes
<lopex>
pre jvm bytecode
<CodeWar>
Got it .. so you have your own parser to AST layer
<lopex>
CodeWar: it's based on bsic blocks
<lopex>
*basic blocks
<CodeWar>
I see
<CodeWar>
Any good libraries for parsing source to AST?
<lopex>
well, I'm not an expert on all of this
<lopex>
CodeWar: just parser generators
<CodeWar>
Just out of curiosity, have folks considered compiling Ruby to bytecode and then loading this bytecode in the VM with an interpreter written in Java that executes the said bytecode sequence. Of course at some point once you have all the knowledge of the code you could transform the bytecode some more and then give it to the VM to do with it what it does
<CodeWar>
That lets the language designer stay in control of not just the static compilation phase but also the dynamic compiliation + profiling phase.
<lopex>
you mean custom bytecode ?
<lopex>
CodeWar: there is an AOT compiler for one, but I believe having separate bytecode would have considerable overhead for cold runs and it doesnt give you the opportunities an IR has
<CodeWar>
lopex: yes
<lopex>
also, big switches on jvm have some bad performance characteristics
rjnienaber has quit [Ping timeout: 246 seconds]
<lopex>
it would have to be generated interpreter just as jvm does
<CodeWar>
So in the current model, if Ruby requires some special profile guided optimization you cannot get that right? Because the JVM profiles for Java (ignoring invokedynamic for now)
<CodeWar>
I understand your points about interpreters on JVM being slow btw
<lopex>
the 1.7 branch does JIT already (I believed it keeps the profile within the AST)
<lopex>
CodeWar: I recall Charlie said this second jruby JIT can skew jvm JIT since new bytecode is introduced just in time
mitchellhenke has joined #jruby
<lopex>
CodeWar: but jvm profiles are being already saturated in different direction
<lopex>
so it can skew inlining, devirtualization, and so on
<lopex>
CodeWar: also, ast doesnt give you enough semantic information, hence the IR in jruby9000
<CodeWar>
These things are best described on a white board :-) But I see what you are saying
<lopex>
I guess it resembles SSA in some way
<CodeWar>
I get why you need the IR lopex. My point being you probably want to profile + intepret on your own and then let JVM take over the generated bytecode classes which are a mixture of compile time byte code generated with some runtime augmentation
<CodeWar>
but yes you want to maintain your IR at all times with the correct amount of metadata in it
mbj has quit [Ping timeout: 240 seconds]
<lopex>
it also allows more meaningful rewrites
<lopex>
(in optimization passes)
<CodeWar>
effectively you want to go from your IR to bytecode right? I get that. What I m saying is you probably also want to interpret this IR at runtime, collect profile information and then do the IR -> bytecode at runtime
<lopex>
yes
<CodeWar>
So the model would probably be: [a] jrubyc source.rb -> source.ir [b] at runtime jrubyvm source.ir : [b] does interpreting of this IR and generates bytecode on teh fly
<lopex>
CodeWar: yes, but the IR is not being stored/cached atm (not sure if it will be ?)
<CodeWar>
interesting ... so this is about persisting the IR to runtime
<lopex>
CodeWar: jruby parser takes considerable overhead since it's always a cold code
<lopex>
IRWriterFile.java seems to write to byte buffers for file storage
<CodeWar>
lopex: I am not Ruby educated but it seems if you store the IR by methodhandle(?) compile to bytecode at compile time and leave a hook in teh generated bytecode that calls into your runtime then we have the best of both worlds
mitchellhenke has quit [Quit: Computer has gone to sleep.]
<CodeWar>
your cold start problem goes away since its bytecode being run, your runtime gets a trap each time the method is invoked with call site information and you can retrieve the IR for this method and do profiling and generate more bytecode if needed (an O2 version)
subbu has quit [Ping timeout: 256 seconds]
<CodeWar>
and you still get to ship the ruby compiled files as .jars and .classes which I think is important to the ecosystem
yfeldblum has joined #jruby
<lopex>
CodeWar: well, I guess storing IR with profile info saves a lot, but going AOT to jvm bytecode would prevent deoptimizations
<chrisseaton>
lopex: there is already any interpreter layer in the IR - I'm not sure what kind of profiling it does
elia has quit [Quit: Computer has gone to sleep.]
elia has joined #jruby
<lopex>
chrisseaton: yeah, I was wondering it the profile is being stored
<CodeWar>
the current layout of the VM makes it tricky to get a complete sub-VM implemented I agree. A typical DBT (Dynamic binary translation) runtime requires the following [a] compile time source to IR [b] Runtime :Interpreter [c] Runtime : Profiler (Trace or Method) [d] Runtime Compiler [e] Region Former (if trace compiled) [f] Ability to switch backnforth b/w interpreted stack and
<CodeWar>
compiled stack [g] Deoptimizer
<lopex>
for example callsite states
<lopex>
?
<CodeWar>
Didnt know you guys had an interpreter, what does it interpret bytecode or your IR? How does it get invoked at runtime
<chrisseaton>
CodeWar: there is an interpreter for IR instructions - each instruction class has an 'interpret' method, and they're run for each instruction
<chrisseaton>
it's run until a method is chosen for compilation to JVM bytecode
<chrisseaton>
lopex: I don't know where any profile info is stored, and looking around I can't really see an example case either
<chrisseaton>
lopex: the call site profiling info is separate - it's stored in the call site objects - which the IR uses
<lopex>
the seentypes thing for example ?
<CodeWar>
chrisseaton: Interesting. Care to elaborate how the startup sequence works. JVM starts up, calls the interpreter. How do the classes get executed
<chrisseaton>
the Ruby code is translated into IR instructions for each method, when the method is called, the IR instructions are run, one by one, by calling their interpret method
<CodeWar>
what does teh static compiler spit out ...
<CodeWar>
bytecode or IR
<lopex>
I guess it's a more linearized command object pattern
rjnienaber has joined #jruby
<lopex>
chrisseaton: so CFG is being built on hot methods only ?
<chrisseaton>
the static (we would say AOT) compiler emits byte code, a bit like the normal IR compiler could
<chrisseaton>
the CFG is now build on all method, but only compiled to byte code on hot methods
elia has quit [Quit: Computer has gone to sleep.]
<chrisseaton>
and optimisations on the IR (CFG) are only applied on hot methods
<lopex>
ah, on all so BBs can be built ?
<lopex>
er, "seen"
<chrisseaton>
yeah
<CodeWar>
chrisseaton: the bytecode spit out by the AOT is the Java bytecode I take it? at what point is the Ruby specific IR generated and loaded by the runtime interpreter
<CodeWar>
I guess I m asking how one goes from bytecode to the Ruby runtime layer
<chrisseaton>
not really sure what you're asking any more - a lot of these terms are overloaded which doesn't help
diegoviola has quit [Remote host closed the connection]
<chrisseaton>
ask me one question from scratch being very explicit
<CodeWar>
chrisseaton: probably a brief description of what the static compiler spits out adn how it is loaded interpreted adn profiled at runtime
<chrisseaton>
by static compiler you mean the ahead of time compiler, jrubyc, right?
<CodeWar>
Thats right
<CodeWar>
Sorry lets use AOT from now onwards
<chrisseaton>
that's broken on 9k, so I'll explain how it works in 1.7
<chrisseaton>
it reads Ruby code, into an AST, that AST tree is walked, and for each node some JVM byte code is generated, that's written to a Java class file, which is loaded when you want to run the Ruby program
<chrisseaton>
some of that JVM byte code will use objects such as CallSites
<chrisseaton>
these are instructions coupled with state, where profiling data can be stored
<chrisseaton>
but I think that's the limit of profiling in the AOT compiler
<chrisseaton>
in 9k it will be the same, except from the AST an IR will be generated, that IR will be transformed several times by the optimiser, and then JVM byte code will be generated from that
<CodeWar>
thanks. This makes complete sense. One question
<CodeWar>
at runtime how do you use the CallSites information. Are you regenerating more bytecode based on profile feedback?
<lopex>
afaik it's just pregenerated monomorphic callsites
<chrisseaton>
I'll simplify a bit here, but each CallSite contains an expected class and a method, when it's run the actual class is checked against the expected class and if they match the method is run, if they don't the method is looked up from scratch
<chrisseaton>
so I don't think there new byte code is generated, it's just some state in the object changes
<CodeWar>
chrisseaton: so the extent of profiling here is limited to inline caching and more is that correct?
<chrisseaton>
with invoke dynamic it's different, but not sure if the AOT uses that
<lopex>
chrisseaton: oh I mean preinstalled
<chrisseaton>
I think so in 1.7 yes
<CodeWar>
Got it
<chrisseaton>
the new IR is designed to allow more profiling I think
<chrisseaton>
profiling is what is needed to make Ruby fast, and a way to readapt if you profiling turns out to be wrong (which is deoptimization)
<CodeWar>
but if you have more profile data you would probably need to regenerate fresh byte code from the IR + profile data
<chrisseaton>
Truffle does a huge huge amount of profiling - which does make the interpreter a little slower
<chrisseaton>
in Truffle every value is profiled for runtime constants, every branch is profiled for probability of going either way, every call site, dynamic or not, implicit or explicit is cached
<CodeWar>
Branch profiling for a high level VM is a bad idea
<CodeWar>
:-)
<chrisseaton>
we even profile the possible range of values, and I'm trying to use that to remove overflow checks on Fixnums at the moment
<chrisseaton>
but we profile at the Ruby level - we profile Ruby branches, and branches in the Ruby runtime
<CodeWar>
unless you are trace scheduling micro-ops into the CPU's scheduling back end its almost always a bad idea ... only exception cold code removal ... i.e., 90-10 branches
<CodeWar>
70-30 branches, 80-20 branches or 60-40 branches probably wont give you any better information for recompilation at such a high level VM like JVM or Ruby. I used to work on a DBT (think Transmeta) where this information was useful but that was in teh CPU's trace cache + frontend
<lopex>
chrisseaton: are ruby branches and runtime branches treated differently ?
<lopex>
well, those are different
<CodeWar>
but anyways, I saw the Truffle/Graal presentation and was very interested
<chrisseaton>
no, and we selectively profile that ones that make sense in the runtime
<chrisseaton>
CodeWar: yeah, the most important branch profile info is taken-vs-not-taken, rather than actual probabilities
<CodeWar>
chrisseaton: yes the cold branch removal I agree...
mcclurmc has joined #jruby
<CodeWar>
asserts :-)
<chrisseaton>
I haven't been able to show that anything other than 0 or 1 makes a measurable difference - I think branch prediction is just so good these days
<CodeWar>
chrisseaton: not sure if it is much better these days but unless you are directly scheduling into the CPU's Reservation stations you cannot do much better with branch weights
<lopex>
you happen to know how cpu corelates code and entries in BTB ?
<CodeWar>
what would you do with 70-30 information if you had it ? Well simple, use more aggressive speculation on the 70 branch and use less aggressive instruction scheduling on the 30 branch :-) Replace speculative load stores with regular ones .. But none of that makes sense once you are on the other side of the CPU front end
<chrisseaton>
if I had it in the compiler? well basically I would put the 70 branch behind a backward jump and the 30 behind a forward jump - the intel (and everything else) branch predictors will start off assuming that the 70 will be taken - so I may get one less stall! pretty useless really. if I had those numbers in the CPU - no idea - open research question I guess
<CodeWar>
chrisseaton: the trouble is branch weights change. Look at SPECInt GCC or libquantum from FP.
<CodeWar>
a static compiler should not make those kidns of decisions. A runtime compiler could btu the cost of profiling and regenerating usually outweights the benefits
<chrisseaton>
well that's why we JIT rather than statically compile - but actually we don't deoptimize on bad branch profile numbers, as profiling in the compiled code to find that out would be far far too expensive
<CodeWar>
which is what I was typing :-) the cost of getting this information by placing counters is expensive. CPUs shoudl be able to give you this information in a cheaper way
<CodeWar>
AMD did something like that LWP or something I believe not sure what information it had
<CodeWar>
the other thing you could do is every `n' ticks go back to interpreted mode recollect branch weights if the deviation is not much continue with your compiled code
<CodeWar>
if its too much recompile :-)
<CodeWar>
I implemented this feature in a DBT I cannot name
<chrisseaton>
or randomly deoptimize every now and again to reprofile
<chrisseaton>
just try not to do it at the wrong time
<CodeWar>
ahh well its always the wrong time :-)
<CodeWar>
If you are profiling in the CPU front end to schedule u-ops theres a ton of profling information available for cheap :[a] branch weights [b] cache miss information [c] MRU rolllback (coherence conflicts) [d] LQ/SQ stalls ... On the mobile side quite a few CPUs do it to save power (a full OoO costs power)
<chrisseaton>
yeah, but you're well off into 'sufficiently smart compiler' teritory now
<chrisseaton>
and I imagine working out how to do that stuff optimally is a np-complete problem
<CodeWar>
its a game of guesses and rollbacks
<CodeWar>
and makign benchmarks look good
cultureulterio-1 has joined #jruby
cultureulterio-1 has quit [Client Quit]
cultureulterior1 has quit [Ping timeout: 240 seconds]
CodeWar has quit [Quit: (null)]
<lopex>
CodeWar: you could even execute both branches given no conflicts
<chrisseaton>
lopex: that's thread level speculation - it's not free though - you are still burning power
<lopex>
chrisseaton: I mean not to stall the pipeline
havenwood has quit [Remote host closed the connection]
calavera has joined #jruby
yfeldblum has quit [Remote host closed the connection]
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
mcclurmc has quit [Remote host closed the connection]
calavera has joined #jruby
yfeldblum has joined #jruby
Hobogrammer has joined #jruby
mcclurmc has joined #jruby
elia has joined #jruby
elia has quit [Client Quit]
havenwood has joined #jruby
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
mcclurmc has quit [Remote host closed the connection]
Felystirra has joined #jruby
calavera has joined #jruby
cprice_ has joined #jruby
mitchellhenke has joined #jruby
Felystirra has quit []
cprice_ has quit [Ping timeout: 250 seconds]
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
calavera has joined #jruby
marr has joined #jruby
rjnienaber has quit [Disconnected by services]
rjnienaber has joined #jruby
JRubyGithub has joined #jruby
<JRubyGithub>
[jruby] mkristian pushed 2 new commits to master: http://git.io/pJ31wg
<JRubyGithub>
jruby/master c55172a Christian Meier: use uri:classloader://META-INF/jruby.home for all cases with a jar-context...
<JRubyGithub>
jruby/master 8e34ffa Christian Meier: cleanup GEM_PATH when evaluating it...
JRubyGithub has left #jruby [#jruby]
yfeldblum has quit [Ping timeout: 272 seconds]
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
calavera has joined #jruby
calavera has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]