<elskwid>
dkubb: Thanks for looking over that code!
<dkubb>
elskwid: no worries
<dkubb>
elskwid: that's some really nice code
<elskwid>
dkubb: Aw shucks.
<elskwid>
That means a lot!
<elskwid>
dkubb: Good catch on skipping the module eval, at that point my brain was so liquified I just put down what was working.
<elskwid>
ha ha
<dkubb>
hehe
<elskwid>
dkubb: All cleaned up.
<dkubb>
sweet
<elskwid>
dkubb: It's been a blast to play around in here. Looking forward to getting involved in the other libs if you guys will have me.
<dkubb>
yeah, for sure
<dkubb>
there's so many things that are really interesting in this area
<dkubb>
we're not just building an ORM, it's a toolkit for building your own ORM
<elskwid>
dkubb: I've lurked for far too long. ha. Figured I'd better speak up when I saw solnic's post on the parley list.
<dkubb>
of course, we'll have some recommended configurations, and probably metagems that package things up in a few common ways
<elskwid>
dkubb: Yes. I've watched with interest.
<elskwid>
and of course, I can't stand the ORM I use day-in-and-day-out
<elskwid>
dkubb: It's such a smart way to do it. We do the same thing for our internal dependencies.
<elskwid>
Similar to your devtools just not as crazy.
<elskwid>
HA!
<dkubb>
hehe
<dkubb>
devtools is pretty crazy
<elskwid>
I mean that in a good way...
<dkubb>
but it saves a ton of time I think
<elskwid>
We just have ours in a metagem but I can see why you do it the way you do.
<elskwid>
and yes, the gains far outweigh the weirdness.
<elskwid>
Just havint a toolset in sync is HUGE
<dkubb>
because it helps get all the low hanging fruit out of the way before code needs to be reviewed.. making the code reviews able to focus on higher level stuff
<elskwid>
*having
<dkubb>
I still do one low level pass beforehand though
<dkubb>
just because no tool is perfect
<dkubb>
and I find my brain gets hung up on small things first. once they're taken care of I'm able to see the bigger picture
<elskwid>
dkubb: What do you mean one low level pass? Do you mean review your code?
<dkubb>
when I review code I usually do two passes
<dkubb>
my first pass is just looking at the code structure. variable names, idioms, etc
<dkubb>
just kind of low level stuff.. like how the code was put together
<dkubb>
when I do a second pass I'm usually looking at higher level design. how the classes fit together, the overall feature the code is trying to add, how it fits in with the rest of the code, etc
<dkubb>
I can't do both passes simultaneously because I can't context switch between the two. I think they require different ways of thinking
<elskwid>
I hear that.
<elskwid>
I'm currently struggling with a switch to emacs. I was telling solnic earlier that I had no idea how much I relied on the tree view (or folders or whatever) to help me maintain project-level context.
<dkubb>
code reviews are also a good way to communicate. it's one thing to talk about how you'd do something, but to have a working example to discuss is 10x better .. it gets everyone on the same page and is a good way to transfer skills
<dkubb>
heh
<elskwid>
dkubb: Yes!
<dkubb>
I made the switch to vim a year or two ago and it was hard
<elskwid>
dkubb: I'm thinking that pull requests should be opened the second you have code down that runs.
<dkubb>
but I love it. I wouldn't go back to what I was using before (Textmate)
<elskwid>
It has been huge for me to have you guys looking over that pull request.
<elskwid>
dkubb: I was on Sublime.
<dkubb>
elskwid: heh, sometimes at work I make a commit with --allow-empty, add a comment, and then push the new branch. I open a PR with that branch
<elskwid>
dkubb: Perfect.
<dkubb>
the nice thign is that the comment is used as the PR description
<elskwid>
I think that is a great way to do it.
<dkubb>
this only works if you have a single commit though
<Gibheer>
morning
<dkubb>
Gibheer: good evening
mbj has joined #rom-rb
<mbj>
.
<dkubb>
mbj: I laid out a few really simple skeleton files for sql
<dkubb>
mbj: I've also added you as a committer
<dkubb>
mbj: good idea to start with the AST outline.
<mbj>
dkubb: yeah
<dkubb>
mbj: should we start with a few simple SQL queries as goals for what we want to be able to parse and generate? something simple like SELECT * FROM users
<mbj>
dkubb: I typically start with literals, boolean expressions and than the more complex stuff
<mbj>
dkubb: makes more fun with specing
<dkubb>
mbj: ok, so you build up from the smallest primitives?
<mbj>
dkubb: Yeah, I did this so often and it turned out bottom up works.
<dkubb>
mbj: ok, I'll follow your lead on the parser/ast side of things ;)
<mbj>
dkubb: thx
<mbj>
dkubb: The main reason is: Specs look really nice!
<dkubb>
mbj: luckily I think SQL syntax is probably simpler than ruby
<mbj>
dkubb: So once you have something like s(:select, s(:all_fields), s(:identifier, :users))
<mbj>
dkubb: For this you need to specifiy the s(:identifier, :users) first
<mbj>
* specify how this should be generated.
<mbj>
will go really smoth, I'll redo aql with this style, and yeah sql will probably many times more easy than ruby
<dkubb>
are you still thinking about using ragel?
<mbj>
We could also try to do it in parselet
<dkubb>
I don't mind ragel if you want to use it
<mbj>
So the whitequark ruby parser uses ragel for tokenizing and racc for parsing
<dkubb>
I've kinda been interested in learning it. either one is fine with me though, I'll let you decide
<mbj>
We should reuse an existing grammar if possible
<mbj>
We call methods on the builder object within actions
<mbj>
I'd just clone the whitequark design here.
<dkubb>
oh I see, and the builder puts those into an ast or a stack or something as they are seen?
<mbj>
yeah
<mbj>
Also I think peter (whitequark) will be able to help us in a dead end. We are both not parser gods ;)
<mbj>
But lets play around before calling for help ;)
<dkubb>
and then this y file is used by ragel to generate ruby source that tokenizes the sql
<mbj>
not by ragel
<mbj>
.y files are processed by racc
<dkubb>
ahh I see
<mbj>
into giant .rb files
<mbj>
parsers only work on tokens
<dkubb>
so racc builds the parser from this
<mbj>
Ragel is used in whitequark/parsers to break the input source into a token stream
<mbj>
That gets consumed by the parser defined in the .y files
<dkubb>
ahh I see
<mbj>
racc is able to generate a parser with a lookahead of one token (fast).
<mbj>
ragel is not!
<dkubb>
what's the speed on all this?
<mbj>
It depends on the complexity of the grammar
<mbj>
For example ruby must be tokenized and parsed at once.
<dkubb>
OT, but can this be used to parse a stream? i.e. does the whole string need to be known before parsing can start?
<mbj>
depends
<mbj>
I typically say stream processing is a lie, because the last character of the stream can still make the whole input invalid.
<dkubb>
right
<mbj>
stream processing implies you need error tolerance
<mbj>
An parser generated by racc only needs one token lookahead
<mbj>
So it is pretty fast. But I dont have measurements here, nor the theoretical background.
<mbj>
I like to get stuff working and correct first ;)
<dkubb>
yeah, I was mostly curious
<mbj>
Once I talked to whitequark I also asked the same
<mbj>
The parsers generated by racc are from the "fastest" class because of the one token lookahead.
<mbj>
And no need for backtracking
<mbj>
But as ruby drives the parser it will still be slow
<mbj>
I noticed a 200% penalty when going from melbourne to whitequark/parser
<mbj>
But this is okay for 1.9 2.0 and 1.8 support
<mbj>
And for pure ruby. That 200% where on MRI, I expect JITed rubies can easily optimize that generated parsers.
<dkubb>
once we have a good pure ruby reference implementation, I suppose someone could always optimize it in C or java
<mbj>
Yeah
<mbj>
And we can measure what exactly is slow
<dkubb>
if it's 100% covered, and mutated, then our test cases will help make a faster version bullet proof
<dkubb>
yeah
<mbj>
I think it is possible to parse SQL with a pure ragel parser
<mbj>
But lets start with a proven approach ;)
<dkubb>
yeah
<dkubb>
the round-trip tests, etc all will still be valid
<mbj>
BTW the tokenizer in MRI was "handwritten", I really hope some implementations use the whitequark thing
<mbj>
I'd love to write an RBX compatible bytecode emitter.
<dkubb>
so above you said you wanted to start with literals. we could first parse a literal into an AST, then generate it so we can round-trip it. then we could look at the fuzzer side of things.. going node-type by node-type
<mbj>
yeah, but maybe I'll go faster and cover more nodes for generation
<mbj>
Than dealing with the parser
<mbj>
This allows me to spot cases where the target AST is misdesigned.
<mbj>
Round trip tests are really nice I know, but my time is limited.
<dkubb>
you can attack this from any angle you want ;)
<mbj>
If we have a stable target ast we have all we need to plug this into axiom
<mbj>
Basically we transform the axiom ast (node tree) into an sql ast.
<dkubb>
yeah
<mbj>
The better the sql ast is desiged, the more fun we'll have doing this.
<dkubb>
then I'll be optimizing the sql node
<dkubb>
er ast
<mbj>
Yeah
<mbj>
First do a naive 1:1 translation
<dkubb>
there are some optimizations we can do, like transforming a natural join into an inneer join
<dkubb>
yeah
<mbj>
And than collaps nodes where possible.
<dkubb>
yeah, I already learned a bunch of the collapsing rules
<mbj>
nice, and IMHO it will be fast enough
<mbj>
I do not care if axiom -> axiom-optimizer -> axiom-sql-generator -> sql-optiomizer take 5-10ms
<mbj>
The typical db roundtrip will be longer
<mbj>
And if we can use a more complex query that loweres the number of sql queries we already won.
<dkubb>
yeah, the cost in time to optimize an sql query is probably 1/100th to 1/10th of a ms
<dkubb>
at least in terms of collapsing, pruning and transforming
<dkubb>
what we're really doing is normalizing
<mbj>
yeah
<dkubb>
axiom-optimizer is super fast. it's much less than that, and it's probably the most expensive part
<mbj>
I'm preparing child for kindergarden and move to office, back in 30min
<dkubb>
k, I will probably be sleeping then, so good night!
<mbj>
ptico: I'd suggest you still use --rspec-dm2 especially as you whitelist the objects.
<ptico>
mbj: thanks! that's what i want
<mbj>
ptico: Remember, mutant will "recurse" into the namespace. So if you have Foo::Bar::Baz, and specifiy Foo::Bar, it will also mutant Foo::Bar::Baz. I fixed this in master already.
<mbj>
For recursion you need to specifiy ::Foo::Bar** ;)
<ptico>
mbj: great
<mbj>
ptico: So if you have the situration you whant ::Foo::Bar, but not ::Foo::Bar::Baz you need to specifiy the methods like ::Foo::Bar#{a,b,c} etc.
<ptico>
mbj: btw, i'm ok with --rspec-dm2, shared contexts helps me to keep it clean
<mbj>
ptico: Yeah, but sometimes it is ugly, I'll provide overrides in future. And I think we'll have a .mutant.yml ;) But I'd like a .mutant.rb more, because you maybe whant to add specific logic. dunno will explore.
<mbj>
ptico: Thx for feedback! And let it come!
<solnic>
mbj: hmm what'd be the spec for unary plus? s(:+, '2') # => '+2'?
<mbj>
solnic: no, dont use symbolic symbols
<solnic>
ugh yeah that was just a shortcut here
<mbj>
s(:uplus, argument)
<mbj>
s(:uplus, s(:int, 2))
<mbj>
s(:uminus, s(:float, 3.0))
<mbj>
So you do
<mbj>
def dispatch
<mbj>
write(operator); visit(first_child)
<mbj>
end
<solnic>
rite
<solnic>
I wasn't sure about that visit(first_chid)
<solnic>
rite
<solnic>
thanks
<mbj>
And you can try to collapse these into a single scalar unary node if you like
<mbj>
The boring fact is, you do that stuff I should do. It takes more time to guide you than do this for myself.
<mbj>
But I hope this changes soon :D
<mbj>
This is not blaming you! It is normal it takes time till the message sinks in.
zekefast has joined #rom-rb
<mbj>
elskwid: hola
<mbj>
solnic: And yeah, nice you made it there!
<elskwid>
Hi solnic , Hi mbj
<elskwid>
You guys are rockin' that AST
<solnic>
mbj: I warned you alright
<mbj>
solnic: haha
<solnic>
mbj: also, it's bad only you and dkubb could do this stuff
<mbj>
solnic: You can already do it, you dont trust yourself here.
<solnic>
mbj: no I don't
<solnic>
elskwid: really? I feel like I've spent entire day doing something that should take 15 minutes lol
<mbj>
solnic: That is the point where I normaly move to a pice of paper and convince person I'm helping he knows all that stuff already.
<solnic>
mbj: ok so identifiers
<elskwid>
solnic: Ha ha.
<elskwid>
I'll let you guys get back to it. solnic, when you get a chance take a look at that PR. It feels pretty good to me. When it's merged assign me the next one! (I would suggest a spec update to the new syntax perhaps)
<elskwid>
I'm going to go run before the heat.
<mbj>
solnic: s(:ident, "foo") => %q("foo")
<solnic>
elskwid: ok I will merge it in!
<mbj>
solnic: s(:ident, "fo\"") => %q("foo""")
<mbj>
solnic: IMHO s(:id) is best!
<solnic>
mbj: yeah I like short
<solnic>
names
<mbj>
solnic: We'll have to call that s() macro often ;)
<solnic>
mbj: what's the diff between id and string?
<mbj>
solnic: SELECT * FROM "foo" <- This is the identifer
<mbj>
not a normal string
<mbj>
identifiers get quited with double quotes
<mbj>
strings with single quotes.
<solnic>
rite
<mbj>
solnic: After that I'd say we can create asingle class that handles ALL binary operators, scalar and boolean.
<mbj>
solnic: IMHO, as far as I remember SQL.
<mbj>
solnic: Than we should add "attributes", nothing more than scoped identifiers.
<mbj>
solnic: binary operators, scalar and boolean
<mbj>
solnic: We dont have all literals, but I think we should do some more strucutral nodes now.
<mbj>
Because they are more fun.
<mbj>
solnic: s(:and, s(:true), s(:false)) => 'TRUE AND FALSE'
<mbj>
def dispatch; visit(left); write(WS, TABLE.fetch(node.type), WS); visit(right); end
<mbj>
where left and right are children[0] and children[1]
<mbj>
I'd to a multiple assignment like left, right = children
<solnic>
mbj: ok cool
<mbj>
solnic: than we have to "Fix" nested binaries
<mbj>
solnic: s(:and, s(:and, left, right), right2) => (left and right) and right2
<mbj>
And because this would be stupid to implement (inspecting the child for need of parentheses I'd say we wrap both, left and right in parentheses.
<mbj>
so:
<mbj>
s(:and, left, right) => (left) and (right)
<mbj>
My rule is to only wrap childs, never wrap self. This way we at least dont have an unnneded outer parenthese pair.
<mbj>
We can optimize later, but it is okay for generated code to be very explicit, maybe we'll also solve remove some precendence bugs in implementations.
<dkubb>
mbj: at some point we'll probably want to do some inspection of the child to only wrap when necessary.. but for now what you're proposing is safer
<mbj>
dkubb: yeah
<mbj>
dkubb: We'll prettify once we can round trip.
<mbj>
dkubb: Also it reduces lots of complexity with explicit parentheses.
<mbj>
dkubb: I plan not the child can tell if it has a high or low precedence in relation to parent. So we can create a reusable visit_at_precedence(child)
<mbj>
dkubb: This would query the child and wrap inside parens if needed.
<mbj>
s/plan not/plan/ sorry
<mbj>
dkubb: Or we just patch #visit to do this. and remove all parentheses { visit(some_child) } calls.
ddfreyne has quit [Excess Flood]
ddfreyne has joined #rom-rb
kapowaz has quit [Read error: Operation timed out]
<snusnu>
mbj: the reek integration cucumber features want it to report duplicate method call for: puts(@s.title)
<snusnu>
mbj: imo that's wrong
<snusnu>
mbj: it should complain about @s.title
<snusnu>
mbj: there's really nothing wrong with calling a method (especially one like puts) with the same argument more than once?
<snusnu>
mbj: or maybe i'm just crazy after trying to read reek code plus grokking the sexps
solnic has quit [Ping timeout: 248 seconds]
<snusnu>
mbj: btw, looks like rubocop functionality is nowhere near to reek's
<mbj>
snusnu: okay
<mbj>
snusnu: ruby_parser sexps are stupid
<snusnu>
mbj: the reason i was initially asking if we want to only warn on methods with no parameters kinda came from me not knowing how to work around that integration spec issue
<snusnu>
mbj: but tbh, i'm not interested in writing that patch anymore