SeanTAllen changed the topic of #wallaroo to: Welcome! Please check out our Code of Conduct -> https://github.com/WallarooLabs/wallaroo/blob/master/CODE_OF_CONDUCT.md | Public IRC Logs are available at -> https://irclog.whitequark.org/wallaroo
moas has joined #wallaroo
moas has quit [Ping timeout: 240 seconds]
moas has joined #wallaroo
moas has quit [Ping timeout: 248 seconds]
moas has joined #wallaroo
moas has quit [Ping timeout: 256 seconds]
<cajually> SeanTAllen: Sorry I lost my internet connection for the rest of the day, I realized it was just print out that were handled. The scripts seem to manage pipefail etc correctly. I'm working on a quick and dirty functioning machida with python3.5, currently working out some strings/unicode differences.
<cajually> I'm the asshole behind #2334 btw and I have a similar amount of questions around the state management :)
<cajually> anyway what is the defacto preferred medium for wallaroo discussion?
moas has joined #wallaroo
moas has quit [Ping timeout: 248 seconds]
moas has joined #wallaroo
moas has quit [Ping timeout: 248 seconds]
moas has joined #wallaroo
<SeanTAllen> cajually: I think that depends. For freeform discussions, either IRC or the mailing list. If you have something specific and actionable, GH issues are best.
<SeanTAllen> Given the difference in timezones, I think mailing list would probably work out better than IRC cajually
moas has quit [Remote host closed the connection]
<cajually> Yeah was thinking for less actionable stuff and preferably somwhere that doesn't give 40~ people email from github everything you do write something
<cajually> Mailinglist sounds about right. Anyway I've made some progress with my python3 port, porting the pony side of things have been slower as I don't know the language. I'll probably make some helper functions for dealing with python bytes() as the old PyString stuff that is used for buffers currently doesn't quite match the new ways
moas has joined #wallaroo
<cajually> for python3 we have the option of targeting the Unicode interface or the Buffer interface instead, sometimes the Unicode one is the correct way but most the situations Buffer is much better. Neither of them have simple size operations, for the Unicode case serialization has to happen and then we can know the size but the reason we try to figure out the size ahead of time is to allocate the correct
<cajually> amount of memory it seems. And the buffer protocol requires different things to figure out the size and potentially dealing with non contigious buffers. All in all I'm not trying to make the best implementation, just one that lets me find the issues and port the tests
<cajually> I'll be traveling until tuesday and probably will have no more time from now to then btw
moas has quit [Ping timeout: 240 seconds]
<cajually> Typing that out I realized that there should be a PyBytes type I could have used that should map very nicely. Disregard most of that stuff I guess
rblasucci has joined #wallaroo
moas has joined #wallaroo
moas has quit [Ping timeout: 268 seconds]
<SeanTAllen> cajually: will do. For the Pony stuff, we can definitely help you out there. What timezone are you in? We might be able to work out some pairing from time to time to assist if that would be something that interests you.
moas has joined #wallaroo
<cajually> I'll be in GMT +8 (china, taiwan,singapore etc), that sounds interesting. I'm currently waiting for a working visa process an have unknown amount of full time available. Regardless I really like the project and would love to be able to contribute something that can help traction. And I think that the python3 support is a perfect start, I've been looking for something like Wallaroo to do real time data
<cajually> processing for ML setups, a world that severly lacks open source tooling
<cajually> I think Wallaroo is closest to having the IMO perfect approach to allow for this
<cajually> I don't think python is going anywhere, I don't think that datascientists have the time to care for JVM languages, Flink isn't amazing Spark has too high latency. The others are plentyful but too small and for good reason. In the end doing in memory styff correctly and distributed while not confined to the JVM i the way forward.
<cajually> ... I realize I have like 10 more paragraphs for why I think Wallaroo is the project in its space(and adjacent) that I'm putting my money(read time) on and I should just make a completely independent blog post on it
<SeanTAllen> cajually: i shared what you said with the team. it's definitely nice to hear from folks who get what we are doing. thank you.
<SeanTAllen> let me know how you would like to get assistance with the pony portions of what you are doing and i'll get you help.
<cajually> Happy to hear that! I know what external validation can mean, having worked on very technical products in what was probably the smallest team possible for the task.
<cajually> Regarding Pony, I fail to find a language specification, is there one? The very soft documentation in form of tutorial and some stdlib spec doesn't really provide a direct path to understanding how stacks works, the builtin types with boxing etc
<aturley> cajually there's no spec at that level right now.
<aturley> folks in the irc channel and mailing list are pretty happy to answer those kinds of questions, but at this point there's not a good single repository for those pieces of knowledge.
<aturley> i mean, other than reading the compiler and runtime code.
<aturley> (it is pretty readable code, but maybe not the fastest way to get questinos answered)
<SeanTAllen> Following on what aturley said, I'm happy to answer Pony language questions either here or #ponylang channel cajually.
<cajually> that's great to know. I've wanted a language like pony for a long time tbh, just very hard to google things with pony in the query and get good results currently..
<cajually> I'll see if I can find a reasonable workflow reading the compiler source but often it's very hard to trace the details through lexing and code generation
<cajually> I'll probably have questions
<aturley> yeah, agreed. you're probably better off asking in IRC if you want to know something specific.
<cajually> I was part of a team making something like pony long time ago, https://github.com/hnsl/librcd, there under a different github account that I've lost access to
<cajually> in the end that company died from NIHS
<cajually> (and competitors having better product-market fit
moas has quit []
<SeanTAllen> cajually: librcd looks interesting
<cajually> I noticed that wallaroo uses libgold, does it provide a large performance boost for pony, maybe in lieu of more language specific optimisation
<cajually> librcd was honstly a lot of fun, we experimented with strange ways to do concurrency and memory management
<cajually> in the end the way we did memory management was very expensive. Didn't stop us from building a massive stack on top of it tho
<SeanTAllen> do you mean the gold linker cajually ?
<cajually> yeah
<cajually> the link time optimisation
<SeanTAllen> there can be some not negligible improvements for some code.
<cajually> I can imagine that is a fantastic band aid if inlining is not done properly
<SeanTAllen> like many things optimization its a "it depends" sort of answer
<cajually> yeah of course
<SeanTAllen> so far, we havent found any cases of link time optimization introducing bugs
<SeanTAllen> so its come a long ways from when it was introduced and was usually more of a way to create bugs
<SeanTAllen> you can turn LTO on and off with Pony.
<slfritchie> cajually: My TODO-soon list includes adding a bit to the Gotchas section of the Pony language docs to summarize stack use. I'd shot myself in the foot, overrunning the Pthreads stack and then having some very odd actor behavior and SIGSEGV crashes result. (Silly me.)
<cajually> Last time I used it, a long time ago, it could not be used with gcc 03 at all
<cajually> happy to see that it is used with 03
<cajually> I can imagine that there is a lot of stack gotchas because I read it as C yet it is actor model
<SeanTAllen> Pony was the first time I used LTO and didn't have it blow up on me. I'd avoided it for a while.
<cajually> another thing I have not yet
<cajually> gah, my ssh is killing me, 300ms from where I am. yet... understood is how the pony calling convention works
<cajually> is it a pure C stack based , no TCO unless lucky situation?
nisanharamati has joined #wallaroo
<SeanTAllen> TCO is an optimization only in Pony
<SeanTAllen> There's no Pony specific support for TCO outside of the optimizations that LLVM can do.
<cajually> figured, the only alternative that made sense was if the the erlang or haskell calling convention was implemented
<cajually> but CFFI looks too smooth for that to be true
<slfritchie> Yes. Ignoring optimizations, each behavior or regular function call use the stack in the same way that C does. Instead of `main` at the bottom of the stack, a Pony scheduler Pthread has a variable number of frames related to scheduling, then mostly 1-1 mapping of the behavior's function calls on the stack. The consequences for overrunning the Pthreads stack size are identical for plain old C/C++/etc.
<cajually> o7 nisanharamati
<nisanharamati> hiya
<cajually> that sounds similar to what we did in librcs
<cajually> librcd*
<cajually> are there a lot of TLA traps associated to actors?
<SeanTAllen> cajually: there are some areas where C-FFI needs to be improved, some things aren't possible but the straightforward stuff is straightforward
<SeanTAllen> im not familiar with the term TLA trap cajually. Trap, yes, not TLA trap.
<slfritchie> TLA trap? (Sorry, Google is distracted by music & not compilers)
<cajually> say that you use some C library that wants to allocate heap memory, it calls it's own malloc it is has pulled in, that could cause issues with pony if an memory allocation is doen with as thread local allocations
<SeanTAllen> cajually: it shouldn't. depending on how you define "cause issues". thread locals dont play well with pony because no actor is guaranteed to always be run by the same scheduler thread.
<cajually> ok that is good to know
<SeanTAllen> the allocating of memory itself should be an issue
<SeanTAllen> * shouldn't
<cajually> scheduling of actors/ fibers can often be made a lot faster by making all their memory thread local and allowing for that move to be expensive in case of work stealing
<SeanTAllen> man, i had to read that 5 times to realize it said should instead of shouldn't
<cajually> haha
<SeanTAllen> Pony has a pool allocator that all actor memory comes from
<SeanTAllen> If you use "new" in some form, it comes from the pool
<SeanTAllen> With the usual heap/stack constraints
<cajually> interesting, has this been a considered a performance bottle neck?
<SeanTAllen> There's an optimization pass that will move heap allocations to the stack where possible
<SeanTAllen> cajually: nope.
<cajually> I remember having issues that we ended up effectively spending a lot of time pushing data between cores
<SeanTAllen> Pony tries to keep actors on the same core, but, in the end, that is basically magic.
<cajually> because the scheduler kept moving our threads, then we pinned the threads with cgroups
<SeanTAllen> So, pony supports pinning threads to a core.
<SeanTAllen> There's an option you can pass at runtime
<cajually> and then we ended up moving to TLA
<SeanTAllen> well, sorry
<SeanTAllen> it will pin scheduler threads by default
<SeanTAllen> you have to ask for them to not be pinned
<SeanTAllen> you have to ask for the thread that handles asio events to be pinned
<cajually> hah
<SeanTAllen> by default the runtime will start 1 scheduler thread per core and pin to it
<cajually> this is almost identical to what we did
<cajually> by trial and error
<SeanTAllen> you can use cgroup and the --ponythreads=X option to only use some cpus
<SeanTAllen> and to devote them solely to your app (this is best practice and discussed in the Pony performance notes on the website)
<cajually> I'll check that out, anyway I feel like we could go on for hours talking about implementation details
<cajually> and I think it's better I do some reading now
<cajually> close to 1 AM here:)
<SeanTAllen> Enjoy the rest of your night cajually
<cajually> SeanTAllen: thanks for your patience and good night! I got some level of python3 running, currently going through the python module and the unit tests. Once I'm done I'm sure we will have a great conversation of how it should actually work and how to preceed etc
<SeanTAllen> Awesome
<nisanharamati> 👍
rblasucci has quit [Quit: Connection closed for inactivity]
nisanharamati has quit [Quit: Connection closed for inactivity]
puzza007 has quit [Quit: ZNC 1.8.x-nightly-20180801-e2a96470 - https://znc.in]
puzza007 has joined #wallaroo