#wallaroo on 2018-02-21 — irc logs at freenode.irclog.whitequark.org

2017-09-30 09:40 SeanTAllen changed the topic of #wallaroo to: Welcome! Please check out our Code of Conduct -> https://github.com/WallarooLabs/wallaroo/blob/master/CODE_OF_CONDUCT.md | Public IRC Logs are available at -> https://irclog.whitequark.org/wallaroo

00:01 <SeanTAllen> nemosupremo: can you share your python source? there's probably some error in the source we arent currently handling and printing out a nice message for. we had a couple of those in the past related to Python objects being different than what was expected. We probably missed a case where that was possible.

00:03 <SeanTAllen> But yes nemosupremo that would be #960 that you are probably running into

00:03 <nemosupremo> https://gist.github.com/nemosupremo/5ab73d650a2cf0ae6f2c49b8dddf4767

00:05 <nemosupremo> I get to '|v|v|v|Initializing Local Topology|v|v|v|' before it segfaults

00:07 <nemosupremo> @SeanTAllen there are 2 imports you can comment out (from x import y) that you can remove

00:08 <nemosupremo> ok, I exited the docker container, then started it again and it worked

00:09 <SeanTAllen> that's really weird

00:10 <SeanTAllen> yeah it works for me over here fine.

00:10 <SeanTAllen> hmmm...

00:10 <SeanTAllen> i wonder what could have happened

00:11 <SeanTAllen> so you restarted the docker container and it worked... hmmm... ok i'm going to be thinking on that. and try to figure out something i can do to try and recreate.

00:13 <nemosupremo> weird, ever since I restarted the docker container, I haven't been able to reproduce the issue

00:13 <SeanTAllen> some days, computers are not my favorite things

00:14 <nemosupremo> I must have messed up the environment somehow, sorry

00:15 <SeanTAllen> i make no such assumption. but something happened. im bummed we cant reproduce.

00:16 <nemosupremo> One thing I noticed is if I ctrl+c the machida application, and restart it, I get stuck on 'Need ClusterInitializer to inform that topology is ready'

00:16 <nemosupremo> is that the wrong way to stop the application?

00:17 <SeanTAllen> that should work, we found a bug in the linux signal handling that we setup so that it wasn't working correctly and doing a clean shutdown. that is fixed on master. will be fixed in the next release.

00:17 <SeanTAllen> it works on OSX, not Linux at the moment.

00:18 <SeanTAllen> there's a cluster shutdown tool you can use to shut it down cleanly or...

00:18 <SeanTAllen> all the files it would remove on clean shutdown can be manually removed in /tmp

00:18 <nemosupremo> If its broken on Linux, that might mean its broken inside docker?

00:18 <SeanTAllen> ya

00:18 <SeanTAllen> it would be an issue in docker

00:18 <SeanTAllen> for my application called "a" for application module, this is the files it removes on shutdown...

00:18 <SeanTAllen> https://www.irccloud.com/pastebin/51Ynqx8k/

00:19 <SeanTAllen> and cluster shutdown tool would initiate a clean shutdown and remove all those

00:22 <SeanTAllen> https://github.com/WallarooLabs/wallaroo/tree/master/utils/cluster_shutdown has the directions for the cluster shutdown tool (it should already be built in the Docker container)

00:23 <SeanTAllen> You'd want to make sure your command line you start the application with includes and "external command address" https://docs.wallaroolabs.com/book/running-wallaroo/wallaroo-command-line-options.html

00:23 <SeanTAllen> --external

00:23 <SeanTAllen> Here's a question we've been discussing internally nemosupremo...

00:24 <SeanTAllen> let's say you have a cluster of 3 wallaroo workers running and you ctrl-c one of them... should only that one shut down or should it do a clean shutdown of all 3? currently it will do all 3.

00:27 <nemosupremo> It should only shut down that worker, but depending on how wallaroo is architectured, I don't know if thats the right answer. Another thing - I don't think I'm giving a good answer either. When starting out, one thing that wasn't immediately clear to me was "what is a wallaroo cluster". Right now, when I ctrl+c I can't tell if I'm shutting down a worker node, or if I'm shutting down a worker process? Is wallaroo multi tenant?

00:28 <nemosupremo> Other "cluster services" have "masters" and "nodes" and its pretty clear what doing a ctrl+c, or a kill -9, or a terminate will do

00:28 <SeanTAllen> each worker runs a single application, perhaps in conjunction with other workers, so it is not multi-tenant in the way most folks would think of it

00:29 <nemosupremo> I may also be very confused, because wallaroo seems too easy to use

00:29 <SeanTAllen> worker node/process would be the same thing in wallaroo i think. what is the difference in your mind between a worker node and a worker process?

00:30 <SeanTAllen> the basic idea is there is no master, except when you have cluster join/leave events and then 1 node makes itself the "master" for work redistribution.

00:30 <SeanTAllen> but the rest of the time, there's no master, just a series of workers that are working together in a cluster.

00:30 <nemosupremo> a worker node = the actual process wallaroo is running. a worker process = a thread that is running my application.

00:30 <nemosupremo> that may be confusing, because I was confused about how wallaroo is architected

00:31 <nemosupremo> Sometimes I want to ctrl+c my process (I have a bug in my application), sometimes I want to ctrl+c my node (I don't need 3 servers)

00:31 <nemosupremo> wallaroo seems very magical in that I can just spin up a process and it will join my cluster and start distributing work to it!

00:32 <SeanTAllen> there's no distinction like that in wallaroo... there are workers

00:32 <SeanTAllen> workers are unix processes

00:32 <SeanTAllen> the work together

00:32 <SeanTAllen> you can't shutdown a given thread within a worker

00:32 <SeanTAllen> so when you say "worker process" i think of process as "unix process"

00:33 <nemosupremo> Right, I think I understand now, I came in with a very Storm/Spark oriented model where you have your "storm nodes" and developers write sinks/spouts

00:33 <SeanTAllen> for the "i dont need 3 servers", in wallaroo that would be, i have 3 workers, i only want two. there's a tool for that, it allows you to shrink the cluster size.

00:33 <SeanTAllen> im quite familiar with storm, happy to try and talk in terms of its terminology if it helps

00:34 <SeanTAllen> cluster shrinker tool: https://github.com/WallarooLabs/wallaroo/tree/master/utils/cluster_shrinker

00:34 <SeanTAllen> which will be in the next release

00:34 <SeanTAllen> which should be soon

00:34 <nemosupremo> Ok, now that I understand it a bit more I think ctrl+c should kill the entire thing

00:35 <nemosupremo> in that case I would try to own the fact that the developer doesn't care about how many resources wallaroo is running

00:35 <nemosupremo> and if it runs on 3 or 9 nodes is an operator detail

00:35 <nemosupremo> (and he has tools to shrink/grow applications)

00:36 <SeanTAllen> im going to disappear on you soon. wife is home and i will be finished cooking dinner soon.

00:36 <nemosupremo> no problem

00:37 <SeanTAllen> slfritchie: tends to be more of a night owl. he might be around.

00:37 <nemosupremo> just want to say that seems very magical. I used storm in pre-1.0, and adding cluster capacity was almost always something you had to think about

00:38 <SeanTAllen> a couple of us ran storm in production for several years and that experience has informed some of the Wallaroo experience

00:38 <SeanTAllen> I even wrote a book about it... https://www.manning.com/books/storm-applied

00:39 <SeanTAllen> unfortunately, it came out a couple months before Twitter announced heron. That pretty much killed the book sales.

00:44 <SeanTAllen> ok dinner time for me. ill check back in the morning to see if you have more questions. if you log off and log back on, check the irc logs for responses.

00:45 <SeanTAllen> the Running Wallaroo section of the book as more information https://docs.wallaroolabs.com/book/running-wallaroo/running-wallaroo.html. Hopefully it can provide some answers. And fire away with questions you have. It helps us figure out what we need to improve in the documentation.

00:52 <nemosupremo> With the Kafka Source - it doesn't look like there is any way to get the partition key of an incoming message

01:08 nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

01:10 <SeanTAllen> That’s an interesting comment. What would you use the partition key for nemosupremo?

03:22 nemosupremo has joined #wallaroo

03:27 <nemosupremo> @SeanTAllen 1.) In this specific application I'm working on, since it uses the RandomKafka Partitioner, it actually stores data in the key that is not stored in the message. Implemention detail. 2.) I might be mistaken on partitioning works in wallaroo, but to me it seems like in a lot of cases it would make sense to partition off of the kafka key (its something I've done often in home grown frameworks)

03:31 <SeanTAllen> that's something we could possibly do, but, how to go about it could be a little tricky. the "simplest" thing would be for every computation etc to take a message and message-metadata object. which could be nothing.

03:32 <SeanTAllen> or the kafka decoder could combine message and metadata together (but that involves a decent amount of copying and could be slow).

03:32 <SeanTAllen> is there other metadata from kafka you can see using?

03:33 <SeanTAllen> hmmm i suppose in a decoder that you could have a message that takes both payload and headers, no copying there but there's extra allocation for the wrapper object, still, that seems doable.

03:33 <SeanTAllen> and given not every source might have metadata, that probably makes more sense

03:34 <SeanTAllen> definitely something we will be discussing internally.

06:01 nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

07:00 nemosupremo has joined #wallaroo

08:05 nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

17:38 nisanharamati has joined #wallaroo

18:13 nemosupremo has joined #wallaroo

19:36 <nemosupremo> The way the 'alphabet_partitioned.py' example (https://github.com/WallarooLabs/wallaroo/blob/master/examples/python/alphabet_partitioned/alphabet_partitioned.py), I'm assuming a TotalVotes object is guaranteed to always be on the same letter, but I'm not sure this is the case

19:36 <nemosupremo> more specifically, line 47 looks like a bug

19:36 <nemosupremo> if self.letter != votes.letter, then the votes get mixed

19:37 <nemosupremo> or, if wallaroo is supposed to ensure that never happens, line 47 looks like a foot gun

19:40 <aturley> nemosupero the total votes object is guaranteed to always be on the same letter

19:41 <aturley> when TotalVotes objects are created, though, there is no way to tell them which letter that will be.

19:42 <aturley> ideally self.letter would only be set the first time update was called.

19:44 <aturley> i seem to recall a few of us had a discussion around what to do in this example, and we decided that we wanted to avoid getting bogged down in the mechanism for only setting the letter on the first call to update(...)

19:45 <aturley> ^^ nemosupremo (sorry, i typed your username wrong up at the top)

20:38 <nemosupremo> Can you use `to_state_partition` if you don't know how many partitions you will have beforehand?

20:47 <slfritchie> Currently, Wallaroo's state partition scheme needs to know how many in advance. It's a restriction we wish to loosen. I'm not aware of when that might happen, though.

20:48 <nemosupremo> @slfritchie how are people partitioning in the real world today?

20:48 <nemosupremo> just select an arbitrary number and bucket into those?

20:49 <nisanharamati> predefined keys/buckets

20:52 <nisanharamati> So the current approach for unbounded/unknown keys it to treat it like database sharding

20:52 <nisanharamati> where you partition on something known (like the first letter of the word in the word_count example)

20:53 <nisanharamati> and then use an unbounded hashmap to store the data inside each shard

20:53 <nisanharamati> like in https://github.com/WallarooLabs/wallaroo/blob/master/examples/python/word_count/word_count.py

20:58 <SeanTAllen> nemosupremo: we are planning on adding "dynamic partition keys" within the next few months. that would allow you to not know the number of partitions ahead of time and could then have a state object per word in an example like the word count one or say, a key per client in another system etc.

21:05 <SeanTAllen> That would be this issue nemosupremo: https://github.com/WallarooLabs/wallaroo/issues/751

21:12 nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

21:31 nemosupremo has joined #wallaroo

23:00 nisanharamati has quit []