SeanTAllen changed the topic of #wallaroo to: Welcome! Please check out our Code of Conduct -> https://github.com/WallarooLabs/wallaroo/blob/master/CODE_OF_CONDUCT.md | Public IRC Logs are available at -> https://irclog.whitequark.org/wallaroo
nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
nemosupremo has joined #wallaroo
nemosupremo has quit [Client Quit]
nemosupremo has joined #wallaroo
nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
nemosupremo has joined #wallaroo
nemosupremo has quit [Client Quit]
nemosupremo has joined #wallaroo
nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
_whitelogger has joined #wallaroo
SeanTAllen has quit [Ping timeout: 240 seconds]
SeanTAllen has joined #wallaroo
cmeiklejohn has quit [Ping timeout: 255 seconds]
cmeiklejohn has joined #wallaroo
nemosupremo has joined #wallaroo
nemosupremo has quit [Client Quit]
nemosupremo has joined #wallaroo
nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
nisanharamati has joined #wallaroo
nemosupremo has joined #wallaroo
nemosupremo has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
nemosupremo has joined #wallaroo
<nemosupremo> Does the giles receiver work in 0.4.0?
<nemosupremo> The encoder is being executed, but received.txt stays at 0 bytes
<nisanharamati> Yes, but it depends on the output data being formatted correctly
<nisanharamati> you can try replacing it with `nc` to see whether the output is (a) arriving and (b) formatted as expected
<nemosupremo> makes sense, I see the format in the docs
<slfritchie> Our release test process has moved away from using giles/receiver/receiver everywhere but AFAIK it works. If your encoder isn't spitting out 4 byte length + binary payload of the message, then giles receiver will interpret the first 4 bytes as a length, calculate a wrong number of bytes to read, and then wait a long time for more data.
<nemosupremo> I'm not formatting it at all
<nemosupremo> I'm trying to understand how data "flows" in wallaroo in order to optimize my database lookups. I'm using a kafka sink and this is what I see - 1000 (or a batch) of messages get "decoded", then my messages get "stuck" in a state computation step. What I would like to do as many db look ups as I can in parallel
<nemosupremo> nevermind
<nemosupremo> as I was explaining it to myself I figured it out
<nisanharamati> nice. What was it? (And consequently, is there anything we could improve in the docs or examples to help new users avoid that issue?)
<nemosupremo> @nisanharamati I think it only applies to me, but using a combination of partitioning with compute_multi I can get "batches" that I can work with at once
<nemosupremo> One thing I noticed though is that compute_multi doesn't drop null elements
<nemosupremo> (compute will drop a message if I return None)
<nisanharamati> So that `[1, 2, 3, None, 5, 6]` would be received as the sequence `[1, 2, 3, 5, 6]` at the next step, instead of the original?
<nemosupremo> As I understand it if I return `[1,2,3,None,5,6]` from compute_multi
<nemosupremo> my next compute steps should receive `1`, `2`, `3`,`5`,`6`
<nemosupremo> but I get `3`,`None`,`5`
<nisanharamati> I think that's a bug
<nisanharamati> I'll open an issue to look into it.
<nisanharamati> Thanks for pointing it out!
<nemosupremo> ok, to a bit clearer, this was the output of a computation_multi, that went into state_computation_multi, and in the partition function I was seeing the null
<nisanharamati> Thanks
<aturley> nemosupremo you *only* see `3`, `None`, `5` at the next computation?
<nemosupremo> what do you mean only?
<aturley> or are you saying that you were surprised to see the `None` along with the other values?
<aturley> do you see 1, 2, 3, None, 5, 6?
<aturley> or do you see 3, None, 5?
<nemosupremo> I was surprised to see the None
<nemosupremo> I was lazy in typing 1,2,...
<aturley> ok.
<aturley> that's expected behavior. perhaps the documentation is unclear in this case.
<aturley> let me rephrase that. the behavior that you've described is consistent with how i understood the code.
<aturley> that said, i can see how it could be unexpected.
<nemosupremo> "If the incoming data isn't an integer, we filter (drop) the message by returning None."
<nemosupremo> It's a bit surprising you can drop a message from a compute step by returning None
<nemosupremo> but that isn't true when using compute_multi
<nemosupremo> The developer could just filter out his own null's
<nemosupremo> so I don't think this is a stop the world issue
<aturley> the intention was that the user could return `None` to indicate that, for a given incoming message, there would be no outgoing messages.
<aturley> nemosupremo i think your line of reasoning makes sense. i'm not sure all users would expect that behavior, but i'm beginning to think that most users wouldn't be surprised if they found out it worked the way you expected.
<nemosupremo> Yep - in my case it was converting a compute step to a compute_multi step
<nemosupremo> any common possible reason, of the top of anyone's head I would see just "Killed" as the output after running machida?
<nemosupremo> is to_state_partition -> to_state_partition legal in wallaroo?
<nisanharamati> should be legal
<nisanharamati> scratch that, _is_ legal.
<nisanharamati> I think I've only seen `Killed` from machida if I had `exit(0)` in my python code
<nemosupremo> I removed the second to_state_partition, and now I get a segmentation fault after the first to_state_partition completes on one partition
<nisanharamati> Is the second partition using the same function or same set of keys? (e.g. same list object)
* nisanharamati > now I get a segmentation fault after the first to_state_partition completes on one partition
<nemosupremo> err
<nisanharamati> sounds like a different error than I'm imagining
<nemosupremo> by partition, I mean I see the compute function exit once and print out data
<nemosupremo> *execute once
<nemosupremo> and then seg faults after I return
<nisanharamati> so it's: `decode -> stateful_partition.partition(data) -> stateful_partition.compute(data) -> segfault` ?
<nemosupremo> yep
<nemosupremo> I've had a couple segfaults already but I could see where it was happening in "my" code, and it was usually just a python error or something that didn't get its stack traced
<nemosupremo> but these feels like the segfault is happening in the framework
<nemosupremo> my function is essentially `def join(d,s): s.lookup(d), return d`
<nemosupremo> I can print just before the return
<nemosupremo> if my stateful_partition.compute does nothing, I get a segfault, so the problem must be in the previous step somehow...
<nisanharamati> Could be either in the framework barfing over `d`, or in `s`s garbage collection, `__del__` method or `atexit` hook, if it gets collected
<nisanharamati> if you switched the partitioned step to non-partitioned, does it still segfault?
<nemosupremo> yes it does
<nemosupremo> Feels like I'm programming in C 😂
<nisanharamati> would you mind sharing a gist of your app and some sample input data, if it's not sensitive data?
<nisanharamati> I'm running up against the wall of my imagination of where it could be going wrong
<nemosupremo> 'm trying to figure out how to send it to you because the data comes from a live kafka source
<nisanharamati> the fields in a single decoded message would be fine. I can send it in from there
<nisanharamati> you could `print repr(bs)` in the decoder
<nemosupremo> I just printed out some msgs and I can't send you the message content, although I think the actual messages may be irrelevant
<nemosupremo> okay
<nemosupremo> This causes a segfault after about ~800 messages
<nemosupremo> message content does not matter
<nemosupremo> to_state_partition references a DeviceEventsWrapper, you can use.a Votes state object as I don't touch the state ither
aturley has quit [Ping timeout: 265 seconds]
<nemosupremo> FUCK
<nemosupremo> nvm