#ponylang on 2017-03-02 — irc logs at freenode.irclog.whitequark.org

2016-11-09 00:11 jemc changed the topic of #ponylang to: Welcome! Please check out our Code of Conduct => https://github.com/ponylang/ponyc/blob/master/CODE_OF_CONDUCT.md | Public IRC logs are available => http://irclog.whitequark.org/ponylang | Please consider participating in our mailing lists => https://pony.groups.io/g/pony

03:25 amclain has quit [Quit: Leaving]

03:32 Shorttail_ has joined #ponylang

03:35 <Shorttail_> I'm running a prime number counter in a single actor on my 64 bit Windows 7 machine, ponyc version 0.10.0. It takes 50% of the 4 core 8 thread i7. If I run multiple actors all doing calculations, it also runs at 50%, although it finishes faster. Is the program supposed to user 50% all the time? It it not detecting hyperthreading?

03:36 <Shorttail_> The program is here, switch the comment to disable multiple actors

03:36 <Shorttail_> http://pastebin.com/zSEyNdZb

03:46 <SeanTAllen> Shorttail_: I dont use Windows so I can't answer that, cpu.c has the code where Pony figures out what cpus are available. on my i7, OSX, it detects 4 cpus and uses those

03:47 <SeanTAllen> ive never specifically checked to see what it is doing with hyperthreads

03:49 <SeanTAllen> ponyint_cpu_init() is probably what you are most interested in

03:49 <Shorttail_> It would seem like it completely ignores threads then, unless you manually disabled yours.

03:49 <SeanTAllen> its using the GetLogicalProcessorInformation() system call

03:51 <SeanTAllen> i dont know the Windows data structures but...

03:51 <SeanTAllen> https://www.dropbox.com/s/ftmjwl1thih91nt/Screenshot%202017-03-01%2022.51.22.png?dl=0

03:51 <SeanTAllen> I assume a relation from what ive just read would be the cpu and hyperthread in which case, yes, it just uses the cpu

03:53 <Shorttail_> Yep, only logical cores. I don't see any comments in cpu.c about performance. Maybe adding hyperthreads doesn't improve performance, maybe it wasn't tried

03:55 <SeanTAllen> so, at Sendence

03:55 <SeanTAllen> we are building a high performance streaming data processing system

03:55 <SeanTAllen> we do a lot of testing in amazon

03:56 <SeanTAllen> where they expose hyperthreads as part of a "VCPU"

03:56 <SeanTAllen> so if they list 8 VCPUS, that 4 real cores, and 4 hyperthreads

03:56 <SeanTAllen> performance in that env for us, is far worse when the hyperthreads are used. we avoid using them.

03:58 <Shorttail_> I think the main reason for hyperthreading and other SMT is that the extra it offers, even if not much, costs alsmost no extra power

03:58 <SeanTAllen> sylvan and i have been looking at strategies for making pony be able to handle more network throughput, perf on that front is already quite good, but we want to make it really good. for that, we might end up leveraging hyperthreads

03:58 <SeanTAllen> pony is really good at using all the cpu without hyperthreads

03:59 <Shorttail_> Do you run tasks that take 100% cpu? If you max out all cores it should be fasterthan without hyperthreading

03:59 <Shorttail_> I see

03:59 <SeanTAllen> yes

03:59 <SeanTAllen> its easy to max out all the cores when pushing a pony program

04:00 <Shorttail_> Some of the threads are surely doing nothing though, with a single actor I still hit max

04:00 <SeanTAllen> so...

04:00 <SeanTAllen> by default, pony will start X number of schedulers where X is the number of cpus

04:00 <Shorttail_> By doing nothing I mean busy waiting

04:01 <SeanTAllen> you can pass --ponythreads to a pony program to change the number of threads

04:01 <SeanTAllen> in addition to those

04:01 <Shorttail_> Ahh, I'll try that

04:01 <SeanTAllen> there is 1 additional thread for asio events

04:01 <SeanTAllen> on Linux when we want best performance, what we do is...

04:01 dougmacdoug has joined #ponylang

04:01 <SeanTAllen> lets say we want 4 scheduler threads

04:02 <SeanTAllen> we will set aside 5 cpus only for the pony program

04:02 <SeanTAllen> the first 4 get used scheduler

04:02 <SeanTAllen> the last for asio

04:02 <SeanTAllen> --ponypinasio pins the asio thread to the last available cpu

04:03 <SeanTAllen> that is going to get your best performance

04:03 jemc has quit [Ping timeout: 240 seconds]

04:03 <Shorttail_> And those threads busy wait even if the progrsm could potentially be single threaded?

04:03 <SeanTAllen> again on Linux, we set aside 1 cpu for the OS and use the rest for pony

04:03 <SeanTAllen> so

04:04 <SeanTAllen> that would be a work stealing question

04:04 <SeanTAllen> work stealing needs some work, right now it involves some hand tuning

04:04 <SeanTAllen> lets talk pony scheduling for a moment

04:04 <Shorttail_> My single actor does CPU work, nothing else happens, the behavior is pretty long

04:04 <SeanTAllen> by default when an actor sends a message to another actor

04:05 <SeanTAllen> if the receiving actor isnt already scheduled, it will be scheduled on the same scheduler as the sender

04:05 <SeanTAllen> so when your program starts

04:05 <SeanTAllen> everything would be on 1 scheduler

04:05 <SeanTAllen> the other schedulers, when they have no actors to schedule will attempt to steal actors from other schedulers

04:05 <SeanTAllen> and in this way, work gets distributed across the available schedulers

04:06 <SeanTAllen> there is overhead to work stealing and if it happens to often, performance can suffer in which case its best to lower the number of ponythreads to get better performance

04:06 <SeanTAllen> if you were to profile such a program you would see most of its time spent in work stealing

04:07 <SeanTAllen> ive tried a number of strategies to back off work stealing when there isnt enough work for all threads but thus far they have all had a large impact on "full bore" performance so i havent opened any PRs

04:07 <SeanTAllen> thats a bit more detailed of an answer than you might have been looking for, hopefully not too much info

04:08 <Shorttail_> I get it. It makes sense to not senselessly tune the runtime for single thread performance when that is not what pony is made for

04:08 <SeanTAllen> well

04:08 <SeanTAllen> its not just single thread

04:08 <SeanTAllen> for example at sendence

04:08 <SeanTAllen> if we have 8 cores set up

04:09 <SeanTAllen> but only run at 100k messages a second, we get a lot of work stealing overhead

04:09 <SeanTAllen> ideally we wouldnt

04:09 <SeanTAllen> but that is a difficult thing to balance

04:09 <SeanTAllen> at the moment pony does the simple thing and leave it to you to tune using ponythreads to your workload

04:10 <SeanTAllen> the problem is you might have a variable workload, so its something we are working on

04:10 <SeanTAllen> and by "we", that has really been me.

04:10 <SeanTAllen> i've tried about 20 strategies so far, none worked out

04:11 <SeanTAllen> if you have runtime questions, i'm probably one of the best people to ask. feel free to get my address off the mailing list if you ever have runtime questions and i'm not around here to answer

04:11 <Shorttail_> I tested with all 8 threads enabled, and it had a speedup of only 20% over 4 threads, so I guess it's not worth it to use hyperthreads by default, seeing as they affect the cache as well]

04:11 <SeanTAllen> ya

04:11 <Shorttail_> Thank you

04:11 <SeanTAllen> you're welcome

04:21 Shorttail_ has quit [Quit: Page closed]

04:41 dougmacdoug has quit []

04:54 jemc has joined #ponylang

06:00 rurban has joined #ponylang

06:00 rurban has quit [Client Quit]

06:04 jemc has quit [Ping timeout: 268 seconds]

06:20 jemc has joined #ponylang

07:25 rurban has joined #ponylang

07:41 graaff has quit [Quit: Leaving]

07:43 abeaumont has quit [Ping timeout: 264 seconds]

08:01 jemc has quit [Ping timeout: 240 seconds]

09:03 jkleiser has joined #ponylang

09:27 rurban has left #ponylang [#ponylang]

10:43 jkleiser has quit [Remote host closed the connection]

11:07 jkleiser has joined #ponylang

11:32 _andre has joined #ponylang

11:34 jkleiser has quit [Remote host closed the connection]

12:49 jkleiser has joined #ponylang

14:00 rurban1 has joined #ponylang

15:05 dougmacdoug has joined #ponylang

15:28 abeaumont has joined #ponylang

15:54 jemc has joined #ponylang

16:01 abeaumont has quit [Ping timeout: 240 seconds]

16:15 jkleiser has quit []

16:57 amclain has joined #ponylang

18:44 abeaumont has joined #ponylang

18:46 obadz has quit [Ping timeout: 252 seconds]

18:46 obadz has joined #ponylang

19:02 Matthias247 has joined #ponylang

19:35 abeaumont has quit [Ping timeout: 240 seconds]

19:41 rurban1 has quit [Quit: Leaving.]

21:12 dougmacdoug has quit [Remote host closed the connection]

21:18 _andre has quit [Quit: leaving]

21:28 prettyvanilla_ has joined #ponylang

21:29 prettyvanilla has quit [Ping timeout: 268 seconds]

22:10 kr1shnak has quit [Quit: bye bye]

23:00 <lisael> Hi, there... I have to sleep, so I just drop this here :

23:00 <lisael> https://github.com/lisael/pony-peg

23:01 <lisael> jemc: you may be interested :)

23:05 abeaumont has joined #ponylang

23:07 <lisael> the goal is to output pony-ast from pony code

23:11 <jemc> lisael: interesting, thanks for sharing

23:11 <jemc> lisael: you may be interested in looking at https://github.com/jemc/pony-pegasus

23:12 <jemc> it's an unfinished project, but it takes the approach of an in-language "DSL" rather than an actual DSL with a first-clas syntax

23:13 <lisael> jemc: I know pegasus, of course ( otherwise my project would be named pegasus :D )

23:14 aedigix- has quit [Ping timeout: 264 seconds]

23:14 <lisael> BTW I don't think pony is made to make DSLs, and the aproach is different here.

23:16 <lisael> I generate pony code ( I read somewhere, maybe in the tutorial that it's not something desirable, though )

23:17 <lisael> I'd like to create toolings that allow someone to generate the parser and ship it in their project without even depending on pony-peg

23:18 <lisael> (for some reasons, at the moment, the generated code has to `use "peg"`

23:18 <lisael> )

23:21 <jemc> I've definitely come to have a positive stance toward code generation in Pony

23:22 <jemc> especially for this sort of thing, like a codec or large state machine

23:24 aedigix has joined #ponylang

23:27 abeaumont has quit [Ping timeout: 256 seconds]

23:30 <lisael> to be clear, what i have in mind is writing the pony grammar (not too hard, I think, just have ot port the ANTLR grammar)

23:30 <lisael> then generate pony-ast

23:31 <lisael> and experiment with macros or stuff like elixir sigils

23:32 <lisael> I have to sleep, realy, now :)

23:32 <lisael> bye.

23:34 <jemc> lisael: FYI I am in the middle of massively revising pony-ast to be more static in nature

23:34 <jemc> using code generation, in fact

23:35 <jemc> not sure whether the current "dynamic" `AST` class will be kept around as an intermediate step that can be transformed into the static version, or not

23:35 <jemc> would definitely be interested in feedback about generating the static version of the codebase directly

23:36 kr1shnak has joined #ponylang

23:38 kr1shnak_ has joined #ponylang

23:40 <jemc> sorry, I mean generating the static data structure directly

23:41 kr1shnak has quit [Ping timeout: 260 seconds]

23:41 <jemc> the only reason this work is on pause is because I found that the ponyc compiler gets bogged down to be quite slow when compiling the compiling the pony-ast/static codebase

23:46 jemc has quit [Ping timeout: 240 seconds]

23:54 jemc has joined #ponylang

23:58 kr1shnak_ has quit [Quit: bye bye]