#qi-hardware on 2011-09-26 — irc logs at freenode.irclog.whitequark.org

02:14 <wolfspraul> if a full nanonote openwrt build on the buildhost takes 30 hours now, how can we determine what the bottleneck is?

02:14 <wolfspraul> 1) cpu 2) hdd 3) memory

02:15 <wolfspraul> others?

02:30 <larsc> if you havn't already you might want to consider to use ccache

02:38 <wolfspraul> thanks!

02:38 <wolfspraul> I guess we can try that on the existing machine first

02:38 <mth> yes, most builds will be almost the same as the previous one, so ccache shoudl work really well

02:38 <wolfspraul> can it easily be enabled in openwrt?

02:39 <mth> alternatively, try to not do full builds, but that might be a bigger developer time investment

02:39 <wolfspraul> also I guess we assume its

02:39 <wolfspraul> bug-free :-)

02:39 <wolfspraul> well, one purpose of the "full" builds is to rule out problems from incremental builds

02:39 <wolfspraul> there's a reason so many devs first erase everything and build from scratch, must be from their experience :-)

02:40 <mth> yes, it's something that is necessary in practice but not in theory

02:40 <mth> I'm wondering if it is feasible to set up a build system where dependency checking is reliable enoug to actually trust it

02:41 <wolfspraul> show me one dev who is doing some "incremental" build magic, and when running into anything "strange" wouldn't first nuke all the temp files and start over? :-)

02:41 <mth> but that's a very long term approach

02:41 <wolfspraul> ok, so ccache. good idea, how can it be enabled?

02:41 <wolfspraul> is it easily supported with openwrt? I'll look into it

02:41 <wolfspraul> in parallel the raw performance of the machine is also something that can be improved

02:41 <wolfspraul> but I'm trying to find out the bottleneck - cpu/mem/hdd

02:42 <mth> afaik it's done by setting CC and CXX to point to ccache rather than the actual compiler

02:42 <wolfspraul> I'm wondering what it is doing all these 30 hours

02:42 <wolfspraul> the hdd is a raid-0 over 2 disks (no ssd)

02:42 <mth> "time" should tell you whether CPU is the bottleneck: compare real time with user time

02:42 <wolfspraul> we could increase memory and try with a memory based /tmp or so

02:42 <mth> mem could be checked by monitoring how much is swapped out and how much cache is available

02:42 <wolfspraul> the cpu is a single-core 64-bit, that could be increased as well

02:43 <mth> a quad core would build about 3-3.5 times as fast as a single core in my experience

02:43 <mth> maybe a bit less if you have lots of small packages

02:43 <wolfspraul> that's assuming that amount of memory or hdd/sdd speed are not the bottleneck

02:44 <wolfspraul> so you say in your experience it will be the CPU?

02:44 <mth> I can build the OpenDingux rootfs in a quad-core VM on an i7 in 25 minutes

02:44 <wolfspraul> ok I think we build a few thousand packages here, in 30 hours

02:44 <wolfspraul> and I'm trying to understand which hardware improvement would help the most

02:44 <mth> that's far less packages than OpenWRT, I guess, but still quite a lot

02:44 <wolfspraul> cpu, memory, hdd/ssd

02:45 <wolfspraul> I don't think the build process will max out mem

02:45 <wolfspraul> and we don't have a ramdisk (maybe we should?)

02:45 <wolfspraul> so yeah, probably the cpu. an SSD would probably also help a lot.

02:46 <mth> would ramdisk be faster than sufficient memory for caching?

02:46 <mth> at least with caching you don't have to manually manage it

02:46 <wolfspraul> well I don't know who is using the resources and to which degree

02:47 <wolfspraul> actually I think it is running builds most of the time

02:47 <wolfspraul> checking...

02:48 <mth> you'd need some background process gathering vital stats of the system say, once a minute, and log them

02:48 <mth> perhaps existing network monitoring tools already do that?

02:48 <mth> the kind they use to keep track of server farms

02:48 <mth> nagios etc

02:49 <mth> one big problem that's hard to get rid of is autoconf

02:50 <mth> that won't utilize multiple cores

02:50 <mth> and it takes a significant amount of time for the build of small packages

02:50 <mth> it's really overdue for replacement, imo

02:51 <wolfspraul> it's so badly designed that it will never be possible to be replaced

02:51 <wolfspraul> survival strategy

02:51 <mth> you could speed it up by caching probe results, but I don't know how reliable that is if you mix different versions and possibly customized rules

02:51 <wolfspraul> no no

02:51 <wolfspraul> I am looking for some easy way to speed up

02:51 <wolfspraul> not to be stuck with arcane problems for a few years

02:52 <wolfspraul> ccache sounds interesting if a) it's easy to enable b) it's bug-free

02:52 <wolfspraul> :-)

02:53 <mth> nothing non-trivial is bug-free, but I think ccache's approach is low-risk

02:53 <mth> since it uses the preprocessed input to do the lookup in the cache

02:53 <wolfspraul> sure I was joking

02:53 <wolfspraul> a build is indeed running today

02:53 <wolfspraul> and I think the machine is doing this for weeks

02:54 <wolfspraul> is the kernel or anybody collecting any load statistics that I can easily look at now?

02:54 <mth> you might have to flush the cache if you update the compiler, I'm not sure about that

02:55 <mth> "top" would be a start

02:55 <mth> it should at least give you an impression of CPU and memory use

02:57 <kristianpaul> iotop may help a bit too

03:00 <wolfspraul> ok I looked at vmstat 1 for a while. indeed it looks like mostly cpu bound, and/or memory speed

03:01 <wolfspraul> not amoutn of memory (1.5gb of 2 used, but lots of buffers, swap very lightly used if at all)

03:01 <wolfspraul> also not disk speed I think

03:01 <wolfspraul> all seems to be cpu and/or memory speed

03:01 <mth> disk speed might become a factor once you switch to multiple cores

03:01 <mth> so don't spend all your money at once

03:01 <wolfspraul> sure, something always bubbles up

03:02 <wolfspraul> you make one piece faster, then one or multiple of the others become relatively bigger :-)

03:03 <wolfspraul> ok so: 1) try ccache 2) upgrade cpu, maybe a little more memory just in case

03:03 <kristianpaul> or if you still like visually/funÂ Â debugging try watch --color -d 'ps -x -kpcpu -o pid,pcpu,args'

03:03 <mth> not just relatively, if you start using multiple cores the access pattern will change as well

03:03 <mth> it will be less localized

03:03 <kristianpaul> vmstat wont tell you i/o problems i remenber

03:04 <mth> you can detect I/O problems indirectly: if there is enough memory and the CPUs are not fully utilized, the I/O must be the bottleneck

03:05 <mth> well, or you're not actually running in parallel (small packages, scripts like configure)

03:05 <mth> buildroot will only use multiple jobs within one package, not build two packages at once

03:06 <mth> I don't know if OpenWRT still has that limitation as well or whether it was removed there

03:07 <kristianpaul> anyway if you all took 30hrs worth install munin munin-node i bet

03:07 <kristianpaul> at least you can get interesting resources utilization stats over a week

03:07 <kristianpaul> not on the last second :)

03:08 <wolfspraul> not sure

03:08 <wolfspraul> all I've seen munin create so far is a lot of data that adds a lot of confusion

03:08 <kristianpaul> just check what you need ;)

03:08 <wolfspraul> whereas I can just login to the running machine and look at the load for a little while with simple commands, and get a good understanding where the bottleneck is

03:09 <wolfspraul> well, just saying from past experience. that could have well been me.

03:09 <kristianpaul> not over the time tought

03:09 <wolfspraul> but I just see dozens of pretty charts but little conclusive value

03:09 <kristianpaul> indeed, it always depend what are you looking for

03:09 <wolfspraul> the pattern is quite stable, if you don't see something over 5 minutes I'd say it's not very relevant to the machine's performance anyway

03:10 <wolfspraul> if you have some backup running once every 24h, that's a special thing and what is happening in those x minutes is not representative either

03:10 <kristianpaul> for example the process that eats more cpu/mem over a longer period of time, but i dont get to that yet tought :/

03:11 <kristianpaul> (5 minutes) yeah ;)

03:11 <wolfspraul> cpu seems super busy, ca 80% us, ca. 20% sy

03:11 <wolfspraul> cpu upgrade it is

03:11 <wolfspraul> and faster memory

03:11 <kristianpaul> whats load average?

03:11 <wolfspraul> no need to waste money on an ssd now, I think the raid-0 over two normal hdds is not bad

03:12 <wolfspraul> load average: 1.03, 1.01, 1.15

03:12 <kristianpaul> dint look that bad

03:13 <kristianpaul> is still compiling right?

03:14 <wolfspraul> yes compiling all the time I think :-)

03:15 <mth> load is more-or-less the number of processes waiting for CPU time, correct?

03:15 <mth> then a load of ~1 is what I'd expect on a single core -j1 compile

03:15 <mth> well, if I/O were a big problem the load would be below 1, so it does point towards the CPU as the bottleneck

04:56 <wpwrak> ccache is reasonably safe. i once managed to create a pathological case where the difference was deep in one of the more unusual ELF sections (i don't remember the details, but i think it was with umlsim), where the ccache folks just accepted defeat, but if you don't drive it to extremes, it'll serve you well. even compiler upgrades should be okay.

04:56 <wpwrak> ah, he already left

07:40 <qi-bot> The build was successfull, see images here: http://fidelio.qi-hardware.com/~xiangfu/compile-log/openwrt-xburst.full_system-09252011-0252/

16:29 <kyak> viric: ping

16:46 <viric> kyak: pong from thousands of km

17:23 <kyak> viric: nevermind, i just built the offrss and giving it a try :)

17:24 <kyak> had some problems with my eyes. Somehow i thought that libmrss-0.9 > libmrss-0.19.2

17:24 <kyak> that's a mindtrick :)

17:33 <viric> haha

17:33 <viric> sometimes even configure scripts make that errors

17:33 <viric> kyak: oth, I feel honored :)

17:34 <kyak> viric: btw, i had to add -I/usr/include/curl in the Makefile and #include <curl.h> in the offrss.c

17:34 <kyak> :)

17:34 <viric> ah

17:34 <viric> interesting

17:34 <viric> I never built offrss on non-nix

17:36 <kyak> damn the X over network is slow.. even in my home network

17:37 <kyak> i have to keep it locally or use console brwoser

17:38 <kyak> for newsbeuter, i sometimes use the "External actions" feature or whatever it is called. It is when the article is passed to some external program; i use to download things from torrent

17:41 <kyak> viric: hm, it's interesting - when i start it like "WEBBROWSER=links ./offrss -w", it won't work. The links shows up, but can't connect to server

17:41 <kyak> when i start as ./offrss -w and then just links http://localhost:8090, it works fine

17:44 <kyak> oh, a segfaul in podofo...

18:06 <viric> in podofo?

18:07 <viric> kyak: what version of podofo? Have you linked podofo?

18:07 <viric> (any gdb bt?)

18:23 <kyak> viric: podofo 0.7.0, the one supplied with my distro, no gdb bt yet

20:27 <qi-bot> [commit] Werner Almesberger: m1/perf/eval.pl: warn if an instruction reads and writes from the same register (master) http://qi-hw.com/p/wernermisc/5bf9ae0

20:27 <qi-bot> [commit] Werner Almesberger: m1/perf/sched.c: use calloc instead of malloc plus memset (master) http://qi-hw.com/p/wernermisc/0a7e5b1

20:27 <qi-bot> [commit] Werner Almesberger: m1/perf/sched.c: return -1 if malloc fails (master) http://qi-hw.com/p/wernermisc/24a9b85

20:27 <qi-bot> [commit] Werner Almesberger: m1/perf/sched.c: code cleanup (no functional changes) (master) http://qi-hw.com/p/wernermisc/35e9903

20:33 <wpwrak> i like it when qi-bot calls me "master". i always imagine "i dream of jeannie" ;-)

20:34 <viric> :)

20:36 <larsc> i don't want to destroy your dreams, but i think it's referring to the branch name ;)

20:49 <wpwrak> i'll choose to disregard this opinion of yours :)