ChanServ changed the topic of #linux-sunxi to: Allwinner/sunxi development discussion - Don't ask to ask. Just ask! - See http://linux-sunxi.org | https://github.com/linux-sunxi/ | Logs at http://irclog.whitequark.org/linux-sunxi
dl9pf has quit [Ping timeout: 245 seconds]
dl9pf has joined #linux-sunxi
dl9pf has quit [Changing host]
dl9pf has joined #linux-sunxi
dl9pf has quit [Read error: Operation timed out]
mdfe has quit [Remote host closed the connection]
dl9pf has joined #linux-sunxi
dl9pf has joined #linux-sunxi
dl9pf has quit [Changing host]
bsdfox has joined #linux-sunxi
bsdfox has joined #linux-sunxi
egbert has quit [Disconnected by services]
egbert has joined #linux-sunxi
bsdfox has quit [Ping timeout: 255 seconds]
bsdfox has joined #linux-sunxi
bsdfox has quit [Changing host]
bsdfox has joined #linux-sunxi
vinifm has quit [Remote host closed the connection]
christopher has joined #linux-sunxi
christopher has quit [Quit: Leaving]
wingrime has joined #linux-sunxi
eebrah_ has joined #linux-sunxi
eebrah_ has quit [Ping timeout: 272 seconds]
n01 has quit [Read error: Connection reset by peer]
n01 has joined #linux-sunxi
ZaEarl has quit [Ping timeout: 245 seconds]
rz2k has joined #linux-sunxi
ganbold_ has quit [Ping timeout: 255 seconds]
rellla has joined #linux-sunxi
paulk-desktop has joined #linux-sunxi
ganbold_ has joined #linux-sunxi
shineworld has joined #linux-sunxi
shineworld has quit [Remote host closed the connection]
rellla has quit [Quit: Nettalk6 - www.ntalk.de]
n01 has quit [Read error: Operation timed out]
bsdfox has quit [Ping timeout: 252 seconds]
<oliv3r> ssvb: i'll try lcd + hdmi dual-head then; but atleast that means hardware is capable
<oliv3r> ssvb: what I wonder, with these tablets showing off 'hdmi port included!!' does it work on any of the devices?
<ssvb> oliv3r: well, it only worked for me if vga was assigned to /dev/fb0 and hdmi to /dev/fb1, but not the other way around
<ssvb> the code is quite fragile and buggy
<ssvb> also I don't know what kind of hdmi port is included in supposedly a13 tablets (if you are using one), maybe it needs its own support code in the kernel driver
<ssvb> but at least dual-head seems to work in some configurations on a10 if configured via fex file
<wingrime> I found 2 things I can fix with dma
<oliv3r> ssvb: and extremly messy :p
<oliv3r> ssvb: A10
<oliv3r> ssvb: do you think the video driver can be 'fixed' or will it really need a complete rewrite?
<oliv3r> also, while going over the disp_mode section in the fex guide, initially I just copied the original part of the wiki and had no clue what it ment. Now I think atleast I know mode 0 and mode 1, but use /dev/fb0 and you either enable output 0 or 1. Mode 4 sounds like 'clone' mode
<oliv3r> mode 2, 'dual head'
<oliv3r> mode 3 is 'misterious' as it's probably not documented properly there
<ssvb> hmm, I have not checked mode 3, but might be one big framebuffer which spans over both monitors
<ssvb> I guess it would be reasonable to first try it with both monitors having the same resolution
vicenteH has quit [Ping timeout: 248 seconds]
<wingrime> oliv3r: funny
<wingrime> oliv3r: sdhost use strange own dma
<wingrime> ssvb: I need your help
<wingrime> ssvb: I need someone who can measure nand speed with/without my patch for dma
<wingrime> ssvb: same for ethernet
<wingrime> oliv3r: you wanted
vicenteH has joined #linux-sunxi
<wingrime> hramrach:
mdfe has joined #linux-sunxi
hipboi has joined #linux-sunxi
<oliv3r> ssvb: i was just running outside, and was thinking the same thing. I'll document that in the wiki
<oliv3r> ssvb: i don't have ethernet on my tablet :)
<wingrime> patch affected ether usb sound nand
<oliv3r> well i can run simple tests on my tablet; i'm booting stage/3.4
<oliv3r> i'll pull in that patch then and see what it does
<oliv3r> this against stage/3.4?
<oliv3r> gimme a few, need prob an hour
<oliv3r> testing hdmi stuff for hansg first
<oliv3r> ssvb: i assume hdmi support is non-hotplug; it's based on whats in script.bin and edid?
<wingrime> olibv3r: I want some measurment before and after
<wingrime> olibv3r: patch for 3.4/stage
<oliv3r> how do you want me to measure?
<wingrime> something like time and dd
<oliv3r> on nand? i can test that
<oliv3r> read or write?
<wingrime> both
<oliv3r> ok
vinifm has joined #linux-sunxi
<ssvb> wingrime: is cache flushing done in hard irq context after your patch?
<wingrime> yes
<wingrime> I tested, no dmesg mesages all works good
<wingrime> ssvb: irq handler need rework
<wingrime> ssvb: see my repo
<wingrime> ssvb: I send more one patch
<wingrime> ssvb: cpu actulay don't care irq context for dcache flush
<techn_> you could enable dma debug stuff.. I tried that once and it gave some warnings/errors
<wingrime> techn_: I want move dma irq handler to worker thread
<wingrime> It will affect to sound,nand,ethernet,usb speed
<ssvb> wingrime: cache flushing takes time, and you don't normally want the irq handler doing heavy work
<wingrime> ssvb: I actulay will try make irq lighter for dma
<ssvb> maybe it would be better to flush cache in the beginning of 'sw_dma_enqueue'?
<wingrime> ssvb: actualy sw_dma_enqueue not alawys send to dma
<wingrime> ssvb: it may save it and do later
<wingrime> ssvb: code need rewrite to linux qeues
<wingrime> ssvb: also cpu is defenetly stoped when cache flush
<wingrime> ssvb: so there is no any difference when we do it
<wingrime> ssvb: next patch more interesting for optimisation
<ssvb> you want to have dma transfers always running for best performance
<ssvb> if there is a long delay (doing cache flush) between the completion of previous dma transfer and the start of a new dma transfer, then this is not good
<wingrime> for new dma transfet you must call sw_dma_enqueue
<wingrime> ssvb: you need do flush cache
<wingrime> to dest-addr
<techn_> why that cache is flushed?
<ssvb> but you don't need to delay the cache flush until the very last moment
<wingrime> techn_: CPU save some freq-used data to SRAM
<wingrime> techn_: dcache
<wingrime> techn_: when you use DMA cpu don't know that data changed
<techn_> but why dma requires cache flush?
<wingrime> techn_: CPU don't know that data in cache changed
<techn_> oh.. so when you use dma you should disable cache
<wingrime> techn_: actualy you need drop cached data
<wingrime> techn_: if there is no that data in cache so it will costs no time
<wingrime> techn_: look like arm can check is this data in cache and flush it using command
<ssvb> wingrime: so is it a transfer from DMA to CPU (for example NAND read)?
<wingrime> ssvb: dma can do nand->ram
<wingrime> ssvb: on a13 supported: IR, uart, audio , sram,sdram,spi.usb
<vinifm> I've been having problems with DMA, when using sockets
<ssvb> wingrime: well, for this direction of transfer, invalidating the cache before dma transfer is complete seems wrong
<wingrime> ssvb: maybe, but I don't know how make it after
<wingrime> ssvb: but maybe not
<wingrime> ssvb: becose , we need CPU reread dram on first access
<wingrime> ssvb: 14606475 branch-misses # 7.96% of all branches
<wingrime> ssvb: yesterday was more
<wingrime> or i don't know
<wingrime> this is side effect or anything else
<ssvb> branch prediction misses should be totally unrelated to data cache
<wingrime> ssvb: it related
<ssvb> how so?
<wingrime> ssvb: I think that D cache clean will clean I cache
<wingrime> ssvb: at least you must reset pipline when you drop data
<ssvb> somehow this does not make much sense to me
<ssvb> but in any case, I can confirm the high rate for branch prediction misses
<wingrime> ssvb: thats why I ask help me with that stuff
<wingrime> ssvb: I-cache cnd D-cache are linked between
<ssvb> ok, let's see what can be done
<wingrime> ssvb: I ask test performance with/without my patches
ganbold_ has quit [Remote host closed the connection]
<ssvb> but I-cache and D-cache are not linked in ARM, for example for JIT you need to explicitly clean D-cache and then invalidate I-cache before executing the modified chunk of code
<ssvb> for example check 'v7_coherent_user_range' function from https://github.com/linux-sunxi/linux-sunxi/blob/sunxi-3.4/arch/arm/mm/cache-v7.S
<ssvb> and the Branch Target Buffer (BTB) is also a somewhat separate entity
hipboi has quit [Ping timeout: 245 seconds]
<wingrime> ssvb: simply test speed with/without pleaser
<wingrime> *please
ganbold_ has joined #linux-sunxi
<wingrime> ssvb: look like cache flush in irq is newer be called
<wingrime> ssvb: some "app callback"
<wingrime> ssvb: find sw_dma_set_opfn
<wingrime> this is callback code
ZaEarl has joined #linux-sunxi
fra79Wii_ has joined #linux-sunxi
<wingrime> WTF
<wingrime> dma driver can call callbackfunction to some driver in irq context
<wingrime> I will remove this strange "halfdone"
<wingrime> it even strange that I think earler
<wingrime> transfer done calback to work in int context
<wingrime> wemac will transmit in irq context to
<wingrime> ssvb: look at sw_dma_set_buffdone_fn
<wingrime> ssvb: look at sw_dma_set_halfdone_fn
<wingrime> ssvb: look at sw_dma_set_opfn
<wingrime> omg: it all can have irq context sometimes
fra79Wii has joined #linux-sunxi
ganbold__ has joined #linux-sunxi
fra79Wii has quit [Quit: AndroIRC - Android IRC Client ( http://www.androirc.com )]
fra79Wii has joined #linux-sunxi
ganbold_ has quit [Ping timeout: 260 seconds]
bsdfox has joined #linux-sunxi
bsdfox has quit [Changing host]
bsdfox has joined #linux-sunxi
fra79Wii has quit [Client Quit]
fra79Wii has joined #linux-sunxi
fra79Wii_ has quit [Ping timeout: 245 seconds]
fra79Wii has quit [Quit: AndroIRC - Android IRC Client ( http://www.androirc.com )]
shineworld has joined #linux-sunxi
fra79Wii has joined #linux-sunxi
<wingrime> ssvb:
<wingrime> ping
<shineworld> good evening wingrime
<wingrime> )
<fra79Wii> hi so what's the situation of CedarX... Someone is keeping up the reverse engineering? I've tried to make the current version to work on android 4.2 but there is no way..
<fra79Wii> Or we should wait until allwinner release a new SDK for 4.2?
fra79Wii has quit [Remote host closed the connection]
fra79Wii has joined #linux-sunxi
<oliv3r> wingrime: what branch is your dma test on? i added your github as remote, but can't find it :)
<oliv3r> i guess i can cherrypick 384a649b09f928fd2065068cb40b73b52f724210 on stage/3.4
<wingrime> wingrime-wip
<wingrime> oliv3r
<wingrime> wait
<wingrime> test this
<wingrime> please test performance "general"
<wingrime> please test nand speed
<wingrime> and someone test audo and ether
<wingrime> oliv3r: I make new Interesting patch
<wingrime> oliv3r: are you using a13 ?
n01 has joined #linux-sunxi
<wingrime> oliv3r: I make patch that moves IRQ handling to workqueue
<wingrime> oliv3r: It generay must change performance
uro_ has quit [Quit: leaving]
<ssvb> wingrime: tried to compare branch prediction rate for atom, a10 and exynos5 - https://gist.github.com/ssvb/5281249
<ssvb> wingrime: does not look too bad, considering that atom and exynos5 executed roughly twice more branches total (apparently trivially predictable)
eebrah_ has joined #linux-sunxi
<ssvb> wingrime: the absolute number of mispredicted branches is quite comparable
<wingrime> ssvb: I done interesting stuff
<wingrime> ssvb: I move irq handler to workqeue
<wingrime> and push it soon
<wingrime> it generaly will have performance impact (positive or negative)
<ssvb> :)
<ssvb> but in any case, this whole dma irq handler looks very suspicious
<ssvb> if anyone is up to fixing it, the fixed implementation probably should be clean and correct
<ssvb> I mean reshuffling code and only fixing parts of it may have unpredictable effects (triggering some latent bugs)
<wingrime> wait I san it soon
<wingrime> *send
<wingrime> see my github
<wingrime> ssvb: try use my 3.4 head
eebrah_ has quit [Ping timeout: 252 seconds]
<wingrime> I have some patches cleanups
<wingrime> branch wingrime-wip
<wingrime> ssvb: and test performance differences with irq to workqeue patch
<ssvb> wingrime: I surely can, but I would prefer if you could initially benchmark your code yourself ;)
ganbold__ has quit [Ping timeout: 256 seconds]
<ssvb> if you expect performance improvements, then I can try to confirm them
<wingrime> ssvb: bad/good performance are secondary, not important, long irq handlers must be in moved to workqueues
<wingrime> ssvb: this fixes strange bugs , that dma_callback functions are called in irq context
<wingrime> ssvb: for example wemac will send message in this context (in callback)
<wingrime> ssvb: that totaly unacceptable for response reason
<wingrime> ssvb: I have not ether (a13) so I can only predict
<wingrime> ssvb: I wan't some one more test this
<wingrime> ssvb: becose dma used for audio.ethernet,usb,nand
<wingrime> *ether->ethernet
Dave77 has joined #linux-sunxi
<paulk-desktop> HI
<paulk-desktop> so it seems that my patches were sent after all
torqu3e has quit [Quit: torqu3e]
eebrah has quit [Ping timeout: 255 seconds]
Guest60022 has joined #linux-sunxi
Dave77 has quit [Ping timeout: 256 seconds]
fra79Wii has quit [Remote host closed the connection]
fra79Wii has joined #linux-sunxi
Guest60022 has quit [Quit: Leaving]
Guest60022 has joined #linux-sunxi
simosx has joined #linux-sunxi
simosx has joined #linux-sunxi
<vinifm> hi, what is the difference between linux drivers and u-boot drivers?
<oliv3r> wingrime: ah, the wip branch; ok ,well i cherry-picked it for now; booting 3.4 now to see its performance
Guest60022 has quit [Quit: Leaving]
eebrah has joined #linux-sunxi
eebrah is now known as Guest79196
Guest79196 has quit [Client Quit]
ZaEarl has quit [Ping timeout: 245 seconds]
gzamboni has quit [Ping timeout: 240 seconds]
gzamboni has joined #linux-sunxi
eebrah_ has joined #linux-sunxi
ZaEarl has joined #linux-sunxi
Dave77 has joined #linux-sunxi
bsdfox has quit [Ping timeout: 256 seconds]
rz2k has quit []
ZaEarl has quit [Ping timeout: 245 seconds]
<wingrime> ssvb: olib3r: I tested and rebuilded 4 times and can say that there is small profit
<wingrime> ssvb: olib3t: I talking about first patch for "flush"
<wingrime> ssvb: olib3t: without Throughput 0.711998 MB/sec with 0.806631 MB/sec
<wingrime> ssvb: olib3t: without 9.48% of all branches with 7.95% of all branches
<wingrime> so I can say that "misticly" change branch-miss count
<wingrime> but results at least stable
<wingrime> last patch: move to workqeue are unstable
<wingrime> I gen hung with it , it looks like hidden bugs or simular
<oliv3r> i'm running time dd if=/dev/zero of=test.zero bs=64k count=8192
<oliv3r> on the sd card to start with, then on nand
<oliv3r> theni 'll do with your patches; i'll paste bin the results
<wingrime> oliv3r: I use "dbench 1 -t 20 -s -S -F --directory=/media/0000-006F/"
<wingrime> for disk
<wingrime> and sudo perf_3.2.0-39 stat -B dd if=/dev/zero of=/dev/null count=1000000
<wingrime> for branches
<wingrime> but it need enable performance counting in kernel config
<oliv3r> i haven't gotten all that installed :p
<wingrime> apt-get ))
<oliv3r> but i did remember to put performance gov. on
<wingrime> oliv3r: this not performance gov
<wingrime> oliv3r: "performance counting tools"
<wingrime> oliv3r: somethig simular
<wingrime> oliv3r: in General menuconfig
<oliv3r> i know :)
<oliv3r> but my kernel boots ondemand by default
<ssvb> wingrime: performance counters should be already enabled by default
<wingrime> ssvb: a13 have other config
<ssvb> which one?
<wingrime> ssvb: a13_defconfig
<wingrime> ssvb: it realy strange see branch-prediction impact hre
<ssvb> hmm, I think a13_defconfig should also enable the performance counters
<ssvb> if not, then IMHO it would make sense to update the configs
Dave77 has quit []
<ssvb> regarding branch prediction impact, I think the biggest problem there might be that the BTB is too small on Cortex-A8
<ssvb> and the old entries are just evicted when running large code with huge number of branches
<wingrime> ssvb: there is something we can do
<ssvb> and because of associativity and aliasing effects, even the minor shifts in branch addresses because of unrelated code insertion/deletion could affect average prediction rate
n01 has quit [Ping timeout: 255 seconds]
<wingrime> ?
<wingrime> can we do magic alligment ?
<ssvb> if it's associativity and collisions problem for BTB entries, then it's kind of random and hard to control
<ssvb> some explanation for set-associative cache is here - http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/set.html
<wingrime> ssvb: what do you think about my last patch
<wingrime> ssvb: moving to workqeue
<oliv3r> while you all speak mumbo jumbo now
<oliv3r> :p
<oliv3r> is this arm-core specific, or Allwinner implementation specific (from the hardware)
<oliv3r> or is this all 'softwinner's bad work'
<ssvb> wingrime: I'm not really a kernel developer, better ask mnemoc or mripard_ :)
<ssvb> wingrime: but the current allwinner dma code looks horrible to me, so the improvements should be generally a good thing
<wingrime> ssvb: at list in this state it works with stresstesst good
<wingrime> but after "workqueue" patch hung on stress test
<wingrime> I understand where but not understand why
eebrah_ has quit [Read error: No route to host]
eebrah_ has joined #linux-sunxi
<ssvb> like I said before, it might be some latent bug just exposed by unrelated changes
<ssvb> it's always dangerous to touch messy codebase
<wingrime> ssvb: agree, simply say I move code from Interrupt handler to kernel thread that works LATER
<wingrime> ssvb: we must make IRQ handlers small as possible
<wingrime> ssvb: that make kernel works smoother
<ssvb> sure
<wingrime> ssvb: and rise responce
<wingrime> ssvb: we make code work less in "lock" state
<wingrime> ssvb: there is some side effects that I not tested
<wingrime> ssvb: it realy dangerous to call callback in irq context
<wingrime> ssvb: I noticed that wemac make TX in that callback
<oliv3r> emac* :p
<oliv3r> ssvb: wingrime so is this just bad code that causes bugs, or is the arm core bugged and the code works around it?
<oliv3r> i mean, the arm core, they bought from ARM; so the code should be pretty ironed out?
<wingrime> ssvb: NAND_WaitDmaFinish simly wait endlessy I should add timeout and thing what's wrong
<wingrime> oliv3r: ARM core must have proper config and good devices arround
<wingrime> oliv3r: thereis many bottleneck effect around
eebrah_ has quit [Ping timeout: 255 seconds]
<wingrime> oliv3r: slow dram slow sdram not many L1/L2 cache bad code around , slow nand ctl, bad PCB tracing
<wingrime> small detals control everysing
<oliv3r> cherr-pick won't compile
<oliv3r> i'll use your branch
<wingrime> good
<wingrime> please use HEAD~1
<wingrime> last patch notstable
<oliv3r> ok i'll pull first
fra79Wii has quit [Quit: AndroIRC - Android IRC Client ( http://www.androirc.com )]
eebrah_ has joined #linux-sunxi
<oliv3r> /silo/build/sunxi-bsp/linux-sunxi/arch/arm/mach-sun4i/dma/dma.c: In function 'sw_dma_loadbuffer':
<oliv3r> /silo/build/sunxi-bsp/linux-sunxi/arch/arm/mach-sun4i/dma/dma.c:526:4: error: implicit declaration of function '__cpuc_flush_dcache_area' [-Werror=implicit-function-declaration]
<oliv3r> commit bfb2dfb06a15f8191631f2f9a1aed0ea945f29fd
<wingrime> ok
eebrah_ has quit [Ping timeout: 246 seconds]
<oliv3r> so can't test it :(
<wingrime> why?
<oliv3r> see those errors?
<oliv3r> :p
<oliv3r> can't buiild your kernel
<wingrime> what errors?
<wingrime> ok
<oliv3r> i pasted them 10 mi ns ago :p
<wingrime> ok wait
<wingrime> I just missed something for a10
<wingrime> oliv3r
<oliv3r> http://paste.debian.net/246258/ <- my base test; i'll add the others after you tell me to pull :)
<wingrime> simply add #include <asm/cacheflush.h>
<wingrime> at rch/arm/mach-sun4i/dma/dma.c
<wingrime> at arch/arm/mach-sun4i/dma/dma.c
<wingrime> oliv3r: try some 60 Mb file with turn-off-cahing
<wingrime> dd if=/dev/zero of=/media/0000-006F/test.zero count=100000 bs=1k oflag=dsync
<wingrime> oliv3r: test disk not mem copy
<oliv3r> 1k bsize yeah? i'll do oflag too
<oliv3r> wingrime: idid 512mb file: p that should have been reasonable
eebrah_ has joined #linux-sunxi
wingrime has quit [Ping timeout: 256 seconds]
paulk-desktop has quit [Quit: Ex-Chat]
eebrah_ has quit [Ping timeout: 258 seconds]
shineworld has left #linux-sunxi ["Leaving"]
eebrah_ has joined #linux-sunxi
eebrah_ has quit [Ping timeout: 256 seconds]
simosx has quit [Quit: Αποχώρησε]
<oliv3r> wingrime: here's some results
<oliv3r> i ran both tests on mmc and nand, i cut off the bs=1k dsync version after a few minutes as it was horribly slow :p
<oliv3r> your dma patches do improve performance quite a bit.
<oliv3r> interestingly overal mmc is faster, but with 1k blocks and dsync, nand is faster
<oliv3r> i cherry-picked the dma fix on stage/sunxi-3.4 + header change. and your right, it was unstable. while waiting initially (after about 10 minutes) i had atleast a reboot. but it was only the one
torqu3e has joined #linux-sunxi
vinifm has quit [Remote host closed the connection]