2013-06-24

<hglm> I saw some of your banwidth numbers and at least it doesn't seem to be crippled (some improvement with smae DRAM bus and speed as the A10)
<hglm> Is Cortex-A7 the dual core used in the A20 and Rockip 3166?
<hglm> Yeah that's what I see, you can get 15-20% over standard (glibc for example) memcpy on Cortex.
<hglm> ssvb: yeah, the newer ARM cores do automatic predictive prefetching I think.
<hglm> Yeah memset easily reaches the maximum unless the code is really dumb.
<hglm> Regarding the kernel memset bug, that could be worth investigation to see whether it is the reason gcc 4.8+ can't be used to compile linux-sunxi and other ARM kernels. I already fixed it but haven't tried new gcc yet.
<ssvb> hglm: I kinda saved a guy from tuning optimizations for misconfigured hardware - http://lists.freedesktop.org/archives/pixman/2012-December/002455.html :)
<hglm> Did they fix that?
<ssvb> hglm: to add an insult to injury, RPi was a bit misconfigured originally to have write-allocate enabled for L2 cache (which is not a very good idea because it has different cache line size than ARM core)
<hglm> ssvb: Yeah I know, the RPi is odd-ball probably because it was never designed to have a general purpose CPU
<hglm> And then there is the bug in memset that be the reason newer versions of gcc can't be used to compile ARM kernels like linux-sunxi.
<ssvb> hglm: to put it mildly, the RPi hardware has a lot of weird quirks, so it is not a very good example
<hglm> For example on the RPi I saw an almost doubling of performance with a few tweaks to memcpy/copy_page, on allwinner memcpy also can go somewhat faster
<hglm> ssvb: I understand, but there is a lot of room for improvement.
<ssvb> hglm: it's hard to make something that is universally good everywhere
<ssvb> hglm: the optimal implementations of memcpy tend to be SoC specific (not even ARM core specific)
<hglm> Some critical functions are still optimized for a 15 year old ARM chip.
<hglm> Yeah but upstream there isn't much improvement I believe
<hglm> ssvb: I have been looking at the kernel ARM memcpy functions recently...they are slightly out-of-date to put it mildly

2013-06-22

<hglm> yeah most 1GB devices have the same DRAM parameters, but on a tablet the LCD screen will be different so you need the correct script.bin to get the LCD working.
<hglm> Superpelican: Yes, using the script.bin from Android will help make the LCD screen work, apart from HDMI
<hglm> But you can move it to a ramdisk which makes Midori a lot faster.
<hglm> Midori by default uses a diskcache in the home directory which is very slow on SDCARD.
<wingrime> hglm: and can't find fast browser
<wingrime> hglm: I use icewm
<hglm> OK. Did you notice faster window dragging with new driver in LXDE?
<hglm> wingrime: do you rerun autoreconf -vi and ./configure --prefix=/usr?
<hglm> I noticed there is room for improvement in the various memcpy implementations for linux-sunxi -- both in userland and in the kernel, although I guess userland (glibc) is distro-dependent.

2013-05-31

<hglm> ssvb: Thanks, and I did add my own copyright notices.
<ssvb> hglm: you are also free to add your own copyright notices when adding non-trivial changes
<ssvb> hglm: that's a free software, derivative works are encouraged (unless you try to remove the original copyright notices and claim that it's written by you) :)
<hglm> ssvb: Are you're OK with me uploading the experiment public Rpi repo? Since it's your code mostly.
<ssvb> hglm: but clean and maintainable support for multiple platforms would be surely a nice addition
<ssvb> hglm: yes, this is to be expected, writing clean code needs a bit more work than a quick proof of concept prototype
<hglm> ssvb: BTW I can upload my RPi port to github, but it does change a lot of identifiers and some configurations, it is not in a state mergeable with sunxifb.
<ssvb> hglm: ok, thanks
<hglm> ssvb: OK, I'll do that.
<ssvb> hglm: also about your G2D related patches for sunxifb, could you please pick a few initial patches that you think are ready to be pushed and make a separate branch for them?
<hglm> ssvb: Yeah IRQ would be nice. Not sure if they are using DMA IRQ's in the RPi kernel though (you would assume so though).
<ssvb> hglm: this means a bit of kernel hacking and proper use of IRQ which can signal DMA completion
<ssvb> hglm: but in any case, if we want to also have a good RPi support, DMA needs to be taken into use
<hglm> ssvb: I understand, the RPi seriously lacks a good real CPU cache (that's why it's slow I think).
<ssvb> hglm: L2 cache is a part of the GPU, and ARM CPU can observe it as physical memory having L2 cache
<hglm> ssvb: L2 cache enabled? That explains some things. Is that a configuration bug?
<ssvb> hglm: btw, there is one more hack possible - https://github.com/ssvb/linux-rpi/commit/e9d325d1bce9b4bc921b38550d3839e5f0fb1dcf :)
<ssvb> hglm: also RPi is interesting because external L2 cache enabled for the framebuffer, that's making it kinda unique
<ssvb> hglm: yeah, shadowfb 'cheating' a lot, but that's how fbdev ddx driver is normally designed to work
<hglm> ssvb: without shadowfb, with shadowfb the screen is updated only sporadically
<ssvb> hglm: fbdev with or without shadowfb enabled?
<hglm> ssvb: I am seeing 70-100% speedup for unaligned screen-to-screen blits using the sunxifb CPU back-end on the RPi compared to standard fbdev, 200% for rightwards overlapped blits (dragging windows to the right).
<hglm> ssvb: OK, I guess the two-pass approach in the CPU back-end works reasonably well with standard memcpy (because some of the arguments are aligned).
<hglm> ssvb: memcpy seems to be optimized on the RPi.
<hglm> ssvb: I ported the cpu backend to ARM simply using memcpy, it helps a lot.
<ssvb> hglm: have you implemented a VFP based overlapped copy function?
<ssvb> hglm: where does the improvement come from?
<hglm> ssvb: I guess so, I looked at dmaer, it could be improved.
<ssvb> hglm: they have a bit messed up dmaer module, but this should be just rewritten properly
<ssvb> hglm: well, this does not seem to be entirely true
<hglm> ssvb: No, I haven't touched "DMA-hell" on the RPi. They seem to require busy-waiting, no interrupt.
<ssvb> hglm: are you using RPI DMA for blits and fills?
<ssvb> hglm: :)
<hglm> ssvb: slightly off-topic: I ported sunxifb to RPi. It is a little faster than the standard fbdev they use.

2013-05-30

<hglm> Default defconfig had a lot of drivers and debugging options that could be disabled.
<hglm> I'm not a kernel hacker, but did compile the kernel.
<hglm> Did you run 'make defconfig'?

2013-05-25

<hglm> Most OpenGL code could be ported with a small effort, OpenGL ES 2.0 is mostly compatible with OpenGL 2.0.
<hglm> I was planning to make it freely available, but it is not yet. It does support features such as shadow map, shadow volumes, HDR rendering etc.
<hglm> It's a full 3D rendering engine. I developed it on OpenG, but also work (in more a limited way) on GLES. I also ported it to Rpi.
<hglm> composited sounds slow, but I guess on it's reasonably fast on a unified frambuffer.
<hglm> Ah, thanks. I have ported some of my own codef but it currently runs in the console.
<ssvb> hglm: try glmark2-es2
<hglm> Are there any nice OpenGL ES apps to test, besides "test/test"? I guess most available software uses OpenGL not OpenGL ES.

2013-05-21

<hglm> Most software will be doing forwards copying
<hglm> ssvb: Yes, I noticed backwards copying was showing clearly higher numbers in those situations.
<ssvb> hglm: this also could be used as a workaround
<ssvb> hglm: forgot to mention one more thing - writing to the memory backwards is slightly more resistant to this screen refresh related performance drop than writing forward :)
<hglm> Dual core is nice, but on the A20 memory bandwidth will be a botteneck.
<hglm> ssvb: I guess there were some bugs in the 4430.
<hglm> Yeah, OMAP was dual-channel which made me almost buy one for a tablet.
<hglm> ssvb: I guess designing an optimal DRAM controller isn't easy...
<ssvb> hglm: in theory :)
<hglm> A31 is dual-channel (64-bit DRAM bus) = 2 x memory bandwidth = fast
<techn_> hglm: yeah.. I just thought that there was enough bandwith.. newer done the math :p
<hglm> techn: It's not a bug, it's normal :) Just increase memory clock or lower resolution/color/refresh rate for optimal system.
<hglm> Screen refresh at 1920x1080x32bpp @ 60 Hz is 480 MB/s bandwidth - that is lot of the bandwidth of the A10.
<hglm> ssvb: Do you have any idea whether Xorg fb's imageblt is optimized or suboptimal?
<hglm> techn: memcpy must read + write, and then you have DRAM latencies etc.
<hglm> ssvb: I guess cairo apps mostly use imageblt from off-screen to the screen.
<hglm> ssvb: Is it in any way possible to replay cairo-traces on screen? It seems to use off-screen rendering.
<hglm> ssvb: I was looking on the net for X benchmarks and downloaded cairo-perf-tools...then I noticed you have a trimmed version in your repo.
<ssvb> hglm: if somebody could run tinymembench on A20 or A31 system, we could get at least some preliminary information
<techn_> hglm: dunno
<hglm> techn: Does the A20 have faster memory or a wider memory bus? A10 is 360-80 MHz 32-bit.
<hglm> ssvb: tinymembench is a nice program, I hope it helps now the info is on the wiki
<ssvb> hglm: and I also did the same benchmarks a long time ago, but nobody listened :)
<hglm> Yes certainly on standard memory clocks.
<hglm> The morale is that lowering the resolution/color depth/or refresh rate a little can speed-up your system a lot.
<techn_> hglm: oh.. sorry :p
<hglm> techn: I think I did those benchmarks, the drop has do with some kind of timing botteneck in the DRAM controller and buffer.
<hglm> ssvb: Probably yes.
<ssvb> hglm: maybe you had scaler mode enabled?
<hglm> The sunxi-disp-vsync-demo was showing tearing on my system or wasn't smooth.
<hglm> ssvb: I've made available a patch for sunxifb for testing/evaluation purposes. Anyone who is using the sunxifb Xorg driver, feel free to test this patch. It may improve performance a little (especially when running in 16bpp mode and when the CPU load is high).

2013-05-20

<hglm> ssvb: I guess there could be room for improvement for the easiness with which the boot-up mode is selected. I have written a program to change modes at run-time, but that's not ideal.
<hglm> I see, I'm sure whether most monitors like reduced blanking. It shouldn't matter much for an LCD I guess.
<hglm> ssvb: Reduced blanking has to do with mode timings?
<ssvb> hglm: it would be also interesting to try reduced blanking modes, because with lower pixel clock they should be stressing the RAM a bit less intensively
<ssvb> hglm: 56Hz ("valid" according to EDID data of most monitors) might be useful for cubieboard, because it runs memory at 480MHz
<ssvb> hglm: added some data about reducing the refresh rate to the wiki - http://linux-sunxi.org/Optimizing_system_performance#Reducing_refresh_rate_for_1920x1080_video_mode
<ssvb> hglm: ok, that's nice
<hglm> ssvb: I implemented 16bpp G2D Fillrect using 32bpp pixel format in up to three segments, it works pretty well for large triangles (double the throughput), for smaller triangles (100x100) I think the non-sequential memory access when drawing the edges makes the gain smaller
<slapin_nb> hglm: there is 2 flavors of 24bpp packed and unpacked one; also the byte order might be different; sometimes for pixman it is different ordering than for X11 (as pixman doesn't care for X's color representation).
<hglm> I just ran X in 24bpp mode, it worked, but there were a few rendering problems (lxterminal). I guess 24bpp has few quircks (especially when the default pixmap depth of 32 doesn't correspond to the root window depth of 24).
<hglm> I also get that, with the actual frequencies specified. Useful for debugging display controllers.
<hglm> I guess monitors targeted at the PC market may not support HDMI TV modes. It may be a limitation of LCD circuitry or the system software inside the monitor. My monitor supports 50 Hz and 60 Hz (it has DVI and HDMI inputs).
<hglm> 24bpp console mode is running fine here, I need to set the refresh rate to 50 Hz to get decent memory copy bandwidth, it seems at 60 Hz there's some kind of bottleneck (with memory clock 408).
<hglm> It seems the linux console supports 24bpp also.
<ssvb> hglm: yes, NEON optimized 32bpp->24bpp conversion is faster than 32bpp->32bpp copy
<hglm> ssvb: I get it, that would be doable. And 32bpp->24bpp is fast when done efficiently (probably asm).
<ssvb> hglm: we need to only override the part which does shadowfb->framebuffer copy and replace it with 32bpp->24bpp conversion
<ssvb> hglm: the xorg does not need to care, it will only see and work with the 32bpp shadow framebuffer
<ssvb> hglm: but this brights all the shadowfb drawbacks and disables G2D acceleration :)
<hglm> ssvb: I see. Does xorg still support 24bpp?
<ssvb> hglm: I agree that it's a pita to work with, that's why a shadow framebuffer using 32bpp format could be used with real 24bpp framebuffer
<hglm> ssvb: 24bpp is never easy to test, the pixels are a little awkard :)
<ssvb> hglm: yes, 3 bytes per pixel framebuffer seems to work, got a bit of doubt when trying to verify it because of a glitch
<hglm> ssvb: the pixel format flag for 24bpp exists, not sure it would work. The chip does support a lot of formats.
<ssvb> hglm: hmm, or maybe not, somehow I thought that the display controller supported 24bpp for scanout
<hglm> ssvb: So sunxi really supports 24bpp natively? I wouldn't be easy/possible to integrate with other video components (console, VE, G2D, Mali) though.
<ssvb> hglm: for 32bpp graphics, we could probably also use some sort of shadowfb alike hack with the real framebuffer configured as 24bpp (3 bytes per pixel)
<hglm> ssvb: That makes sense I guess. BTW: tinymembench shows a big jump running at 50 Hz (1920x1080x16bpp) compared to 60 Hz, 20% faster copy.
<ssvb> hglm: I'm currently looking at EDID parsing code to check if we can maybe automatically make it prefer 50Hz if this refresh rate is supported
<ssvb> hglm: but many people would still use 32bpp by default or because of thinking that it is better for them
<ssvb> hglm: ok, makes sense
<hglm> ssvb: I'm running 1920x1080 at 16bpp now (50 Hz).
<ssvb> hglm: driving 1920x1080 monitor at 60Hz and 32bpp is degrading memory speed a lot even with 480MHz dram clock
<ssvb> hglm: and you also need 432+ memory clock frequency to get the best results
<hglm> ssvb: It seems to work, I'll check the memory bandwidth.
<ssvb> hglm: LCDs are a bit different, 60Hz also used to be bad on CRT
<ssvb> hglm: just try it, in the worst case the monitor will refuse it with something like "out of range" error message :)
<hglm> ssvb: Not yet, not sure my monitor supports it. I may have an aversion to low refresh rates due to dealing with CRTs in the past :)
<ssvb> hglm: have you tried 50Hz refresh rate for 1920x1080?
<ssvb> hglm: yes, it is surely fun
<hglm> ssvb: Hi. It's been fun experimenting with the X driver.
<ssvb> hglm: hi

2013-05-19

<ssvb> hglm: yes, I will reply in a few minutes, thanks
<hglm> ssvb: I did some benchmarks with the sunxifb driver. I posted some suggestions on github.

2013-05-18

<hglm> Bootloaders usually let the BIOS initialize the graphics...not possible for an embedded chip unless there is display-controller ROM or something.
<hglm> That would require initializing all the display controller registers, but you have no control over the output configuration (there's no BIOS). You would need HDMI code etc etc.
<ssvb> hglm: we have all the hardware, like the pieces in a chess or tetris game, we just need to use it all efficiently :)
<hglm> Maybe there can be some parallelism between CPU rendering and G2D (if the areas drawn do not intersect).
<ssvb> hglm: I'm not an Android guy and only briefly looked at gralloc implementation, there is a use of Mali mixed with G2D and display controller layers
<ssvb> hglm: G2D (Mixer Processor) has a fixed pipeline with limited features, there is no way to avoid fallbacks to CPU rendering for some operations in X
<hglm> ssvb: Something like an OpenGL-like pipeline would be useful. Isn't there a 100% GPU rendering path in Android? Unlikely that it would perform great on an embedded GPU though.
<ssvb> hglm: software can be fixed, but hardware limitations in A10 are still going to cost some performance
<ssvb> hglm: the use of blocking ioctls to do any graphics operation which sleep waiting for a completion is a braindead software design
<hglm> ssvb: It think it may also reduce context switches to the kernel and make it easier for the kernel scheduler to handle all those short sleep requests.
<ssvb> hglm: I suppose it can prevent irq spam when doing lots of small graphics operations
<hglm> ssvb: OK, I guess command queue may help in an X-style driver.
<ssvb> hglm: from what I could see in the newer sources, A31 has an updated G2D hardware with the support for premultiplied alpha and command queue
<hglm> ssvb: I see, I guess sleeping is better than busy-waiting at least.
<ssvb> hglm: do you mean the current G2D driver from allwinner? it sleeps waiting for an irq, which is not terribly great design
<hglm> Does the G2D driver do busy-waiting for command completion or is it fully parallel with CPU processing (using an interrupt or similar)?

2013-05-17

<hglm> gl
<lkcl> all good: thanks hglm, thanks hno.
<lkcl> hglm: eek!
<hglm> lkcl: You have tried wiggling/forcing the HDMI plug? They can be unreliable. For example on my tablet I sometimes have to apply wiggle/apply some force to my miniHDMI plug to get it recognized.
<lkcl> hglm: yes
<hglm> lkcl: Have your tried rebooting with script.bin with changed screen0_output_mode?
<lkcl> hglm: do you have a known-good kernel config i can compare against? (stage/sunxi-3.4 preferably)
<lkcl> hglm: not yet. let's try...
<hglm> lkcl: Have you tried changing screen0_output_mode to a lower resolution that your monitor supports?
<lkcl> hglm: ack
<hglm> lkcl: yes, I meant that. But if it's 32 it should be normal. You could remove the option because 32 is the default I think.
<lkcl> hglm: ahh... did you mean sunxi_fb_mem_reserve=32 ?
<lkcl> hglm: ack
<hglm> lkcl: screen0 looks OK, scaler mode is disabled, maybe you try enable that. And not use fb_framebuffer_size=16 in the kernel command line options.
<lkcl> hglm: i have another board, it's got android, and it works fine on HDMI
<lkcl> hglm: star
<hglm> lkcl: OK
<lkcl> hglm: any chance you could take a look - http://hands.com/~lkcl/eoma/script.fex ?
<lkcl> hglm: was experimenting with that yesterday.... 1sec...
<hglm> lkcl: Maybe you need to change the script.bin (fex file) settings.
<lkcl> hglm: tried that (a dozen times yesterday....) :)
<hglm> lkcl: maybe changing kerner cmdline options helps, not using EDID.
<lkcl> hglm: hmmm, the monitor keeps switching itself off :)
<hglm> lkcl: Yeah, that should give list of modes supported by the monitor. Means the kernel disp driver doesn't report any mode as supported.
<lkcl> hglm: should there be any "supported" modes? or am i going to need hdmi set (with "force")?
<lkcl> hglm: ^
<lkcl> hglm: it looks good. yes screen0, hdmi fine
<hglm> lkcl: my program only works on screen0 at the moment and doesn't support VGA or TV output. Does running a10disp info give any clues?
<JonnyH> hglm: I wish - it's a non-gpl compliant binary blob libnand.a that's linked into a module so I doubt I can just link it into the kernel
<lkcl> am just trying out hglm's program to get hdmi up-and-running. fuuuun
<hglm> JonnyH: Maybe compile nand.ko into the kernel (not module).
<JonnyH> hglm: It seems to build a binary 'nand.ko' - I guess that needs to be loaded from a ramdisk
<hglm> JonnyH: Not sure but maybe try a ramdisk root fs to see if it works?
<JonnyH> hglm: I've got the kernel at git.hands.com booting on my a31 tablet but seems to have difficulty loading the root fs off the nand
<hglm> JonnyH: I am not an expert, but probably too different from a1x to be supported easily. Probably better chances for the a20.
<lkcl> hglm: cool. i think that's exactly what i need, trying to get hdmi output working on the a10eoma68 board
<hglm> ssvb: I noticed that, yes. I guess Linux will use /dev/fb0 by default.
<ssvb> hglm: this is configured via "disp_mode" in http://linux-sunxi.org/Fex_Guide#.5Bdisp_init.5D
<hglm> Yes, I use them in my program (and ignore /dev/fb1 (screen1) for setting modes).
<ssvb> hglm: depends on configuration, two different framebuffers are also supported and visible as /dev/fb0 and /dev/fb1
<hglm> ssvb: Is dual output mode (different framebuffers) supported in linux-sunxi? Or only same framebuffer?
<ssvb> hglm: yes, for example driving two 1080p monitors is particularly bad for the memory bandwidth remaining available to applications - https://github.com/ssvb/tinymembench/wiki/Mele-A2000-%28Allwinner-A10%29
<hglm> I am using it to switch reliably from LCD to HDMI and vice versa, and to set different HDMI modes with 32bpp and 16bpp...I measured how difference it makes in benchmarks (memory bandwidth).
<techn__> hglm: you should be able to make pipe and use different properties of layers combined. normal->scaler->output
<hglm> I am not touching the layers except setting the working mode and modifying scaler mode :)
<techn__> hglm: nice. have you tried to pipe layers? :)
<hglm> I wrote an improved utility to change the display output mode on tablets and other A10 devices: https://github.com/hglm/a10disp Read the README before running it. It has only been tested on one device (my tablet). You can use it to set a lower resolution or 16bpp mode to improve system performance.
<hglm> It should be safe I think. The -j flag is more important for memory usage for compiles (number of parallel compilations).
<hglm> It's a gcc flag, I don't think the extra amount of memory it uses is significant, it should just be faster.
<hglm> Ah
<hglm> mnemoc: Thanks, it works now!
<mnemoc> hglm: try now :)
<hglm> Still the same error message.
<hglm> I will.
<mnemoc> hglm: can you try now?
<hglm> Mailer returned: Unknown error in PHP's mail() function.
<mnemoc> hglm: can you paste me the whole error?
<hglm> The linux-sunxi wiki's email confirmation seems to be broken (it says PHP mail error). I wanted to edit a page, but without confirmation I can't.

2013-05-16

<hglm> I don't suppose you can run F2FS with a 3.4.x sunxi kernel...
<hglm> Thanks, I'll look that up.
<specing> hglm: F2FS
<hglm> What's the best root file system for an sdcard? ext3? ext4 with write-back mode? Disable journalling? Any experimental flash filesystems?
<hglm> A10 tablets were advertised as 1.2 GHz when they were running at 1.0 Ghz, is Antutu really measuring the speed or just reporting a spec string?
<hglm> lol, I had the same problem a few days ago, those microSD slots have a lot of recoil, I found it fortunately.
<hglm> Loco, gentoo is not ideal for smaller/slower systems -- compiling everything takes a long time. It's a nice concept though, I've used it on a PC.
<hglm> Just wondering, is it safe to mount internal NAND partitions on a tablet, or even write to them? (Linux is on the sdcard)
<ssvb> hglm: good :)
<ssvb> hglm: btw, scaler mode does not fully solve the problem, I guess it just introduces a larger intermediate buffer between reading the data from the framebuffer and sending it over hdmi, so underruns are less likely (but still can happen)
<hglm> It seems to work now! I disabled scaler mode.
<hglm> ok
<ssvb> hglm: but if you are interested in 16bpp, then it should not be affected
<ssvb> hglm: yes, there is some issue apparently related to Mali GPU starving the framebuffer scanout for high resolutions, so that you can observe rolling waves on screen
<ssvb> hglm: and if you enable scaler mode on a13, you can say goodbye to the hardware scaled video playback
<hglm> ssvb: I had the impression that scaler mode helps with the stability of higher resolutions like 1920x1080, not sure though.
<ssvb> hglm: a10 has two scalers, but a13 has only one
<ssvb> hglm: do you really need scaler mode? it wastes one hardware scaler which could have a better use
<ssvb> hglm: yes, with scaler mode enabled, 16bp is not supported, so fbcon thinks that it uses 16bpp but the scanout is still done as 32bpp
<hglm> ssvb: It didn't quite work, display was messed up (half screen), I think it's because SCALER mode wasn't disabled (it should be disabled in 16bpp).
<hglm> ssvb: OK, will try.
<ssvb> hglm: try "fbset -depth 16 -rgba 5,6,5,0"
<hglm> ssvb: Thanks, fbset -rgba 5650 didn't work for me though.
<ssvb> hglm: also you can change video mode at runtime using 'fbset' tool, X server uses the same ioctl to do this
<ssvb> hglm: you can set 16bpp mode in .fex file - http://linux-sunxi.org/Fex_Guide#.5Bdisp_init.5D
<hglm> OK, if X/framebuffer run in different depth that could cause problems when switching VTs.
<Turl> hglm: ask ssvb to be sure
<hglm> Turl: So 16bpp is supported in the X but not console fb?
<hglm> Mostly console, sometimes X.
<Turl> hglm: you're using X right?
<hglm> Turl: I have tried, but I don't know how, kernel command line options for 16bpp for format and seq didn't seem to work.
<Turl> hglm: you could use a lower color depth too
<hglm> Lowering framebuffer resolution has a bigger effect on performance though.
<hglm> I might try it again sometime.
<Turl> hglm: I didn't see anything weird happening
<hglm> Turl: Is it stable? I ran 2G2G with nohighmem and it was about 5% faster in some memory-intensive benchmarks, but I had some kernel oops that might be related.
<hglm> Has anyone tried any edgy kernel optimizations like 2G/2G split/no highmem, or Thumb2?
<hglm> I wouldn't mind fast accelerated 2D graphics in X.
<slapin_nb> hglm: as soon as G2D is well supported, it will be mandatory, I think
<hglm> I thought G2D is 2D acceleration (which is not enabled by default in X and doesn't work too well), VE is video acceleration (CedarX), I think.
<slapin_nb> hglm: G2D is good for watching videos it seems...
<slapin_nb> hglm: I don't see a point disabling G2D, disabling mali is good, or disable all this things on headless device
<hglm> No, Mali is still enabled, VE is disabled and G2D is disabled.
<slapin_nb> hglm: disabled mali?
<hglm> I have 950MB free now, that's a nice boots (was 835MB). It would really make a differnence on a 512MB device.
<shine|meeting> sorry hglm I mean
<hglm> I have a 1920x1080 monitor but using a lower resolution like 1280x720 (scaled by the monitor) speeds up all aspects of the A10 by increasing memory bandwidth.
<hglm> 16M should be just enough for double buffering on 1920x1080.
<hglm> I wrote a little utility to switch modes using the new disp driver (switch LCD/HDMI and between HDMI modes).
<hglm> OK, by default the fb is too small if your disp init mode is small. That option is useful.
<hansg> hglm, with the 16M I use you can go upto 1920x1200, and more then that the hdmi out cannot handle. Actually I've had to add code to kick the hdmi encoder real hard to make it do 1920x1200 in the first place
<hglm> The extra fb memory is handy if you want to change modes to a higher resolution after boot.
<hglm> I noticed double buffering isn't really working (or hard to get working) in OpenGL ES2 anyway.
<hglm> OK
<hglm> what about mali for framebuffer?
<hglm> And w
<hglm> You mean compiled and installed sunxi-mali for X?
<hansg> hglm, right, unless you've installed the mali blob
<hglm> OK, thanks, I think the X server uses only fb by default (although I use the console mostly).
<hansg> hglm, assuming that like me you're only using the framebuffer and not any of the other gfx / video blocks
<hansg> hglm, Add something like this to the kernel boot cmdline: sunxi_g2d_mem_reserve=0 sunxi_ve_mem_reserve=0 sunxi_fb_mem_reserve=16 sunxi_no_mali_mem_reserve
<hglm> Thanks, 80MB for VE (video acceleration?) is a bit much.
<oliv3r> hglm: yes, read hansg's README for his fedora 18 image; i THINK it might also be documented on the wiki
<hglm> I noticed there's a lot of reserved memory after you boot...(VE, G2D, LCD) -- is there any way to make this available to the OS when you don't need video acceleration for example?
<hglm> I just did a test and it seems A10 sdcard speed is at least as good as internal NAND. I did get one "too much ecc err" warning when reading internal NAND.
<hglm> Has anyone tried installing on internal NAND of an A10 tablet instead of SD-card? Is it safe or faster?
<hglm> It's a bug in gcc that it doesn't give a warning when a negative value is assigned to a char on machine where char is unsigned by default.
<hglm> I guess so, yes. I think there may be apps too that are not in Debian etc. armhf because they need to be compiled with -fsigned-char.
<oliv3r> hglm: the kernel (and its drivers) should use u8 and s8 for that reason
<hglm> Compiling with -fsigned-char may solve some problems.
<hglm> I am guessing this feature is a major reason why some apps break in armhf.
<hglm> I just read man gcc, it seems char can be unsigned or signed, you hav to use "signed char" to get guaranteed signed char.
<hglm> It's very easy to reproduce: char c = -1; printf("%d", (int)c); prints out 255.
<hglm> OK, I'll investigate that. I fixed the problem by using unsigned char instead of signed char.
<n01> hglm: can you produce the error and paste the problem somewhere?
<hglm> Could it be a compiler bug then?
<n01> hglm: weird o_o