Topic for #qi-hardware is now Copyleft hardware - http://qi-hardware.com | hardware hackers join here to discuss Ben NanoNote, atben / atusb 802.15.4 wireless, and other community driven hw projects | public logging at http://en.qi-hardware.com/irclogs
<roh> oomk are stupid anyhow.
Textmode has joined #qi-hardware
urandom__ has quit [Remote host closed the connection]
phirsch has quit [Ping timeout: 244 seconds]
phirsch has joined #qi-hardware
Aylax- has joined #qi-hardware
Aylax has quit [Ping timeout: 260 seconds]
methril has quit [Quit: Leaving]
xwalk has quit [Read error: Connection reset by peer]
xwalk has joined #qi-hardware
Aylax- has quit [Quit: Bye]
compcube has joined #qi-hardware
compcube has quit [Changing host]
compcube has joined #qi-hardware
GNUtoo has quit [Ping timeout: 245 seconds]
xwalk_ has quit [Quit: Leaving]
xwalk has quit [Read error: Connection reset by peer]
xwalk has joined #qi-hardware
methril has joined #qi-hardware
xwalk_ has joined #qi-hardware
compcube has quit [Quit: Leaving]
pabs3 has quit [Ping timeout: 260 seconds]
pabs3 has joined #qi-hardware
kristianpaul has quit [Ping timeout: 248 seconds]
kristianpaul has joined #qi-hardware
kristianpaul has joined #qi-hardware
rejon_ has joined #qi-hardware
xwalk_ has quit [Ping timeout: 240 seconds]
xwalk_ has joined #qi-hardware
jekhor has joined #qi-hardware
<viric> can it be that I have a hang on oomk, because I have the syslog outputting to a tmpfs? maybe it needs to allocate tmpfs pages to save what oomk printks.
<viric> and deadlocks.
<viric> mh it may be related to reiserfs too... grr
larsc has joined #qi-hardware
Aylax has joined #qi-hardware
Textmode has quit [Ping timeout: 255 seconds]
<roh> viric: people using reiserfs wondering over strange system behaviour are weird anyhow ;)
<roh> viric: you could try syslogging to udp network and let another host on the local lan dump it to disk. that way you may have a chance to see your logdata
Aylax- has joined #qi-hardware
Aylax has quit [Ping timeout: 260 seconds]
<viric> roh: yes, I'm building the netconsole module now
<viric> roh: reiserfs worked very stable for me for the last 10 years, but in 3.3 and 3.4 I'm seeing deadlocks in it
<viric> stable = no deadlocks, never lost data.
<viric> reiserfs was famous for using the BKL a lot... and I think that recently someone broke up the locks into something more scalable. And maybe there are deadlocks now.
<viric> (scalable in terms of SMP)
<roh> dunno. i am happy with ext3/4
<Aylax-> Just out of curiosity, why are you using reiserFS?
<viric> because I lost data with ext3 :)
<viric> several times
<viric> (on system hang, and things like those)
<roh> eh. no. that doesnt happen with unbroken hw. dead ram/diskcontroller/etc yes... but there is raid and ecc against that
<viric> but I can't tell for ext4. Never used it still. But I don't trust at all the promises of ext3 journaling
<viric> nah, the hw has always been the same. ext3 - corrupted files. reiserfs, never corrupted files.
<viric> But maybe ext4 fixes all those things.
<viric> I don't know.
<viric> I'm starting to use ext4, due to the recent deadlocks in reiserfs.
<viric> 2 days of ext4 already; I'll tell :)
<viric> last time I tried ext3, it took me 3 days to have my boot scripts full of binary data.
<viric> (that was around 2.6.34)
<roh> huh? without crashes?
<viric> Yes, there were crashes
<roh> due to what?
<viric> well, I was setting up a sheevaplug system... and sometimes I got left without network, ... and I powered off the device
<roh> ive only seen such things with broken blockstorage happen. means.. bad diskcontrollers, etc. anything that makes the blocks not keep state or corrupt below the fs layer.
<viric> I thought the journal would hold all fine.
<roh> sheeva is flash. there should never be ext on there without ubi or something similar between that and mtd.
<viric> Well, I remember experiencing the same kind of corruption on PC over spinning disks when I had used ext3 on PC... so I conclude that should be ext3. And replacing the fs with reiserfs always solved for me any troubles.
<Aylax-> They can't just use ext3 on top of mtd
<viric> roh: no, I don't use the flash. I used ext3 over SD.
<viric> mmc.
<viric> I used ext3 only temporarily, because I wanted to setup the initrd to have reiserfs modules. And before I could do that, the ext3 got corrupted :)
<roh> sdcards are similar fails when it comes to blocks. there is a dumb remapper in there but its usually slow. on sd make sure the journal is 'in the front' where the FAT usually resides on msdosfs
<viric> roh: lots of people tell me that ext3 worked fine for them
<roh> gives better performance and less possibility of loosing data due to unrecoverable lost blocks
<viric> well, my experience is that I lost data with ext3 (always *soon*), and never with reiserfs, no matter how bad I powered off the systems
<viric> so I don't care much about ext3 promises. :)
* roh needs to learn more about ubifs and ubi for such cases. usually i got squash and jffs2 on small stuff
<viric> If I had lost data only once, I could doubt. But it happened several times in my sporadic short usages of ext3.
<viric> But you'll find people that tell you the opposite: lost data with reiserfs, and never with ext3.
<viric> Why that? I've no idea.
<roh> i did a test how good journaling keeps data sane about 10 years ago... reiser didnt even perform as good as xfs and i got that do 'crashing kernel on boot'
<viric> Most people that told me that ext3 worked fine for them, they use UPS and stable systems. :)
<roh> we needed something for the dvr storage drive inside a set-top-box
<viric> aha
<viric> reiserfs3 you mean, right?
<viric> 3.6
<roh> so i tested it by repeatedly unsafe shutdowns under write-load
<viric> well, my trouble with xfs is that it can't shrink
<roh> bck then? dunno. 10-11 years ago. long time. but it was bad. through the board. ext always came up. took time to fsck (no ext3 back then)
<viric> (and they have those weird (although correct) semantics that used to leave 0-byte files
<roh> but didnt loose data or even segfault on recovery like xfs.
<roh> reiser corrupted but always came up i think
<viric> roh: ah you mean xfs segfaulted?
<viric> I thought you meant reiserfs segfaulted, and xfs not.
<roh> fsck.xfs was a 'exit 0' dummy and the kernelcode to 'fix it on mount' just crashed the kernel. not nice
<viric> ok
<viric> yes, xfs never had fsck in the sense of 'run before mounting'
<viric> and they use that 'exit 0'.
<viric> The same does btrfs.
<roh> reiser actually had a 'if the kernel says its too bad to mount' fsck tool which even worked, but corrupted some files.
<roh> i have high hopes for btrfs. didnt test it tho.
<roh> yet.
kristoffer has joined #qi-hardware
<viric> I'm using it in my home and office computers...
<viric> It performs very bad for files that get randomly written (vm images, databases, ...)
<viric> because that gets very fragmented.
<roh> the only think i am kinda annoyed of is sometimes the performance of my badly overcrowded ext3 disks in my workstations. everwhere else i have no issues.
<viric> for the rest, fine.
<viric> Ah, another deal with ext* filesystems.... I often ended up finishing the inodes.
<viric> Well, for my taste, *too often*. I wouldn't expect never to happen.
<viric> having free space, and not being able to write files.
<roh> hm. never had that. (ourside of quotas)
<viric> reiserfs does not have inode limits
<viric> Finally I used ext3 only for /boot, due to broader loader support :)
<viric> roh: different users, different experiences! :)
lekernel__ has joined #qi-hardware
lekernel_ has quit [Ping timeout: 260 seconds]
<viric> anyway, why the sheevaplug deadlocks on OOMK, we'll see once I have netconsole.
<viric> I never used it before.
<viric> hm is it me, or their logo http://www.plugcomputer.org/ looks like a man going to poo, seen from the back?
<viric> well, not necessarily a man. but with a hairy bottom.
<Aylax-> Once you see it ... x_x
<viric> weird. :)
<roh> viric: i dont own one but i have heard the shivas do like to overhead and corrupt ram
<viric> hm no, I can reproduce the troubles quite quickly
<viric> I never saw ram corruption - my only hangs are always related to oomk
<viric> roh: and it's not super-hanged... Just in deadlock. sysrq works fine.
<viric> roh: do you want to see the backtraces? I'll prepare a txt
kilae has joined #qi-hardware
<viric> oops that lacks the out_of_memory
<viric> http://sprunge.us/GdBO that's from the same session, where only sysrq answers.
wej has joined #qi-hardware
<roh> i'm a bit confused... reiser needs to dynamically allocate memory to get a write lock for sth. on disk?
<roh> atleast thats what that dump says to me
<roh> or is that just incindental in there since something else tries to get ram and its swaping to a file on reiser? /me confused
<viric> what makes you think it needs to allocate memory?
<viric> I think that the 'vm' threw away read-only pages from memory, expecting that they can be read from disk again when needed.
<viric> (.text sections for example)
<viric> Then, there are 'page faults' that trigger reiserfs_get_block.
<roh> yeah. well.. why does it try "reiserfs_write_lock_once" then`
<roh> do you need a lock to read sth?
<viric> reiserfs has a lock to read, yes.
<viric> who owns that lock, I've no idea...
<viric> I should have built kdb...
<roh> it still weird that this doesnt happen in ram the kernel doesnt give away (static buffers)
<viric> 'this'?
<viric> what should happen in ram?
<viric> I think that the culprit is having one process locked at "out_of_memory"
<roh> hm. not 'in ram' 'in memory which is never unavailable to the kernel' .. like static buffers.
<viric> but what? what should be using static buffers?
<roh> i find it kinda odd for fs code to depend on dynamic memory allocation when that(the fs code) is what needs to work if you run out of free memory (need to swap)
<viric> What makes you think reiserfs depends on dynamic memory allocation?
<roh> it should USE memory for cache (as the fs cache does) but not depend on it.
<roh> viric: the trace. or do you see anything else in there?
<viric> I told you
<viric> 11:45 < viric> I think that the 'vm' threw away read-only pages from memory, expecting that they can be read from disk again when needed.
<viric> page faults that trigger read from disk. Nothing more. What do you see about dynamic memory allocation?
<roh> viric: in your other sysrqs
<roh> well. the oops
<viric> there are no oops
<viric> somehow I always related these hangs to tmpfs, not reiserfs...
<roh> hm. well.. why is the reiser in some mutex then? or is that just a snapshot (the lower trace)
<viric> maybe the trouble is the out_of_memory being called in a pagefault exception, simply.
<viric> roh: snapshots. sysrq-regs and sysrq-blocked, simply
<roh> the lower parts of both traces look kinda similar
<viric> maybe out_of_memory should never be called from a page fault exception! could it be that?
<viric> that would be a broad kernel bug though
<roh> up to [<c0068be4>] (filemap_fault+0x1e0/0x4b0) from [<c007e3dc>] (__do_fault+0x68/0x4b4)
lekernel__ is now known as lekernel
<viric> roh: yes, that's the page fault
lekernel has quit [Quit: Konversation terminated!]
<roh> __alloc_pages_nodemask is called from the fault, and thats ooms
<viric> I see. maybe that's the cause of deadlock: trying to allocate from a page fault
<roh> do you have weird overcommit settings maybe?
<viric> no no, default.
<roh> sigh. i'm out of ideas
<viric> it's pretty easy to reproduce: I don't have any swap at all, I've some tmpfs for /var/log and /tmp, and just have some process ask lots of ram
<viric> 1st oomk works fine, 2nd too...
<viric> and around 3rd or 4th, it deadlocks
<roh> doesnt make it better. a system should never have to oomk. doesnt help make stuff work better.
<viric> searching the www for "filemap_fault out_of_memory" gives deadlock situations
<roh> only leads to stuff like "sshd adjusting its /proc/self/oom_adj
<viric> uh?
<viric> I only find 'crashes' :)
jurting has joined #qi-hardware
<viric> hm no, some people have succesful oomk.
<viric> roh: usually, all my processes have oom score = 0.
<viric> That may mean that the oomk can't find any process to kill, and then it hangs
<viric> Or maybe all processes with oom_score > 0 are locked in reiserfs :)
<Aylax-> Speaking about filesystems. What FS would you recommend for a very slow SSD?
<viric> at this point, I've no idea :)
<viric> I'd say all fs work bad ;)
<Aylax-> ...
<roh> Aylax-: do you need rw or is ro ok?
<Aylax-> roh, I'd prever r/w
<viric> Aylax-: btrfs has some wins, in terms that it uses to write big new blocks always (due to COW)
<roh> slowness doesnt really matter. you can always add more ram to cache that away. is it big?
<roh> if you have some embedded device which only needs rw for system updates and storing config, i think the best combo is the openwrt way. squashfs for the initial userland and jffs2 fot the rest (configs, later installed packages)
<Aylax-> No, I can't add more RAM, the mother board does not accept more than 2GB
<Aylax-> It's my ACER AAO 110
<Aylax-> 8GB of a super-crappy SSD
<viric> I've a crappy 32GB SSD :)
<roh> well.. then swap it for a cheap 2.5" hdd
<roh> those are faster than slow ssd mostly. and a lot cheaper and bigger
<viric> Aylax-: why ask for software solutions, when you can have hardware replacement solutions? ;)
<Aylax-> viric: I tried btrfs, was very bad
<Aylax-> Because of one thing: fsync
<roh> i still use them in my thinkpad... proper ssd is much too expensive for my taste (and also fails like harddisks sometimes)
<Aylax-> roh: there's no space for a 2.5" hdd :-)
<Aylax-> 1.8" could fit
<roh> Aylax-: huh? the acer webpage said that its either coming with a 2.5" hdd or a ssd
<roh> weird
<Aylax-> Yes, but the SSD version is a bit different inside
<Aylax-> The SSD is a tiny PCI-E card
<roh> doh.
<roh> and what happened to the disk slot the other model has?
<viric> Aylax-: did you try it in recent kernels? it has lots of development
<Aylax-> No, I should try again
<Aylax-> Do they have a working fsck now?
<Aylax-> And support from Grub 2?
<viric> they have some fsck, but most of recovery is done in-kernel, mounting with "-o recovery"
<viric> someone wrote grub2 support, but it was someone at grub, not at btrfs.
<viric> grub2 has an incompatible license with btrfs
<Aylax-> Meh.
<viric> and so grub2 people have to write a btrfs reader not reading btrfs code.
<viric> (gplv3 vs gplv2)
<viric> I imagine it's not only the btrfs case.
<viric> if grub2 people want to keep the gplv3 label in their code, they can't pick gplv2-only code.
<Aylax-> The problem I had with btrfs is that it took hours just to install a random package on Debian
<viric> ok
<viric> I know apt does a lot of fsync
<viric> they've been working on that I think
<Aylax-> I tried last summer I think
xwalk_ has quit [Ping timeout: 240 seconds]
Aylax has joined #qi-hardware
Aylax- has quit [Ping timeout: 265 seconds]
Aylax has quit [Quit: Bye]
Aylax has joined #qi-hardware
Aylax has quit [Quit: Bye]
Aylax has joined #qi-hardware
DocScrutinizer06 has joined #qi-hardware
DocScrutinizer2 has joined #qi-hardware
DocScrutinizer has quit [Disconnected by services]
DocScrutinizer05 has quit [Ping timeout: 244 seconds]
<whitequark> roh: iirc jffs2 cannot work on non-mtds
jluis has quit [Remote host closed the connection]
<Aylax> whitequark: how does it compare vs. UBI
<Aylax> ubifs I mean
<roh> whitequark so what?
<roh> whitequark: if i dont have mtd i usually dont need jffs2 ;)
<whitequark> Aylax: ubifs is significantly faster than both jffs and especially yaffs on big mtds
<whitequark> roh: the ssd in that notebook isn'
<whitequark> t an mtd, but is a SATA drive
<whitequark> or something like that
<whitequark> a block device.
<whitequark> Aylax: while some of jffs/yaffs have a mode for block devices (I don't recall details, but there was something like that), ubifs only works on mtds
<viric> ubifs works only on ubi, and ubi works only on mtds.
<viric> something like that.
<roh> whitequark: yeah. true. but there is already badblock and ecc management in the ssd
<roh> whitequark: so one doesnt need jffs features
phirsch has quit [Ping timeout: 245 seconds]
phirsch has joined #qi-hardware
jekhor has quit [Ping timeout: 246 seconds]
<whitequark> viric: well, I do not know of anything else that works on top of ubi
<viric> me neither
GNUtoo has joined #qi-hardware
<roh> ubi should provide a 'cleaned' blockdevice upstairs. similar to regular disks
<roh> it does badblock reallocations and handling as well as write balancing for upper layers like ubifs.
<roh> jffs2 does that internally afaik.
<roh> both need mtd below (ubi as well as jffs2)
<roh> squash runs better on ubi afiak
<viric> 'squash'?
<roh> squashfs
<viric> can it run on ubi then?
<viric> or ubifs you mean?
kyak_ has joined #qi-hardware
Aylax- has joined #qi-hardware
kyak has quit [Ping timeout: 244 seconds]
kyak_ is now known as kyak
kyak has quit [Changing host]
kyak has joined #qi-hardware
Aylax has quit [Ping timeout: 256 seconds]
B_Lizzard has joined #qi-hardware
kristoffer has quit [Quit: Leaving]
GNUtoo has quit [Ping timeout: 260 seconds]
GNUtoo has joined #qi-hardware
<roh> viric: on ubi afaik
Aylax- has quit [Ping timeout: 252 seconds]
kristoffer has joined #qi-hardware
Aylax has joined #qi-hardware
<larsc> you can also run jffs2 ontop of ubi
<larsc> squashfs is read-only so you'd normally not need ubi underneath it
<roh> larsc: well.. if you got badblocks...
<roh> afaik squash cannot deal with holes on its own
<roh> on small nor thats usually not an issue, but on nand it is for sure (only the first block is guranteed to be biterror-free on sale)
<larsc> hm
<viric> doesn't anyone happen to know how can a serial port (16550 based, in a PC) can be rendered unusable? All operations to it give EIO
<viric> dmesg shows ttyS0 and ttyS1 (as usuual), but /sys/.../serial8250 only show ttyS2 and ttyS3. weird.
<viric> stty -F /dev/ttyS1 worked first, but after some work, it only gives EIO.
<viric> maybe it's faulty hw... but I'd expect the serial port always to work
<larsc> viric: cd drivers/tty/serial; grep EIO *
<viric> :)
<viric> couldn't be easier
<larsc> what you see there is that all operations return EIO if the TTY_IO_ERROR flag is set
<viric> and how could I reset that?
<larsc> according to the code reopen the device
<viric> there was a moment where, more or less, one of every "stty -F /dev/ttyS1" worked fine, the rest gave EIO
<viric> nah, it wasn't that. lsof said it wasn't opened by anyone
<viric> weird
<viric> larsc: in http://www.easysw.com/~mike/serial/serial.html, it says it can give EIO in case DCD is not up
<viric> well, strange. I ended up rebooting the computer to get it back :)
<viric> thank you for the hints
<viric> maybe increasing the loglevel could say something
wolfspraul has quit [Quit: leaving]
wolfspraul has joined #qi-hardware
GNUtoo has quit [Quit: Program received signal SIGSEGV, Segmentation fault.]
GNUtoo has joined #qi-hardware
jekhor has joined #qi-hardware
Aylax has quit [Quit: Bye]
Aylax has joined #qi-hardware
Aylax- has joined #qi-hardware
Aylax has quit [Ping timeout: 244 seconds]
DocScrutinizer06 is now known as DocScrutinizer05
compcube has joined #qi-hardware
kristoffer has quit [Remote host closed the connection]
jekhor has quit [Ping timeout: 245 seconds]
jekhor has joined #qi-hardware
kilae has quit [Quit: ChatZilla 0.9.88.2 [Firefox 13.0/20120601045813]]
urandom__ has joined #qi-hardware
xwalk_ has joined #qi-hardware
emeb has joined #qi-hardware
emeb has quit [Remote host closed the connection]
rlifchitz has quit [Remote host closed the connection]
kristoffer has joined #qi-hardware
kristoffer has quit [Quit: Leaving]
Textmode has joined #qi-hardware
emeb has joined #qi-hardware
jekhor has quit [Ping timeout: 248 seconds]
jurting has quit [Ping timeout: 252 seconds]
Aylax- has quit [Quit: Bye]
B_Lizzard has quit [Remote host closed the connection]
urandom__ has quit [Read error: Connection reset by peer]
<qi-bot> [commit] Werner Almesberger: modules/Makefile (MODULES): add bat-clip-aa-th (master) http://qi-hw.com/p/kicad-libs/8d40b38
<qi-bot> [commit] Werner Almesberger: modules/pads-array.fpd: like pads.fpd, but in a array formations (WIP) (master) http://qi-hw.com/p/kicad-libs/86ce0c0
<DocScrutinizer05> wpwrak: http://www.youtube.com/watch?v=9Ww1RH8iAR4 :-D
<DocScrutinizer05> (my new toy)
<wpwrak> how to dramatically improve the personal hygiene of the average hacker ;-)
<DocScrutinizer05> lol
rejon_ has quit [Ping timeout: 244 seconds]
rz2k has quit [Ping timeout: 245 seconds]
rzk has joined #qi-hardware