<wpwrak> hmm .. "oggenc" ... let's see ...
<wpwrak> ah, but that's audio only :-(
<Thihi> .ogm is at least the filetype for an ogg movie file, I think
<Thihi> But I have no idea about encoders
<wpwrak> ah well, everyone has mplayer to see it anyway :)
<wpwrak> patch sent. upload is still running
<wpwrak> ... sync-before.mov is done, sync-after.mov at 71% ...
<wpwrak> lekernel: did you see this ?
<wpwrak> Chain_Control looks a bit suspicious. there's one definition in cpukit/score/include/rtems/score/chain.h and another one in doc/tools/bmenu/chain.h
<wpwrak> gdb tells me flickernoise uses the latter definition. i find the path name rather scary ...
<wpwrak> video upload complete
<errordeveloper> kristianpaul: ?
<wolfspraul> yeah, I also wondered what that URL meant :-)
<aw_> xiangfu, one question: i took out audio codec chip and used test program to test vga, and it showed well even video source. but when I tried to boot up m1. there's no screen shows on monitor. http://dpaste.com/645906/
<aw_> xiangfu, what possible reason to let no screens after booted up? Does this reasonable in boot procedure if my audio codec unmounted on m1?
<xiangfu> aw_, probably because the audio chip not mount. the system stopped when check the audio chip.
<aw_> xiangfu, after "Unable to open audio mixer: No such device" msg, the d2/d3 is fully ON, i think it's in rendering mode, isn't it?
<xiangfu> aw_, since there is error while boot. I can not make sure if it already boot to rendering mode.
<wpwrak> when we ran into the audio grounding problem (Lxxx), lekernel mentioned - if i understood this correctly - that the codec provided a clock that was vital for the system
<xiangfu> aw_, it should be boot to rendering mode.
<wpwrak> so perhaps you're just hitting a condition where the system depends on it
<xiangfu> wpwrak, 'that the codec provided a clock that was vital for the system' oh.
<wpwrak> i'm not entirely sure i understood this correctly. but he basically suggested that, if the codec somehow crashed, the whole system would hang (which is what i had indeed observed)
<wpwrak> and of course, and absent codec can only be worse ;-)
<aw_> xiangfu wpwrak so that clock is designed from AC97_SOUT then feeding into fpga to identify and depend?
<wpwrak> no idea which clock it is and what exactly depend on it. but lekernel should wake up soonish ;-)
<aw_> wpwrak, hmm...this may really explain my condition now. since the codec was mounted before and worked well for couples minutes only last time then a very huge noise occurred then my screen went to frozen and never showed screen on monitor more.
<aw_> i doubted my reworks on codec previously. now I took out the audio codec and used test program to test vga/video source etc. all pass except audio section.
<aw_> so seems now i have no choice to go. just go remount codec again... I hope that codec itself is still good. ;-)
<xiangfu> wpwrak, those 'clock' stuff is under bitstream code right?
<wpwrak> aw_: heroic rework ;-)
<wpwrak> xiangfu: i would think so, yes. but then, i don't really know what it is or even whether i understood this correctly
<xiangfu> aw_, we trust your soldering skill, but don't trust the chip :)
<aw_> xiangfu, ha...no...  consolation is not bad. but fact is I reworked two m1, reward: 1 done 1 failed, so poor skill though. ;-)
<wpwrak> probably not a bad result - you're moving from relatively simple and well-understood changes into increasingly difficult territory
<wpwrak> also, the board you're working on may have had other rework in the past. so the potential errors add up ...
<wpwrak> hmm. setting a conditional breakpoint on _CORE_message_queue_Seize seems to be a bit too much work for this poor system
<aw_> wpwrak, yes. taking risk to potential err added up already.
<wpwrak> it's funny. the system does seem to advance. but very very slowly. the queue grows by about ten messages per minute :)
<wpwrak> hmm, or less :) amazingly slow. but it keeps on growing. just a question of time until it hits 64. and then ....
<aw_> xiangfu, it acts as rendering mode while detected successfully an audio codec. Your guess was right. Hope this second board can still rendering well for more couple hours after temperature goes up.
<wolfspraul> aw_: cool, so everything works *right now* on the second upgraded rc2 board as well?
<aw_> wolfspraul, needs run rendering for more hours, yes they works well after test program for all items. Don't know what exactly happened in my previous work. It was frozen and never showed up in the past. ;-)
<wolfspraul> sure, I understand
<wolfspraul> but that's a good first step
<wolfspraul> of course let's do more testing now, let it run for 24hours, then wait a day, then again for a few hours, etc.
<wolfspraul> we are in no rush with this
<aw_> I'll append this board into here for records: http://en.qi-hardware.com/wiki/Milkymist_One_run_3_schedule#Upgrade_h.2Fw_RC2_to_RC3
<wolfspraul> ok, good
<wolfspraul> how are the rc3 reworks going?
<xiangfu> aw_, great. also thanks to Werner. :-)
<aw_> still have 14 remaings in rc3. meanwhile I started to gather boards which will go for x-ray.
<aw_> 14 boards: 1) midi 0x46 2) nor 0x55 / 0x67 / 0x6d / 0x6f 3) no boot up 0x32 / 0x70 4) video i2c 0x4d 5) dimly lit 0x3a
<aw_> xiangfu, oh..yes thanks to Werner too. ;-)
<aw_> 6) short 0x57 / 0x59 / 0x5d / 0x62 / 0x70
<aw_> fixed one short board with C104/0805 which surrounding D16/R30 area. that must be caused by carelessness while first replaced R30 in factory.
<aw_> I keep checking short boards now.
<wolfspraul> ok, so 5 completely fixed so far?
<wolfspraul> remaining boards down to 14?
<aw_> yes. down to 14
<aw_> (x-ray condidates) 0x32, 0x3a, 0x46, 0x4d, 0x70.....will gather more I think.
<aw_> 0x32 / 0x70 are the BTN2 (bga ball AA4) with keeping high voltage after power on which must be 0. As a rough guess: the AA4 is nearby the area of D16 and R30. (i.e. at the corner of fpga), so this may completely damage by first heat air in factory already to replace R30.
<wolfspraul> ok, good news still
<wolfspraul> so the yield is 76/90 now, 14 to analyze
<wolfspraul> and those 14 are very important as preparation for rc4
<aw_> 0x46, midi_rx (ball AB21). 0x4d, videoin_sda (ball AB17) those are abnormal level.
<aw_> 0x32 / 0x70 may also be involved my several reworks fix2/fix2b (potential errors added up in the past)
<GitHub100> [flickernoise] sbourdeauducq pushed 3 new commits to master: http://git.io/SjK5zQ
<GitHub100> [flickernoise/master] input.c: synchronize with MIDI status and ignore real-time messages - Werner Almesberger
<GitHub100> [flickernoise/master] input: remove MIDI timeout - Sebastien Bourdeauducq
<GitHub100> [flickernoise/master] New X2 patch from Werner - Sebastien Bourdeauducq
<GitHub17> [flickernoise] sbourdeauducq pushed 2 new commits to stable_1.0: http://git.io/7OYdsQ
<GitHub17> [flickernoise/stable_1.0] input.c: synchronize with MIDI status and ignore real-time messages - Werner Almesberger
<GitHub17> [flickernoise/stable_1.0] input: remove MIDI timeout - Sebastien Bourdeauducq
<kristianpaul> wolfspraul, errordeveloper, nah just vaporware it seems, until i see a dek kit with zynq chip
<lekernel> I don't think it's really "vaporware"... Xilinx often ships experimental stuff to a few lab-rat companies before it is generally available
<wpwrak> (midi timeout) ah, interesting ... how long is a "tick" ?
<lekernel> 10ms (iirc)
<lekernel> wpwrak, btw, if you think you are getting lost bytes because of the interrupts not being serviced fast enough, mwalle has made a new UART interface design that should be a lot friendlier to implement a small hardware FIFO
<lekernel> it's in soc git head, but there's no RTEMS driver for it yet
<wpwrak> (timeout) then is probably would only have worked if there's no clock. the clock ticks at 24*bpm, so something like 30-50 Hz. i may actually have observed some slight changes when playing with the clock. interesting :)
<wpwrak> (UART) great ! that's definitely something worth considering. at the moment, i seem to get very few losses, maybe even none. but a lot more of those hangs :-(
<lekernel> ah, there is no clock with my MIDI keyboard
<lekernel> maybe that explains why you got bugs and not me
<wpwrak> heh, yes, that might be just the trigger
<wpwrak> lekernel: any ideas about the hang ? i've now set a conditional breakpoint on _CORE_message_queue_Seize (for e_message_queue->number_of_pending_messages == 64) and i can watch it crawl to towards its doom, but i still don't have any smoking gun
<wpwrak> it appears that disaster doesn't necessarily strike the very first time the queue fills up
<wpwrak> also, with the conditional breakpoint in place, I didn't get to stop in memcpy. instead, the first evidence of trouble I see is the_message_queue->Pending_messages.last = 0x0
<kristianpaul> lekernel: i wrote then just by curiosity and point me to use qemu  instead of a board :)
<kristianpaul> but yes, they may have a real board for sure i guess
<kristianpaul> oh http://digilentinc.com/Products/Detail.cfm?NavPath=2,400,836&Prod=ATLYS
<kristianpaul> ha, supported by petalinux ;-)
<kristianpaul> hum it uses serial flash instead
<lekernel> wpwrak, not off the top of my head, sorry
<lekernel> what sets the_message_queue->Pending_messages.last to 0?
<lekernel> iirc you can also use watchpoints
<wpwrak> watchpoints ? hmm, let's see. the conditional breakpoints are glacially slow. takes something like 10-30 seconds per queue size increment
<lekernel> hmm... I don't know how they work. if they result in a lot of traffic exchanged between the PC and the M1 every time the code is executed, that may explain it
<lekernel> the serial link is not fast
<lekernel> btw - the FT2232H might support 30Mbps there as well. and with a redesign of the FPGA UART, the SoC could support similar speeds too.
<kristianpaul> or start by moving the uart core from csr to wishbone?
<wpwrak> watchpoints would only work usefully if there's hardware support for them
<lekernel> there should be hardware support for thel
<lekernel> them
<lekernel> I have not tested it, but maybe mwalle did
<wpwrak> perfect. let's put it to good use then :)
<wpwrak> oh, btw, did you implement any NULL pointer dereferencing trap ?
<lekernel> no
<lekernel> this would happily land in the flash
<wpwrak> that would be a worthwhile feature for catching bugs
<lekernel> in theory, you could easily generate a bus error on such a condition with something like
<wpwrak> oh, even in the flash ? wow ;-)
<wpwrak> yes, a bus error is what i had i ming
<wpwrak> minD
<wpwrak> if it's even NOR address space, then the CPU has no business accessing the beginning of that range anyway (standby bitstream)
<lekernel> assign wb_err_i = wb_adr_o[31:lower_bit] == <# of bits>'d0 & wb_stb_o & wb_cyc_o;
<lekernel> right on the CPU buses
<lekernel> I have never tested bus errors with LM32 though
<lekernel> I don't know how the current debugger handles them (they are never asserted with the current design)
<wpwrak> anyone here who got time ? :)
<wpwrak> hmm. watchpoints kinda pseudo-work :-(
<wpwrak> the watchpoint per se seems fine
<wpwrak> but the conditional part is weird
<wpwrak> the "Backtrace stopped: previous frame inner to this frame (corrupt stack?)" in the backtrace seems to be "normal". at least i get it from very early on. hmm.
<wpwrak> amazing. there are no less than three instances of chain.inl in RTEMS. two of them overlap in what they define. the third is a set of wrappers for (which ?) one of the others. if i was looking for a design that made broad allowances for letting subtle but nasty errors creep in, that approach would be a good candidate.
<lekernel> I know you dislike this system and I will easily admit it's far from perfect. but... seriously try running FN under Linux, and you'll see it's a lesser evil :)
<wpwrak> oh, i hope very much to meet these evils ;-)
<wpwrak> what's irritating with these lists/chains is that they're such a fundamental thing and there are at least two potentially dangerous things in how they're done. of course, i keep telling myself that, given that they're so fundamental, everything must pan out in the end. but still, ...
<wpwrak> of course, the code says "1989-2006". not that lists would particularly new, but, say, the considerably more elegant solution linux uses for the same problem (not just lists but some internal properties of them as well) may not have been common knowledge back then. (not that i'd expect the solution in linux to originate from linux, of course)
<mwalle> wpwrak: lekernel: yeah conditional watchpoints/breakpoints are handled by gdb (not by the gdbstub)
<mwalle> and watchpoints are hardware watchpoints, but i dont know if they are switched on the MM1, i remember the comparators were within the critical path and we wont meet timing
<wpwrak> the watchpoints seem to work. but the conditional part isn't handled correctly.
<wpwrak> (or so it seems)
<wpwrak> interestingly, i get conditional breakpoints work just fine
<wpwrak> like this: http://pastebin.com/t6fbcSqa
<wpwrak> also tried  watch ...  with  condition ...  which should be equivalent to  break/watch ... if ...   but got the same result
<wpwrak> it traps all the time in _Chain_Append_unprotected, which is indeed where "last" changes
<wpwrak> regarding the mixed-up types, at least gdb is confused: http://pastebin.com/Tg3Xqyvk
<wpwrak> the struct with first/permanent_null/last is from doc/tools/bmenu/chain.h while gdb locates the sources for the rest from the more plausible cpukit/score/inline/rtems/score/ universe
<mwalle> btw iirc watchpoints are always two instructions behind
<wpwrak> at least these structures should be compatible (both by intention and by the way they were compiled), but such things don't exactly inspire confidence ...
<mwalle> watch or awatch? (or are these cmds equivalent?
<wpwrak> watch is for writes, says the manual :)
<wpwrak> awatch for read/write
<mwalle> lm32 only supports access (read and write)
<wpwrak> oh
<mwalle> mh
<wpwrak> i sense potential for some improvement ;-)
<mwalle> wpwrak: forget it, should be fixed within the latest gdbstub, it supports write and read and access
<wpwrak> wheee ! :)
<mwalle> wpwrak: but have a look at $pc, i guess its two instructions behind the actual sw or lw instruction
<mwalle> i don't know if this has some influence to gdb's conditional logic
<wpwrak> hm, shouldn't ... after all, i'm giving it a constant address
<mwalle> wpwrak: so you made sure $pc is a sw or lw instruction?
<mwalle> gdb does some weird single stepping after a watchpoint
<mwalle> iirc ;)
<mwalle> i guessed that some archs break before the actual store/load instruction and some after the instruction was executed
<mwalle> but gdb is always singlestepping one instruction
<mwalle> you may turn on gdbstub debugging to see whats actually going on
<mwalle> set debug remote on
<wpwrak> it's a sw
<mwalle> mh ;)
<wpwrak> one of many. so it may very well be a little off
<wpwrak> in fact, it probably is
<mwalle> whats $r1 + 72 ?
<mwalle> your watchpoint? :)
<wpwrak> hmm. i'm not entirely sure about those offsets. the difference seems to large to make sense
<wpwrak> naw, nowhere in right :)
<wpwrak> oh, wait. typo
<wpwrak> $r1+88 is my watchpoint
<wpwrak> so $pc is correct
<mwalle> mh
<mwalle> i should probably update my gdb ;)
<wpwrak> in any case, the calculation should be affected by where $pc is only in as much as what instructions have executed since the trap
<wpwrak> the calculation of the value does not depend on any local context (except for the symbol table)
<mwalle> watch if cond.. should set the hw watchpoint, and then gdb checks cond on every exception. to find out whats broken, assuming you want to use conditional watchpoints, a little test binary, which triggers the bug, would be helpful :)
<wpwrak> watch <var> if <cond>  is what i tried. it breaks all the time, no matter what the condition evaluates to :-(
<mwalle> wpwrak: so try to enable remote debug and see whats going on
<mwalle> the packets are described here: http://sourceware.org/gdb/onlinedocs/gdb/Packets.html#Packets
<mwalle> you should see the set hw watchpoint packet, then continue, then a signal packet, when the watchpoint has hit and after that there should be some memory read commands where the reply should be interesting
<mwalle> sorry but i have to go to bed now, my alarm wakes me up very early ;)
<wpwrak> oh dear
<mwalle> yeah and some register info packets ;)(
<mwalle> so gdb doesn't read the memory at all?!
<wpwrak> maybe this is it ? $m408dfe68,4#06 -> 7fffffff
<wpwrak> but i'm not so sure what it thinks it's reading :)
<mwalle> gdb disables the watchpoint and reads 401365dc, 401365e0 and 401365e4
<mwalle> do you have some print statements on break enabled?
<wpwrak> no
<mwalle> maybe you should try raw memory addresses :)
<wpwrak> let's see ..
<mwalle> gn8 :)
<wpwrak> not even .. if *(uint32_t *) 0x408da714 == 0  does the trick :-(
<wpwrak> kewl. now i killed it so hard gdb doesn't get through anymore
<lekernel> hmm... I wonder if this could be because the CPU tries to access unmapped bus areas that never get acked
<lekernel> generating a bus error in those cases (I'm not sure if they exist) would solve the problem
<wpwrak> i certainly get a very hung CPU. i suppose with some jtag magic, i could also find out where exactly it hangs :)
<wpwrak> now, i set a watchpoint on 0x10, since this seems to be a popular "NULL" pointer.and it tripped in rtems_message_queue_send: http://pastebin.com/t1zHBWwM