#milkymist on 2011-11-19 — irc logs at freenode.irclog.whitequark.org

2011-11-18 12:31 Topic for #milkymist is now Milkymist One, Milkymist SoC & Flickernoise development channel (LLHDL/Antares are welcome too) :: Logs: http://en.qi-hardware.com/mmlogs :: JFDI

00:27 mumptai joined #milkymist

00:59 wolfspraul joined #milkymist

01:36 wolfspraul joined #milkymist

01:40 wpwrak joined #milkymist

04:40 mumptai joined #milkymist

04:52 <wpwrak> let's see if git-send-email works ...

05:00 <kristianpaul> seems it does

05:00 <wolfspraul> wpwrak: oh wow, lots of patches

05:01 <wolfspraul> 'full-speed almost works now', that sounds like the rest can also be fixed in software?

05:01 <wolfspraul> I am referring to my question from a few days ago whether we can rule out hardware bugs or needed improvements to support full-speed.

05:01 <wolfspraul> I guess now we can?

05:25 <wpwrak> let's see. it could still be that we're weak in the analog domain

05:25 <wpwrak> but at least we're not hopeless :)

05:26 <wpwrak> now i'm unrolling the loop of usb_rx ... that should improve timing quite a lot ... and make the code really cryptic ;-)

05:39 <wolfspraul> [analog domain] you mean the signals going out on the wire are not clean, wrong timing?

05:39 <wolfspraul> can't we just look at that on a scope and tell whether it's good or not?

05:39 <wpwrak> there could be signal distortion, yes

05:40 <wpwrak> tricky. a 100 MHz / 400 MSa/s scope should be sufficient for this kind of task, but i get a lot of noise, so it's hard to see the exact shape

05:43 <wolfspraul> and the noise is coming from the m1 board?

05:43 <wpwrak> another way to find out is to check the CRC, and count error

05:43 <wpwrak> no, from my environment the the measurement setup

05:44 <wolfspraul> should adam take a quick measurement? when he goes to minbo or xray shop or other places, maybe they have a good scope standing around - if measurement is easy...

05:45 <wolfspraul> sounds like just power m1, plug in usb keyboard, take measurement

05:46 <wpwrak> something like that, yes :)

05:46 <wpwrak> you have to be careful not to distort the signal with the probe

05:47 <wpwrak> usb doesn't like capacitative loads on the signals

05:47 <wpwrak> already full-speed is a bit on the finicky side

05:48 <wolfspraul> hmm, well. should we do it or not?

05:49 <wolfspraul> in case we do, can you describe in a line or two how you would do the measurement setup?

05:49 <wpwrak> can't hurt if he has a look, if he feels it's useful. there's always the chance of someone noticing something. and maybe he can find a fast scope with a FET probe. like that GHz monster we had at openmoko. with such a device, you rule :)

05:50 <wpwrak> (okay, that one ran windows, so in that particular case, you still lose)

05:52 <wpwrak> hmm. 12 bit times, down from 16. still darn tight.

05:53 <wpwrak> (sample size 1, as any good statistician would)

05:59 <mumptai> full-speed usb should be below 100MHz bandwidth

06:13 <wpwrak> let's see ... full-speed rise/fall time is 4-20 ns. let's say 10 ns. for an accuracy of 5% or better, according to lecroy, the scope's rise/fall time should be 1/3 oft that.

06:15 <wpwrak> the 3 dB rise/fall time would be 1/3/analog_bw. so yes, a 100 MHz scope would just do. the next problem is the sample rate. some low-cost brands have really low sample rates. e.g., mine does typically only up to 200 MSa/s. 400 MSa/s if i use only one channel.

06:16 <wpwrak> (rigol improved that in the successor)

06:16 <wpwrak> the rule of thumb is somewhere around sample rate = 5 x signal bandwidth

06:17 <wpwrak> so my scope could barely make sense of such a signal

06:17 <wpwrak> of course, i could spot any truly massive distortion, but i couldn't tell with certainty whether the signal meets specs

06:23 <wpwrak> wolfspraul: one of these cases where expensive tools do improve the results. with a slow scope, there's just a lot you can't see

06:24 <wpwrak> grr. 15 cycles. despite adding extra exit points to the poll loop. this is messy.

06:32 mumptai joined #milkymist

06:33 <wpwrak> hehe, 6 cycles. that's more like it ;)

06:53 <wolfspraul> wpwrak: oh, you totally misquote me and out of context too. I have not said anything against expensive tools :-)

06:53 <wolfspraul> it's safe to assume that in most cases there is a reason they are expensive, i.e. someone buys them and they create that value.

06:54 <wolfspraul> that's why I'm wondering now how we can effectively and quickly rule out analog usb issues... very aware that bad tools may complicate our effort and even lead us in the wrong direction.

06:54 <wolfspraul> one time we had this nasty potential sdram problem, and Sebastien got access to a high-end scope at Xilinx to track down the issue... which was great!

06:54 <wolfspraul> otherwise I take the risk on manufacturing to produce something I need to repair or even discard later

06:57 <wolfspraul> here http://www.milkymist.org/wiki/index.php?title=RC1_signal_integrity_measurements

06:57 <wpwrak> yeah. it's nice to have definite answers on such issues. not things that are guesstimates based on indirect evidence, etc. a bit like they do astronomy of distant planets :)

06:57 <wolfspraul> man, I cannot believe I still haven't done another news release

06:57 <wolfspraul> urgh

06:57 <wolfspraul> MUST get it done before Monday!

06:57 <wolfspraul> (note to self)

06:57 <wpwrak> eleven days left until the date for the quarterly news :)

06:57 <wpwrak> or sooner. always better :)

07:07 mumptai joined #milkymist

07:22 cjdavis joined #milkymist

07:57 isa_ joined #milkymist

07:58 rejon joined #milkymist

08:04 r33p joined #milkymist

08:20 azonenberg joined #milkymist

08:40 isa_ joined #milkymist

08:48 rejon joined #milkymist

10:05 mumptai joined #milkymist

10:48 sb0 joined #milkymist

11:35 antgreen joined #milkymist

12:13 r33p joined #milkymist

12:33 aeris- joined #milkymist

12:54 gbraad_ joined #milkymist

13:32 kilae joined #milkymist

14:30 mumptai joined #milkymist

15:01 DJTachyon joined #milkymist

15:08 stekern joined #milkymist

15:45 DJTachyon joined #milkymist

15:51 r33p joined #milkymist

16:11 sb0 joined #milkymist

16:35 <sb0> http://www.xilinx.com/txpatches/pub/documentation/misc/improving%20ddr%20sdram%20efficiency.pdf

16:36 * kristianpaul click

16:43 <kristianpaul> sb0: is not what you do with hpdmc ?

16:44 <sb0> no, hpdmc is in-order

16:44 <sb0> with fast page mode and pipelining

16:44 <sb0> I don't know if the article says it (haven't finished reading yet) but OOO controllers have a latency penalty

16:46 <sb0> if the cores using the DRAM can't maintain performance in the presence of the extra latency, the benefit of OOO diminishes

16:48 <sb0> but I think the future is OOO and prefetching. especially if we use DDR3 (or more) someday.

16:51 <sb0> well it just says: "However, since reordering improves efficiency and therefore reduces memory controller occupancy, the result is to improve average read latency.". I'm not sure the case is so clear-cut, especially with DDR1

16:58 <wpwrak> ah, what would be needed to use DDR2/3 in M1 ? could pin-compatible chips plus a SoC change be enuogh ? or would it be a more complex redesign ?

16:59 <kristianpaul> oh, FEL was removed from F16, sb0 ?

16:59 <sb0> way more complex redesign

16:59 <sb0> different voltage, different package, different pinout, different timings

16:59 <sb0> DDR 2/3 is BGA

16:59 <kristianpaul> but DDD3 with a 80Mhz worth ?

17:00 <wpwrak> ah, pity. it's never easy, isn't it ? :(

17:00 <kristianpaul> 80mhz soc*

17:00 <sb0> we'd use a clock multiplier

17:01 <sb0> and probably the memory controller would generate several DRAM commands in one cycle at 80MHz that would then be serialized and sent into the DRAM in 10 cycles at 800MHz

17:01 <kristianpaul> :-)

17:02 <sb0> wpwrak, I think there is already a 4x increase in memory bandwidth possible with the current system

17:02 <sb0> we can use a 2x clock multiplier and serdes

17:02 <sb0> and get another 2x increase with OOO and prefetching

17:02 <wpwrak> oh, wow

17:03 <wpwrak> 1080p and deep color, here we come ! ;-)

17:03 <sb0> btw, the TMU already has an experimental prefetch system in SoC head

17:04 <sb0> 'experimental' = it works, but the performance boost isn't so high

17:04 <sb0> some 30% with the current memory system

17:06 <sb0> if you want to know how it's done, it's explained here: http://www.graphics.stanford.edu/papers/texture_prefetch/texture_prefetch_down.pdf

17:06 <sb0> without the reorder buffer of course, there's instead a simple register that holds one transaction

17:07 <sb0> that's actually one big limiting factor

17:09 <sb0> s/register/FIFO

17:12 <wpwrak> hmm. something is still screwy with USB timing. that MS mouse comes and goes. make a small tweak and it dies. make another tweak and it works again.

17:12 <sb0> full HD would need more unfortunately, especially if we use 10:10:10 colors (which I think is the right thing to do, with more potential than resolution)

17:13 <sb0> but we can still scale on the fly (-:C

17:13 <sb0> (for the GUI it should be OK though)

17:14 <wpwrak> hm yes, 4x even isn't quite enough for 1024x768 (2.56x) if the pixel size doubles too

17:15 <wpwrak> well, you mentioned once that 800x600 would work. so maybe there is enough room :)

18:45 <sb0> the internal resolution is 512x512

18:46 <sb0> that's the size of the renderer's internal texture, which is then scaled to 640x480 (or 800x600)

18:46 <sb0> this works

18:46 <sb0> switching to 1024x768, I also increased the internal buffer to 1024x1024. this creates a lot of slowdown.

18:50 <wpwrak> ah, i see. 1024x512 ?

18:51 <wpwrak> 4:3 is dying anyway :)

18:53 <sb0> wpwrak, what clock recovery algorithm would you recommend?

18:53 <sb0> right now it's resyncing at every transition

18:54 <sb0> there's a counter running at 48MHz, and a transition resets it

18:56 <sb0> note that the 48MHz clock is not synchronized to the 12MHz transmission clock from the device

18:56 <wpwrak> hmm, maye some deglitching could help ? (if that's the issue)

18:58 <wpwrak> 1:4 may also be a bit low. 1:8 may give more margin of error.

18:58 <wpwrak> s/of/for/

18:59 <wpwrak> at which phase offset to you sample ? +1 cycle ?

18:59 <wpwrak> s/to/do/

18:59 <wpwrak> grmbl. can't type.

19:00 <sb0> 96MHz is more difficult on this slow FPGA

19:00 <sb0> would be easier if we had a virtex or such

19:00 <sb0> :P

19:01 <wpwrak> ;-)

19:01 <sb0> 2nd cycle, just in the middle

19:03 <sb0> yes, we can try deglitching ...

19:04 <wpwrak> i wonder if the problem could be with SYNC. it seems that the first edge of SYNC can be a bit slower (?) than the rest

19:04 <wpwrak> at least i've seen something like this mentioned

19:12 <sb0> meh, debian is no longer shipping gcc-avr?

19:14 <sb0> http://packages.debian.org/search?keywords=gcc-avr&searchon=names&suite=testing&section=all

19:15 <sb0> http://packages.debian.org/search?keywords=gcc-avr&searchon=names&suite=unstable&section=all

19:15 <sb0> wtf

19:15 <kristianpaul> swtich to sid :-)

19:16 <kristianpaul> or stay in stable :-)

19:16 <kristianpaul> http://packages.qa.debian.org/g/gcc-avr/news/20110708T163910Z.html

19:17 <sb0> debian stable is for meter-long bearded sysadmins

19:18 <kristianpaul> or people with poor-low bandwitch ;-) (me)

19:19 <kristianpaul> ah,http://release.debian.org/migration/testing.pl?package=gcc-avr

19:23 <wpwrak> sb0: wrong. it's for their grandparents ;-)

19:24 <wpwrak> sb0: do you have full-speed HID devices at home to test things with ?

19:24 <lars_> i use debian stable

19:24 <lars_> at least on the hardware where it still works

19:25 <sb0> no, but I have that LV3

19:26 <sb0> I didn't know full speed HID devices existed. but why make it simple ...

19:26 <sb0> anyway, I'm still in Norway, back tomorrow

19:27 <sb0> wpwrak, btw what is the use case for the software controlled USB power switch? heavy-handed debugging a la norruption?

19:29 <wpwrak> that, power-cycling devices that don't respond to any nicer form of reset, turn off things when not needed or when causing trouble, etc.

19:29 <wpwrak> being able to power-cycle is also nice for remote debugging

19:30 <wpwrak> beats shipping USB devices around the globe ;-)

19:30 <sb0> i see...

19:30 <sb0> let's add it then

19:31 <sb0> should the switch controller from navre or lm32?

19:31 <sb0> I can easily add two outputs to the existing GPIO controller in charge of the LEDs

19:32 <wpwrak> can control pass from one to the other with a soc update ? or are there routing restrictions that get i the way ?

19:32 <sb0> in a first version, the BIOS would them on and we can forget about them for a while

19:33 <wpwrak> sure. easy does it :)

19:33 <sb0> there are no routing restrictions, especially for such low speed signals

19:33 <wpwrak> you were concerned about glitches. at what time scales would they be ? < 1 us ? or longer ?

19:33 <wpwrak> perfect. that's what i thought

19:34 <sb0> I don't really mean a 'glitch' in the strict sense of the term - in fact, there shouldn't be any

19:34 <sb0> what happens though is that there are weak pull ups when the FPGA is unconfigured

19:34 <sb0> which happens at power up and when switching between SoC and standby bitstreams

19:35 <sb0> those have to be taken into account

19:35 <sb0> I don't like the idea of switching power to the USB device for < 1s until the FPGA is configured and then switch it off

19:35 <wpwrak> okay, that shoudn't be too critical. we can just default to vusb off.

19:36 <sb0> which would happen at power up if the pull up is not neutralized (assuming the switch control is active high)

19:36 <wpwrak> we can choose between active low or high

19:36 <wpwrak> we just have to tell adam that it's no longer up to his coin toss ;-)

19:37 <sb0> ok, if we choose active high, then we don't need to neutralize the FPGA pull ups

19:37 <sb0> and the USB devices will be off while the FPGA is reconfigured

19:38 <wpwrak> yup. and if the pull-ups are too weak, we can easily help them.

19:38 <wpwrak> or if they're likely to transition into Z

19:38 <sb0> we can then drive those pins high in the standby bitstream to keep USB off

19:38 <sb0> and when the SoC is loaded, the GPIO defaults to 0 and turns USB on, so we have nothing to do

19:39 <wpwrak> sounds good to me

19:40 <sb0> s/choose active high/choose active low

19:40 <sb0> of course

19:40 <wpwrak> err, yes :)

19:41 <sb0> I think the FPGA pull ups are strong enough ... is that a CMOS input on the switch?

19:41 <sb0> I don't remember the value from the datasheet, but as you can see they let pass a current sufficient to light the LEDs noticeably

19:42 <wpwrak> 1 uA leakage (max)

19:42 <sb0> yes, no problem then

19:42 <wpwrak> yeah. that'll be plenty :)

19:43 <wpwrak> and now, lunch ...

19:44 <sb0> bon appetit

19:44 <wpwrak> merci !

20:15 <sb0> ah, seems you fixed the DATAx mismatch bug I sometimes experienced... thanks :)

20:30 <wpwrak> heh, it was fun to see with just how many protocol violations one can almost get away :) well, i probably added a few of my own, so the fun probably isn't quite over yet ...

20:41 <sb0> hmm... your first series of patches doesn't apply cleanly

20:42 <sb0> do you still use the trigger code?

20:44 <sb0> ah, seems you didn't merge my new SOF/keepalive generation code

20:45 <wpwrak> no, i have that

20:45 <wpwrak> maybe you tripped over 4k vs. 8k ?

20:46 <wpwrak> i branched off "USB: send SOFs and keepalives on both ports and immediately after reset"

20:46 <wpwrak> commit f6c7474ae3b181157d8950e25c4705d53d9ae9c1

20:48 <sb0> no

20:49 <sb0> is there a debug mode for patch? it rejects a hunk that looks totally OK for me

20:49 <wpwrak> heh ;-)

20:50 <sb0> it just says '1 out of 1 hunk FAILED'... no way to know more?

20:51 <wpwrak> maybe -l helps ? (ignore whitespace) but i usually just apply it manually if patch starts hallucinating

20:51 <sb0> http://pastebin.com/LisFw0AK

20:51 <sb0> ah, yes, -l helps

20:53 <wpwrak> mystery difference :)

20:53 <wpwrak> maybe we'll have a trailing blank more or less now :)

20:54 <wpwrak> ah, there are some tailing tabs. maybe that's it

20:54 <wpwrak> tRailing

21:00 <GitHub15> [milkymist] sbourdeauducq pushed 11 new commits to master: http://git.io/_DfZ1g

21:00 <GitHub15> [milkymist/master] softusb: 4 kB hack - Werner Almesberger

21:00 <GitHub15> [milkymist/master] softusb: use OE# of port A for trigger - Werner Almesberger

21:00 <GitHub15> [milkymist/master] softusb: send SETUP and DATA0 back-to-back - Werner Almesberger

21:17 <GitHub5> [milkymist] sbourdeauducq pushed 6 new commits to master: http://git.io/vOd9Rg

21:17 <GitHub5> [milkymist/master] softusb: partially unroll usb_in - Werner Almesberger

21:17 <GitHub5> [milkymist/master] softusb: send ACKs from dedicated inline function - Werner Almesberger

21:17 <GitHub5> [milkymist/master] softusb: fail garbled packets fatally again - Werner Almesberger

21:18 <sb0> wpwrak, all merged, thanks a lot!

21:28 <sb0> http://www.smsc.com/index.php?pid=28&tid=143

21:28 <sb0> if someone wants high speed, that could be useful... I'm not sure if and how the fpga could do clock recovery at 480 mpbs

21:43 <lars_> mwalle: there seems to be a problem with the new uart linux code. milkymist_uart_tx_char is sometimes called although the uart tx path is still busy

21:46 <lars_> my fixis to check whether THRE is set, and if not leave the routine right away and wait for the next interrupt

21:46 r33p joined #milkymist

21:55 sh4rm4 joined #milkymist

22:36 <wpwrak> sb0: (merged) thanks !