##openfpga on 2018-10-31 — irc logs at freenode.irclog.whitequark.org

00:07 wbraun has joined ##openfpga

00:11 jevinskie has joined ##openfpga

00:16 wbraun has quit [Quit: wbraun]

00:30 futarisIRCcloud has joined ##openfpga

01:06 emeb has quit [Quit: Leaving.]

02:07 unixb0y_ has quit [Ping timeout: 244 seconds]

02:15 unixb0y has joined ##openfpga

02:37 m_w has quit [Quit: Leaving]

02:44 lovepon has joined ##openfpga

02:52 Miyu has quit [Ping timeout: 252 seconds]

03:07 wbraun has joined ##openfpga

03:11 kuldeep has quit [Read error: Connection reset by peer]

03:18 kuldeep has joined ##openfpga

03:46 azonenberg_work has joined ##openfpga

03:49 rohitksingh_work has joined ##openfpga

03:53 <wbraun> I have been playing with the icestorm setup and the picorv32 core. There is a build for the iCE40 ultraplus 5k (one the icebreaker board)

03:53 <wbraun> Its actually failing timing for me at the default clock frequency (something like 12MHz)

03:54 <wbraun> and then passes if I disable a few things on the core. Its at about 80% utilization for the FPGA.

03:54 <wbraun> Given the low max clock frequency, are the iCE40 FPGAs really derpy, or is there some limitation with the icestorm toolchain?

03:55 <Bob_Dole> ice40UP5K are really slow

03:55 <wbraun> The picorv32 repo claims at least a few hundred MHz on a 7 series FPGA

03:56 <Bob_Dole> ice40HX8K are faster speed grades, but they lack a few features.

03:56 Bike has quit [Quit: Lost terminal]

03:56 <wbraun> The ultraplus are the newer iCE40 FPGAs, right?

03:56 <Bob_Dole> I think the HX ones should get upwards of 50mhz

03:56 <wbraun> I passed timing at whatever the default timing was for the HX dev board build.

03:56 <Bob_Dole> yeah, ultra plus are newer, but they're slower apparently. they add some features though

03:57 <wbraun> How can they be an order of magnitude slower than a 7 series FPGA....

03:57 <wbraun> I guess the high utilization (~80%) may have also played a role.

03:58 <wbraun> Apparently the HX4k and the HX8k are the same die? Can I generate 8K sized bitstreams to target the 4k devices with the icestorm toolchain?

03:59 <wbraun> I am trying to avoid the BGA devices…

03:59 <Bob_Dole> from my understanding it's the "fabric" that interconnects the LUTS, 7-series xilinx chips are just faster there.

03:59 <Bob_Dole> and up5k models are exceptionally slow at it

04:00 <Bob_Dole> (I am not a specialist here, at all.)

04:02 <wbraun> Yah, its the fabric that is slower. Just suprising that its that much slower.

04:03 <wbraun> according to wikipedia the iCE40 is on a 40nm process node. The artix-7 is on something like a 28nm process node. Those two nodes should not have an order of magnitude speed difference.

04:03 <Bob_Dole> there's more than node-size for speed

04:04 <wbraun> Its a pretty good figure of merit though.

04:04 <wbraun> So is it possible to generate 8k sized bitsteams for the 4k devices?

04:04 <Bob_Dole> I don't know, will need to wait for a response from someone more familiar.

04:04 <wbraun> Or is that not built into the tools to avoid the wrath of lattice?

04:06 <wbraun> I have heard multiple FPGAs from multiple vendors employ the same product differentiation (only JTAG ID) but there is a conspicuous lack of tools that take advantage of that.

04:14 ayjay_t has quit [Remote host closed the connection]

04:15 ayjay_t has joined ##openfpga

04:29 <kc8apf> wbraun: yes, it's common that a single die design is fused for different SKUs

04:30 <kc8apf> it's unclear if the fused parts are binned (tested and bad sections mapped out) or just sold as lower end parts for market segmentation

04:31 <Bob_Dole> are they actually fused off, is his question

04:31 <kc8apf> generally, no

04:31 <kc8apf> the fusing is to set the JTAG ID

04:32 <Bob_Dole> so possible to put an 8k bitstream on the 4k parts? maybe useful

04:32 <kc8apf> No one wants to have a tool end-user find their part has occasional failures

04:33 <kc8apf> but yes, it's entirely possible to do

04:33 <kc8apf> ice40 4k and 8k

04:33 <kc8apf> many of the artix7 line

04:34 <kc8apf> max-v too

04:34 Xark has quit [Ping timeout: 252 seconds]

04:34 <rqou> max-v (and other altera parts) seem to be literally controlled by an "if" statement in the software

04:34 <rqou> max-v at least doesn't even differ in jtag idcode

04:35 * kc8apf suspects rqou has an alert on max-v

04:35 <rqou> nor is it "geometrically" restricted; literally "if (LEs in design > limit) raise an error"

04:36 <rqou> not e.g. "can only use the top half"

04:36 <rqou> and no, just happened to notice the conversation

04:37 <TD-Linux> has anyone done anything with mach 4

04:38 <kc8apf> i haven't heard anything

04:38 <wbraun> They are not binned because you can typically route to any LUT on the device. You are just limited to some number of LUTs by the software.

04:38 <kc8apf> I have a MachXO2 on my desk to poke at at some point

04:39 <wbraun> So I guess the answer is that no one has bothered adding support for doing that with the iCE40 devices then?

04:39 <rqou> uh, it works for the ice40

04:39 <rqou> you can generate a bitstream with 8k worth of LUTs for a 4k device

04:40 <kc8apf> just need to patch the bitstream with the 4k device ID

04:40 Xark has joined ##openfpga

04:40 <rqou> i think it does that automatically

04:40 <kc8apf> oh, I thought it didn't

04:41 <wbraun> oh cool.

04:43 <wbraun> So I can use the ICE40HX4K-TQ144 (in an easy to deal with non BGA LQFP package) as an 8k device then? And not have to deal with BGA packages while getting to play with the biggest device?

04:43 ym has joined ##openfpga

04:43 <kc8apf> wbraun: should be able to

04:43 <rqou> yeah, there's already some board that does this

04:44 <rqou> iirc it's a shield thing for some "maker" board form factor and has sdram

04:45 <Bob_Dole> me and solra are going with the BGAs because honestly, since I have a spare toaster oven, I think BGAs sound easier. they'll pull themselves intoplace if you get close enough.

04:45 <rqou> can confirm, BGAs (at least at 1mm pitch on ENIG) work much more reliably than fine-pitch QFPs

04:46 <wbraun> iCE40 FPGA has a largest pitch of 0.8mm

04:47 <rqou> probably works, but ymmv

04:47 <wbraun> 1mm pitch is within cheap PCB design limits, 0.8mm typically requires you to push / break some of the limits

04:47 <wbraun> I have done 1mm in the past, its not that bad.

04:47 <Bob_Dole> why did I only just now realize what ymmv is?

05:00 <wbraun> What project are you working on Bob_Dole?

05:01 <SolraBizna> tricking me into making cryptocurrency mining devices

05:01 <SolraBizna> all I wanted was a 65816 and some memory, but noooo :P

05:01 <wbraun> With what FPGAs?

05:02 <wbraun> Is there anything that can be profitably mined with even FPGAs nowadays?

05:02 <wbraun> What does not have an asic?

05:02 <wbraun> I was working on a sia coin miner last year but then I got bored with it. A few months later they announced an asic...

05:04 <Bob_Dole> wbraun, there's a few that need FPGAs, monero specifically. They're mostly going for high-end xilinx parts for that, and mid-tier xilinx parts for accelerating gpu mining for various algos.

05:04 <Bob_Dole> I'm trying to get solra to make a fully FOSS gpu architecture, and if possible, include the ability to mine cryptocurrency stuff. first part is most important.

05:04 <SolraBizna> he's also working on tricking me into fully open hardware he can actually have

05:04 <SolraBizna> which is much easier because I also want that

05:04 ayjay_t has quit [Quit: leaving]

05:05 <Bob_Dole> be able to pair said gpu with a fully open cpu with fully open everything-else is desired

05:06 <wbraun> My attempted edge was to use AWS FPGA instances. They have hefty FPGAs

05:06 <wbraun> I wonder if I would have actually been profitable if I finished it. Probably not.

05:07 <wbraun> I wonder if I would have been more cost effective than the current best AWS credit >> crypto >> USD laundering method. Possibly. But still not likely.

05:07 <wbraun> Also, turns out that they restrict the FPGA instances quite heavily.

05:07 <Bob_Dole> I don't know much about the AWS fpgas.

05:07 <wbraun> Its a top end xilinx part.

05:07 <wbraun> The toolchain for loading stuff on it is a bit convoluted though.

05:08 <galv[m]> How do they restrict the FPGA instances? I was able to reserve one just fine with some credits. Do they put restrictions on the total number?

05:08 ayjay_t has joined ##openfpga

05:08 <wbraun> They won’t let you just run any bitstream and you have to use their wrapper for interface

05:08 <galv[m]> The toolchain is a gory mess :(

05:08 <Bob_Dole> they're working on a custom VCU15... something, based card with some tweaks to cooling and power delivery, changing the V to a B, and then some artix-7 parts to put into M.2 slots called Acorn

05:09 <kc8apf> wbraun: that's so they can regain control, etc

05:09 <Bob_Dole> not aws. but SQRL/Mineority

05:09 <wbraun> They restrict the number. I was trying to determine the price elasticity of the spot instances and I had to beg them for the ability to have 10 instances

05:09 <wbraun> Probably mostly so you can’t dick with interfaces you are not supposed to or fry the device with a dud bitstream

05:10 <Bob_Dole> the acorns look interesting, M.2 Artix 7 cards, they're being focussed at accelerating parts of mining algos that can get GPUs more competitive again

05:10 <wbraun> And single FPGA instances mind you, not the 10x of the full box with multiple cards

05:10 <kc8apf> a few of their security people told me about the lengths they go to ensure the fpgas are clean before transferring between users

05:11 <wbraun> the price elasticity was actually measurable with only 10 spot instances, so I guess they did not have that many at the time.

05:11 <wbraun> I had to pretend to be working on some research project at my university to even get 10

05:11 <wbraun> Clean? What stored state is there?

05:11 <kc8apf> I recall them having a bunch of RAM attached

05:12 <wbraun> Yah. Should be pretty easy to flush volatile ram though

05:12 <kc8apf> also, bitstream loading doesn't necessarily clear latches

05:12 ayjay_t has quit [Client Quit]

05:12 ayjay_t has joined ##openfpga

05:12 <wbraun> So they load some “cleaning” bitstream to initialize everything to a known state?

05:13 <wbraun> What are the latches you are talking about though? Latches would be created with LUTs, no?

05:13 <kc8apf> in a 7-series device, bitstreams don't always include the storage bits in a LUT

05:13 <kc8apf> or BRAMs

05:14 <kc8apf> think of partial reconfig situations

05:15 <kc8apf> I didn't get concrete details on what they do. I'm guessing a bit based on what I know of 7-series bitstream format

05:15 <wbraun> Oh yah, each slice contains some distributed ram / shift registers

05:16 <wbraun> Now I think about it, don’t they always use partial reconfiguration and always keep their interface wrapper configured or something?

05:16 <kc8apf> I think so

05:16 <wbraun> I don’t remember, its been about a year in a half since I read the docs

05:16 <wbraun> It was fun to learn about. Not very accessible though.

05:16 <wbraun> But it is cool to theoretically have access to such a big FPGA at relatively low cost.

05:16 <kc8apf> it would avoid having to reset the PCIe block

05:17 <Bob_Dole> my thought is: cache is good. sram is "cheap enough" to get 8MB of fast-ish SRAM. cryptonight needs ~2MB per thread. ECP5s don't Go Faaast,but have room to get a decent number of threads. cache is good for CPU so even if it can't mine well that'll probably make a cpu more Tolerable for general use onit. >.>

05:20 <Bob_Dole> DDR3 isn't working in the open toolchain right now, right? how's ddr/ddr2 doing?

05:21 <SolraBizna> iCE40s can do DDR

05:26 jevinski_ has joined ##openfpga

05:28 jevinskie has quit [Ping timeout: 252 seconds]

05:39 Zorix has quit [Ping timeout: 264 seconds]

05:43 jevinskie has joined ##openfpga

05:43 Zorix has joined ##openfpga

05:44 jevinski_ has quit [Ping timeout: 245 seconds]

05:50 <sorear> DDR means two things

05:51 <emily> dance dance revolution 3

05:55 jevinski_ has joined ##openfpga

05:56 jevinskie has quit [Ping timeout: 250 seconds]

06:24 <wbraun> So Arachne-pnr and nextpnr are both place and route tools?

06:26 <wbraun> I am looking through the example builds on picorv32 and it looks like one of the builds (for the iCE40 UltraPlus 5K) uses nextpnr and the other (for the iCE40 hx8k) uses arachne-pnr

06:27 <wbraun> It seems that both were building yesterday but something is not working today. I was poking around in the make file to debug and noticed that there was a difference between the two builds.

06:27 <wbraun> Not that its causing my problem. Just curious. Both repos seem to be similarly updated / active.

06:28 <Bob_Dole> arachne-pnr is the old pnr tool,replaced by nextpnr

06:28 <wbraun> ok. Cool.

06:28 <wbraun> is the command interface the same?

06:30 <wbraun> Ooh, looks like its not. I will figure it out though.

06:31 <wbraun> So nextpnr is the tool to use for the near future? Is there other alternatives?

06:49 <kc8apf> VPR but it doesn't have working support for any FPGAs yet

07:09 <wbraun> cool! Thanks for answering the questions! Hopefully I will have something building soon.

07:20 lovepon has quit [Ping timeout: 250 seconds]

08:11 <gnufan> Bob_Dole: with regard to "FOSS GPU", let me point you to http://libre-riscv.org/3d_gpu/ ; LKCL the guy backing it maybe has not such a great success story behind, but maybe some ideas could worth a look..

08:15 <Bob_Dole> gnufan, that the eoma68 guy?

08:15 <gnufan> indeed..

08:16 <gnufan> i subscribed for the microdesktop... still waiting for it! :-)

08:16 <gnufan> but things are "slowly" moving there.. it looks like..

08:16 <Bob_Dole> One of his ideas is basically what I was proposing to solra. take a risc-v core, modify it some,modify llvmpipe to use those modifications, call it day.

08:17 <Bob_Dole> and same, I ordered a compute card and enough to use it... has it been a year yet? I think it's been a year+ now.

08:17 <Bob_Dole> he was doing great at keeping updates about progress and hindrances and then went silent for a while, it looks like he's gotten back to giving updates.

08:18 <whitequark> Bob_Dole: llvmpipe is really slow...

08:18 <whitequark> well, it's impressively fast for what it is

08:18 <whitequark> but you pretty much have to back it with like an i7 to get anything meaningful

08:18 <whitequark> and even then

08:18 <Bob_Dole> whitequark, yes. the point is desktop environments to run smoothly, nothing more.

08:18 <whitequark> well

08:18 <whitequark> macOS up to 10.8.5 works well on llvmpipe

08:18 <whitequark> their variant of

08:18 <whitequark> 10.10+? nope, very slow

08:18 <Bob_Dole> since everything requires 2.5D acceleration now

08:18 <whitequark> just barely enough to not be completely unusable

08:20 <Bob_Dole> I haven't touched Mac OS since 10.6

08:21 <whitequark> i run it in a vm, which.... HANG ON

08:21 <whitequark> i can run it in a VM with GPU passthrough now

08:22 <whitequark> let's try it

08:35 m4ssi has joined ##openfpga

08:36 GuzTech has joined ##openfpga

08:40 futarisIRCcloud has quit [Quit: Connection closed for inactivity]

08:48 <Bob_Dole> I wonder how well the Nyuzi core ports over to ECP5.. since it looks like Cyclone IVs are also LUT4?

08:48 <Bob_Dole> but there's more to it than hat

08:49 <Bob_Dole> s/hat/that

08:49 Miyu has joined ##openfpga

08:54 Miyu has quit [Ping timeout: 268 seconds]

09:13 thehurley3[m] has quit [Remote host closed the connection]

09:13 galv[m] has quit [Remote host closed the connection]

09:13 AlexDaniel-old[m has quit [Remote host closed the connection]

09:13 pointfree[m] has quit [Remote host closed the connection]

09:13 indefini[m] has quit [Remote host closed the connection]

09:13 jfng has quit [Remote host closed the connection]

09:13 edmund20[m] has quit [Remote host closed the connection]

09:13 Wallbraker[m] has quit [Remote host closed the connection]

09:13 nrossi has quit [Remote host closed the connection]

09:16 Xark has quit [Ping timeout: 252 seconds]

09:16 <daveshah> Bob_Dole: I think ECP5 and Cyclone IV are quite comparable resource-wise

09:16 <daveshah> Although their low level architecures are different; similar LUT, RAM and multiplier widths

09:16 <daveshah> Timing should be similar too

09:18 Xark has joined ##openfpga

09:18 <Bob_Dole> nyuzi was using 74k of its LE/LUT4s it looked like, so unless it's using DSP blocks..

09:23 <daveshah> Should just about fit in an 85k then

09:23 <daveshah> 85k has plenty of DSPs too

09:24 <Bob_Dole> wasn't there only one operation being supported on them presently?

09:25 AlexDaniel-old[m has joined ##openfpga

09:27 <daveshah> Well the FOSS tools won't build Nyuzi until late next year at the earliest due to its heavy use of SystemVerilog

09:27 <daveshah> So DSP feature set in ECP5 would be the least of my worries

09:27 <Bob_Dole> Ah, well, there's plenty enough to do in the mean time.

09:52 nrossi has joined ##openfpga

09:52 pointfree[m] has joined ##openfpga

09:52 indefini[m] has joined ##openfpga

09:52 jfng has joined ##openfpga

09:52 Wallbraker[m] has joined ##openfpga

09:52 thehurley3[m] has joined ##openfpga

09:52 edmund20[m] has joined ##openfpga

09:52 galv[m] has joined ##openfpga

10:51 <openfpga-github> [Glasgow] whitequark pushed 1 new commit to master: https://github.com/whitequark/Glasgow/commit/77e3989de5cf30d59051edba0e41b4a75fce0702

10:51 <openfpga-github> Glasgow/master 77e3989 whitequark: applet.{i2c,spi}: add missing __init__.py to satisfy setuptools.

10:56 <openfpga-github> [Glasgow] whitequark opened issue #70: Add a DNP resistor footprint between CYP_MEM A0 and Vcc https://github.com/whitequark/Glasgow/issues/70

10:58 <openfpga-github> [Glasgow] whitequark closed issue #66: Firmware does not boot when built with sdcc 3.7.0 https://github.com/whitequark/Glasgow/issues/66

10:58 <openfpga-github> [Glasgow] whitequark commented on issue #66: Was an issue with libfx2 xmemcpy using DPS. Fixed by not using DPS, at a small per-iteration penalty. https://github.com/whitequark/Glasgow/issues/66#issuecomment-434644556

11:02 <openfpga-github> [Glasgow] whitequark commented on issue #38: We don't have any margin for two-row escape routing on 0.8 mm with 5/5 rules already, so this is not viable. https://github.com/whitequark/Glasgow/issues/38#issuecomment-434645525

11:02 <travis-ci> whitequark/Glasgow#112 (master - 4c115e9 : whitequark): The build has errored.

11:02 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/77e3989de5cf...4c115e9317e0

11:02 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448787788

11:04 <travis-ci> whitequark/Glasgow#111 (master - 77e3989 : whitequark): The build has errored.

11:04 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/7176c8520bc8...77e3989de5cf

11:04 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448785061

11:15 Bike has joined ##openfpga

11:21 <openfpga-github> [Glasgow] whitequark opened issue #71: Remove all hardcoded instances of system clock frequency https://github.com/whitequark/Glasgow/issues/71

11:41 Bike is now known as Bicyclidine

11:46 <openfpga-github> [Glasgow] whitequark opened issue #72: Rename USB pipes to Q, R, ... https://github.com/whitequark/Glasgow/issues/72

12:20 Kamots has quit [Quit: Lost terminal]

12:37 mumptai has joined ##openfpga

12:50 rohitksingh_work has quit [Read error: Connection reset by peer]

12:50 <openfpga-github> [Glasgow] whitequark commented on issue #70: Should be A1. First, it's equivalent to shorting pins 1 and 2 (already easy, but you have to remember the pinout). Second, it gives the EEPROM an unique address, unlike shorting A0 to Vcc. https://github.com/whitequark/Glasgow/issues/70#issuecomment-434674257

13:21 rohitksingh has quit [Ping timeout: 252 seconds]

13:57 mumptai has quit [Quit: Verlassend]

14:10 rohitksingh has joined ##openfpga

14:35 <openfpga-github> [Glasgow] whitequark opened issue #73: Pipeline USB reads https://github.com/whitequark/Glasgow/issues/73

14:43 Miyu has joined ##openfpga

15:10 Bicyclidine has quit [Ping timeout: 240 seconds]

15:16 mumptai has joined ##openfpga

15:20 rohitksingh has quit [Ping timeout: 268 seconds]

15:25 Bicyclidine has joined ##openfpga

16:23 rohitksingh has joined ##openfpga

16:30 <openfpga-github> [Glasgow] marcan pushed 2 new commits to revC: https://github.com/whitequark/Glasgow/compare/5321ef457ba2...e709a5bcac5f

16:30 <openfpga-github> Glasgow/revC e709a5b Hector Martin: revC: FX2<->FPGA routing take 2

16:30 <openfpga-github> Glasgow/revC 4df636b Hector Martin: revC: FX2<->FPGA routing take 1

16:32 GuzTech has quit [Quit: Leaving]

16:33 <openfpga-github> [Glasgow] marcan force-pushed revC from e709a5b to 5281b1b: https://github.com/whitequark/Glasgow/commits/revC

16:33 <openfpga-github> Glasgow/revC 5281b1b Hector Martin: revC: FX2<->FPGA routing take 2

16:42 <travis-ci> whitequark/Glasgow#113 (revC - e709a5b : Hector Martin): The build has errored.

16:42 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/5321ef457ba2...e709a5bcac5f

16:42 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448946040

16:44 <openfpga-github> Glasgow/revC 31ab573 Hector Martin: revC: delete old FPGA...

16:44 <openfpga-github> [Glasgow] marcan pushed 1 new commit to revC: https://github.com/whitequark/Glasgow/commit/31ab573b2303a2573a8f0387fdcd07a048947cf5

16:47 <travis-ci> whitequark/Glasgow#114 (revC - 5281b1b : Hector Martin): The build has errored.

16:47 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/e709a5bcac5f...5281b1b21b1f

16:47 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448947773

16:55 <openfpga-github> [Glasgow] marcan pushed 1 new commit to revC: https://github.com/whitequark/Glasgow/commit/91819304a3f92aac0472b97e10ca26c6cf9dbc1e

16:55 <openfpga-github> Glasgow/revC 9181930 Hector Martin: revC: route missing D0 and D1

16:56 <travis-ci> whitequark/Glasgow#115 (revC - 31ab573 : Hector Martin): The build has errored.

16:56 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/5281b1b21b1f...31ab573b2303

16:56 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448952742

16:58 <openfpga-github> [Glasgow] marcan force-pushed revC from 9181930 to f6b74c4: https://github.com/whitequark/Glasgow/commits/revC

16:58 <openfpga-github> Glasgow/revC f6b74c4 Hector Martin: revC: route missing D0 and D1

17:08 <travis-ci> whitequark/Glasgow#116 (revC - 9181930 : Hector Martin): The build has errored.

17:08 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/31ab573b2303...91819304a3f9

17:08 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448958179

17:10 <travis-ci> whitequark/Glasgow#117 (revC - f6b74c4 : Hector Martin): The build has errored.

17:10 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/91819304a3f9...f6b74c4249c7

17:10 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448959474

17:11 m4ssi has quit [Remote host closed the connection]

17:22 <awygle> wbraun: re: the UP5K being so slow - that's because it's optimized for power consumption, not speed

17:39 <awygle> why are 7:1 SERDES so common? that seems like a weird ratio

17:40 <daveshah> awygle: because 7-bit LVDS is very commonly used

17:40 <daveshah> For LCDs and some cameras

17:40 <awygle> ah, okay, that's what i was missing

17:40 <daveshah> Why that in the first place, I don't know

17:40 <awygle> still seems like a weird number of bits

17:43 <whitequark> i know why

17:43 <whitequark> uhhh

17:43 <whitequark> look at the encoding lcd lvds uses

17:44 <whitequark> oh yeah

17:44 <whitequark> awygle: 6 red, 6 green, 6 blue, hsync, vsync, data enable

17:45 <whitequark> the typical lcd lvds

17:45 <whitequark> transmitted in 21 bit chunks

17:45 <whitequark> via three diffpairs

17:45 <whitequark> so basically ,it's because of vga

17:45 <whitequark> i assume cameras do something like that in reverse

17:45 <whitequark> so you can feed the lcd from it or something

17:46 <whitequark> or because 7:1 SERDES were already a thing

17:47 <sorear> why would you spend 1/7 of your bits on sync data

17:47 <whitequark> because of vga

17:48 <whitequark> it's for converting vga transmitters and vga displays directly to lvds

17:48 <whitequark> like

17:48 <whitequark> it is literally made forfeeding lcd controllers that have the parallel lcd bus

17:48 * awygle writes "learn about VGA" on TODO list, thinks about it for a second, and then crosses it off again

17:48 <whitequark> awygle: vga is

17:49 <whitequark> digital hsync, digital vsync, and analog r+g+b

17:49 <whitequark> that's the entire thing

17:49 <sorear> I assume it's also wasting a stupid fraction of the clock cycles on blanking intervals

17:49 <Bob_Dole> awygle, but you can make a DAC out of just a bunch of resistors you can drive off the pins of an MCU

17:49 <whitequark> vsync is a frame start strobe, hsync is a line start strobe

17:49 <whitequark> yes

17:49 <whitequark> you now understand vga

17:49 <Bob_Dole> and have vga, just like that

17:49 <sorear> despite the fact that LCDs fundamentally don't have a retrace period

17:49 <whitequark> sorear: but when are you going to generate music if not during vsync intervals?

17:49 <whitequark> and play it if not during hsync intervals? :P

17:50 <awygle> how will my mouse tear if there's no vsync

17:50 <awygle> (i might have that backwards)

17:50 <whitequark> lol

17:50 <qu1j0t3> awygle: is that a trick or rhetorical question

17:50 <whitequark> awygle: did you know that

17:50 <qu1j0t3> awygle: or are you looking for an answer

17:50 <whitequark> your gpu includes support for drawing the cursor

17:50 <whitequark> specifically to avoid cursor tearing

17:50 <whitequark> it's called "silken mouse" by nvidia iirc

17:51 <sorear> tearing is a kind of aliasing artifact, and will be relevant as long as time is sampled discretely

17:51 <awygle> qu1j0t3: it's a question demonstrating my total lack of comprehension of how any of this works or, in fact, what "mouse tearing" actually is

17:51 <qu1j0t3> and just "mouse" by everyone else, who uses vertical retrace interrupts since forever ;-)

17:51 <whitequark> i assume so that hardware that can't into double buffering still has usable cursor

17:52 <sorear> I knew that basically every GPU has a cursor sprite but not that it was about tearing

17:52 <sorear> thought it was just another "we added this when computers were slow and never revisited the decision"

17:52 <awygle> ... did seeed make their website spooooky for halloween? everything is orange

17:53 <qu1j0t3> awygle: If you change the (single buffered) vram during the raster, you can see a partially updated mouse "sprite". This is CRT terminology because the problem is very old, and was solved originally by using a vertical retrace interrupt and doing the update during CRT blanking

17:53 <qu1j0t3> (doesn't need double buffering, but double buffering also needs that interrupt anyway)

17:53 <whitequark> double buffering means double ram

17:53 <awygle> i guess "seeed" is also kind of spoooky by itself

17:53 <qu1j0t3> yes.

17:53 <whitequark> and i assume that wasnt available back them

17:53 <whitequark> then

17:53 <awygle> qu1j0t3: okay that makes sense, kind of

17:53 <awygle> concurrency problem

17:53 <qu1j0t3> whitequark: yeah, but it just wasn't needed for such a small problem

17:54 <whitequark> right

17:54 <awygle> seems like a big-hammer fix for it

17:54 <awygle> but i guess with CRTs maybe not

17:54 <qu1j0t3> whitequark: double buffering gets used when you have a full screen of sprites, like Dark Castle on Mac, then of course you flip buffers during blanking

17:54 <qu1j0t3> but you know that

17:54 * qu1j0t3 shuts up

17:54 <whitequark> well i've never heard of that specific application of double buffering

17:55 <qu1j0t3> awygle: not really, the interrupt had many uses

17:55 <whitequark> im not much of a crt person

17:55 <whitequark> i vaguely aware they exist

17:55 <whitequark> i used an actual crt for a few years i think

17:55 <whitequark> but that was sooooo long ago

17:55 <qu1j0t3> heh, i'm still a crt person, but electrostatic

17:55 <qu1j0t3> doing vector stuff for fun

17:56 <whitequark> nice

17:56 <awygle> qu1j0t3: "don't concurrently access these bits" seems like a solveable problem at higher granularity to me, but i guess it's not worth it

17:56 <qu1j0t3> i think with LCDs you still have to fake the retrace interrupt in principle, no doubt GPUs do that

17:56 <openfpga-github> Glasgow/revC 78bfffc Hector Martin: Update SOT563 footprints to Glasgow version and more DRC fixes

17:56 <openfpga-github> Glasgow/revC a2fd2f5 Hector Martin: revC: fix some misc DRC errors

17:56 <openfpga-github> [Glasgow] marcan pushed 2 new commits to revC: https://github.com/whitequark/Glasgow/compare/f6b74c4249c7...78bfffc056ad

17:56 <qu1j0t3> awygle: you can't NOT concurrently access those bits

17:56 <qu1j0t3> awygle: you can't block the video signal

17:56 rohitksingh has quit [Ping timeout: 240 seconds]

17:56 <qu1j0t3> that;'s the root of the problem, it's real time and mapped

17:56 <awygle> but the video signal could (in principle) block the memory write

17:57 <qu1j0t3> true

17:57 <openfpga-github> [Glasgow] marcan force-pushed revC from 78bfffc to 5ff2fd7: https://github.com/whitequark/Glasgow/commits/revC

17:57 <openfpga-github> Glasgow/revC 5ff2fd7 Hector Martin: revC: Update SOT563 footprints to Glasgow version and more DRC fixes

17:57 <qu1j0t3> awygle: yeah, possibly some systems solved it that way

17:57 <qu1j0t3> awygle: but the interrupt is not a very complex solution

17:57 <whitequark> that seems like a lot of pain

17:57 <whitequark> blocking writes

17:57 <awygle> sure, and what i'm describing does add a lot of complexity

17:57 <whitequark> like it'd fuck up anything else realtime

17:57 <awygle> whitequark: only if the "anything else realtime" was also writing to display memory

17:58 <qu1j0t3> anyway, for example, the Mac had "silken cursor" since 1984 (and other guis probably earlier)

17:58 <qu1j0t3> for this reason

17:58 <openfpga-github> [Glasgow] marcan force-pushed revC from 5ff2fd7 to 5a50dd6: https://github.com/whitequark/Glasgow/commits/revC

17:58 <openfpga-github> Glasgow/revC 5a50dd6 Hector Martin: revC: Update SOT563 footprints to Glasgow version and more DRC fixes

17:58 <whitequark> awygle: i automatically assume the implementation of bus wait cycle insertion would be fucked up somehow

17:58 ZipCPU_ has joined ##openfpga

17:58 <whitequark> because it's intel

17:58 <awygle> lol, fair

17:59 <awygle> i was thinking of software level locking

17:59 <whitequark> oh

17:59 <awygle> rather than doing it in hardware

17:59 <whitequark> ... wouldn't that involve a vsync interrupt *anyway*

17:59 <whitequark> assuming you don't want to poll a bit

18:00 <whitequark> so you'd need a vsync interrupt and then either block there (ew) or longjmp out of the display update code (double ew)

18:00 ZipCPU has quit [Ping timeout: 250 seconds]

18:00 <sorear> awygle: the video signal does block writes, but moving the cursor needs to be atomic relative to that

18:01 <sorear> the fine-granularity but traditional approach here would be "read a current-scanline register, poll if it's too close to where you want to write"

18:02 <sorear> some old systems didn't have enough memory for *single* buffering - ancient video systems are a trip

18:02 <sorear> would the NES PPU be considered a CRTC

18:04 <awygle> okay i'm convinced

18:05 <azonenberg_work> sorear: well if you want to do really low memory stuff

18:05 <azonenberg_work> you have a bunch of sprites and coordinates

18:05 <azonenberg_work> and synthesize pixels in real time off that :p

18:06 <sorear> but if you don't have a hardware sprite for the cursor, the window system needs to save the content under the cursor somewhere

18:06 <sorear> which would then significantly complicate all drawing routines if you aren't already double-buffering everything

18:07 <sorear> i guess you could not bother and just fire expose events on every mouse move but ughh

18:12 <travis-ci> whitequark/Glasgow#120 (revC - 5a50dd6 : Hector Martin): The build has errored.

18:12 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/5ff2fd715ac2...5a50dd640c0c

18:12 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/448986733

18:13 ZipCPU_ is now known as ZipCPU

18:18 wbraun has quit [Quit: wbraun]

18:20 <azonenberg_work> sorear: well if you're doing that kind of low time rendering

18:20 <azonenberg_work> you never store the framebuffer

18:20 <azonenberg_work> you spit out pixel data in real time and always re-render at the frame rate

18:20 <sorear> sorry, I'm talking about two things at once

18:20 <azonenberg_work> i.e. your cpu clock is the pixel clock

18:20 <sorear> the NES PPU doesn't store a framebuffer anywhere

18:21 <sorear> but we've also been discussing cursor rendering in Toolbox, and, uh, it's been too long since I read Inside Macintosh

18:22 <sorear> iirc classic mac os, X11, and windows pre-2000 or so all use the "there is one framebuffer, 'windows' exist only to drive the event loop, moving a window results in expose events" approach

18:23 <sorear> but if you have a framebuffer, and the framebuffer contains the cursor, drawing is tricky

18:25 <azonenberg_work> Yeah

18:25 <azonenberg_work> if i ever get around to making any embedded gizmos with a UI

18:25 <azonenberg_work> i'm thinking of having a hardware compositor in the fpga :p

18:26 <azonenberg_work> each app renders using a combination of hw and sw to its own private framebuffer

18:26 <azonenberg_work> then as you finish updating you push the current framebuffer pointer to the compositing block

18:27 <azonenberg_work> which knows the position of each window and generates the final framebuffer from that

18:28 wbraun has joined ##openfpga

18:29 <sorear> why would you generate a final framebuffer instead of just compositing during scanout

18:29 <sorear> unless you want to do weird transforms

18:46 <awygle> wait so in a 7-series

18:47 <awygle> does every signal that a bufg drives end up on one of the 12 clock nets in each region?

18:47 <awygle> that was not a super comprehensible phrasing of that question i guess

19:00 <azonenberg_work> sorear: because the latency of memory reads is hard to predict if they're coming from all different sources

19:01 <azonenberg_work> compositing during scanout is easier if you have linear reads so you can prefetch etc

19:01 <azonenberg_work> awygle: Yes, that is my understanding

19:01 <awygle> azonenberg_work: then why are there 32 BUFGs if you can never use more than 12

19:01 <azonenberg_work> I dont think

19:01 <azonenberg_work> seec

19:01 <awygle> (this is specifically a zynq 7020, exact numbers may vary)

19:02 <azonenberg_work> awygle: so, if i'm reading UG472 right

19:02 <azonenberg_work> There are 32 vertical clock lines down the center spine of the device

19:02 <azonenberg_work> one BUFG drives each one

19:03 <azonenberg_work> Each clock region has 12 horizontal clock lines, which can each be fed by some or all (docs are unclear on exact routability here) of the 32 global

19:03 <azonenberg_work> then the HCLKs drive the per-region clock tree

19:03 <azonenberg_work> So basically, you have 32 global clocks but any one clock region can only use 12 of the 32

19:03 <azonenberg_work> but they can be different subsets

19:03 <azonenberg_work> you can also use a BUFH to drive a HCLK directly without using one of the 32 global lines

19:13 <awygle> hm okay

19:14 <awygle> this design _might_ fit then

19:14 <azonenberg_work> This is one of many situations where floorplanning is handy

19:14 <awygle> what counts as a clock region?

19:15 <azonenberg_work> have you ever looked at a 7 series chip in the floorplanner? :)

19:15 <azonenberg_work> The boxes with color-coded outlines

19:15 <awygle> no, because we're using some kind of horrible mess of mostly-EDK

19:15 <awygle> don't @ me bro

19:16 <azonenberg_work> planahead then?

19:16 <awygle> uh sec

19:17 <azonenberg_work> in other news woo 500 error

19:17 <azonenberg_work> https://i.imgur.com/RhUdP3B.png

19:18 <azonenberg_work> (fixed now)

19:18 <azonenberg_work> Roughly speaking a clock region is half the chip wide and 50 CLBs high

19:19 <azonenberg_work> So all 7-series parts are 2xN clock regions

19:19 <azonenberg_work> Height of a clock region is constant, width varies by size of the part

19:19 <azonenberg_work> (ultrascale changes this, iirc all ultrascale clock regions are the same size and you can have >2 columns of them)

19:19 <awygle> where is the planahead button

19:20 <azonenberg_work> um... i know how to get to it from projnav but not edk

19:20 <azonenberg_work> But you can launch it standalone if you just want to look at the design

19:20 <awygle> what is projnav

19:20 <azonenberg_work> _pn

19:20 <azonenberg_work> the normal ISE ide

19:21 <azonenberg_work> source $XILINX/settings64.sh

19:21 <azonenberg_work> planAhead

19:21 <azonenberg_work> is how you'd launch it in the CLI

19:22 <azonenberg_work> Then once you get it up, create a new dummy project in /tmp or something, select the device, and import the ngc (synthesized netlist) and ncd (par'd netlists) from your build

19:22 <azonenberg_work> it should default to opening up the device floorplan

19:22 <azonenberg_work> Will look something like this http://thanatos.virtual.antikernel.net/unlisted/wtfplacement.png except without the color coding for the different blocks of hierarchy, which i did myself

19:23 <awygle> hm it won't open my edk project i guess

19:25 <awygle> okay so "X1Y2" indicates a clock region?

19:34 <awygle> can i make this show me the actual clock nets? i see the tiles but not the nets...

19:34 <SolraBizna> blast, there was a whole discussion of video generation and I was asleep

19:36 <SolraBizna> the original Mac had a single 16x16 hardware "cursor sprite" and a single hardware framebuffer

19:37 <SolraBizna> it updated the cursor sprite's position in the vertical-blank handler, and applications that wanted to do smooth animations would render to the heap and start to copy that rendered data onto the screen after the vertical blank had occurred

19:40 <SolraBizna> I would recommend an architecture like that for new embedded systems with mostly-static graphics requirements

19:50 <azonenberg_work> awygle: yes those are clock regions

19:50 <azonenberg_work> Planahead has a bug where each time the window refreshes, the box gets a little smaller

19:50 <azonenberg_work> this is fixed in vivado and i reported it in ISE but they wontfix'd it since ISE was basically EOL by that point

19:51 <awygle> that is an amusing bug

19:51 <balrog> azonenberg_work: they did FINALLY put out a Win10 compatible ISE build

19:52 <balrog> but it only supports Spartan 6, wtf

19:52 <azonenberg_work> balrog: because basically all the older parts are "even more EOL" than s6

19:52 <azonenberg_work> and for 7 series they want everyone using vivado

19:52 <balrog> Spartan 3A is still too common

19:52 <azonenberg_work> o_O

19:52 <balrog> due to 5V tolerant I/Os

19:52 <azonenberg_work> wait what? s3a is 5v tolerant?

19:53 <azonenberg_work> i thought you had to use cplds for that

19:53 <Bob_Dole> :o

19:53 <balrog> hm maybe I'm confusing myself

19:53 <azonenberg_work> xc9500, the original series, is 5V

19:53 <azonenberg_work> 9500xl might be 5v tolerant but is 3.3v core, i dont remember

19:53 <Bob_Dole> Ithought even the 5V CPLDs were EOL

19:53 <azonenberg_work> coolrunner and beyond are 3.3 max on io

19:53 <balrog> azonenberg_work: yeah, nope, I was wrong

19:53 <azonenberg_work> balrog: my guess? the support engineers are tired of dealing with ise and want it to just die already

19:53 <awygle> verdict - this design will fit but i'll probably have to hand-optimize what buffers are used (instead of just slapping them all on BUFGs)

19:53 <awygle> i'm so sick of this design lol

19:53 <balrog> will they add CPLD support to Vivado?

19:53 <balrog> coolrunner is still supported I thought

19:54 <azonenberg_work> yeah, you have to use old ise on pre-win10 or linux

19:54 <azonenberg_work> vivado will never support cplds

19:54 <azonenberg_work> they consider cplds a dead end

19:54 <azonenberg_work> they still sell the chips but arent really wanting people to do new designs with them

19:54 <azonenberg_work> awygle: i generally use bufh's when i can for regional clocks that are only used for a small module or something

19:55 <azonenberg_work> It helps placement considerably in ISE with BUFH's if you floorplan the module to one clock region though

19:55 <azonenberg_work> so it doesnt have to spend a lot of time fighting unroutability to make it work

19:58 <azonenberg_work> awygle: btw if you upgrade this design to ultrascale at some point it will get a lot easier since you have 24 clocks per clock region (and the clock regions are smaller)

19:59 <awygle> azonenberg_work: pins from the same logical bus are in different banks, clocked by clocks which are not on CC pins

19:59 <azonenberg_work> plus 24 more "routing" clocks that are used for feed-through to adjacent regions

19:59 <awygle> floor planning is unlikely to help

20:00 <awygle> at this late date

20:00 <awygle> so ise doesn't use timing for placement /routing, does it?

20:01 <azonenberg_work> it should be timing driven

20:01 <awygle> oh OK, I thought it wasn't for some reason

20:02 <azonenberg_work> the really old parts did not, like spartan3 with some par options

20:05 <azonenberg_work> awygle: side note, i almost never use BUFGs in 7 series designs

20:06 <azonenberg_work> Since PLLs can do funny things during boot and i normally have my clocks come from a PLL

20:06 <azonenberg_work> So what i do instead is i feed each PLL output to a BUFGCE gated by the PLL lock signal

20:06 <awygle> azonenberg_work: i would love to do all kinds of cool shit but this design came pre-fucked

20:07 <awygle> these are non-free-running clocks coming in on non-clock-capable inputs

20:07 <awygle> no PLLs, no BUFIOs, only pain

20:08 <azonenberg_work> lol

20:08 <azonenberg_work> enjoooooooy

20:09 <azonenberg_work> oh, and get the pcb engineer fired :p

20:10 <whitequark> wtf

20:20 <balrog> whitequark: your logging server is down?

20:20 <azonenberg_work> awygle: at least if they were CC inputs you'd be ok-ish

20:20 <azonenberg_work> and that could totally have been fixed at layout time

20:21 <azonenberg_work> Also if at all possible try and change policy so the FPGA guy(s) get inserted into the design flow before the PCB tapes out to fab

20:21 <azonenberg_work> you can avoid so much pain by having a less siloed design flow

20:21 <azonenberg_work> (not sure if this design predates you or what though)

20:22 <whitequark> balrog: is it? lemme see

20:22 <whitequark> balrog: seems up to me?

20:22 <balrog> it was a hiccup

20:22 <balrog> got a "could not load data from this location"

20:22 <azonenberg_work> same thing with asic work, having the RTL and layout guys/teams talking to each other from day one avoids lots of issues before they become big problems

20:23 <azonenberg_work> (looking at you, big-cpu-company-that-isnt-amd)

20:23 <balrog> azonenberg_work: looooooooool

20:25 <whitequark> pffffffff

20:26 <azonenberg_work> seriously, siloed workflows like that are asking for trouble

20:29 <whitequark> yeah intel's culture is royally fucked

20:29 <whitequark> every time someone talks about it i'm just "welp"

20:29 <whitequark> "how do they produce anything at all functional"

20:29 <qu1j0t3> :)

20:30 <azonenberg_work> whitequark: to give you an idea of how bad it is

20:30 <azonenberg_work> one time i had an intel engineer ask me for recommendations on a 10G NIC

20:30 <azonenberg_work> I suggested an intel chipset

20:30 <azonenberg_work> he was like "wait, we make those?"

20:31 <rqou> wtf

20:31 <azonenberg_work> he literally had no idea an entire line of their business existed

20:31 <azonenberg_work> its one thing to not have commit access to the repo or something, but not knowing the product exists??/

20:34 <rqou> azonenberg_work: although to be fair i have very little idea what the rest of my employer is doing

20:34 <rqou> because we also do a ton of random shit

20:34 <azonenberg_work> rqou: yeah but like, these arent classified government contracts or something

20:34 <azonenberg_work> these things are sold openly on amazon

20:35 <rqou> yeah, but considering my $WORK, the red team really has very little visibility into e.g. how the huffpost writers get paid (which was a thing that came up a while back)

20:36 <sorear> the "three xeons in a trenchcoat" thing caught me by surprise, but it was at least chips I knew about

20:36 <rqou> it's not even in the same "legacy half of the company"

20:38 <azonenberg_work> yeah but if you have a huge holding company that has many different sub-brands its understandable

20:38 <azonenberg_work> as they're effectively their own companies who just share profit at the highest levels

20:39 <azonenberg_work> but intel nic and intel cpu?

20:39 <zkms> also theres intel buttbands

20:39 <rqou> wat

20:43 <sorear> *mumble* run lspci on a typical intel laptop and count the things *not* made by intel

20:44 <sorear> i guess at this point they could expand into batteries or screens

20:44 <SolraBizna> 2

20:44 <SolraBizna> (out of 21)

20:45 <SolraBizna> (one is an ExpressCard I've inserted, the other is a FireGL[?!!?!] GPU)

20:46 <zkms> rqou: https://www.intel.com/content/www/us/en/mobile/modem-solutions.html

20:51 <whitequark> 20:30 < azonenberg_work> he was like "wait, we make those?"

20:51 <whitequark> incredible

20:56 <emily> sorear: realtek pcie card reader, ath10k networking, samsung SSD controller, apparently

20:56 <emily> I'd be happier if the networking were intel >>

20:57 <sorear> does intel make wifi chips

20:57 <sorear> i've only seen ethernet nics from them

20:58 <zkms> they do make wifi parts

20:58 <sorear> do they work on linux

20:58 <emily> iirc the old xps 13s specifically had a variant with an intel wifi chip in them because the normal one was terrible on linux?

20:58 <openfpga-github> [Glasgow] marcan pushed 1 new commit to revC: https://github.com/whitequark/Glasgow/commit/7d67f584612335f12ed225045db4bf5bdf3abee8

20:58 <openfpga-github> Glasgow/revC 7d67f58 Hector Martin: revC: fix connector placement, LVDS connector testing

20:58 <zkms> sorear: IME yes

20:58 <emily> but now they only offer that one

20:58 <emily> because shrug :/

20:59 <zkms> lmao if i do lspci |grep -v Intel

20:59 <zkms> no output

20:59 <emily> how many levels of vertical integration are you on right now

20:59 <openfpga-github> [Glasgow] marcan force-pushed revC from 7d67f58 to 328e0b7: https://github.com/whitequark/Glasgow/commits/revC

20:59 <openfpga-github> Glasgow/revC 328e0b7 Hector Martin: revC: fix connector placement, LVDS connector testing

20:59 <sorear> does that include an intel nvme ssd, or are you still on sata

20:59 <travis-ci> whitequark/Glasgow#121 (revC - 7d67f58 : Hector Martin): The build has errored.

20:59 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/5a50dd640c0c...7d67f5846123

20:59 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/449062745

20:59 <emily> i'm kind of amused by the realtek pci-e sd card reader

20:59 <emily> i forgot this even has a microsd slot

20:59 <sorear> (which intel does make)

21:00 <emily> 00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 08)

21:00 <emily> tfw half of my lspci output is thermals stuff

21:00 <sorear> I would have hoped that a "PCI Express Card Reader" would have something to do with ExpressCard

21:02 <zkms> apparently now intel has a thing called CNVI where the actual RF components for bluetooth/802.11 live in a separate hardware module

21:02 <zkms> that just ingests/shits out stuff on a high speed serial link

21:02 <zkms> idk if it's I/Q samples or if there's more modulation/demodulation stuff happening in the RF module but still

21:03 <SolraBizna> that sounds like my approach to analog

21:03 <whitequark> zkms: converged network... what?

21:03 <whitequark> or how does that decode

21:03 <awygle> azonenberg_work: this design does predate me

21:03 <awygle> also it predates this project

21:03 <whitequark> Intel® Integrated Connectivity (CNVi)

21:04 <awygle> this is add-on capability

21:04 <whitequark> awygle: predates as in predator.

21:04 <awygle> whitequark: that too

21:04 <awygle> now, i happen to think that this problem was Eminently Forseeable at design time

21:04 <awygle> but i wasn't there so probably they had constraints i'm not aware of

21:04 <zkms> whitequark: i have no fucking clue what the Intel® people think that acronym means

21:04 <awygle> our hw and sw teams do talk (they just don't always speak the same language)

21:06 <whitequark> oh god

21:06 <whitequark> zkms: "Connectivity Integration"

21:06 <whitequark> that's what it means

21:07 <zkms> ic

21:09 <zkms> the cell people call this a "remote radio head" which i feel is a lot better naming sense

21:09 <zkms> almost makes up for the vicious panoply of long acronyms that fill their standards ;;

21:10 <azonenberg_work> zkms: yeah i was just thinking that sounded like the same thing cell is doing

21:11 <SolraBizna> the Raspberry Pi I was gifted with was unstable, so I hooked up its serial connection and monitored it for four months

21:11 <travis-ci> whitequark/Glasgow#122 (revC - 328e0b7 : Hector Martin): The build has errored.

21:11 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/7d67f5846123...328e0b7558d7

21:11 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/449063178

21:11 <SolraBizna> for three months, it was stable, so I assumed I had fixed it

21:12 <SolraBizna> Three days ago, I unplugged the serial connection

21:12 <SolraBizna> It has hung twice

21:16 <rqou> solution: update the deployment requirements to require the serial connection be plugged in

21:16 <SolraBizna> I can't think of any reason for it to fail like this except that having the ground hooked up was stabilizing it somehow

21:20 <whitequark> that's actually possib

21:20 <whitequark> le

21:21 <rqou> [14:16] (rqou) solution: update the deployment requirements to require the serial connection be plugged in

21:21 <whitequark> try hooking up just the ground?

21:21 <rqou> am I doing the cursed vendor thing correctly?

21:21 <SolraBizna> that's a good idea, I'll do that if it's stable with the serial cable connected for a week or so

21:22 <reportingsjr> whitequark: what is your IRC channel?

21:22 <whitequark> reportingsjr: ##whitequark

21:22 <whitequark> rqou: yes.

21:22 <whitequark> unfortunately

21:22 <SolraBizna> so glad I don't have to deal with real cursed vendors

21:22 <SolraBizna> on the other hand, I have to deal with being poor...

21:24 <awygle> would it be faster to route pin->BUFR->BUFG than pin->BUFG?

21:27 <azonenberg_work> no i dont think so

21:27 <azonenberg_work> probably a lot slower in fact?

21:31 <awygle> hm why? my reasoning is "BUFR is close to pin, clock fabric is faster than general routing"

21:33 <awygle> i guess that second thing may not be true

21:35 <azonenberg_work> its not faster than general routing per se, it's controlled for skew

21:35 <azonenberg_work> its a balanced tree with the same number of loads on each leaf etc

21:35 <awygle> right

21:36 <awygle> MMCMs aren't guaranteed to free-run if their input clock goes away...

21:37 <azonenberg_work> Why would you want a pll/mmcm to free run anyway? if it loses lock the output frequency is completely unpredictable

21:37 <azonenberg_work> and might go too fast for your constraints or something

21:37 <awygle> mhm

21:38 <azonenberg_work> if anything i'd want it to insta-stop and gate all outputs

21:38 <azonenberg_work> after only a couple of vco cycles, before it could drift out of the safe range

21:38 <awygle> i can't find anything about "bufr O to bufg I" timing in the datasheet....

21:38 <sorear> the problem with a stopped clock is that a glitchless clock mux can't change away from a stopped clock

21:39 <sorear> a PLL that free-runs at the lowest possible frequency could be useful, idk

21:39 <azonenberg_work> awygle: that's routing fabric

21:39 <azonenberg_work> routing fabric numbers arent in the datasheets

21:39 <azonenberg_work> heck, past 7 series they don't even give you CLB timing

21:39 <awygle> yeah but it's dedicated routing, i was hoping it was

21:39 <azonenberg_work> sorear: yes but the VCO isn't stopped

21:40 <azonenberg_work> the idea is that you detect loss of input clock, vco is still running, then glitchlessly gate the output between cycles before the vco period drifts significantly

21:40 <sorear> azonenberg_work: replying to "I'd want it to insta-stop and gate all outputs"

21:41 <azonenberg_work> That's my point, you can glitchlessly stop the outputs because the pll outputs are generated from the VCO which is free-running

21:41 <sorear> yes but if you have a clock mux downstream of the PLL you're hosed

21:41 <awygle> you spend like a nanosecond in the bufr itself

21:41 <azonenberg_work> depends on what you're muxing and how its configured

21:41 <awygle> but it takes >3ns to get to bufg

21:41 <azonenberg_work> all i can say is, my approach has always been that lack of a stable clock = lack of a clock

21:41 <awygle> so who knows

21:42 <azonenberg_work> shut down until you get a clock back

21:42 <awygle> find out when this build finishes i guess

21:42 <azonenberg_work> awygle: why do you care so much about this delay?

21:42 m4ssi has joined ##openfpga

21:42 <azonenberg_work> what are you trying to do with the input clock?

21:42 <awygle> sample the incoming data

21:42 <azonenberg_work> a bufh is the best thing to do with a fabric-sourced clock (not that there are GOOD things to do with fabric clocks)

21:43 <awygle> hm, wonder if BUFH->BUFG is faster actually

21:43 <azonenberg_work> Why bufg at all??

21:43 <awygle> i can't BUFH the whole way because the clock regions don't work out that way

21:43 <azonenberg_work> thats my question

21:43 <azonenberg_work> Sample the data in the BUFH'd domain

21:43 <azonenberg_work> then feed into a dual clock fifo

21:43 <awygle> i _can't_. the clock comes in in (say) bank 34, along with 2 out of 8 bits

21:43 <azonenberg_work> ...

21:44 <awygle> the other 6 are in (say) bank 13, one clock region over and one down

21:44 <azonenberg_work> waaait

21:44 <azonenberg_work> the bits arent even coming in on the same side of the chip, much less the same bank?

21:44 <awygle> nope

21:44 <azonenberg_work> my previous recommendation to fire the pcb guy just got a lot stronger

21:44 <awygle> not necessarily at least

21:44 <awygle> like i said this board wasn't, strictly speaking, _designed_ to do this

21:46 <awygle> oh the BUFHs are in the center spine, so they're not going to be any faster than routing through fabric to the BUFG, probably

21:57 <azonenberg_work> My understanding is that a BUFH is purely a software construct, they have no existence in the actual chip

21:57 <azonenberg_work> so is a BUFG

21:57 <azonenberg_work> ish

21:57 <azonenberg_work> Basically, you have routes from various sources into the 32 global clock lines

21:58 <azonenberg_work> and the 12 regional clocks

21:58 <azonenberg_work> Then you can route from the global clock lines into the regional clocks

21:58 <azonenberg_work> a BUFH is just a way of saying, drive this regional clock but don't backfeed into the global clock

21:59 <azonenberg_work> i.e. the 12 BUFH's per clock region are literally the horizontal branches of the global clock tree within that clock region

21:59 <azonenberg_work> there are not two separate sets of routes

21:59 <azonenberg_work> So a BUFH is just a pip going from fabric routing to the horizontal clock wire

22:00 <azonenberg_work> And it likely uses the same high-fanout buffer as the redriver between the global clock spine and the horizontal clock row

22:00 <sorear> What would happen if a clock tree had loops?

22:00 <awygle> right

22:00 <awygle> but the redriver is still on the center spine

22:01 <azonenberg_work> near, not in

22:01 <daveshah> sorear: you mean going back through a global buffer?

22:01 <awygle> whatever, close enough

22:01 <daveshah> You'd make a delay line memory

22:01 <daveshah> Most arches don't have bidirectional switches off the clock tree so you couldn't form a loop within it

22:01 <azonenberg_work> a very power hungry and slow one :p

22:01 <azonenberg_work> and yes

22:01 <azonenberg_work> that too

22:01 <azonenberg_work> you could loop one bufg into another and another

22:02 <azonenberg_work> but not full feedback

22:02 <sorear> I mean if there were actual metal loops in clock distribution

22:02 <azonenberg_work> you'd have to close the loop in fabric

22:02 <awygle> bufg i-to-o is much faster than bufh i-to-o

22:02 <openfpga-github> [Glasgow] marcan pushed 1 new commit to revC: https://github.com/whitequark/Glasgow/commit/8da369c6a4212780e813f1fdbb6b345d648934fd

22:02 <openfpga-github> Glasgow/revC 8da369c Hector Martin: revC: route LVDS

22:02 <awygle> per datasheet

22:03 <sorear> naively, a clock plane would not have any skew problems (because nearby-in-2D flops will always get the pulse at nearly the same time), and a clock plane-with-holes would seem to have the same advantage while using not much more metal than a tree

22:03 <sorear> i'm wondering if (a) they don't do this to save metal (reduce clock capacitance, reduce dynamic power) (b) this doesn't actually work

22:04 <daveshah> I suspect capacitance comes into it

22:04 <daveshah> Most clock tree structures have buffers in the tree structure

22:04 <daveshah> At least in ice40 and ecp5 they can be turned off to save power too

22:12 m4ssi has quit [Remote host closed the connection]

22:14 <travis-ci> whitequark/Glasgow#123 (revC - 8da369c : Hector Martin): The build has errored.

22:14 <travis-ci> Change view : https://github.com/whitequark/Glasgow/compare/328e0b7558d7...8da369c6a421

22:14 <travis-ci> Build details : https://travis-ci.org/whitequark/Glasgow/builds/449089853

22:15 <awygle> i'm so sick of this project...

22:16 <sorear> which

22:22 Bicyclidine is now known as Bike

22:29 pie__ has quit [Ping timeout: 272 seconds]

22:38 <azonenberg_work> sorear: metal density issues w fab too

22:39 <azonenberg_work> there are min and max percent cover allowed

22:39 <azonenberg_work> and yes cap is an issue

22:39 <azonenberg_work> An ideal clock setup is typically a fractal of H shapes

22:40 <azonenberg_work> So each buffer only needs a fanout of ~4

22:40 <sorear> I guess my question is, "do they use fractal trees because that's the most efficient use of metal, or because there are voodoo RF reasons to studiously avoid loops?"

22:40 <azonenberg_work> Both

22:41 <azonenberg_work> I did see a square grid on a 350nm part once

22:41 <azonenberg_work> Myricom lan switch

22:41 <azonenberg_work> unsure if clock or power

22:42 <azonenberg_work> https://siliconpr0n.org/archive/lib/exe/fetch.php?cache=&media=azonenberg:myricom:pcidma_03_bf_neo5x_annotated.jpg

22:42 <azonenberg_work> Given the fat paired busing on M2 i conjecture the M3 grid is clock

22:43 <azonenberg_work> And this wasnt the crossbar it was a pci dma card

22:43 <azonenberg_work> https://siliconpr0n.org/archive/lib/exe/fetch.php?cache=&media=azonenberg:myricom:pcidma_01_bf_neo40x_cropped.jpg

22:45 <azonenberg_work> https://siliconpr0n.org/archive/lib/exe/fetch.php?cache=&media=azonenberg:myricom:pcidma_08_bf_neo40x_annotated.jpg taps off the clock tree for sram wordline drivers

23:14 <awygle> loop == loop antenna

23:16 <awygle> so i'm looking at a .twr file, and it has "clock to setup on destination clock", "setup/hold to clock", and

23:16 <awygle> "Hold Paths" under hold errors

23:17 <awygle> and they all have different numbers

23:17 <awygle> how do i interpret this?

23:26 wbraun has quit [Quit: wbraun]

23:26 <awygle> hm can i set this to analyze timing only between 0 and 40 C?

23:31 <Bob_Dole> I saw someone saying RX 550s have working drivers on RV64 with the FOSS AMDGPU stuff?

23:31 <Bob_Dole> s/I saw/someone told me but I forget who/

23:32 mumptai has quit [Quit: Verlassend]

23:34 <awygle> huh, apparently there's a TEMPERATURE constraint, that's interesting