##openfpga on 2019-07-17 — irc logs at freenode.irclog.whitequark.org

00:24 sgstair has quit [Read error: Connection reset by peer]

00:51 Maya-sama has joined ##openfpga

00:55 Miyu has quit [Ping timeout: 276 seconds]

01:10 dj_pi has joined ##openfpga

01:35 Richard_Simmons has joined ##openfpga

01:39 Bob_Dole has quit [Ping timeout: 264 seconds]

02:07 flea86 has joined ##openfpga

02:47 genii has quit [Remote host closed the connection]

03:04 sgstair has joined ##openfpga

03:06 Bike has quit [Quit: Lost terminal]

03:08 mkdir has joined ##openfpga

03:08 <mkdir> yo yo

03:08 <mkdir> what it be

03:59 hackkitten has joined ##openfpga

04:02 Maya-sama has quit [Ping timeout: 245 seconds]

04:18 <emeb_mac> 2b | /2b

05:02 rohitksingh_wor1 has joined ##openfpga

05:03 rohitksingh_wor1 has quit [Client Quit]

05:12 dj_pi has quit [Ping timeout: 258 seconds]

05:20 cr1901_modern has quit [Ping timeout: 248 seconds]

05:23 cr1901_modern has joined ##openfpga

06:22 m4ssi has joined ##openfpga

06:28 cr1901_modern1 has joined ##openfpga

06:30 cr1901_modern has quit [Ping timeout: 268 seconds]

06:46 hackkitten has quit [Ping timeout: 252 seconds]

07:17 emeb_mac has quit [Ping timeout: 248 seconds]

07:54 mkdir has quit [Ping timeout: 260 seconds]

08:14 Asu has joined ##openfpga

08:16 Jybz has joined ##openfpga

08:59 Maya-sama has joined ##openfpga

09:01 Jybz has quit [Quit: Konversation terminated!]

09:02 Maya-sama is now known as Miyu

09:14 Miyu is now known as hackkitten

09:21 plaes has quit [Remote host closed the connection]

09:45 <_whitenotifier-3> [Boneless-CPU] zignig commented on pull request #4: directives bikeshed - https://git.io/fj1cq

09:46 <_whitenotifier-3> [Boneless-CPU] zignig synchronize pull request #4: directives bikeshed - https://git.io/fjXmy

10:18 ironsteel__ has joined ##openfpga

10:20 ironsteel has quit [Ping timeout: 245 seconds]

10:43 Asu has quit [Ping timeout: 245 seconds]

10:56 Asu has joined ##openfpga

11:46 cr1901_modern1 has quit [Quit: Leaving.]

12:00 Asu has quit [Ping timeout: 248 seconds]

12:19 Miyu has joined ##openfpga

12:21 hackkitten has quit [Ping timeout: 244 seconds]

12:21 Asu has joined ##openfpga

12:26 Asu` has joined ##openfpga

12:26 Asu has quit [Ping timeout: 268 seconds]

13:21 flea86 has quit [Quit: Goodbye and thanks for all the dirty sand ;-)]

13:38 dh73 has joined ##openfpga

13:39 cr1901 has joined ##openfpga

13:56 rohitksingh has joined ##openfpga

14:09 rohitksingh has quit [Ping timeout: 244 seconds]

14:18 genii has joined ##openfpga

14:28 rohitksingh has joined ##openfpga

14:28 <eddyb> daveshah: apologies for discussing ECP5 previously and not being aware of https://github.com/daveshah1/TrellisBoard

14:29 <eddyb> it looks amazing o_O

14:40 <eddyb> almost missed this: "Remaining two SERDES channels on M.2 E-key connector"

14:41 <eddyb> seems like it would be possible to get DisplayPort out of there :P

14:44 <eddyb> at this point I should just order the ECP5 evaluation board and do a demo with it (if I can even get the electrical side to behave)

14:46 <tnt> daveshah: you said there was a demo of how to use the SERDES somewhere ? what does that demo do ?

14:47 <daveshah> I wrote this which just transmits a counter and displays the received value on the LEDs

14:47 <daveshah> https://gist.github.com/daveshah1/e5eaa1b3da02c59f2892fdaa4a9737f8

14:47 <daveshah> Intention being to loop Tx to Rx

14:47 <daveshah> whitequark's Yumewatari also has an example implementing the PCIe physical layer

14:48 <daveshah> she also documented some of the parameters a bit

14:48 <daveshah> https://github.com/whitequark/Yumewatari/blob/master/yumewatari/gateware/platform/lattice_ecp5.py

14:53 <eddyb> what I never asked was what board did wq test all that on?

14:53 <tnt> versa IIRC

14:53 <eddyb> thanks, that makes sense

14:54 <eddyb> daveshah: have you tried Yumewatari on your TrellisBoard? :D

14:54 <daveshah> No, haven't tested the SERDES stuff at all yet

14:54 <daveshah> on the ever growing todo list...

14:59 somlo has quit [Remote host closed the connection]

15:04 somlo has joined ##openfpga

15:06 Thorn has quit [Ping timeout: 244 seconds]

15:13 Asu` has quit [Read error: Connection reset by peer]

15:13 emeb has joined ##openfpga

15:15 Asu` has joined ##openfpga

15:16 somlo has quit [Ping timeout: 268 seconds]

15:24 somlo has joined ##openfpga

15:26 Asu has joined ##openfpga

15:28 Asu` has quit [Ping timeout: 272 seconds]

15:29 wpwrak has quit [Ping timeout: 245 seconds]

15:43 wpwrak has joined ##openfpga

15:43 rohitksingh has quit [Ping timeout: 258 seconds]

16:02 rohitksingh has joined ##openfpga

16:21 m4ssi has quit [Remote host closed the connection]

16:47 <mithro> Anyone know where the page for that open source eda tooling event co-located with DATE at the beginning of the year went?

16:47 <mithro> daveshah: ^

16:47 <daveshah> mithro: https://osda.gitlab.io/

16:50 <mithro> daveshah: For some reason it doesn't seem to come up with my Google searchs

16:51 <daveshah> https://usercontent.irccloud-cdn.com/file/PY9yEyQw/Screenshot%20from%202019-07-17%2017-50-49.png

16:51 <daveshah> #worksforme

17:05 Miyu has quit [Ping timeout: 268 seconds]

17:25 rohitksingh has quit [Ping timeout: 268 seconds]

18:15 rohitksingh has joined ##openfpga

18:44 hackerfoo has quit [Remote host closed the connection]

18:44 m_hackerfoo has quit [Remote host closed the connection]

19:00 Miyu has joined ##openfpga

19:20 Miyu has quit [Ping timeout: 246 seconds]

19:57 rohitksingh has quit [Ping timeout: 245 seconds]

20:26 hackerfoo has joined ##openfpga

20:27 m_hackerfoo has joined ##openfpga

20:51 <whitequark> daveshah: poke

20:53 <daveshah> whitequark: hi!

20:55 <whitequark> daveshah: currently thinking about the best way to map decision trees in processes to lookup tables that output one-hot signals

20:55 <whitequark> i feel like BDDs are definitely related in that their reduced form is good for eliminating irrelevant terms, but also i don't see any form of BDDs so far that would directly be a good fit

20:56 <whitequark> for several reasons:

20:57 <whitequark> 1. in yosys processes you can end up with *several* cases in the tree simultaneously active. what's worse is that not only the decision trees are priority-encoded in that the first case wins, but they are *also* priority-encoded is that the last switch assigning a specific signal wins

20:59 <whitequark> 2. it would be nice to share one-hot functions between several parallel muxes (with some preprocessing/creative wiring), but that requires turning the usual BDD structure "inside-out" in a way

21:00 <whitequark> so for example, in a typical BDD you would have two leaf nodes, 0 and 1, and you build up functions by branching on variables, where a more complex function can be split into several smaller ones that are effectively subtrees

21:03 <whitequark> on the other hand, in the hypothetical structure i want, each node can potentially drive a signal, and indeed if an entire switch subtree chooses some particular value for a particular mux, you want to get the selection signal as close to the root as possible

21:05 <whitequark> on the other hand, canonicity isn't that important because combining these structures isn't that useful

21:06 <whitequark> any thoughts?

21:07 <daveshah> Not really close to anything I've looked at before but I'll think about it

21:07 <whitequark> I'm not even sure how to *build* this structure tbh

21:08 <whitequark> but I suspect what might work is "pushing from the top"

21:10 <whitequark> that is, there is a shared decision tree for all muxes, and each node can have *both* a set of mux selections *and* a set of edges

21:10 <daveshah> Yes, I think that is starting to make sense

21:11 <whitequark> each time you add a new node you look at whether the sub-switches have a variety of selections for any particular mux, or only one

21:11 <whitequark> and if it's one, you pin it there, or if it's many, you push the decision to sub-nodes

21:14 cr1901 has quit [Ping timeout: 245 seconds]

21:18 cr1901 has joined ##openfpga

21:19 <mithro> whitequark: A while back you had a page which discussed some of your theory behind nmigen but I can't find it right now...

21:20 <whitequark> mithro: https://github.com/m-labs/nmigen/blob/master/doc/PROPOSAL.md ?

21:20 Thorn has joined ##openfpga

21:20 <mithro> whitequark: I felt like there was something else to that...

21:21 <daveshah> Just to make sure I understand what is going on, what will the ultimate final inputs to the BDD be?

21:22 <daveshah> e.g. if a case was selecting on a==5 would the input be a or a==5?

21:23 <daveshah> With a being some kind of top level input

21:25 <whitequark> daveshah: hmmm, do you mean like `case (a==5) 1: ... end` or `case (a) 5: ... end` ?

21:25 <daveshah> Let's say the first

21:26 ZipCPU has quit [Ping timeout: 250 seconds]

21:27 ZipCPU has joined ##openfpga

21:27 <whitequark> then the process would only ever see the result of a==5, which is a 1-bit $eq.\Y signal

21:27 <daveshah> Which it seems like Yosys will create if you have a chain of if/else instead of a case

21:27 <whitequark> indeed, and nmigen does that as well

21:28 <whitequark> what I'm really interested in though is things like instruction decoders

21:28 <whitequark> where for example you can have nested switches branching on the *same bits*

21:28 <whitequark> boneless' decoder branches up to three times on the same \i_insn[15:0]

21:29 <whitequark> it's not hard at all to handle if/else chains because it pretty much just becomes a tree of muxes and all downstream passes should handle that more or less optimally

21:29 <whitequark> but the moment you get an FSM it becomes bad already

21:29 <whitequark> and decoders are the worst

21:47 <whitequark> daveshah: so I think I'm going to write some code that builds *some* decision diagram (not necessarily a good one), and then go from there, because,

21:48 <whitequark> with BDDs, choosing variable order is very important, but how do you choose variable order? for example by looking at which choices affect the most primary outputs

21:49 <whitequark> but to do this, I need to be able to quickly assess the impact of a particular decision on the primary outputs... which means I need to expand the priority encodings somehow, which means I need basically the same strcture I'm building

21:50 <whitequark> or at least, that would be an easy way to do it, since, given this tree makes every decision explicit, evaluating the choices is as simple as cutting off some branches

21:53 <sorear> it sounds like you’re hitting the “synthesizing decoders requires specialized tools” problem that, as you have observed, other vendors solve with spreadsheets and bespoke logic optimization scripts

21:54 <whitequark> sorear: yeah but it's not just decoders

21:55 Asu has quit [Quit: Konversation terminated!]

21:55 <whitequark> like, FSMs with rich conditionals inside them have the same problem

21:55 <whitequark> also this is a stupid ass problem to have. you have a logic optimizer *right here*

21:57 <tnt> I'm curious what you'll end up with and how much better than the naive/brute-force approach to build a giant truth table and minimizing that.

21:58 <whitequark> tnt: "giant truth table" might work for 16-bit instructions but probably not for 32-bit ones

21:58 <tnt> yeah, of course, it's not a universal approach, that's just the one I'm using atm :p

21:59 <whitequark> oh. lol.

21:59 <whitequark> what for?

21:59 <tnt> cpu instruction decoding :)

22:00 <whitequark> right, what cpu?

22:04 <tnt> A custom 16 bit one I wrote for the ice40. I posted the ISA a while ago https://gist.github.com/smunaut/dac7a096add6bcdb5b70f0e205e4d16d

22:05 <tnt> Really need to get back to it and put it online somewhere.

22:06 <whitequark> tnt: strongly reminiscent of boneless, heh

22:09 <tnt> whitequark: IIRC I took the register window concept from it, but other than that, it was actually mostly defined on paper before boneless. I guess the design space isn't that big :) Although there are quite a bit of fundamental differences like different data/program/io spaces.

22:09 <tnt> (at least AFAIK, I haven't actually dug much into boneless)

22:11 <whitequark> tnt: yeah, not saying you even borrowed anything

22:11 <whitequark> it's just there are not so many ways to make a small cpu

22:11 <whitequark> i'm curious if you would consider using boneless v3

22:14 <tnt> I was definitely planning to look at it a bit closer. Especially because chances are you'll make a much better toolchain for it than I would for my softcore :)

22:15 <whitequark> tnt: there is already the assembler and disassembler (both into text and structured data aka python code)

22:15 <whitequark> the core, in theory, is functional, the FSM-driven one, but it does not have a formal spec yet, because it pisses me off how inefficient the yosys synthesis results are

22:15 <whitequark> and I will not drive a pipelined core without a formal spec for sure

22:16 <tnt> tbh when I saw you were working on one, I considered not even pursuing mine at all, but then I always wanted to try and design and implement one ever since I used and tweaked the picoblaze like 20y ago (damn I'm old) so I figured I would see it through.

22:16 <whitequark> oh yeah, designing one really gives you a lot of insight into that

22:17 <whitequark> for example... on a LUT architecture you don't really care about a prefix encoding

22:17 <tnt> mostly because I'm targetting fmax of 48 MHz (to be synchronous with usb core) on a up5k and that's ... tricky and requires hacks that I didn't think you'd go for :)

22:17 <whitequark> that's actually something I want to do.

22:17 <whitequark> for the same reason, amusingly

22:18 <whitequark> the entire instruction set is carefully designed such that the path between any two registers is at most a 16-bit adder + a 2-way 16-bit mux

22:18 <whitequark> well, in theory, it doesn't seem to synthesize to that yet

22:19 <whitequark> that's why boneless has so many multicycles, among other things

22:19 <tnt> I saw your "python assembler" for an earlier version of boneless and copied it when I had to write an easy way to make the microcode for my usb core ( https://github.com/smunaut/ice40-playground/blob/master/cores/usb/utils/microcode.py#L142 )

22:19 <tnt> I found it easy and elegant :)

22:20 <whitequark> in particular I am going to be (ab)using registers with reset and 4-wide OR gates to replace muxes

22:20 <tnt> Hehe, yeah, that's how I implemented 'muxes' too.

22:20 <whitequark> oh yeah, I since replaced that assembler with a more complex version

22:21 <whitequark> mostly because zignig wanted a text assembler

22:21 <whitequark> https://github.com/whitequark/Boneless-CPU/blob/master/boneless/arch/instr.py https://github.com/whitequark/Boneless-CPU/blob/master/boneless/arch/opcode.py

22:21 <tnt> Yup, I saw some tweets about that, pretty impressive stuff you managed to do with metaclasses etc ...

22:21 <whitequark> it's a bit silly really

22:21 <whitequark> i didn't *have* to go that far

22:22 <whitequark> but i wanted the instruction tables to be really pretty, and now they are

22:41 Bike has joined ##openfpga

23:05 dh73 has quit [Ping timeout: 260 seconds]

23:56 dh73 has joined ##openfpga