<digshadow-c>
rqou: nice! Is this posted to a public repo?
<rqou>
not yet
<rqou>
it's also not 100% clean since i've been using the .dev files for hints, so i'm not sure you _want_ it in a public repo
digshadow has joined ##openfpga
<azonenberg>
rqou: xilinx uses active-low bits a lot for flash parts
<azonenberg>
b/c they want "off" = "blank flash"
<rqou>
i meant that the macrocells are not in the "obvious" order
<rqou>
they seem to be in the order "0 | 9 | 1 | 10 | 2 | 11 | 3 | 12 | 4 | 13 | 5 | 14 | 6 | 15 | 7 | 16 | 8 | 17"
<azonenberg>
oh interesting
<rqou>
well, xc2 has that too for some of the larger parts
<azonenberg>
i wonder if that's because the config logic is in the middle
<azonenberg>
and it goes out up and down the array?
<rqou>
ugh, now time to figure out how the product term allocator works
Bike has quit [Quit: Lost terminal]
<rqou>
ugh i think the product terms are all in a crazy order too
<rqou>
azonenberg: how do i construct a giant many-product-term thing that can't be demorgan'ed?
<sorear>
if you're looking for pathological cases, try xor
<azonenberg>
sorear: with coolrunner you cant do that
<azonenberg>
as they have dedicated xor logic exactly to avoid that
<azonenberg>
i dont think xc9500 does though
<rqou>
it does too
<rqou>
almost all cplds actually do
<rqou>
it appears to me that it's a major missing feature in abc
<rqou>
xpla3 is even better in that it has an entire LUT2 between a sum-of-products and an extra product term
<rqou>
ugh ise really really doesn't want to use the product term cascades
<rqou>
hmm, a big xor doesn't seem to work either
uovo has quit [Ping timeout: 268 seconds]
<rqou>
but i don't understand enough about the interconnect to understand what it did instead
<rqou>
azonenberg: ok, i got it
<rqou>
the naive assign o = (a & b) | (c & d) | (e & f) | (g & h) | (i & j) | (k & l); works :P
oeuf has joined ##openfpga
<rqou>
azonenberg: does this make sense:
<rqou>
there appears to be a bit in the product term allocator settings that is set in order to export product terms "up"
<rqou>
but it doesn't appear to need to be set when exporting terms "down"
<rqou>
hrm, maybe it can only export one direction?
<azonenberg>
interesting
<azonenberg>
i've never worked with a product term allocator
<rqou>
but overall it wasn't nearly as hard as you made it out :P
<azonenberg>
i've used xc9500* a tiny bit but most of my cpld work, and all of my RE, was coolrunner
oeuf has quit [Ping timeout: 256 seconds]
<rqou>
the interconnect is probably going to be the hardest part
uovo has joined ##openfpga
<rqou>
anyways azonenberg it turns out that more-or-less every thing in the schematic in the datasheet is just controlled by (a set of) bits
<azonenberg>
Surprise surprise
<azonenberg>
yeah the interconnect is gonna be fun to do black box
<rqou>
i'm doing it "gray box" :P
<azonenberg>
I did entirely black, other than the comments in the bitfile
<rqou>
yeah, i'm reading the .dev
<rqou>
oh wtf
<rqou>
the product term "stride" seems to be 6
<rqou>
even though there are only 5 product terms per macrocell
<rqou>
hmm it appears the non-xl 9500 has some unnecessary muxes on the global signals
<rqou>
and indeed the xl doesn't have those bits anymore
sgstair_ has joined ##openfpga
sgstair has quit [Disconnected by services]
sgstair_ is now known as sgstair
<rqou>
oh wtf?
<rqou>
azonenberg: xc9500 seems to use 5 bits to control the output tristate
<rqou>
but there are only 7 possible values
<rqou>
it seems
<azonenberg>
maybe some are one-hot?
<azonenberg>
xilinx loves one-hot coding for stuff
<azonenberg>
they're not trying to make small bitstreams
<rqou>
01001 = use GTS3
<rqou>
01110 = use GTS4
<rqou>
01101 = use GTS2
<rqou>
01011 = use GTS1
<rqou>
01111 = use local oe
<rqou>
10111 = always enabled
<rqou>
11111 = output disabled
<rqou>
this doesn't seem to be one-hot
<rqou>
this means that the datasheet lied
<rqou>
it can't actually invert OE
<azonenberg>
interesting
<azonenberg>
so it looks like bit 3 is inverted then ORed with the OE mux
<azonenberg>
or somethign lke that
<azonenberg>
actualyl no
<azonenberg>
bit 4 = ignore OE mux
<azonenberg>
bit 3 = output disable flag, dontcare if bit 4 isn't set
<azonenberg>
2:0 = dense coded GTS mux setting
<rqou>
O_o there is a bit here called SAPWRn (sense amp power)
<rqou>
i thought it's 0 = on 1 = off
<rqou>
but it turns out it's 0 = high power 1 = low power
<rqou>
but there's an individual bit for _every_ product term
<rqou>
meaning that you can purposely make fucky impossible-to-analyze timing
<azonenberg>
yes they let you tweak it per pterm
<azonenberg>
theres a constraint for it somewhere
<rqou>
what
<rqou>
are you supposed to be able to do that?
<azonenberg>
yes
<rqou>
hmm
<rqou>
the datasheet implies it's supposed to be per-macrocell
<azonenberg>
the older cplds used analog sense amps, closer to nmos style logic
<azonenberg>
and you could tweak current to trade slew rate against static power
<rqou>
yeah, that makes sense
<azonenberg>
modern stuff is fully static cmos and doesnt have that problem :p
<rqou>
but per-pterm rather than per-macrocell?
<azonenberg>
i dont remember the specifics
<azonenberg>
i wonder if they do that timing driven
<azonenberg>
pterms not on the critical path are slower or something
* sorear
wishes it were easier to find information about this sort of thing
<azonenberg>
if it was easier
<azonenberg>
we wouldn't all be here :p
<sorear>
what I've found online is mostly static CMOS if it covers electrical-level logic design at all
<rqou>
also i love how there are typos in the xilinx internal data files
<azonenberg>
lol i've seen typos all over the place
<azonenberg>
But as you know i try to keep clean from that sort of stuff so i dont poke too much
<rqou>
hmm, there's a big block of config bits that i have no idea what it does
<rqou>
and it's not even in the .dev
<azonenberg>
unused?
<azonenberg>
remember coolrunner has that big hole
<azonenberg>
that afaik does zilch
<rqou>
nope, they change as the design changes
<azonenberg>
i dont think they even have matching sram cells
<azonenberg>
interesting
<azonenberg>
zia?
<azonenberg>
or whatever passes for routing in those things
<rqou>
there's a separate block for that i think?
wpwrak has quit [Read error: Connection reset by peer]
wpwrak has joined ##openfpga
user10032 has joined ##openfpga
soylentyellow has quit [Read error: Connection reset by peer]
soylentyellow has joined ##openfpga
<rqou>
azonenberg: ok, i think you're right
<rqou>
the separate block for the interconnect seems to only control the wire-and part of the matrix
<rqou>
because a) it only has enough bits for that b) the xl version doesn't have that
<rqou>
so the bits that are still not understood are:
<rqou>
* the big block that is related to the interconnect
<rqou>
* "Input paths control bits" whatever that means
<rqou>
* "Mux decoding bits"
<rqou>
and miscellaneous global bits
<rqou>
as well as fuzzing the ordering of pterms
<rqou>
i think overall not bad for the amount of time put in so far
<rqou>
also azonenberg you're right that the sense amp speed is supposed to be individually controllable
<rqou>
according to patent 6038386 they indeed do timing-driven low-power-enabling
futarisIRCcloud has quit [Quit: Connection closed for inactivity]
user10032 has quit [Remote host closed the connection]
rohitksingh_work has quit [Read error: Connection reset by peer]
futarisIRCcloud has joined ##openfpga
ondrej2 has quit [Quit: Leaving]
eduardo_ has joined ##openfpga
eduardo__ has quit [Ping timeout: 260 seconds]
jhol has quit [Quit: Coyote finally caught me]
jhol has joined ##openfpga
m_w has joined ##openfpga
m_w has quit [Read error: Connection reset by peer]
<awygle>
proposal - add some sort of SAT solver into compilers which tries all the possible numeric casts until your arithmetic expression gives the right ansewr
m_w has joined ##openfpga
m_w has quit [Read error: Connection reset by peer]
wpwrak has quit [Quit: Leaving]
wpwrak has joined ##openfpga
<mithro>
awygle: SGTM! shipit!
<mithro>
daveshah: Morning!
<mithro>
daveshah: Will look at your pull request shortly
<daveshah>
mithro: Thanks! Sorry must have missed that PR. afk this evening, don't know if I'll get a chance to rebase mine or not
<mithro>
daveshah: Okay
<mithro>
daveshah: I might rebase for you (since you gave maintainer permission to push to your branch :-) and see if it works
<daveshah>
Sure, give it a go :) . I think the only conflict will be the folder renaming
<rqou>
azonenberg: ping?
<rqou>
do you have any idea how they managed to make the entire input mux (i assume these bits i'm looking at are the input mux) be controlled by only 15 bits?
<rqou>
they appear to be grouped in 3 groups of 5 (and the xl version only has 2 groups of 5)
<rqou>
i'm not familiar enough with SAT solvers to try reimplementing it though
<pie___>
pff what do you mean you havent read all of knuth
<rqou>
pie___: "i'm the hardware guy" :P
<rqou>
(to real EEs): "i'm the software guy" :P
<pie___>
:D
<rqou>
i wonder if awygle does this too :P :P
<pie___>
we are half blood princes :x
* pie___
isnt actually a harry potter fanatic
<awygle>
ha! actually i tend to do the opposite
<awygle>
To hardware guys: "I'm one of you!"
<pie___>
xD
<awygle>
To software guys: "I'm one of you!"
<awygle>
From bosses: "Have twice as much work!"
<rqou>
lool
<awygle>
(this strategy may require some modification in the months ahead)
<awygle>
actually i'm more and more realizing how much more stressful software is than hardware. i should find a hardware job and only do software for fun....
<pie___>
qu1j0t3, ^
<mithro>
I'm in the software camp :-P
<rqou>
hmm, so right now i've got a decent chunk of the xc9536 bitstream understood
<rqou>
should i:
<pie___>
rqou, yes
<rqou>
a) finish fuzzing all the bits
<rqou>
b) write up what I have sanely
<rqou>
which should i do first?
<pie___>
probably (a) and keep notes?
<awygle>
i would write what you have but not publish it, then finish fuzzing, add that, then publish
<rqou>
why not publishing?
<awygle>
which diagonalizes to what pie___said
<mithro>
rqou: Do you have enough bits that you could do *something* even if you didn't have the remaining bits?
<rqou>
maybe
<awygle>
because i'm a perfectionist :P
<rqou>
there's some missing information for feedback paths and for IOs
<pie___>
idk anything but, too much effort to do a proper writeup, get the rest of the picture and maybe then $OVERARCHING_FRAMEWORK++?
<mithro>
rqou: I would aim for getting *something* that can be used
<rqou>
oh, i'm not intending to write any synthesis tools for this
<rqou>
this is just so that people can RE extracted bitstreams
<pie___>
well unless you decrease someone elses workload by publishing early
<rqou>
it does
<rqou>
quite a bit is understood already
<pie___>
i mean, how long would it take you to finish
<rqou>
probably a few more days
<rqou>
but i have to go to class right now
<pie___>
idk, tat doesnt seem too bad
<mithro>
rqou: What is your aim with doing the xc9536 effort?
<rqou>
so that people like MAME can RE bitstreams they get their hands on
<pie___>
rqou, if noting else drop a line somewhere "contact me for info on ___ otherwise wait a bit for me to do a proper writeup"
<pie___>
but then actually finish and do the writeup UNLIKE ALL THOSE OTHER PEOPLE
<rqou>
hey, the coolrunner-ii writeup is technically "done"
<mithro>
rqou: Do you have an internals details of that part? It seems to be a CPLD?
<rqou>
yes, it's a CPLD
<rqou>
was pretty popular
<pie___>
alternatively, do whatever you have energy for
<rqou>
some weird people who are afraid of level translators still use it because it's a 5V cpld
<awygle>
do the thing that brings you maximum reward per calorie
<rqou>
lol
<awygle>
life is a lazy hill-climbing algorithm
<rqou>
ok, i think i'll start writing up what i have so far
<mithro>
rqou: Do you have some info on the internal cells?
* pie___
attempts to watch an episode of nichijou for motivatin
<awygle>
semi-related, has anyone else ever wished you could sort restaurant menus like SQL databases?
<rqou>
mithro: what do you mean?
<rqou>
no, i have no die photos nor sem photos if that's what you're asking
<rqou>
this is all "gray" box
<awygle>
i am continually frustrated by the existence of two anime called nichijou
<mithro>
rqou: No - does it internally have like LUTs or NAND or etc?
<rqou>
that's all documented publicly
<mithro>
rqou: Link me?
<rqou>
it's a traditional sum-of-products
<awygle>
dumb git question - i did a rebase, i made all my changes, do i want to "git commit" normally, "git commit --amend", or "git rebase --continue" now?
<rqou>
well, traditional sum-of-products with product term allocating/steering/stealing
<awygle>
mithro: "git status" lists all three of those options
<rqou>
alright, really need to go to class now brb
<awygle>
i don't know what "once you are satisfied with your changes" means. i'm satisfied with them, and i want them to continue to exist rather than being snuffed out by a capricious god
<qu1j0t3>
awygle: the changes were conflict fixes?
<awygle>
qu1j0t3: not really. i just removed a bunch of lines that were changed in a non-functional way from the diff so it could be code reviewed more easily.
<qu1j0t3>
awygle: generally you add the modifications then continue the rebase
<awygle>
qu1j0t3: i did "git reset @~" to get it into a state where i could actually edit it, do i need to re-commit it?
<qu1j0t3>
check git diff --staged
<qu1j0t3>
if the changes are shown there, theyre in the index, and you can just continue the rebase
<awygle>
mk
<qu1j0t3>
during rebase you don't `commit` yourself.
<qu1j0t3>
rebase will do it.
<qu1j0t3>
you only need to make sure they're added to the index (staged)
<qu1j0t3>
if they aren;t there, then git add them (or git add -p)
<awygle>
okay, thank you
<mithro>
rqou: It would be useful to have some very *simple* FPGAs supported in arch-defs
<awygle>
lmao "you have uncommitted changes in your working tree pleaes commit them first"
<mithro>
rqou: But I'm guessing that the difference between LUT and sum-of-product logic devices is probably big enough that it doesn't make sense here
<rqou>
no it probably doesn't
<qu1j0t3>
awygle: Ah yes that will be a problem.
<rqou>
also you know we have Coolrunner-II fully done right?
<qu1j0t3>
awygle: i thought you were already in a rebase?
<awygle>
i fix. that was annoying.
<qu1j0t3>
k
<awygle>
qu1j0t3: i was, i think the git reset is why i needed to commit
<mithro>
rqou: That is also a sum-of-product too though?
<awygle>
i undid the commit, i needed to redid it
<qu1j0t3>
awygle: rebase -i can be very useful
<awygle>
i did do rebase -i but i wanted to be able to see the changes that had been made last time
<mithro>
the Lattice iCE40 actually _is_ a pretty good example given how regular / simple it is
<awygle>
without reset @~ it just said "working directory clean"
<rqou>
mithro: yes, but it has a fully programmable OR array
<mithro>
Just not really small enough
<rqou>
it's actually simpler than xc9500
<mithro>
rqou: Yeah - I assume it was simpler then the xc9500 - I was hoping that maybe xc9500 was a very simple LUT based device
<rqou>
that would be the 4000
<rqou>
which is so old and obsolete you shouldn't bother
<awygle>
mithro: the 384 ice40s seem like they're the smallest/simplest you're likely to get
<mithro>
awygle: Yeah probably
<mithro>
awygle: I guess I'm after a "useless device" which is a LUT device which is about ~10-20 cells which are all identical with single LUT+FF in the cell
<pie___>
mithro, can you fit that muhh into a literal LUT? program an eeprom
<mithro>
muhh?
<pie___>
nvm
<mithro>
There is just a *huge* gap between my "test" architectures and architectures I can generate a real bitstream for
digshadow has quit [Ping timeout: 256 seconds]
<rqou>
hmm, I have a weird hypothesis for how the xc9500 interconnect mux bits might work
<rqou>
maybe each set of five bits controls the muxes for a set of mux lines?
<rqou>
so there are three sets of twelve, and each set shares the mux config
<rqou>
does that make any sense?
<awygle>
sounds reasonable
<cyrozap>
rqou: If I were you, I'd just post all my notes and code as-is into a public git repo, then clean it up later.
<cyrozap>
It's really easy to fall into the trap of "oh, I'll publish all that once it's ready" and then never feel like it's "ready" enough and so it never gets published.
<cr1901_modern1>
Contrary to that tweet, I thought the XC2064 had 3-input LUTs. A pnr/bitstream generator for these would be fun as a joke
<cr1901_modern1>
digshadow-c knows someone who needs an XC2064 bitstream analyzer if memory serves
<cr1901_modern1>
(Also, why won't my client let me change my nick back ._.)
cr1901_modern1 has quit [Quit: Leaving.]
cr1901_modern has joined ##openfpga
<qu1j0t3>
cyrozap: +1
<rqou>
sgstair: um, that's a PAL
<rqou>
not lut-based
<rqou>
although PAL support for yosys might happen because it's been requested by robert baruch
<rqou>
amazingly, PALs are easy again because it has so _few_ features
<rqou>
so Coolrunner-II --> pretty much any design that doesn't exceed capacity should work (with exceptions)
<rqou>
PALs --> pretty much any design that uses more than basic features can't work anyways
<sgstair>
yeah
<rqou>
xc9500(xl) --> oh god so many weird limitations
<awygle>
rqou: remind me again why SAT/SMT didn't work for Coolrunner?
<rqou>
the naive way of encoding it is too big and doesn't give enough hints to the tool on how to exploit symmetry
user10032 has joined ##openfpga
<rqou>
awygle: consider as a simplification that you just use backtracking search rather than a modern sat solver
<rqou>
the algorithm will spend tons of time e.g. swapping p-terms around
<rqou>
however, it doesn't matter where you put a p-term inside any given FB, because all FB inputs are accessible by all p-terms and all p-term outputs are accessible by all or gates
<rqou>
so this swapping is useless
<rqou>
suppose we fixed this
<rqou>
then the solver will spend all its time e.g. assigning each pterm to each FB
<rqou>
with 16^54 possible choices
<rqou>
in the largest part
user10032 has quit [Quit: Leaving]
<rqou>
but that's also useless, because the p-terms need to be in the same FB as the macrocells
<rqou>
etc. etc.
<rqou>
and SAT solvers can't automagically discover this
<awygle>
hmm. seems like you should be able to do it with SMT but my theoretical background is too week to say for sure
<rqou>
so that's how i arrived at the current mostly-greedy-with-min-conflicts algorithm
<awygle>
*shrug* C is for cookie
<rqou>
hrm, maybe SMT can do it
<rqou>
but it's not clearly any less work
digshadow has joined ##openfpga
<digshadow>
cr1901_modern: yes, it is me
<digshadow>
FWIW I did document the bitstream format a while back
<digshadow>
but I'd need to dig up the notes
<cr1901_modern>
How'd you generate the bitstreams? You made a Windoze 3.1 VM :P?
<cr1901_modern>
s/the/test/
<awygle>
is it important to put e.g. oscilloscopes or spectrum analyzers on/in an ESD controlled environment?