<azonenberg> So using this methodology, I get...
<azonenberg> Input buffer delay = 0.562
<azonenberg> Crossbar delay = 10.531
<azonenberg> Schmitt trigger delay = 0.937
<azonenberg> Output buffer delay = 6.408
<rqou> that's a really slow obuf/crossbar
<azonenberg> i think the ibuf is slower than that but it's a start
<azonenberg> that's also with x1 drive
<azonenberg> with x2 its half that
<azonenberg> ... i should test x4 drive for lulz
<rqou> hmm are obufs normally this slow?
<azonenberg> some of the pins have x4
<azonenberg> On the coolrunner, which is generally a faster chip
<rqou> i see 1.2ns for a 32a-6 Tout
<azonenberg> a lvcmos33 input buffer is 1.8 ns, a schmitt trigger is another 3 on top of that
<azonenberg> an output buffer is 2.8
<azonenberg> i'm measuring the schmitt trigger as way faster
<azonenberg> you know, i should look at how many mV of hysteresis it provides
<azonenberg> that should hint a bit
<rqou> oh but then the 32a Tslew33 is 5ns :P
<rqou> which is more equivalent to a x1
<rqou> i guess io pads are just slow
<azonenberg> 32a is 500 mV hysteresis
<azonenberg> A greenpak is about the same
<azonenberg> 0.47 nominal
<azonenberg> anyway, i'll use these values until i can get more accurate data from somewhere :p
<azonenberg> Next step is to see if they're fairly consistent across multiple paths
pie_ has quit [Ping timeout: 240 seconds]
<azonenberg> Hmmm interesting
<azonenberg> schmitt trigger delay adder is not consistent across paths o_O
<azonenberg> it varies by nearly 50%
<azonenberg> Which means my computed input buffer delays are unstable too
<azonenberg> :(
Zarutian has quit [Quit: Zarutian]
<azonenberg> rqou: see above
<azonenberg> this is data based on those assumption
<azonenberg> note the range in variation of schmitt trigger delays
<rqou> that doesn't seem right
<rqou> this is a result of the 1x vs 2x assumption?
<azonenberg> well
<azonenberg> not exactly
<azonenberg> the other data is
<azonenberg> the schmitt is just subtracting
<azonenberg> pin x to y with x2 drive on output and no schmitt on input
<azonenberg> vs pin x to y with x2 drive on output and schmitt on input
<azonenberg> same crossbar route, same output driver
<azonenberg> bitstreams are literally one bit off
<rqou> that seems totally nuts
<azonenberg> It does :p
<azonenberg> next step
<azonenberg> try with another die
<azonenberg> see if it's at all consistent :p
<rqou> i want to see some process analysis of a greenpak now
<rqou> what fab is it on?
<azonenberg> TSMC 180 nm
<azonenberg> I decapped a 46140 and 46620 and had digshadow do top metal and delayered images
<azonenberg> Not sharing publicly b/c i dont want to upset silego
<azonenberg> i get special treatment so i respect that
<rqou> wait this is 180nm?
<rqou> seems really slow for that
<rqou> er actually
<rqou> the crossbar seems really slow (compared to coolrunner-ii)
<rqou> but the schmitt trigger seems way too fast
<azonenberg> So
<azonenberg> The coolrunner ZIA is a very efficient routing
<rqou> coolrunner's schmitt trigger is a whopping 4ns of delay
<azonenberg> the greenpak matrix is a *full* crossbar
<azonenberg> every input can route to every output
<azonenberg> it's massive
<azonenberg> rqou: see PM
gameredan has joined ##openfpga
m_w has quit [Quit: leaving]
<mtp> CP/M? never between meals
<qu1j0t3> bwahahahha
amclain has quit [Quit: Leaving]
_whitelogger has joined ##openfpga
DocScrutinizer05 has quit [Disconnected by services]
DocScrutinizer05 has joined ##openfpga
<azonenberg> Oook so, architecture question
<azonenberg> whitequark, lain, rqou: i want you guys' input on this
<lain> avoid converging lines
<azonenberg> :P
<azonenberg> So, i'm starting to collect data on the slg46620
<azonenberg> Trying to figure out how this should fit in the data model
<azonenberg> the two main options are, add fields to every Greenpak4BitstreamEntity-derived class for calculating delays
<azonenberg> or create a separate parallel data structure for timing measurements
<azonenberg> the former seems to make more sense as i wouldn't have to re-create the whole interconnect model twice
<azonenberg> once for routability and once for timing
<azonenberg> i'm only doing combinatorial delays to start but will have to do setup, hold, clock-to-out, etc down the road
<azonenberg> (and i'm only doing rising edge delays for now)
<lain> hmmm
<azonenberg> Keep in mind, libxbpar needs a way to query timing properties of various paths in the netlist
<azonenberg> Eventually, this will require clock constraints and such
<azonenberg> as well as some metadata to show what's a combinatorial vs a sequential path
<azonenberg> which cells are registers
<azonenberg> etc
<lain> would it be simpler to add a pure virtual method like Greenpak4BitstreamEntity::getTiming(int what), where the arg is actually from a device-specific enum?
<lain> then the derived classes can implement that
<azonenberg> So, my thought was
<azonenberg> there would be methods to getSetup(pin name)
<azonenberg> getHold(pin name)
<azonenberg> getPropagation(srcpin, destpin)
<azonenberg> But the default data model would be for the bitstream entity to just have a map<pin, TimingValue> for each measurement
<azonenberg> where a TimingValue would be a set of DelayPair objects at fast/slow and possibly typical P/T/V
<lain> ah ok
<azonenberg> And a DelayPair is rising/falling combinatorial edge delays
<azonenberg> i'll figure out setup/hold in a bit
<azonenberg> I also have to figure out how to post-process the data
<azonenberg> gp4tchar will characterize one die at one temperature
<azonenberg> (although eventually across voltage corners)
<azonenberg> it may eventually interface to a peltier as well
<azonenberg> but you will have to manually swap dies in and out of the zif / header either way
<azonenberg> And so this will require some way to merge multiple timing datasets
<azonenberg> basically you'll load a timing data file
<azonenberg> and it'll collect new data
<azonenberg> then do fast = min(previous fast, new value) and slow = max(previous slow, new value) for each delay value
<lain> yeah
<azonenberg> Thinking of having YAML as the serialization format as i'm already using that heavily
<azonenberg> sorry i meant JSON
<lain> I think adding the fields to the class makes more sense, rather than having a separate data structure
<azonenberg> since i have a json library pulled in as a dependency already
<azonenberg> Yeah
<azonenberg> Ok so, i guess at this point i'll try actually poking the device to load the timing values
<lain> yeah I use json for everything nowadays too, I still have some xml stuff but I'd like to move it over at some point, particularly once I find a good json schema parser :P
<azonenberg> i use yaml for human authored data
<azonenberg> and json for machine authored
<azonenberg> since yaml is easier to write and has comments
<openfpga-github> [openfpga] azonenberg pushed 6 new commits to master: https://git.io/vHuwk
<openfpga-github> openfpga/master 0261e03 Andrew Zonenberg: solver: Added support for coefficients other than 1
<openfpga-github> openfpga/master 408f4fa Andrew Zonenberg: Fixed signed/unsigned warning
<openfpga-github> openfpga/master fd146b7 Andrew Zonenberg: Greenpak4IOB: Added SetSchmittTrigger
<rqou> azonenberg: will future timing-driven placement be more advanced than the current single "ComputeTimingCost"?
<azonenberg> rqou: Quite possibly
<azonenberg> For starters, the timing cost will be the sum (appropriately scaled) of amounts by which FF-to-FF paths fail timing
<azonenberg> added to the sum of pin-to-pin combinatorial delays
<azonenberg> with some scaling factor
<azonenberg> That may be enough for the existing optimizer to find a good result
<azonenberg> But before i do timing driven placement i'm going to do post-par static timing
<azonenberg> lain: ok, the plot thickens
<azonenberg> Greenpak4BitstreamEntity is a libgreenpak4 class
<azonenberg> which isn't directly understood by libxbpar
<azonenberg> now, this isnt a huge problem
<azonenberg> because Greenpak4PAREngine can override PAREngine::ComputeTimingCost()
<azonenberg> But it means that if we add timing driven placement to e.g. coolrunner-2
<azonenberg> we have to reimplement more than if it was in xbpar
<azonenberg> Thoughts?
<rqou> i'm not actually sure what would be required in either case (gp4 or xc2)
<rqou> hmm
<azonenberg> rqou: so my plan was to first do techmapping
<rqou> so right now the PAR seems to be completely unaware of any "actual hardware stuff"
<rqou> e.g. it doesn't know about FFs vs combinatorial or setup/hold times
<azonenberg> Correct
<rqou> would that need to change?
<azonenberg> it's a pure graph based router right now
<azonenberg> Not necessarily
<azonenberg> xbpar basically tries to map one graph onto a subset of another
<azonenberg> while minimizing a virtual cost function
<azonenberg> and given restrictions like, nodes must be placed at compatible sites
<azonenberg> I think all of that stuff would have to be implemented in the timing analyzer
<azonenberg> Which, i think, will initially be implemented in gp4par
<azonenberg> then gradually refactored out into a separate library once it's a bit better developed
<azonenberg> at which point it'd be nicely packaged up for cr2 to use
<azonenberg> lain: also, i think crossbar delays will have to be in Greenpak4Device
<azonenberg> Greenpak4BitstreamEntity only cares about pin to pin delays for various configurations
<rqou> gp4 and cr2 aren't actually that similar at all
<azonenberg> Yeah but i think the par is generic enough
<azonenberg> i was going to have the cr2 cell library have a cell for a PLA AND gate
<azonenberg> a cell for a PLA OR gate
<azonenberg> a cell for a macrocell XOR
<azonenberg> a cell for a macrocell FF
<azonenberg> etc
<rqou> what about "muck around with PLA logic synthesis and try again?"
<rqou> or can this be done in a purely feed-forward way?
<azonenberg> Dont know yet
<azonenberg> i will almost certainly have to replicate pterms
<azonenberg> if they need to be used in multiple FBs
<azonenberg> But i dont yet know how/when to do that yet
<azonenberg> With greenpak it was easy b/c there was no pterm sharing
<azonenberg> one lut = one lut
<azonenberg> and yosys did the techmapping
<azonenberg> the only techmapping i had to do was things like, when you use a gp_vref you need to attach it to a gp_acmp
<azonenberg> so if there's not one in use, infer one
<azonenberg> etc
<rqou> oh btw i'm looking into playing with getting cr2par to exist right now
<rqou> nothing actually exists yet though
<azonenberg> :)
<azonenberg> Awesome
<rqou> i noticed your code doesn't seem to be very const-correct
<azonenberg> If you mean i underuse consts
<azonenberg> yes
<azonenberg> if you see a const function feel free to tag it as such, within reason
<rqou> also wtf you made another submodule
<rqou> i hate submodules
<azonenberg> Lol
<azonenberg> well the alternative is to require the end user to manually install a dozen separate libs
<azonenberg> Or to copy the code into every repo
<azonenberg> neither scales well
<azonenberg> actually i need to pass voltage to the timing functions too
<azonenberg> because unlike coolrunner
<azonenberg> greenpak needs timing characterized at multiple voltages
<openfpga-github> [openfpga] azonenberg pushed 1 new commit to master: https://git.io/vHuos
<openfpga-github> openfpga/master 396c094 Andrew Zonenberg: Added PTVCorner. Added initial timing functions to Greenpak4BitstreamEntity. No serialization yet.
<rqou> azonenberg: is label_names passed around just for debug printing?
<openfpga-github> [openfpga] rqou opened pull request #90: xbpar: Make a bunch of things const (master...rqou-xbpar-const) https://git.io/vHuoW
<azonenberg> rqou: i believe so but i'll have to check
<azonenberg> sorry i'm workign on other stuff right now
<azonenberg> ok so re your doc PR
<azonenberg> where do we stand
<azonenberg> did you fix the things i mentioned earlier?
<azonenberg> I wanna get that out of the way
<rqou> shit
<azonenberg> Work on that while i review #90
<rqou> what was wrong with it other than a copypasta error?
<openfpga-github> openfpga/master c87f1bc Andrew Zonenberg: Added calibration device to gp4tchar. Added function to Greenpak4Device to print timing data (not yet implemented)
<openfpga-github> [openfpga] azonenberg pushed 1 new commit to master: https://git.io/vHuog
<azonenberg> i think just that
<azonenberg> fix that and i'll re-review
<openfpga-github> [openfpga] azonenberg closed pull request #90: xbpar: Make a bunch of things const (master...rqou-xbpar-const) https://git.io/vHuoW
<rqou> alright i updated the doc
<rqou> somehow it doesn't ping IRC though if you force-push to a PR
<whitequark> no it's just flaky
<rqou> ah ok
<rqou> oh btw random question
<openfpga-github> [openfpga] azonenberg pushed 2 new commits to master: https://git.io/vHuoD
<openfpga-github> openfpga/master 77474e2 Andrew Zonenberg: Merge pull request #88 from rqou/unused_attrib...
<openfpga-github> openfpga/master ae5de4a Robert Ou: doc: Documented UNUSED_DRIVE and UNUSED_PULL attributes...
<rqou> when i was running pdflatex manually to see how things look, how come a bunch of figures just have ?? instead of a number?
<rqou> this doesn't happen when I run make
<openfpga-github> [openfpga] azonenberg pushed 2 new commits to master: https://git.io/vHuoy
<openfpga-github> openfpga/master 3313caf Andrew Zonenberg: gp4tchar: Now print timing data at end (for debugging)
<openfpga-github> openfpga/master c01bb48 Andrew Zonenberg: Merge branch 'master' of github.com:azonenberg/openfpga
<azonenberg> rqou: did you do it twice?
<azonenberg> latex needs to make two passes over the doc
<rqou> no, why do i need to do that?
<rqou> wait what
<azonenberg> the first pass it figures out where all of the labels go
<rqou> how come i never needed to do that before?
<azonenberg> the second pass it resolves them based on the index it made the first time
<azonenberg> this is always the case
<azonenberg> when you involve bibtex it gets even more fun
<rqou> i've never encountered problems with this in the past
<azonenberg> um
<azonenberg> If you only have back-references
<rqou> i probably wasn't using auto-numbering
<azonenberg> i think it works one pass
<azonenberg> i forget if you only reference things after they're declared
<azonenberg> it might work one pass
<azonenberg> but to work reliably you need two pass for sure
<rqou> wait is this why random files in cwd can taint/confuse/break it?
<azonenberg> i think it's the .toc and .aux files
<azonenberg> that it uses
<azonenberg> but i am not super familiar with tex internals
<rqou> wtf
<rqou> why can't it figure out that it needs to reparse the document by itself?
<azonenberg> Good question
<azonenberg> probably because it's always been that way and nobody wants to try to fix it? :P
<azonenberg> i use tex b/c the output is better than any other tool
<rqou> and this is why i just give up and use wysiwyg text editors
<azonenberg> not because the ux is nice
eduardo__ has joined ##openfpga
jn__ has quit [Ping timeout: 260 seconds]
eduardo_ has quit [Ping timeout: 268 seconds]
<rqou> huh azonenberg you use merge-based git workflows?
<azonenberg> rqou: yeah, why not?
<rqou> i personally prefer rebase-based workflows
<azonenberg> i'd rather keep the lineage of the code obvious
<azonenberg> and not make it look like one giant commit
<rqou> you don't have to squash commits when rebasing
<azonenberg> if you dont, what's the benefit?
<rqou> i usually make a giant mess and then optionally use rebase to refactor into clean meaningful cummits
<rqou> *commits
<rqou> and since i'm rebasing anyways, might as well rebase onto master
<openfpga-github> [openfpga] azonenberg pushed 2 new commits to master: https://git.io/vHu6p
<openfpga-github> openfpga/master 0c9cd72 Andrew Zonenberg: Refactoring: Greenpak4BitstreamEntity::GetDescription() is now const
<openfpga-github> openfpga/master 5da42e2 Andrew Zonenberg: Continued work on printing of timing data
<azonenberg> i prefer to just work atomically and make small commits as i go
<rqou> azonenberg: um... what's your normal tab width?
<azonenberg> one \t is four columns
<azonenberg> Spaces shall not be used for indentation
<rqou> why do you use tabs btw?
<azonenberg> a) force of habit, that's how visual studio '98 did it
<azonenberg> Which is what i mostly learned on
<azonenberg> b) it makes more sense semantically
<azonenberg> one \t is one level of indentation
<azonenberg> you can render it however you want, although i find four columns makes more concise code than eight (you can have a few levels of indentation and still nice long lines)
<azonenberg> So i typically write things to look good with 4-column tabs
jn__ has joined ##openfpga
<openfpga-github> [yosys] azonenberg pushed 4 new commits to master: https://git.io/vHuML
<openfpga-github> yosys/master 0290b68 Clifford Wolf: Update ABC to hg rev efbf7f13ea9e
<openfpga-github> yosys/master e7a984a Clifford Wolf: Add dff2ff.v techmap file
<openfpga-github> yosys/master c365e33 Clifford Wolf: Fix AIGER back-end for multiple symbols per input/latch/output/property
<azonenberg> Ok, i have to figure out how to represent crossbar delays still
<azonenberg> But i'm getting there
<rqou> wait wtf i just checked my jenkins dashboard
<rqou> stuff be broken
<rqou> on both windows and mac
<openfpga-github> [openfpga] azonenberg pushed 1 new commit to master: https://git.io/vHuM3
<openfpga-github> openfpga/master ba0213d Andrew Zonenberg: Now collecting IOB characterization data and storing it in the IOB object. Not serializing yet
<azonenberg> Did i break something?
<azonenberg> i did notice the Ethernet test failing to PAR with some of the recent changes to the optimizer
<azonenberg> i think it was borderline fitting and now doesnt
<azonenberg> need to investigate why
<rqou> nah it's stupid platform shit that i don't care about at this moment: https://github.com/azonenberg/xptools/issues/1
<rqou> xptools isn't very cross-platform is it :P
<azonenberg> rqou: i wrote it years ago and it's possible that i missed an include file during refactoring?
<azonenberg> But now that there's an issue open i'll investigate
<azonenberg> (when i get a chance)
<azonenberg> This data is now being stored in the Greenpak4Device class (though not persisted to a file yet)
azonenberg_work has quit [Ping timeout: 240 seconds]
jn__ has quit [Ping timeout: 240 seconds]
jn__ has joined ##openfpga
pie_ has joined ##openfpga
<openfpga-github> [openfpga] azonenberg pushed 1 new commit to master: https://git.io/vHuQP
<openfpga-github> openfpga/master 1e2ebe3 Andrew Zonenberg: Minor message reformatting. Added file missing from last commit
Finnpixel has joined ##openfpga
egg|egg has quit [Quit: moo.]
oeuf has joined ##openfpga
pie_ has quit [Ping timeout: 268 seconds]
pie_ has joined ##openfpga
pie_ has quit [Ping timeout: 260 seconds]
gameredan has quit [Quit: leaving]
Zarutian has joined ##openfpga
Zarutian has quit [Read error: Connection reset by peer]
Zarutian has joined ##openfpga
wpwrak has quit [Read error: Connection reset by peer]
wpwrak has joined ##openfpga
pie_ has joined ##openfpga
azonenberg_work has joined ##openfpga
gameredan has joined ##openfpga
<azonenberg_work> rqou: around?
amclain has joined ##openfpga
azonenberg_work has quit [Ping timeout: 260 seconds]
azonenberg_work has joined ##openfpga
m_t has joined ##openfpga
m_w has joined ##openfpga
nrossi has quit [Quit: Connection closed for inactivity]
m_w has quit [Quit: leaving]
<rqou> azonenberg_work: i am now
<azonenberg_work> rqou: so i did more math, it looks like i can basically reduce most of the timing unknowns down to a single ratio between input and output buffer delays
<azonenberg_work> i'm gonna do a bit more experimenting and guesstimate it
<azonenberg_work> then try to do a more accurate assessment with probing
<azonenberg_work> I'm going to assume that the crossbar has zero sideways propagation delay
<azonenberg_work> more specifically
<azonenberg_work> delay from IP block output to crossbar matters
<azonenberg_work> Delay from crossbar to IP block input matters
<azonenberg_work> But the delay *through* the crossbar is constant for any given IP block
<azonenberg_work> so for example, say pin 3 to the crossbar is 1 ns and pin 4 to the crossbar is 1.5 ns
<azonenberg_work> then crossbar to pin 5 is 2 ns
<azonenberg_work> i'm going to assume that pin 3 to 5 is 3 and 4 to 5 is 3.5
<azonenberg_work> and that there's no skew due to horizontal wires in the crossbar
<azonenberg_work> This isn't 100% accurate but it will make the timing model much simpler and i think i can get good-enough data to err on the side of caution
<rqou> hmm yeah i whether some IP is less capable of driving the crossbar than other IP
<rqou> because the transistors are smaller or whatever
<azonenberg_work> well the other thing i will not model initially
<azonenberg_work> is fanout delay
<azonenberg_work> i.e. i assume pin 1 driving pin 2
<azonenberg_work> and pin 1 driving pin 3
<azonenberg_work> are the same delays as pin 1 driving pins 2 and 3
<azonenberg_work> i'll do some testing to confirm
<azonenberg_work> aaanyway if i do this
<azonenberg_work> i can model each IP block as a fixed end-to-end delay
<azonenberg_work> Lumping in the crossbar
<azonenberg_work> (which is basically what their datasheet does)
<azonenberg_work> Then i can model setup/hold timing for flipflops starting at the crossbar
<azonenberg_work> and lump any skew in the crossbar into that measurement
<azonenberg_work> at some point i want to try to characterize skew in the crossbar, if any
<azonenberg_work> as well as skew between the clocks in the left and right halves of the device
pie_ has quit [Ping timeout: 272 seconds]
m_t has quit [Quit: Leaving]
m_w has joined ##openfpga
DocScrutinizer05 has quit [Disconnected by services]
DocScrutinizer05 has joined ##openfpga
DocScrutinizer05 has quit [Remote host closed the connection]
rvense_ has joined ##openfpga
kristian1aul has joined ##openfpga
rvense has quit [Remote host closed the connection]
kristianpaul has quit [Remote host closed the connection]
DocScrutinizer05 has joined ##openfpga
azonenberg_work has quit [Ping timeout: 272 seconds]