freemint has quit [Remote host closed the connection]
freemint has joined ##openfpga
_whitelogger has joined ##openfpga
Thorn has joined ##openfpga
<tpw_rules>
ow i'm hurt by that rust pun
<tpw_rules>
oh i thought it was on there already
<tpw_rules>
anyway, it's a perfect application for my fpga HPS board
freemint has quit [Remote host closed the connection]
freemint has joined ##openfpga
<bluezinc>
azonenberg: I believe you are mistaken. I'd consider it much more likely to be either an XC7V585 or an XC7VX485 part in an 1157 package (20 GTX, 1 for USBTX, 1 for USBRX, 18 LA channels).
<bluezinc>
I don't see lecroy using a 901 package, because the ffg901 has no HP banks.
freemint has quit [Ping timeout: 260 seconds]
genii has quit [Quit: Morning comes early.... GO LEAFS GO!]
ZombieChicken has quit [Remote host closed the connection]
_whitelogger has joined ##openfpga
dh73 has quit [Quit: Leaving.]
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 272 seconds]
X-Scale` is now known as X-Scale
Bike has quit [Quit: leaving]
X-Scale` has joined ##openfpga
X-Scale has quit [Ping timeout: 258 seconds]
X-Scale` is now known as X-Scale
<azonenberg>
bluezinc: Hmm
<azonenberg>
bluezinc: but then how do they phase the channels to each other?
<azonenberg>
you can have all of the GTXes at a common multiple of the refclk, but the phase of the sampling clock vs the refclk could vary by up to an entire refclk cycle in discrete steps
<azonenberg>
also usb tx/rx are one gtx, each gtx contains a tx and rx
<azonenberg>
this is why android devices are moving to MTP
<whitequark>
um
<whitequark>
android devices haven't shared fat32 volumes for like five major releases
<azonenberg>
because usb mass storage has the same problems, you had to unmount the sd card from the phone OS in order to mount it on the host
<azonenberg>
ok, moved. Point remains
<whitequark>
strengthens it, really
<sensille>
sure, i'm aware of that
<sensille>
the question is, would it be useful to anyone but me?
<whitequark>
there's been people discussing this before, even in this channel i think
<whitequark>
and there are even prototype tools that do this for embedded dev
<whitequark>
might have seen commercial ones?
<whitequark>
bottom line: yes
<whitequark>
although i think you can go pretty far with just a jumper and a bunch of muxes
<whitequark>
which conveniently avoids having to read the cursed shit that counts for sd card protocol
<sensille>
hm. that might indeed be enough
<azonenberg_work>
whitequark: so speaking of cursed things
<azonenberg_work>
did i mention what i discovered about vivado the other day?
<azonenberg_work>
It *aggressively* caches synthesized IP netlists. And doesnt seem to have great cache invalidation
<azonenberg_work>
So it's possible to create a tree that compiles fine, and produces a working even when you reset the build and resynth/par your rtl
<azonenberg_work>
you add that state to git, go find some regressions, go back to a clean tree after a hard reset to an older revision
<azonenberg_work>
and that commit hash no longer compiles
<azonenberg_work>
actually no wait
<azonenberg_work>
it compiled fine
<azonenberg_work>
then i flushed the cache and it stopped compiling
<azonenberg_work>
because one of the IP source files wasn't in git, but vivado didn't actually check that the inputs were present and untouched before blindly using the cached netlist for P&R
<whitequark>
ouch
<azonenberg>
one of many reasons i avoid xilinx IP as much as i can
<azonenberg>
all i use is the ILA for my personal projects, and only for debugging
<azonenberg>
but this is a $sidegigclient project on a Zynq and there is no sane way to use a zynq without the IP integrator
<azonenberg>
why they cant just make the axi interconnect core a parameterized module that you instantiate and set a few synth parameters on is beyond me
<azonenberg>
why do all this code generation when a generate loop will do fine?
<azonenberg>
(if you've ever considered using a zynq in a project, dont :P)
<edbordin>
oh hey, it finished fuzzing :D
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
<edbordin>
We use this a National Instruments product with a Zynq in it and they only expose the fpga via some LabView pay-to-play proprietary dumpster fire. This makes me feel better about that fact.
s_frit has quit [Remote host closed the connection]
s_frit has joined ##openfpga
<swetland>
azonenberg: what I ended up doing was writing a script to generate a wrapper exposing the various axi interfaces I needed, etc
<swetland>
which works if you're okay with writing all your own axi glue, etc
<TD-Linux>
sensille, another option is the toshiba flashair or trancend wifi cards
<sensille>
and a micro SD adapter
OmniMancer has joined ##openfpga
Asu has joined ##openfpga
rohitksingh has joined ##openfpga
Xiretza has joined ##openfpga
<OmniMancer>
daveshah: 16 global clocks is too many?
<daveshah>
This was only 8, but the Xilinx clock rules are very complicated and nextpnr doesn't handle them all. And I guessed 8 clocks for 300 logic cells probably means something is wrong
<daveshah>
I think the problem wasn't actually clocking but because some of these fanned out to logic too
<daveshah>
I think actually Yosys shouldn't have been promoting things that drove one FF clock and one LUT input to a global at all, but that's another issue
<whitequark>
i feel like clock promotion should be nextpnr's job
<whitequark>
but i'm ignorant of the problems involved in that, so i might be just wrong
<daveshah>
I think Yosys does it because ISE needs it
<whitequark>
what about ice40?
<whitequark>
or <insert any other arch>
<daveshah>
that's done by nextpnr
<daveshah>
ditto ecp5
<daveshah>
it's just synth_xilinx that does clock promotion
<daveshah>
which hitherto has been more focused on synthesis for vendor pnr than open pnr
<whitequark>
oh, sorry, brain not fully on yet. i was thinking of using DFFCEs on ice40
<omnitechnomancer>
How many global clocks does ECP5 have?
<daveshah>
ah
<mwk>
whitequark: yeah, it was done for ISE
<daveshah>
omnitechnomancer: 16 per quadrant
<whitequark>
which... actually should be done in nextpnr too ideally, but that's even harder
<omnitechnomancer>
ah cool, similar to Anlogic in another way :P
<daveshah>
I have been wondering about doing more physical optimisations in nextpnr at some point
<daveshah>
The start point would be retiming, but timing-based LUT repacking and DFF control set optimisation could be interesting to look at too
<whitequark>
dffmince is such a crude hack, and it hits pretty badly because of how slow ice40 routing is (I think because of that?)
azonenberg_work has quit [Ping timeout: 260 seconds]
<omnitechnomancer>
daveshah: what can the global nets drive besides clocks?
<daveshah>
8 of them can drive resets, 8 CE and 8 logic IIRC
<Marex>
daveshah: mithro: So I started looking at 5M40Z again, did the project chibi ever get anywhere ?
<Marex>
seems like that is the most suitable altera part to start with in the end
bgamari_ has joined ##openfpga
bgamari_ has quit [Ping timeout: 260 seconds]
freemint has quit [Ping timeout: 245 seconds]
bgamari_ has joined ##openfpga
dh73 has joined ##openfpga
mumptai has joined ##openfpga
<Marex>
hmmm, so the POF for 5M40Z is 7859 bytes , wow
<mithro>
Marex: you would have to ask rqou -- I think he got some stuff done but never got around to making it usable
<Marex>
mithro: I saw the chunks of python
<mithro>
Marex: generally the actual work to get a proof of concept is only a very small part towards making a usable toolchain others can use -- documentation, building community, etc is like 90% of the work
<Marex>
mithro: point being, 8 kiB bitstream is easier to analyze
<Marex>
and it's literally 24 copies of the same
<Marex>
some of which is not used, so even better
<mithro>
Marex: well that is good potential first target then I guess?
<Marex>
mithro: I was being dumb for attempting C-IV back then
<mithro>
I haven't seen rqou around at all lately, but it can't hurt to email him
<Marex>
mithro: so this 5M40Z is I believe the same die as 5M80Z, 5M160Z and 5M240Z , each having different amount of LEs (you can guess from the number between 5M and Z how many)
<Marex>
since one LAB has 10 LEs, the smallest part has 4 active LABs , the next one 8 active LABs etc
<Marex>
so I would guess, that 4 out of 24 LABs are active on the smallest die and the rest is just ... nothing
<Marex>
I would expect quartus sets most of the bitstream to 1 or some idle interconnect
<Marex>
just thinking out loud
<Marex>
it almost seems like the 5M40Z is software-limited to 4 LABs, the quartus PnR places LEs in four LABs, but in random locations in the chip
<Marex>
uh
_whitelogger has joined ##openfpga
<ZirconiumX>
Marex: still around?
<Marex>
ZirconiumX: yes ?
<ZirconiumX>
I'm working on the Cyclone V, so we can probably share a bunch of tips
<Marex>
ZirconiumX: we discussed in the past, I was looking into C/IV before rqou was even around :)
<Marex>
ZirconiumX: I didn't have time to finish that, but I have some notes still
<ZirconiumX>
Ah, fair
<Marex>
ZirconiumX: and yes, I know you work on CV
<ZirconiumX>
I got the ALM bits in a LAB figured out at least
<ZirconiumX>
~~that's my one achievement~~
<Marex>
ZirconiumX: also note that I have purely software background and FPGA is a hobby, so I might be completely wrong in the terminology department
<ZirconiumX>
I don't have the fuzzing process fully automated though, even though I should
<Marex>
ZirconiumX: I think the C-IV and older are much simpler, no ? They only have LUT4 per LE
<ZirconiumX>
Indeed, but I have no EP4C chips to test with :P
<Marex>
ZirconiumX: I don't have any 5M40Z either
<ZirconiumX>
What I *do* have is a semi-functioning Yosys synthesis frontend
<Marex>
but I have to wonder how I can fuzz this one automatically, back then I was doing some arcane diffing of the bitstream dumps, but I'm not sure that's the best approach today
<Marex>
also, is there a tool to map the bits, like what the x-ray had ?
<ZirconiumX>
My friend wrote a bitstream diffing tool called horrortable
<ZirconiumX>
It's how I got the LUT bits
<ZirconiumX>
It could probably be hacked to fit a LUT4 though
<Marex>
considering that the 5M40Z bitstream is 8 kiB, I could probably write my own
<ZirconiumX>
True, I suppose.
<Marex>
but there should be some way to coerce quartus into tweaking LUT content from command line, right ? quartus_cdb can I think dump such information ...
<Marex>
but there was something, I think one could've frobbed with the .qsf file to achieve that
<Marex>
that's what I did with the C/IV
<Marex>
hm, maybe I can generate the qsf altogether for this small part
<ZirconiumX>
Marex: Try quartus_cdb --vqm
<ZirconiumX>
That gives you a Verilog netlist for it
<Marex>
oh
<ZirconiumX>
That being said, you would want as little extra noise for it as possible
<Marex>
ZirconiumX: wihch you can do, by frobbing with the chip planner , which tells you how to "fixate" LUTs to specific cells , and then generate a QSF (or QPF?) with those statements , just slightly adjusted
<Marex>
I think there's even a statement which allows you to set specific LUT to specific mask
<ZirconiumX>
Marex: If you use VQM_FILE, you can instantiate cells directly
<Marex>
ZirconiumX: well that I didn't know :-)
<ZirconiumX>
I don't know the name of the Max V cell, but I'd guess it's something like maxv_lcell_comb
<Marex>
ZirconiumX: maxv_lcell
<ZirconiumX>
That works too
<Marex>
ZirconiumX: I will take a look ; so which allows me to basically plant a cell at the specific position in the bitstream and then synthesise the result ?
<Marex>
or rather, specific location in the floorplan...
<ZirconiumX>
Marex: to plant it at a specific point you'll need to use set_location_assignment in the .qsf
<ZirconiumX>
By the way, a tip I learned from mwk was that there's a more efficient way of extracting LUT bits
<ZirconiumX>
Compared to e.g. one-hot
<mwk>
meow?
* ZirconiumX
pets mwk
<mwk>
oh, we're talking fuzzers
* mwk
purrs
<Marex>
ZirconiumX: do tell ?
<Marex>
ZirconiumX: btw is the cell on C/V also 35bits x 8bits large ?
<Marex>
ZirconiumX: that's what it was on C/IV
<tnt>
daveshah: btw, I can confirm that the special IO works fine with new treillis and nextpnr (tested on actual hw).
<ZirconiumX>
So, a LUT4 has 2^4 = 16 combinations.
* mwk
should really write that algorithm up in some document some time
<ZirconiumX>
Let's take an all-zero LUT as a baseline control
<mwk>
maybe even with some of the hacks that I layered on top of it
<ZirconiumX>
Then, imagine if you arranged the bits for (0+1) to (16+1) in columns, so it looks like this
<ZirconiumX>
If you now produce 5 bitstreams, each with a LUT mask of one of the digit columns, you can then permute it to read the LUT bits off horizontally
<ZirconiumX>
And you've done it in 6 bitstreams instead of 17 for one-hot
<ZirconiumX>
This scales even better the more LUT bits you try to do at once
<Marex>
ZirconiumX: makes sense, although I figured out how to calculate the locations of the LUTs in the C/IV bitstream and then I did the whole thing in a couple of runs
<Marex>
ZirconiumX: the MAX V is even easier, there's 40 LEs in total here :-)
<Marex>
and the compilation run, well ... it's seconds
<ZirconiumX>
Again, I had a tool to do it for me :P
<Marex>
ZirconiumX: this must be automated
<mwk>
also, with a good enough batching tool, you can lioterally RE half the chip in one batch
<ZirconiumX>
<Marex> and the compilation run, well ... it's seconds <--- Quartus has never been that fast for me
<Marex>
ZirconiumX: on C/V, it was slow for me too
<ZirconiumX>
It's like two minutes a run for even a single LUT at a time
<mwk>
because if you think about it, you can apply this idea not just to lut bits, but to any sort of a binary feature
<Marex>
jupp, for CV, it's painfully, aggravatingly slow
<Marex>
although I only ever used CV with SoC
* ZirconiumX
should hire mwk to RE the Cyclone V at this rate
<ZirconiumX>
I am terrible at this >.>
<Marex>
ZirconiumX: how big is the LE cell on C/V ?
<Marex>
ZirconiumX: on C/IV it looks like 35bits x 8bits , from my notes
<ZirconiumX>
I don't have definite dimensions as such, but I know the approximate range of the LUT bits
<Marex>
ZirconiumX: there's interconnect somewhere in there too
<Marex>
more like , the output of a LE has some mux to attach to the interconnect row and column and LE-local , these things
<Marex>
it's usually on the sides of the LE
<ZirconiumX>
I know :P
<ZirconiumX>
83393 for an ALM is way bigger than that :P
<Marex>
ZirconiumX: so see that lab.txt , for me the LE looked like there were two, back-to-back
<daveshah>
tnt: good to know, thanks for testing
<tnt>
daveshah: does that pin behave like the other ? (like, does it have io registers and stuff like that ?)
<daveshah>
Yes, it does
<daveshah>
The main reason it is odd is that it is the only pin on the chip that doesn't have a 'B' side
Jybz has joined ##openfpga
ym has quit [Quit: Leaving]
<Marex>
ZirconiumX: btw, if I hexdump -vC the .pof file, there;s an interesting pattern, a column of ...
<Marex>
ff fe ff ff ff fd ff ff .... the 0xfe and 0xfd are always in the same column
<Marex>
ZirconiumX: POF is the programmer object file, no?
<ZirconiumX>
(Programming Object File)
<ZirconiumX>
Yes, but RBF is *the bitstream itself*
<Marex>
I am not sure if there is RBF for CPLD
<Marex>
maybe cpf can help ?
<ZirconiumX>
Well, try running quartus_cpf on it and see
<ZirconiumX>
Yeah
<Marex>
ZirconiumX: nope, there's no SOF
<ZirconiumX>
I said RBF :P
<Marex>
ZirconiumX: you need SOF to generate RBF , no ?
<ZirconiumX>
No
<Marex>
ZirconiumX: so what is your input to the CPF then ?
Jybz has quit [Quit: Konversation terminated!]
<Marex>
"POF file is a header plus raw binary data. I think there is no such a document. Converting pof to rbf, you can know where is raw binary data area."