<tpb>
Title: Poor error message with two always blocks named the same · Issue #1940 · YosysHQ/yosys · GitHub (at github.com)
<ZirconiumX>
While I'm here, the code uses register initialisation inside always blocks
<ZirconiumX>
e.g. reg [3:0] state = 0;
<ZirconiumX>
How should I replace that?
ebb has quit [Remote host closed the connection]
ebb has joined #yosys
Ultrasauce has quit [Quit: No Ping reply in 180 seconds.]
Ultrasauce has joined #yosys
kuldeep has quit [Read error: Connection reset by peer]
kuldeep has joined #yosys
kuldeep has quit [Remote host closed the connection]
kuldeep has joined #yosys
<ZirconiumX>
daveshah: Could you send me a firmware.hex for attosoc? I want to try using attosoc with synth_intel_alm and Quartus and see how far I get
<daveshah>
all a horrible hack but at least it is doing something
emeb has joined #yosys
Vinalon has joined #yosys
<gtw>
haha nice choice of screenshot daveshah :)
<daveshah>
it's the MiST SNES core
<whitequark>
daveshah: whoa nice
<whitequark>
how's the performance like?
<daveshah>
It took about 20 minutes to get to that point
<whitequark>
flattened design?
<daveshah>
But it's a big and messy core
<daveshah>
No
<whitequark>
oh
<whitequark>
yeah that would do it
<whitequark>
non-flattened designs run orders of magnitude slower
<gtw>
Is this an appropriate place to ask prjtrellis questions, or is there somewhere better? I am wondering about adding a bunch more SVF options to ecppack...
<daveshah>
gtw: sure
<mwk>
gtw: it's ok, though there's also ##openfpga
<gtw>
daveshah: sure it's appropriate, or sure add more SVF?
<whitequark>
daveshah: if you don't flatten then two things happen
<gtw>
mwk: ok thanks :)
<whitequark>
first, there is no splitnets
<daveshah>
gtw: both
<whitequark>
second, you have comb dependencies between submodules
<Sarayan>
How do you add a port to a RTLIL::Module?
<whitequark>
both of which mean you probably have hundreds of delta cycles
<whitequark>
(I'm curious how many)
<daveshah>
ack
<daveshah>
how do I know?
<whitequark>
.step() returns that
<gtw>
daveshah: OK I will send a pull request once I get a round tuit :)
<daveshah>
gtw: thanks!
npe has joined #yosys
<Sarayan>
(adding a wire and setting its port_id blows up because it's not in the ports array)
<daveshah>
whitequark: looks like up to 62 delta cycles, but that is during startup so maybe more once more parts start running
<whitequark>
daveshah: i would expect at least a 10x speedup if you flatten
<whitequark>
quite possibly mre
<Sarayan>
oh, looks like I have to class fixup_ports() at some mysterious point
<whitequark>
daveshah: oh and if you do -b 'cxxrtl -header' you can rebuild your bench code without rebuilding the generated code
<daveshah>
whitequark: thanks, absolutely flying with flatten as expected
<whitequark>
daveshah: curious what the speedup is
<Sarayan>
ok, I have the ports and internal variables translating, weee
<daveshah>
whitequark: took about 2 minutes to get to the point it took noflatten about 20 minutes
<daveshah>
so about 10x as expected
<whitequark>
nice
<whitequark>
how many delta cycles is it now?
<whitequark>
(both on posedge and negedge)
<whitequark>
also, does the flattened design have any feedback arcs, per the backend output?
<daveshah>
up to about 5 delta cycles
<daveshah>
i'll see what the backend output is
<whitequark>
5 delta cycles is reasonable
<daveshah>
I think there is even a generated clock deep inside
<whitequark>
ah, then my next suggestion wouldn't work
<whitequark>
(stop toggling the clock and instead set posedge_p_clk = true; directly, which lets you skip negedge cycles)
<daveshah>
Yes, one feedback arc around the PPU
<whitequark>
ok, yes, won't work then
<daveshah>
there is definitely stuff happening on the negede too
<whitequark>
ahh
<whitequark>
this is really nice to hear that cxxrtl actually handles that well
<whitequark>
i've only really built it for fully synchronous single-clock nmigen designs
<whitequark>
well
<whitequark>
that was my task. i built it to handle literally any imaginable rtlil because why not
<daveshah>
Almost everything is on the same clock tbf
<daveshah>
But it still seems to do very well
<whitequark>
daveshah: how do you grab the images btw?
<daveshah>
whitequark: writing a csv file and processing it with PIL
<daveshah>
just using hsync falling edge = start new line and vsync falling edge = start new file
<whitequark>
oh, so you're basically sampling it at each pixel clock?
<daveshah>
Yeah, every other system clock which I think is one pixel clock
<whitequark>
do you think blackboxes would help you here?
<daveshah>
The video output is top level at the moment anyway, so it wouldn't make a big difference
<whitequark>
ah ok
<ZirconiumX>
I think it'd be at least a little more fun to see it render in SDL or whatever
<whitequark>
yeah but you don't necessarily need blackboxes for that
<daveshah>
ZipCPU did something like that with Verilator
<whitequark>
ZirconiumX: my idea for nmigen-soc is that it would come with peripherals you could drop into your design and they'd have simulation versions hooked up to the host system
<whitequark>
one of them could very well be SDL output
<daveshah>
That's a really nice idea
<daveshah>
Being able to do that for the CPU should give a good speedup too
<whitequark>
yes, lambdaconcept suggested it, based on the concept pioneered by litex
<whitequark>
we are actually quite close to it working
<whitequark>
two yosys PRs away from having all the knobs I need in nmigen for turnkey integration
<whitequark>
and i already have a working design for the first one
<daveshah>
Some parts of memory inference is even more cursed in VHDL than Verilog
<ZirconiumX>
It's *way* closer here
<whitequark>
daveshah: you represent the memory as one gigantic register and then read chunks of it, right?
<daveshah>
No, you can have arrays
<daveshah>
It's true dual port that gets weird as you start to get into shared variables, which were significantly changed in VHDL 08 so one pattern stopped working
<ZirconiumX>
10MHz less Fmax, 2 LABs more area. That's not too shabby.
<lambda>
ZirconiumX: wow, nice
<ZirconiumX>
On the other hand I'm very consciously giving it workloads without things like RAM or multipliers
<daveshah>
Hmm, that's very impressive
<daveshah>
Particularly in terms of area which Yosys/ABC usually do quite badly on
<whitequark>
nice
<ZirconiumX>
My hunch is that it's the LUT4s that ABC is producing
<lambda>
daveshah: from what I can tell shared variables were never a very good pattern, but somehow very common anyway. The "one process, two clock edges" approach should've always worked and is cleaner in a few ways
<ZirconiumX>
Because an ALM can fit two independent LUT4s and two LUT5s that share only two terms
<ZirconiumX>
I don't think it's that ABC is doing particularly well here, rather that Quartus can pack the result efficiently
<whitequark>
what if you mark the LUTs as keep?
<ZirconiumX>
I don't have e.g. WYSIWYG primitive resynthesis enabled, so it should be treated as keep
dys has joined #yosys
<ZirconiumX>
wq: Yeah, no difference to the Yosys results
<whitequark>
I see
az0re has joined #yosys
<ZirconiumX>
I can try enabling resynthesis, if you're curious
<cr1901_modern>
nice... I forgot there was a free SNES HDL core
az0re has quit [Ping timeout: 240 seconds]
<ZirconiumX>
Remind me again what BRAM transparency means?
<daveshah>
Transparent means that when you read from a address that is being written to in the same cycle as the read address arrived, the new rather than old data appears
<ZirconiumX>
If it's configurable, should I pick it or not?
<daveshah>
It can be configurable in Yosys too
<ZirconiumX>
Doesn't that then produce a parameter to instantiate?
<daveshah>
Yes
<ZirconiumX>
Unfortunately memory_bram doesn't tell me what the parameter is called
<whitequark>
ZirconiumX: the manual lists the parameter
<ZirconiumX>
whitequark: Where? It's not in command-reference-manual.tex
<daveshah>
It looks like the parameters for memory_bram created cells are indeed undocumented
<whitequark>
hm
<whitequark>
did I forget to put them in? apologie
<whitequark>
*apologies
strobokopp has joined #yosys
<ZirconiumX>
It seems to be the RD_TRANSPARENT parameter?
<daveshah>
I think that is for $mem cells
<daveshah>
It is called TRANSPn for memory_bram created cells
<daveshah>
where n is the port
<daveshah>
*not the port, but the number given in the transp section in the config
<ZirconiumX>
transp 0 2 <-- this would have a TRANSP2?
<daveshah>
Yes
<ZirconiumX>
What about if `clocks` is configurable?
<daveshah>
clocks isn't configurable
<daveshah>
values greater than 1 are just different clock signals
<daveshah>
with 0 being unclocked
<ZirconiumX>
Yeah, but an MLAB read port can either be sync or async
<daveshah>
Then you need two different BRAM entries
<ZirconiumX>
Okay
az0re has joined #yosys
Cerpin has quit [Quit: Lost terminal]
Cerpin has joined #yosys
emeb_mac has joined #yosys
npe has quit [Ping timeout: 256 seconds]
<ZirconiumX>
Quartus is giving me a headache
<ZirconiumX>
It lets you build a 32x2 LUTRAM (as the documentation says) but only sometimes
N2TOH has joined #yosys
jfcaron has quit [Ping timeout: 258 seconds]
<ZirconiumX>
daveshah: Having problem mapping the ROM in attosoc; I have an async-read initialisable LUTRAM (that I'm using for testing) but memory_bram is giving up
<daveshah>
Yeah, the rom in attosoc doesn't map to bram
<daveshah>
It was a horrible pattern because at the time objcopy didn't write more than 8 bit inits
<daveshah>
And I cba to write a better was at the time ecp5 bram wasn't even supported
<daveshah>
*better way
<ZirconiumX>
Ugh; I need a testbench for BRAM init