<rjo>
sb0: why did you suggest 50 Mhz XOs everywhere? Why not 200 MHz?
<sb0>
I suggested 50MHz only for the RTM FPGA
<sb0>
for two reasons: 1) Greg had already chosen 50MHz for the other XO, whose frequency is non-critical since we can easily put it through a FPGA PLL/MMCM
<sb0>
I didn't specify anything for that first XO. I don't know why Greg chose 50MHz. but I just wanted to avoid another BOM line.
<sb0>
2) we're going to run that link at about 1Gbps (due to IOSERDES limitations on the KU side) and the 7-series transceivers can only multiply the clock by 10, 20 or 40
<sb0>
(only 20/40 actually but you can divide by 2 in the IBUFDS_GTE2)
<rjo>
sb0: higher frequencies generally enable higher PFD frequencies and better performance.
<rjo>
but ack the multiplication argument.
<sb0>
iirc 50MHz still allows you to reach the max VCO frequency
<sb0>
let me check. but we don't need a high performance PLL there anyway.
<sb0>
yeah, you can multiply up to 64 with a MMCM and 128 with a PLL
<sb0>
that's weird, I'm running DRTIO aux controller simulations with FullMemoryWE() and they work...
<sb0>
oh, right, I'm writing from the logic, not the simulation
<sb0>
well I'm not quite sure how your code is supposed to work
<sb0>
FullMemoryWE destroys the original memory object, that's what it's supposed to do
<rjo>
sb0: sure. just that this makes it impossible to access the memory directly.
sandeepkr has quit [Remote host closed the connection]
rohitksingh_wor1 has quit [Read error: Connection reset by peer]
<rjo>
sb0: how should we implement wide RTIO events. i.e. the 160 bit spline data for phaser2
<rjo>
and for the latency matching of the different rtio channels within one dac channel i can do it in software or with a bunch of registers in gateware. runtime version will slow it down a bit because of 64 bit wrestling, might complicate later register pinning of now.
sandeepkr has joined #m-labs
<rjo>
sb0: is that few picosecond drift that joe seemed to report on simple-drtio a while back explained?
rohitksingh has joined #m-labs
<sb0>
rjo, no it's not
<sb0>
rjo, how does this "latency matching" work?
<sb0>
rjo, what is the internal structure of the 160-bit events?
<sb0>
I guess they're made of several fields? maybe there can be a bunch of 32-bit registers, and only those that have been written get transferred over drtio?
<sb0>
then the MSBs of each 32-bit are cut before putting them into the FIFO
<sb0>
I suppose it could be OK to transfer the full 32bits over drtio
<sb0>
especially if the fields are large anyway
<sb0>
the advantage of this is the CPU doesn't have to do potentially slow struct packing
<sb0>
but if that's not an advantage, then the 32-bit regs can correspond to contiguous "bag of bits" data, not fields
<sb0>
this results in more efficient DRTIO packing
<sb0>
well, we probably want that for DMA
<rjo>
sb0: latency matching: if you set the frequency and the amplitude at the same (rtio) time, the effects should hit the dac at the same time.
<rjo>
i'd prefer to keep the rtio data opaque and not decompose into fields
<rjo>
they tend to all change.
<rjo>
variable length events (on the same rtio channel) would also be helpful. i.e. writing 16 bits amplitude (fixed, no spline interpolation) vs writing 16+32+48+64 bits for a full cubic spline. the RTIO FIFO would zero pad to full width at its input.
<rjo>
but what i need right now is a way to do that within the current tree, without drtio serialization.
<rjo>
i.e. changing the rtio runtime api.
<rjo>
we need full rate of full width rtio events. splitting into multiple rtio events and registers is a no-go.
<sb0>
rjo, isn't that already done by having rtio counter synchronization and sysref?
<rjo>
amplitude and frequency? how
<rjo>
?
<rjo>
as i explained before there are several rtio channels feeding one dac channel.
<sb0>
is that more than inserting the correct amount of pipeline registers into each channel after rtlink?
<sb0>
re. the registers, my "packed" proposal is similar to a wide 160-bit CSR
<rjo>
i'd put the delay registers on the data side. that's narrower.
<rjo>
or offset the counter in gateware (costs an adder).
<rjo>
or in software (costs cpu)
<sb0>
how large is the latency mismatch?
<rjo>
it's not much. maybe 50 16 bit registers per channel.
<sb0>
and how wide is the data?
<sb0>
pipeline registers are nice because they give good timing
<rjo>
delays ranging from 1 to 40
<sb0>
40!
<rjo>
its two cordics versus one spline interpolator. yes 2*(16+3 guard)
<sb0>
i'd put the whole thing in gateware in any case, latency-matching on the CPU sounds clumsy and slow
<sb0>
well for such a deep pipeline I'd do TSC correction if that meets timing
<sb0>
or well
<sb0>
no.
<sb0>
pipeline registers
<sb0>
Kintex Ultrascale has SRL64
<rjo>
do they have the same limitation as spartan6 (i.e. no reset)?
<rjo>
another thought i had. can we allow signals to be resetless? in addition to CDs.
<rjo>
in many cases it seems to me people are creating reset_less CDs just to get a register that has no reset.
<sb0>
I'd be fine with that
<sb0>
though I haven't seen "many" cases
<rjo>
where else do you need a resetless domain? reset counters, SRLs without reset values...?
<rjo>
and well the actual case where your hardware/design has no reset.
<sb0>
no reset and it actually only goes to 32
<sb0>
so you'd need two
<sb0>
the strobe signal may be implemented on FD
<rjo>
strobe?
<rjo>
about wide rtio: i can add a bunch more data registers to the interface to cover the max, add rtio_output(u64 t, u32 ch, u32 adr, u32 *data, u32 len)
<rjo>
e.g. 8 32-bit csrs to cover 256 bit max rtio data size
<rjo>
sound good for now?
<sb0>
yes, or CE if you wish
<sb0>
delay both CE (with FD, resettable) and data (with SRL), then register data with CE near the core
<rjo>
no need for CE on the output side of the spline inteprolators.
<sb0>
you can do that, but if you need reset, then you can use this CE technique