<sb0>
While we deeply believe in open-source hardware, we do not want to release a platform that we do not feel is ready. Part of the PULP platform is still in heavy development and is not in a state that would allow a world-wide distribution yet. For this reason, we started by releasing the most mature IPs, instantiated in the PULPino project.
<sb0>
hm
<sb0>
bah, i don't know, compile it for some fpga, try it and let us know
rohitksingh_work has quit [Quit: Leaving.]
<key2>
sb0: too bad it's axi oriented
rohitksingh_work has joined #m-labs
<sb0>
bah, try it. if it's the usual academic EDA project, it doesn't matter if it's axi or not because it's unusable in any case
<key2>
how so ?
<key2>
what makes it unusable
<key2>
(btw, they say they use it in asics)
<sb0>
the usual
<sb0>
bugs, bloat, low performance, sloppy design, etc.
<sb0>
same as opencores
<sb0>
all that means is they blew some money on a fab. what were the results?
<key2>
mmh digging
<key2>
i actually don't see the core itself
<sb0>
yeah, they use some NIH package manager for cores
<key2>
I've been told it was interesting because it has jtag
<key2>
a guy I know is working on a nanorv32 on which he also added jtag, and with a 20 pin jtag header he managed to get it running on openocd. while he is trying to get his company release the code opensource, he told me this thing looked promising
<sb0>
adding jtag to misoc should be intern level stuff (at least with lm32)
<key2>
they got several version done in 28nm by STM
<sb0>
again, all that means is they blew money
<sb0>
you can give a completely broken mask to STM (or maybe RTL, some do full-service) and some cash, and you'll also have a 28nm ASIC by STM
<sb0>
did the chip work?
<key2>
hahahah
<key2>
good question
<key2>
well as you say, only real way to know is to give it a try
<sb0>
rjo, ok for 2.0
kuldeep_ has quit [Ping timeout: 244 seconds]
sb0 has quit [Read error: Connection reset by peer]
sandeepkr__ has quit [Ping timeout: 265 seconds]
sb0 has joined #m-labs
sandeepkr has joined #m-labs
kuldeep_ has joined #m-labs
<rjo>
sb0: but non-RT would still be starved by RT in that scheme. i suspect it would be easier to just scramble everything after link init (we also need a way to detect de-init/de-sync) and then do guaranteed minimum allocations for RT and non-RT in the arbitrator.
<sb0>
if you guarantee a maximum RT packet length and a minimum inter-packet gap for RT, then there's a guaranteed non-RT bandwidth
<rjo>
sb0: make sure that they block the beam to > OD6 or so. otherwise it's still dangerous. but polycarbonate goggle usually do the trick.
<rjo>
sure. minimum gap achieves it. but if you have minimum gap, you are wasting bandwidth.
<rjo>
if you do minimum gap, then you could just do one scramble-everything, encode non-RT as regular pakets and be done with it.
<sb0>
minimum allocation is also wasting bandwidth
<sb0>
for the same reason
<rjo>
i mean a minimum allocation guarantee (up to a max) with unused bandwidth going to the other type.
<sb0>
if non-RT traffic takes any bandwidth from RT traffic, then it's not strictly RT anymore
<rjo>
sure. as long as the delay incurred by RT traffic is bounded.
<sb0>
also, I think non-RT traffic should also be low-bandwidth. with the scheme I proposed, a maximum RT packet length of 256 characters, and one character interpacket gap, there's 38Mbps of non-RT traffic on a 5Gbps link
<sb0>
is that not more than enough to control a few TTLs with buttons?
<sb0>
bounded, yes, but then that requires flow control on the RT side
<sb0>
and generally more complex reasoning
<sb0>
*3.8Mbps
<rjo>
we will need flow control anyway.
<sb0>
the less the better
<rjo>
we need _that_ kind of flow control anyway.
<rjo>
it's not less.
<sb0>
no we don't, if we can assume that for all clock cycles the link layer is always ready to accept a full set of words of RT traffic, it's one full nest of bugs that gets destroyed
<rjo>
we need flow control/arbitration on the core device side to arbitrate between dma feeding rtio events and non-dma. they will very likely need to happen at the same time.
<sb0>
that's flow control at another layer, yes we need that
<sb0>
but we don't need a nasty interaction between the two
<rjo>
i don't care so much about a percent wasted bandwidth due to fixed allocation. but it seems useless to me to encode non-RT in commas. you can just attach one non-RT character to every RT paket and send "empty" RT pakets (containing just the non-RT "piggy-back" character) when there are none.
<rjo>
what do we want to use non-RT traffic for? mon-inj, flashing, ...?
<sb0>
but then there is state to carry over from one character to the next, whereas with my scheme, you can immediately tell whether a given character is RT or non-RT
<rjo>
the characters would fall out of the paketizer. just assemble them afterwards.
<rjo>
*de-paketizer
<sb0>
I like keeping RT and non-RT traffic completely separate as soon as possible.
<sb0>
they have different protocols, different constraints
<sb0>
they will connect to different parts of the SoC
<rjo>
but you are mixing layers. you are mixing the framing/comma layer with the paket-type layer.
<sb0>
yes, moninj and flashing, both to comms CPU
<sb0>
I suppose we will not be sending any RT packets during flashing, so we get 1Gbps, i.e. plenty
<rjo>
if the traffic looks like this: SNRRRRRRSNSNSNSNSNSNRRRRSNRRRSNRRR with S: start-of-frame, N: non-RT, R: RT data, you can mux/demux very early.
<sb0>
yes, it's a bit of a hack, but will that cause any problem?
<sb0>
plus the non-RT scrambler will fix the idle EMI issue
<rjo>
no need. just scramble it once in the end. why scramble twice?
<sb0>
scramblers are different for special and non-special characters
<rjo>
need to be?
<sb0>
there are 256 regular and 12 special characters
<sb0>
also: a non-RT scrambler with a short period gives out a lot of commas in the absence of traffic
<sb0>
so we can simply refrain from sending anything during link initialization, instead of having modes.
<rjo>
i know that there are different characters. but why do you want different scramblers? just scramble everything at the 10b level.
<larsc>
you wouldn't get valid symbols in that case, wouldn't you?
<rjo>
if the scrambling maps all the 10b symbols onto all the 10b symbols while maintaining the parity stuff, then yes.
<sb0>
that sounds like a complicated scrambler
<sb0>
whereas unconstrained scramblers are fairly straightforward (and do not use lots of FPGA resources)
<rjo>
your two scramblers would not be unconstrained. they would do the same. just within the K and non-K spaces.
<rjo>
why would the 10b scrambler use more?
<sb0>
nope, one scrambler has to output 3 bits per cycle, the other 8
<sb0>
(assuming I only use K28.y)
<rjo>
then you need to define what you mean by "constrained".
<sb0>
map 10b to 10b, maintains running disparity
<rjo>
yours are constraint. they map K28.y to K28.y and Dx.y to Dx.y respectively.
<rjo>
constrained.
<sb0>
put it before the 8b10b encoder, problem solved
<rjo>
what problem?
<sb0>
mapping K28 to K28 and D to D
<rjo>
that's a self-inflicted problem.
<larsc>
scrambling is probably not so much the issue, the issue is being able to descramble on the rx side
<sb0>
that problem has an easy solution (put the scrambler and encoder in the right order), unlike the problem of designing a scrambler that operates all on 10b symbols without subtly messing things up
<rjo>
yep. we'd need monitoring for align-complete, de-sync etc.
<sb0>
that already works in drtio_transceiver_demo, including 90% of a scrambler-descrambler
<sb0>
also mapping K28 to K28 and D to D makes debugging a bit easier
key2 has quit [Ping timeout: 240 seconds]
<sb0>
hm, how exactly does one make a fake comma with K28.7 plus another code?
<sb0>
I see the problem with series of K28.7 (running disparity makes comma alignment uncertain to 5 bits), but some sources claim there's also an issue when K28.7 is combined with something else
<rjo>
sb0: i think you could just use one 8b scrambler and leave it running continuously. for K28.y just take three bits of that scrambler.
<sb0>
rjo, yes, good idea
<rjo>
sb0: did you check whether any of the recent changes to artiq/dashboard should also be applied to artiq/browser?
<sb0>
or not... you need to feed the scrambled data back into the shift register
<sb0>
for multiplicative scramblers, which are nice because self-synchronizing
<rjo>
really? isn't it just doing data xor with a lfsr (that feeds back on itself)?
<sb0>
if you feed just lfsr, you make an additive scrambler and synchronizing the receiving end is tricky
<rjo>
ah. multiplcative.
<rjo>
but don't ween need to be able to react to de-sync etc anyway?
<sb0>
yes, I'd do that with checking the contents of the packets, possibly with CRCs
<sb0>
doing anything at the transceiver level is unadvisable btw, there's at least one FPGA (Kintex-7) with a poorly designed CDR that can't tell you if it's locked
<rjo>
but then for the multicplicative scrambler on K28.y, maybe just feed back on itself (for the other bits, during K28.y) would be sufficient...
<sb0>
those things will have to operate at high speeds, so I'd rather duplicate logic than fail timing due to complex combinatorial functions
<sb0>
but maybe
<rjo>
high speed?
<sb0>
either high speed, or with wide data
<sb0>
32-bit (for 5Gbps/125MHz) should still be reasonable though
<sb0>
pulling many bits from a LFSR-style construct in one cycle doesn't scale (the combinatorial functions explode rather quickly)
<rjo>
xorshift does that well.
<sb0>
xorshift?
rohitksingh_wor1 has joined #m-labs
<rjo>
i wrote it for redpid. it gives you good white noise.
rohitksingh_work has quit [Ping timeout: 276 seconds]