electronic_eel has quit [Ping timeout: 264 seconds]
electronic_eel_ has joined #scopehal
<azonenberg>
So i'm shifting gears a bit to MAXWELL firmware for a few days
<azonenberg>
just for a change of pace, then back to probe design
<azonenberg>
I was originally planning to do some function generator ui code for glscopeclient this weekend
<azonenberg>
then $work decided to do maintenance on the VPN so i couldn't get into the lab where the scope with an attached function generator lived...
<_whitenotifier-f>
[starshipraider] azonenberg pushed 2 commits to master [+7/-5/±0] https://git.io/Jknos
<_whitenotifier-f>
[starshipraider] azonenberg ac73439 - Renamed resistive-probe to akl-pt2
<_whitenotifier-f>
[starshipraider] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JknX2
<_whitenotifier-f>
[starshipraider] azonenberg 268062e - Continued work on CRC simulation
electronic_eel_ is now known as electronic_eel
<tnt>
azonenberg: I'm wondering if that can help crc([a,b,c,d]) == crc([0,0,0,d]) ^ crc([0,0,c,0]) ^ crc([0,b,0,0]) ^ crc([a,0,0,0]) ^ crc([0,0,0,0])
<tnt>
so you could compute several 'phases' in // and just combine them at the end.
<azonenberg>
tnt: That is actually the theory of a paper i'm reading
<azonenberg>
it splits a 128 bit datapath into four 32s then combines
<azonenberg>
i think that's probably the way to go. I think with some optimization and possibly bumping the fpga up to -3 speed i can pass timing on a 32 bit
<azonenberg>
But i need to play around a bit more... some recent implementation tweaks are suggesting the 128 bit implementation might be possible to smoosh down enough to make timing
<tnt>
Ok, I guess there aren't too many way to do it and it seemed like the obvious one :)
miek has quit [Ping timeout: 246 seconds]
asy_ has quit [Ping timeout: 246 seconds]
asy_ has joined #scopehal
miek has joined #scopehal
<Degi>
Hmm, the XORing CRCs is interesting...
<tnt>
the beauty of linear math
<d1b2>
<OmniTechnoMancer> The crc of 4 0s is constant and can be done later right?
<d1b2>
<OmniTechnoMancer> and omitted if you have an even number of 4 way splits?
<tnt>
it's constant for sure. Not sure about the omission, I didn't work out exactly how the initial/final operations done in the CRC interated.
<d1b2>
<OmniTechnoMancer> I guess it depends how you combine two of these packets together?
<azonenberg>
to start, simplifying by using crc32-posix which is the same polynomial as ethernet but initialized to all 0s not 1s
<azonenberg>
in this case, crc(aa bb cc dd) = ~(crc(aa 00 00 00) ^ crc(00 bb 00 00) ^ crc(00 00 cc 00) ^ crc(00 00 00 dd) )
<azonenberg>
unclear where the complement comes in
<bvernoux>
azonenberg, why not using a CRC32 lookup table ?
<bvernoux>
it will compute CRC32 with one cycle per byte
<azonenberg>
bvernoux: This is response to my twitter thread trying to figure out a good way of doing 40 Gbps CRC32 with a 128-bit datapath at 312.5 MHz for 40GbE
<bvernoux>
ha ok
<azonenberg>
one cycle per byte is about 16 times too slow :p
<bvernoux>
yes clearly ;)
<bvernoux>
so the idea is to do a custom // algorithm ?
<bvernoux>
it will be intesting to check what ST does they have a HW CRC32 ...
<azonenberg>
I'm exploring an algorithm in a couple of papers
<azonenberg>
which exploits the ability of crcs to be split and combined under some circumstances
<azonenberg>
i'm still working on figuring out the best chunk size and how to combine things etc
<azonenberg>
and how to handle the partial word at the end of a packet efficiently
<bvernoux>
yes interesting and the expected speed is to do 4 bytes / cycles or more ?
<azonenberg>
I'm targeting 16 bytes per clock at 312.5 MHz
<azonenberg>
on a -2 kintex7
<bvernoux>
ha yes will be very nice
<azonenberg>
That's what it will take to process 40G line rate data
<bvernoux>
to be checked maybe it is only for the big version ;)
<bvernoux>
yes maybe the 2Tbps is only for Bus Width 4096 with 19902 LUT running at 509.13MHz ...
<bvernoux>
it is not very clear
<bvernoux>
for what you want you can choose the best it will be interesting to test what they have done instead of reinventing the wheel
<bvernoux>
It seems to be open source
<bvernoux>
they speak about it in Readme
<bvernoux>
As far as we know, this is the first open source code covering the whole procedure of programming a single LUT
<bvernoux>
speaking about Reprogramming by HWICAP
<bvernoux>
their paper is here file:///C:/Users/Ben/AppData/Local/Temp/Low-Cost%20and%20Programmable%20CRC%20Implementation%20based%20on%20FPGA%20(Extended%20Version).pdf