<azonenberg>
Also having signal integrity problems i need to work on...
<azonenberg>
I didnt realize the buses were this fast
<azonenberg>
i might have to switch to somethign else instead of the LA pod i have now
<azonenberg>
or at least shorten my probes by a lot
<azonenberg>
I would love a solder in AKL-PT1 on maxwell for this
<azonenberg>
lain: soooo i now have glscopeclient using 119.9 GB of RAM
<miek>
welp
<azonenberg>
Eleven waveforms, each 64M points on 7 channels
<azonenberg>
A total of 2.8 *seconds* of data at 250 Msps
<azonenberg>
That's 704M points per channel or nearly 5 G points of raw data
<azonenberg>
i'm now in the process of saving it to my NAS lol. The gig-e pipe has been saturated for quite a while
<azonenberg>
In case you were wondering why i was planning to build a 10G-attached ceph SAN :p
<azonenberg>
also RLE at the capture side would greatly reduce the overhead since a lot of the downtime between transactions would use less memory if it wasn't regularly spaced
<azonenberg>
I might actually implement that even for the e.g. lecroy driver at some point
<azonenberg>
only emit samples on the LA if the data has changed
<lain>
azonenberg: lol
<lain>
:D
<azonenberg>
(and it's still storing the file)
<azonenberg>
The save file is 80 GB
<lain>
now gzip it
<azonenberg>
lolno
<lain>
:D
<lain>
I wonder if lz4 would do any good, it's quite fast
<azonenberg>
I'm also wondering whether the original seagate SSD i was looking at is the best choice for this kind of work
<azonenberg>
Seagate XP1920LE30002
<azonenberg>
M.2 22110, 1920 GB, PCIe 3.0 x4, has power loss protection, but only rated for 0.3 DWPD
<azonenberg>
Which is, to be fair, still 576 GB/day of writes for 5 years
<azonenberg>
Also not suuuper fast, they claim sequential read/write of 2000 / 1200 MB/s
<azonenberg>
But it's cheap for the capacity, $200 at newegg
<azonenberg>
Something like the samsung 983 DCT would give me the same capacity and form factor but cost $412. For that price i get 0.8 DWPD endurance and 3000 / 1430 MB/s read/write
<azonenberg>
or if i go all out, the intel dc P3608 is $1060, 3 DWPD, 5000/2000 MB/s read/write, in a hh/hl pcie form factor
<azonenberg>
Assuming i have a single 10GbE pipe to the box, the aggregate bandwidth available from all four SSDs in one server will only be 1250 MB/s to the outside world so i feel like paying more for higher performance is unnecessary
<azonenberg>
And endurance probably isn't a HUGE concern because even if i'm writing lots of big scope captures they'll be spread across multiple drives
<azonenberg>
Ignoring write amplification issues (probably not a huge deal for massive linear file writes like storing waveform data), the seagate can do 576 GB/day of writes and if i split that across four drives (assuming 4 drives per node, 3x replication, 3 nodes)
<azonenberg>
my actual write capacity over 5 year is 2304 GB/day across the array
Degi has quit [Ping timeout: 246 seconds]
<azonenberg>
that's 28 of these giant datasets
<azonenberg>
each day, every day
<azonenberg>
lain: what do you think? i feel like there's little chance of me exceeding that
<sorear>
do you actually need long term storage of oscilloscope traces and by extension do you actually need to raid >0 them? is losing all of your traces going to cost you more than an hour of work?
<azonenberg>
sorear: so the issue is mainly, this isn't JUST for scope waveforms
<azonenberg>
My plan is to set up a snigle storage cluster to serve all of my needs
Degi has joined #scopehal
<azonenberg>
I'm going to have a bunch of ceph RBD block devices for virtual machine hard drives (replacing the hard wired M.2's in my current VM server which is getting full and also a bit old)
<azonenberg>
as well as a CephFS filesystem for storing home directories, scope traces, all of my photos and media files,
<azonenberg>
the plan is to consolidate all of my storage so each machine has a boot drive and a 1G/10G/40G (depending on throughput requirements) ethernet link to the ceph cluster
<azonenberg>
and the boot drive will be essentially disposable, if it fails i reimage and copy over a few dotfiles
<azonenberg>
the main storage array will be backed up nightly to my existing offsite backup server (6x 4TB spinning-rust in raid6)
<azonenberg>
But it's located in another city and restoring that much data over VPN to the other location would be super time consuming. So i want to minimize the chances of downtime
<azonenberg>
also, ceph's replication isn't just for drive failure tolerance. it can do scrubbing etc to ensure data integrity
<azonenberg>
and having multiple copies of data means reads can be serviced from any copy
<azonenberg>
so it's faster
<sorear>
not gonna try to netboot? :p
<azonenberg>
No
<azonenberg>
PXE is a giant pain in the butt to set up and maintain
<azonenberg>
I've done it
<kc8apf>
Ceph has a fairly high maintenance burden
<azonenberg>
kc8apf: oh?
<kc8apf>
And it really doesn't like small clusters
<azonenberg>
i'm still open to ideas. And from what folks are telling me 3 nodes is the smallest that is reasonable to go, but it should work fine on that
<kc8apf>
I ran a 3 node setup for a while and it mostly worked
<azonenberg>
basically i want something that scales to more capacity and bandwidth than just a couple of drives with linux mdraid served over nfs
<azonenberg>
if not ceph it would be lustre or pvfs or something like that
<kc8apf>
Colocating the ceph control plane with an osd causes odd, hard to debug problems when the osd load gets high
<kc8apf>
Basically none of the existing options are particularly pleasant to use
<azonenberg>
What do you mean "colocating"
<azonenberg>
same drive or same cpu?
<kc8apf>
CPU
<azonenberg>
I figured if i had six cores and four OSDs i'd be OK
<kc8apf>
Certain events can cause a fairly high cpu load doing reconstruction or verification
<azonenberg>
and how high load are we talking? i'm only going to have 10GbE to each node
<azonenberg>
no 40/100G although my workstation will have 40G to the network core
<azonenberg>
my hope is to be able to hit 30G throughput from my workstation to the three ceph nodes
<kc8apf>
I had 1G and 1 osd per node
<azonenberg>
what kind of cpu?
<kc8apf>
You can get bandwidth from ceph but it requires some planning and careful reading of the tuning guides
<azonenberg>
My proposed build right now in a newegg wishlist has a 6 core 1.7 GHz skylake xeon (scalable bronze 3104)
<azonenberg>
and the way i see it is, almost anything i do with ceph is likely to outperform my current mdraid 2x 7200rpm 4tb NFS over gig-e
<azonenberg>
it's just a question of how much better i can make it
<kc8apf>
I was using fairly low end stuff. HDDs and older AMD boxes
<azonenberg>
was ram bandwidth/capacity an issue for you?
<azonenberg>
i'm looking at 24GB of 6 channel ddr4 2666 per node
<kc8apf>
Plan for 1-2GB of RAM per TB of storage
<kc8apf>
Ceph and bluestore like to cache a lot
<azonenberg>
Yeah. I'm looking at 4x 1.92 TB OSDs per node
<azonenberg>
so 8TB total and 24GB of RAM
<sorear>
osd?
<kc8apf>
OSD is the ceph per-disk process
<sorear>
ceph's high level design seems vastly preferable to every other network filesystem but some of the things I've heard about administering it are alarming
<kc8apf>
That's pretty much spot on
<azonenberg>
lol
<azonenberg>
well, i figure i'll give it a try and keep the old nfs server around for a little while
<azonenberg>
and if it proves to be too annoying i'll move all my data back to the old server
<azonenberg>
then wipe the ceph nodes and just run nfs on them
<azonenberg>
But i won't have budget for a while as i just bought the 4 GHz scope and now that my pocket has recovered from that, i need to save up for some repairs around the house before i invest in more lab infrastructure
<azonenberg>
And i also have to upgrade sonnet still
<azonenberg>
oh, and build MAXWELL lol
azonenberg_work has quit [Ping timeout: 256 seconds]
azonenberg_work has joined #scopehal
maartenBE has quit [Ping timeout: 256 seconds]
maartenBE has joined #scopehal
<azonenberg>
Yaaaay
<azonenberg>
I just segfaulted glscopeclient with an 80GB dataset in RAM
<azonenberg>
I saved it to disk previously, thankfully, but loading it will take 10+ minutes
<azonenberg>
This sounds like a good excuse to implement the file load progress dialog i've wanted to have :p
<azonenberg>
After i fix another bug that is
<monochroma>
XD
<azonenberg>
Also i need that storage cluster sooner rather than later
<azonenberg>
at 30 Gbps, that would be a 20 second load time
<electronic_eel>
azonenberg: so you are planning to have just the os on local disks connected to your workstations and the full homedir will be on ceph?
<electronic_eel>
I'm doing something similar, but with nfs. but when I introduced this, I ran into massive latency problems on the workstations: many programs tend to do lot's of accesses to ~/.local and ~/.config - and all theses small accesses always have the full network latency
<electronic_eel>
so basically the affected programs became slow as hell
<sorear>
ceph has a cache/lease mechanism that's not pure yolo like nfs's
<electronic_eel>
what I ended up doing is introducing a cache partition which rsyncs .config and .local on login and syncs it back on a clean logout
<electronic_eel>
sorear: ah, good to hear. I'm running this setup for like 8+ years now, maybe the cache on nfs got better over the years, I haven't investigated
<_whitenotifier-f>
[scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZnQ
<azonenberg>
electronic_eel: so what i do right now is i have my homedir with .local and .config be stored on the OS drive
<azonenberg>
but /nfs4/home/azonenberg/ has all of my data
<azonenberg>
i consider it kinda my pseudo-home
<azonenberg>
/home/azonenberg/ has very little on it
<electronic_eel>
do you sometimes sync /home/az to your file server for backup?
<azonenberg>
i expect /ceph/home/azonenberg/ will also be used for pretty much all of my bulk data and /home/azonenberg will contain preference settings like i do now
<azonenberg>
No. I consider everything there fairly expendable
<azonenberg>
the preferences i care more about are things like browsing history
<azonenberg>
Which are stored in the web browsing VM
<azonenberg>
Which is not backed up, but is on raid1 on the xen server
<azonenberg>
and will eventually be stored on ceph
<electronic_eel>
hmm, setting up all the programs takes quite some time for me, so I really want it in the backup
<azonenberg>
Yeah makes sense
<azonenberg>
Copying it to a backup would not be unreasonable, or just running a clientside backup
<azonenberg>
Right now my NAS is the only machine that actually has backups on it, as everything i really care about lives there
<_whitenotifier-f>
[scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±4] https://git.io/JUZnx
<_whitenotifier-f>
[scopehal-apps] azonenberg 67ea814 - Fixed bug where clicking a trace in the protocol analyzer from a historical waveform moved to the new history point but did not properly reload cached waveforms for display
<electronic_eel>
do you compile on your /nfs4/home/azonenberg/?
<electronic_eel>
compiling also has a really noticeable latency for me, so I usually compile on a local disk
<azonenberg>
electronic_eel: yes i do
<azonenberg>
when i "make -j" kicad or glscopeclient i can max out all 32 cores without getting network bound already
<azonenberg>
i hide latency by just having another gcc instance steal the cpu while the first one is blocked lol
<electronic_eel>
hmm, strange. depending on file count the compile times can be twice to 10 times worse for me
<azonenberg>
last time i checked i could compile all of kicad in something like... 90 seconds?
<azonenberg>
on nfs
<azonenberg>
what's the latency from you to the nas? is it wifi?
<electronic_eel>
nonono
<electronic_eel>
10gbe over direct attach cable
<azonenberg>
I have something on the order of 100-150 μs latency to it
<azonenberg>
10Gbase-SR from me to the core switch, 1000base-SX from there to the edge switch, then 1000base-T from the edge switch to the NAS
<electronic_eel>
maybe it is the kernel version on the server that is getting old, it is running centos 7
<azonenberg>
My proposed ceph cluster will connect to the core switch via 10Gbase-SR from each node
<electronic_eel>
plan to migrate to centos 8 soon
<azonenberg>
and the connection to my desk will be upgraded to 40gbase-SR4
<azonenberg>
I have the cable in the walls already
<azonenberg>
but got screwed over by MPO connectors
<azonenberg>
turns out MPO keystone modules do not contain alignment pins
<azonenberg>
i.e. you cannot mate two female MPOs with one
<azonenberg>
you need a male on one cable
<azonenberg>
Aaaand all of my MTP/MPO patch cords have female on both sides
<azonenberg>
So now i have some MTP M-F 1.5m patch cords on the way from FS but they won't be here until october 1st or thereabouts
<azonenberg>
At which point i should have 40GbE to the core
<azonenberg>
I'm used to 10G stuff where all cables have male LC ends and all couplers contain the necessary alignment features to mate two male cables end to end
<electronic_eel>
ok, note to myself that I have to look this stuff up in detail before introducing 40g
<azonenberg>
There are also 3 different polarities for MPO/MTP connectors (to clarify, MTP is a brand-name connector which is compatible with the generic MPO but has some nice features)
<azonenberg>
or well for cables
<azonenberg>
A is straight through
<azonenberg>
B is L-R crossover, C is pairwise crossover and i think is pretty rare
<azonenberg>
My standard is to use type B cables for everything, which aligns with how normal LC cables are crossover
<azonenberg>
So if you have two patch cords and a plant cable, all type B, you end up with a net of one tx-rx inversion across the whole thing
<azonenberg>
which is what you want
<azonenberg>
QSFP modules normally have male connections (alignment pins are the only thing that distinguish a M from F, the rest of the connector body is the same and an external bracket is needed to mate a M to an F and hold them together)
<azonenberg>
So a QSFP contains both the bracket that the connector latches to as well as a male MPO compatible mating receptacle that you plug a female cable into
<azonenberg>
then your patch cords are normally all female at each end, and your plant cables are normally male at both ends
<azonenberg>
Which i didn't know, since i was used to plant and patch cords all being the same gender and having couplers between them
<azonenberg>
but it turns out MPO couplers are just the bracket and you need the cables to be gender compatible
<azonenberg>
There is apparently a tool that lets you insert pins into a female MTP connector, one of the improvements in the name brand MTP vs generic MPO is that it allows field polarity changes
<azonenberg>
but this was an expensive cable and i dont want to risk screwing it up and i have no cheap cables to practice on, and the tool isnt cheap either
<azonenberg>
i only have two misgendered plant cables so i'm just going to get male ends on pach cords for this one link
<azonenberg>
and not make the mistake again
<azonenberg>
if this was a larger rollout i'd put pins on the plant cables to avoid problems
<electronic_eel>
hmm, is it common to have the MPO stuff in the patch panels and so on? that would mean a dedicated fibre installation just for this. why not use regular lc and a 4 lc to mpo patch cable?
<electronic_eel>
I usually want my wiring in the walls and so on be compatible for several generations of connections
<electronic_eel>
the cat-7 cables I used in the old office at work were used with 100 mbit first and then were upgraded to 1 gbe. could also be used with 2.5gbe or if I really wanted, 10gbase-t
<electronic_eel>
since splicing the fibre stuff tends to be more expensive than pulling & connecting cat7, I really want this to be usable for some years to come
<azonenberg>
regular LC for the plant cable you mean?
<azonenberg>
i was concerned about skew from manufacturing tolerances between cables
<azonenberg>
with a MPO you know every fiber in the cable is the same length
<electronic_eel>
"plant cable" is the one that runs in the walls, right?
<azonenberg>
yes
<azonenberg>
or rack to rack, or generally "infrastructure" vs a patch cord
<electronic_eel>
ok
<azonenberg>
I'm running 40G from my lab to my desk
<azonenberg>
there are three ways to do it
<azonenberg>
MPO plant cable, whic his what i did
<azonenberg>
4x LC plant cable, which takes up more space on patch panels and in tray/conduit and might have skew concerns
<electronic_eel>
do the 4 lanes need to be phase stable and have tight length tolerances?
<azonenberg>
I'm not sure, i didnt look into it
<azonenberg>
but i havent found anybody talking about using four LC cables for 40G
<azonenberg>
i have no idea what the 40gbase-sr4 lane to lane skew budget is
<azonenberg>
all of the MPO to 4x LC breakouts i've seen were for 4x 10G
<azonenberg>
anyway, the final option which i did consider was using 1x LC plant cable and a CWDM QSFP+ that runs four wavelengths over a single fiber each way
<azonenberg>
But those are around $200 vs $30 per optic
<azonenberg>
MPO cables cost a bit more than LC cables but it was still cheaper than using WDM optics
<azonenberg>
that only really makes sense IMO if you have a major investment in existing plant cables that arent practical to reinstall
<electronic_eel>
yeah, you just do CWDM if you'd have to pull completely new cables otherwise
<azonenberg>
well i had originally looked into it because i couldnt find MPO keystones
<azonenberg>
i only was able to find a single vendor
<_whitenotifier-f>
[scopehal-apps] azonenberg pushed 1 commit to master [+2/-0/±3] https://git.io/JUZWj
<_whitenotifier-f>
[scopehal-apps] azonenberg 5474836 - Implemented progress dialog for waveform loading. Not displayed for saving yet. Fixes #165.
<azonenberg>
well this is nice, lol. I'm watching a youtube video on a browser running in a VM in my xen box, with sound over RTP and video over SSH+VNC, with no noticeable lag
<azonenberg>
*while* glscopeclient is saturating the gigabit pipe to my NAS loading an 80GB saved waveform dataset
<azonenberg>
About 1.2 Gbps inbound. I think this is the first time the 10G pipe from my desk to the rack has actually run at >1gbps for a sustained period of time
<azonenberg>
i've had trouble fully using the 10G pipe because i don't have enough things at the far side of the pipe that can keep up yet
<azonenberg>
the VMs rarely push more than a few hundred Mbps although they are on a 10G pipe, and the NAS is slooow
<azonenberg>
of course as soon as i build maxwell that will be a different story
<azonenberg>
ooh 1.4 Gbps sustained now
<_whitenotifier-f>
[scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZ81
<_whitenotifier-f>
[scopehal] azonenberg 0f093a6 - Accept both upper and lowercase "k" as prefix for "kilo"
juli965 has joined #scopehal
<_whitenotifier-f>
[scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JUZ41
<_whitenotifier-f>
[scopehal] azonenberg 6a554ef - SPIFlashDecoder: implemented 0x0b fast read
jn__ has joined #scopehal
<electronic_eel>
about 40gbase and using LC breakouts - this paper claims you can have 15 meters length difference between the lanes without hitting the limits
<electronic_eel>
so it seems to me that using lc for your infrastructure cabling and just have mpo to 4x lc breakout cables seems to be no problem
bvernoux has joined #scopehal
<azonenberg>
electronic_eel: assuming you have enough patch panel space, yes
<azonenberg>
that would be an option
<_whitenotifier-f>
[scopehal-apps] azonenberg opened issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f>
[scopehal-apps] azonenberg labeled issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f>
[scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±3] https://git.io/JUZNF
<_whitenotifier-f>
[scopehal-apps] azonenberg 19e8b03 - ProtocolAnalyzerWindow: now display per packet background colors. Fixes #167.
<_whitenotifier-f>
[scopehal-apps] azonenberg closed issue #167: Protocol analyzer: color code rows based on type of packet - https://git.io/JUZbu
<_whitenotifier-f>
[scopehal-apps] azonenberg opened issue #168: Add filter support to protocol analyzer to show/hide packets matching certain properties (f.ex hide status register polling) - https://git.io/JUZxC
<_whitenotifier-f>
[scopehal-apps] azonenberg labeled issue #168: Add filter support to protocol analyzer to show/hide packets matching certain properties (f.ex hide status register polling) - https://git.io/JUZxC
<bvernoux>
azonenberg, do you plan to fully replace cairo to OpenGL ?
<azonenberg>
bvernoux: No. Non performance critical stuff like cursors, axis labels, etc will remain cairo for the indefinite future
<azonenberg>
but when you have tens of thousands of protocol decode packets in a view cairo is slow
<bvernoux>
ok
<azonenberg>
most likely what i will move to near-term is opengl for the colored outlines of protocol events
<bvernoux>
I imagine it is not a simple task to rewrite cairo stuff in OpenGL ...
<azonenberg>
then cairo for the text inside them
<azonenberg>
text in GL is a huge pain and i hide text when the box is too small to fit it
<bvernoux>
especially for text ...
<azonenberg>
So GL-accelerating the text seems unnecessary
<azonenberg>
but the boxes are drawn even when tiny
<azonenberg>
So i figure accelerate the boxes then software render text if it fits
<azonenberg>
all analog and digital waveform rendering is already done in shaders
<bvernoux>
nice
<bvernoux>
I have a friend which have done lot of OpenGL stuff in paste and he told me too that text is a huge pain in GL ...
<bvernoux>
So I understand now why you mix cairo and GL
<azonenberg>
Yeah
<azonenberg>
cairo makes antialiasing etc really nice, it produces beautiful output, it's just slow
<azonenberg>
and with GL composited rendering is super easy
<azonenberg>
So i render some stuff with cairo and some in compute shaders (not using the GL graphics pipeline, just parallel compute)
<azonenberg>
splat them all out into textures then render a handful of actual GL triangles and use a fragment shader to merge them
<bvernoux>
need to test the Rigol version ;)
<bvernoux>
on my old DS1102E just to see how glscope works
<bvernoux>
IIRC it shall be compatible even if DS1102E is very slow to send data over USB
<_whitenotifier-f>
[scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JUZpZ
<_whitenotifier-f>
[scopehal-apps] azonenberg 11453b8 - ProtocolAnalyzerWindow: laid initial groundwork for display filters (see #168). No actual filtering is performed, but the m_visible bit now controls row visibility.