marcan changed the topic of #asahi-gpu to: Asahi Linux: porting Linux to Apple Silicon macs | GPU / 3D graphics stack black-box RE and development (NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
<jn__>
using the address of the parameter key, rather than its pointer value
azonenberg has joined #asahi-gpu
<azonenberg>
o/
<marcan>
\o
<azonenberg>
Wondering what the status of OpenGL is. Specifically how plausible it is to get GL_ARB_compute_shader in mesa working on the M1 GPU
<azonenberg>
Not an immediate priority but i have a friend getting a M1 mac who is interested in using glscopeclient on it
<azonenberg>
right now that's not possible
<marcan>
cc bloom (I think she's asleep though)
<marcan>
but my uneducated guess is that computer shaders are *easier* than 3D
<azonenberg>
the glscopeclient renderer is basically a software rasterizer in compute shaders that takes advantage of the specialized characteristics of waveform data
<azonenberg>
namely that points are sorted in strict left to right order and in many cases at regular intervals (I have optimizations i can ifdef for that to build two different versions of the shader)
<azonenberg>
The output is a monochrome fp32 texture that i then splat onto a full-viewport pair of triangles and use a fragment shader to tone map to colors
segher has joined #asahi-gpu
monochroma has joined #asahi-gpu
<marcan>
remember this gpu is TBDR; I wonder if that affects your optimizations
<azonenberg>
The glscopeclient renderer displays a total of three fullscreen "quads" of two triangles per render pass
<azonenberg>
all with a single texture and very simple shader
<azonenberg>
First one is Cairo for the grid, background gradient, etc
<azonenberg>
second is the compute shader generated waveform
<azonenberg>
third is Cairo again for axis labels, protocol decode annotations, and cursors
<azonenberg>
The waveform rendering itself is basically
<azonenberg>
spawn one compute thread for each pixel along the x axis
<azonenberg>
within each thread: loop over all samples whose timestamp is in that pixel
<azonenberg>
find min/max Y value, interpolating if the segment from that sample to the previous/next crosses outside the pixel
<azonenberg>
for y=min to max waveform[x][y] += alpha
<azonenberg>
i integrate in shared memory using a couple of threads, then splat out to global memory at the very end since that's slow
<marcan>
does it do sinc interpolation? :>
<azonenberg>
Not in the shader. there's a sinc filter you can apply to a waveform with arbitrary upsample rate if you want it
<azonenberg>
and then render the upsampled waveform with the shader
<azonenberg>
but it's just another DSP block you can put in the pipeline
<azonenberg>
glscopeclient works a lot like gnuradio or VTK/ITK
<azonenberg>
signal sources (channels or imported files), sinks (displayed plots), and processing blocks between them in an arbitrary DAG
<marcan>
ah, neat
lain has joined #asahi-gpu
<azonenberg>
Anyway, so right now it won't run on any apple platform unless it's an x86 that you dual boot linux on
<azonenberg>
Hence being interested in potential for it to work on M1 at least under linux
<azonenberg>
long term a community opengl 4.x implementation that runs under osx would be nice :p
choozy has joined #asahi-gpu
<bloom>
compute shaders will happen wen they happen, easier tha 3D but less fun :-p
<azonenberg>
Out of curiosity if you're not reversing any apple driver code or decapping the chip, how are you figuring out anything about the register interface?
<azonenberg>
poking registers at random seems likely to end badly and not tell you very much
<azonenberg>
i assumed most of this kind of stuff had to be done by finding someone in a suitably friendly jurisdiction to reverse the relevant tidbits then write a clean-room spec
<azonenberg>
but judging by the topic that's not the case
<azonenberg>
or is dynamic analysis fair game?
<azonenberg>
watching memory accesses via jtag or something while calling various APIs?
<jn__>
blackbox dynamic RE is the way, AFAIUI
<jn__>
watching library calls by apple's equivalent of LD_PRELOAD
<azonenberg>
the last time i was faced with such a problem it was a xilinx fpga bitstream
<azonenberg>
and my solution was to decap the chip and trace out the circuits :p
<azonenberg>
interestingly enough, under US law semiconductor RE is explicitly protected and legal
<azonenberg>
so it's less gray than anything software related
<azonenberg>
You're even explicitly allowed to use your findings to inform the design of a new chip as long as you aren't making a verbatim copy of the mask layout, or infringing on patents
<segher>
azonenberg: poking randomly is perfectly safe
<segher>
i have done it many times before. yes i know it can be disaster in teory, heh
<bloom>
segher: I figure poking randomly from unprivileged userspace in a machine with SIP/etc in tact is fair game
<bloom>
And if something goes horribly wrong I can file a CVE, win-win ;-p
tomtastic has quit [Ping timeout: 265 seconds]
tomtastic has joined #asahi-gpu
dottedmag has quit [Quit: QUIT]
dottedmag has joined #asahi-gpu
dottedmag has joined #asahi-gpu
<segher>
no, from unlimited mode
<segher>
you can map out where there is mmio, what blocks are aliased, etc.
<segher>
anything that can destroy hardware needs some specific sequence these days (since the 90's or so), not just one poke
<segher>
but yes you never know. take a risk :-)
<jix>
i did poke random MSRs on my ryzen to find the chicken bit to make rr work and I didn't even manage to crash my machine doing that... was much less exciting than I thought it would be ;)
<opticron>
heh...reminds me of a PCI card produced at a company I worked for a long time ago. there was a two-color LED on the back panel, the hardware was missing an interlock, and the driver had a bug that allowed the wrong transistors to be turned on at the same time during a race condition which blew out the LED trace and could cause a fire
<Yuzu>
that's a significant failure mode
<segher>
yes, anything external hardware (external to the chip) can do very bad things
<segher>
opticron: that is badly designed hardware of course, but :-)
<opticron>
yes
<opticron>
it was
<segher>
even just adding a dum resistor would help, heh
<segher>
dumb
<opticron>
if it was in the right place, yes
<segher>
well sure, heh
<segher>
but you should have anything to limit current on otherwise possibly wide-open paths
<opticron>
I think there *was* a resistor there, but it just wasn't across both of the two possible power-to-ground paths
<opticron>
probably not across either of them, just next to the LED
<jix>
yeah, that rr, some speculation that amd does affects performance counters in a way that breaks rr... so you need to turn that off https://hackmd.io/sH315lO2RuicY-SEt7ynGA?view has a write up of all that (I didn't do much apart from making educated guesses where to find the bit and trying those on my HW, others already figured out what the issue is and that some chicken bit to disable it is