<rqou>
apparently it involves jumping to unmapped addresses and relying on a cpu internal register to hold old data values
<rqou>
how the heck was this developed? and how do emulators manage to even emulate this?
<pie_>
wow, old data caching exploit? damn
<pie_>
whats the context of this? im not familiar with it and cant watch the vid right now
<rqou>
the super mario world snes video game
<pie_>
desktop generic exploit or osmething?
<pie_>
oh
<pie_>
pretty cool
marcus_c has quit [Ping timeout: 240 seconds]
marcus_c has joined ##openfpga
maaku has quit [Quit: No Ping reply in 180 seconds.]
maaku has joined ##openfpga
azonenberg_hk has joined ##openfpga
<azonenberg_hk>
pie_: synth wouldnt be too hard
<azonenberg_hk>
you'd need PAR though
<azonenberg_hk>
Which would be interesting
<azonenberg_hk>
especially given the short wire length limitation (signals go away after ~16 blocks)
<azonenberg_hk>
and the fact that vias are diagonal and not vertical
<azonenberg_hk>
you need massive numbers of buffers in redstone vs other logic families
<rqou>
i thought modern asics already needed buffers for long wires
<rqou>
also, you actually play minecraft?
<azonenberg_hk>
Long wires
<azonenberg_hk>
Not sixteen lambdas
<azonenberg_hk>
:p
<rqou>
ok, true :P
<azonenberg_hk>
I havent played in a bit, but yeah
<rqou>
doing it with redpower is much easier
<azonenberg_hk>
My point is, the ratio of buffer to logic has to be a lot higher
<rqou>
redpower is iirc 256 blocks
<rqou>
idk about the clone
<azonenberg_hk>
oh theres a mod with slower degradation?
<rqou>
you haven't seen redpower?
<azonenberg_hk>
No
<rqou>
ignoring the drama part, it adds a large number of more advanced redstone tiles
<azonenberg_hk>
and if you wrote a place-and-route it shouldn't care once you implement an "add buffers every so often" constraint
<azonenberg_hk>
oh?
<rqou>
and wires that degrade slower and go up walls
<azonenberg_hk>
and if i was going to make an fpga in redstone
<azonenberg_hk>
i'd do it in pure minecraft
<rqou>
my favorite pair of tiles in redpower are:
<rqou>
one tile that just lets two wires overlap in the h/v direction without touching
<azonenberg_hk>
just with a world edit or something to generate the geometry vs building it by hand :p
<azonenberg_hk>
and ooh
<rqou>
and another tile that when the h wire is off does not affect the v wire
<rqou>
but when the h wire is on the v wire is forced on
<rqou>
so these tiles trivially pack into very dense roms
<azonenberg_hk>
nice
<lain>
oooh yeah redpower
<azonenberg_hk>
I'll look at it
<rqou>
with h wires being word lines and v wires being bit lines
<azonenberg_hk>
could always make two back ends for the par
<azonenberg_hk>
What i was thinking is, wired-AND / wired-OR arrays with buffers every few cells
<lain>
doesn't that also have the ability to colour wires and have different colour wires in the same path, but they only connect with same-coloured wires?
<rqou>
we should make a backend for yosys and submit it on april 1st
<lain>
or was that a different mod
<rqou>
yes, it had that too iirc
<azonenberg_hk>
have vertical redstone pistons to set/clear bitstream bits i think
<lain>
did you see sethbling's atari emulator in minecraft?
<azonenberg_hk>
do a full on PLA
<rqou>
i just linked that a few hours ago
<rqou>
the cpu is done in command blocks
<lain>
oh
<lain>
that's what you're talking about :D
<lain>
just scrolled up
<rqou>
btw redpower was also rather notable at having an energy network that actually simulated ESL
<azonenberg_hk>
ooooh you know what would be fun
<azonenberg_hk>
lain: Make a coolrunner-2 bitstream compatible emulator in redstone
<rqou>
and then fix up the yosys xc2 backend?
<azonenberg_hk>
i need to make a coolrunner verilog model anyway to verify my toolchain against actual hardware
<azonenberg_hk>
and exactly
<rqou>
i'm down for doing this
<rqou>
submitting it on april 1st? :P
<azonenberg_hk>
Lets shoot for april fools '18
<azonenberg_hk>
i can't get it done this year
<azonenberg_hk>
not in 4 months with all i have going on
<rqou>
btw i've been a little quiet about this but i'm working on yet another attempt at a vhdl frontend
<azonenberg_hk>
for yosys?
<rqou>
yeah
<azonenberg_hk>
:)
<azonenberg_hk>
i dont use vhdl but having support would be nice
<rqou>
this would be like the third attempt :P
<azonenberg_hk>
if you want to make vhdl libraries for my greenpak tool that would be great
<rqou>
i was thinking to myself, "how hard could it be?"
<azonenberg_hk>
and docs etc
<rqou>
the ebnf for vhdl is godawful
<rqou>
it's also not normative
* lain
<3 vhdl
<lain>
I mean
<lain>
the language sucks
<rqou>
oh btw vhdl has brilliant features like this:
<rqou>
the lex rule for a character literal:
<rqou>
"'"[\x20-\x7E\xA0-\xFF]"'"
<lain>
but verilog is a little too magical for me
<rqou>
and yes, i'm sure
<lain>
vhdl is at least either going to do what you expect, or yell at you because it can't decide
<lain>
whereas verilog is more prone to "they probably meant <random logic>"
<rqou>
also, the regex character group for what is allowed in an identifier is [A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\xFF]
<rqou>
again, yes i'm really sure
<lain>
lol
<rqou>
in case you haven't figured out what's going on, vhdl explicitly requires iso 8859-1 latin-1
<lain>
how awful :D
<lain>
one day I'll finish hdl#
<rqou>
so e.g. 0xD7 and 0xF7 (multiply and divide symbols) are not considered letters
<rqou>
and are therefore not allowed in identifiers
<rqou>
same with 0xA0-0xBF
<rqou>
which are mostly considered "other special characters"
<azonenberg_hk>
rqou: and utf-8 isnt allowed either?
<rqou>
it's explicitly latin-1 only
<rqou>
and it matters because the case folding table is also in the LRM
<lain>
in C#, identifiers can be unicode, including emoji. in case you felt variables like i, j, and k weren't ambiguous enough...
<rqou>
because identifiers are case-insensitive
<azonenberg_hk>
rqou: that is one of my biggest objections to vhdl
<azonenberg_hk>
lain: int [poop] = 42;
<lain>
:D
<lain>
azonenberg_hk: perfectly legal
<rqou>
verilog just says "8-bit ascii" without specifying what that means
<rqou>
presumably including utf-8
<azonenberg_hk>
rqou: I see
<lain>
I mean, vhdl's charset choices are stupid, but at least it has a specification?
<lain>
:P
<azonenberg_hk>
Lol
<azonenberg_hk>
So
<azonenberg_hk>
Suppose we plan to eventually support a yosys vhdl back end on our toolchain
<azonenberg_hk>
Anybody here want to volunteer to be the maintainer for the greenpak vhdl libraries?
<rqou>
and also e.g. 0xC0 and 0xE0 are explicitly upper/lower case pairs
<azonenberg_hk>
(No work required until we have a functional synthesizer)
<rqou>
so if you don't mind your utf-8 being mysteriously corrupted and rejected, you can still use it :P
<azonenberg_hk>
well ok, no work other than reviewing the verilog library and finding anything that might not work well with vhdl
<azonenberg_hk>
:p
<lain>
haha
<rqou>
btw the spec also doesn't allow consecutive or trailing underscores in identifiers
<rqou>
i have no idea why not
<lain>
rqou: but do you plan on doing vhdl-2008?
<azonenberg_hk>
rqou: so foobar_ok is allowed but foobar__ok is not?
<azonenberg_hk>
weird
<rqou>
yeah, that's what the grammar rule says
<lain>
yeah the consecutive/trailing underscore thing is obnoxious. I ran into that when doing the hdl# -> vhdl codegen
<lain>
I was doing autogenerated stuff as __thing, but of course that explodes
<rqou>
leading underscore is also forbidden
<azonenberg_hk>
lol
<lain>
really?
<lain>
I thought it was only trailing
<azonenberg_hk>
i prefer C identifier semantics
<rqou>
basic_identifier ::= letter { [ underline ] letter_or_digit }
<lain>
is that from 2008 spec?
<rqou>
you can use the \foo\ extended identifiers though
<rqou>
significantly more sane
<azonenberg_hk>
cant start with a number, otherwise any identifier-legal char is allowed anywhere
<lain>
bah, I need to re-read the spec
<azonenberg_hk>
any number in any order
<rqou>
although extended ids are not in the same "namespace" as basic ids
<rqou>
so foo and \foo\ are different
<rqou>
also, 0x00-0x1F and 0x7f-0x9f are still forbidden
<rqou>
can utf-8 generate bytes in the C1 control range?
<rqou>
if not then you can use it in extended identifiers
<lain>
that would be good to know, it may simplify my codegen
<rqou>
i just checked, it can't
X-Scale has quit [Ping timeout: 265 seconds]
<rqou>
so you can use utf-8 in extended identifiers
<lain>
my codegen targets vhdl-2000 for compat reasons, which makes things a bit more annoying
<lain>
ah nice
<lain>
(but 2000 also has extended identifiers, so that seems fine)
<rqou>
except you can't use the C0 range (0x00-0x1F) or delete (0x7F)
<rqou>
but those aren't characters anyways :P
<rqou>
i actually have no idea what changed in 2000 vs 2008
<rqou>
i'm just working off of the 2008 lrm
<lain>
I've only read through the 2000 lrm
* azonenberg_hk
has only looked at hte verilog LRM
<azonenberg_hk>
so cant say
<rqou>
do you need the 2008 lrm (as long as you don't leak it)?
<azonenberg_hk>
generate statements were added recently to verilog
<rqou>
if you leak it i probably get banhammered :P
<azonenberg_hk>
i forget if 2001 or 2008
<lain>
rqou: sure, I'll mark it as private in my fileserver so I don't try sharing it :P
<rqou>
"Authorized licensed use limited to: Univ of Calif Berkeley. Downloaded on XXXXX from IEEE Xplore. Restrictions apply."
<lain>
I've purchased quite a few specs
<lain>
but that one is mad overpriced :P
<rqou>
i can get all the ieee ones with varying degrees of "legit"-ness
X-Scale has joined ##openfpga
rqou_ has joined ##openfpga
<lain>
I think a lot of the 2008 changes were library-related but I'm not sure
<rqou_>
lain, azonenberg_hk: look at my hostname now :P
<lain>
there's things like unary logic operators, like you can "or some_vector" and it'll OR-reduce the vector
<lain>
rqou_: nice
<azonenberg_hk>
verilog has had that for ages
<azonenberg_hk>
:P
<lain>
I mean
<lain>
functions existed before, like you can trivially have or_reduce(vector)
<lain>
but now it's a builtin operator
rqou_ has quit [Client Quit]
<lain>
right, I should eat
<lain>
but /what/ should I eat
<lain>
am I reading this right? spaces are allowed in extended identifiers?
<lain>
so I could have \butts are pretty great\
<rqou>
that's how i read it yes
<lain>
:>
<lain>
one thing confuses me, I wonder if it's a typo...
<lain>
yes, it seems it was a typo in the 2000 spec
<lain>
it showed "\BUS \bus\" and said these are two different identifiers
<lain>
in the 2008 spec the same example shows "\BUS\ \bus\"
<lain>
yay errata.
<azonenberg_hk>
Lol
<azonenberg_hk>
So it's case sensitive?
<lain>
yes, they are
<azonenberg_hk>
in extended?
<rqou>
extended ids are case sensitive
<lain>
and seem to support utf8
<rqou>
you can have three distinct things: foo, \foo\, \FOO\
<lain>
now I can elminate all those stupid-ass transfomations in the codegen
<lain>
well
<lain>
I better check that tools support extended identifiers first
<rqou>
ghdl does
<rqou>
haven't tried xst
<rqou>
it shouldn't be that difficult, afaik \ isn't used anywhere else in the grammar
<whitequark>
azonenberg_hk: thats by number of commits
<whitequark>
ahmedirfan1983 and wspeirs put in as many or more lines as you
<whitequark>
well almost as many
<azonenberg_hk>
whitequark: yes, i didnt go see how big the commits were
<azonenberg_hk>
Lines are not a great measurement either
<azonenberg_hk>
What surprised me more is, there were not a lot of people working on it on an ongoing basis
<azonenberg_hk>
i thought it was a bigger team
<azonenberg_hk>
like, all i did was write ~2 passes and a tech library
<whitequark>
HDL synthesis is even trickier than compilers...
<rqou>
how so?
<whitequark>
and not that many people are touching FPGAs in the first place
firebird_ has quit [Ping timeout: 252 seconds]
<whitequark>
rqou: there's approximately 500 tutorials on writing a compiler
<whitequark>
show me a tutorial on writing a HDL synthesis tool
<whitequark>
for one
<rqou>
but many aspects of hdl synthesis are similar to compilers
<whitequark>
its a niche tool and it doesnt surprise me at all that it receives so little attention
<whitequark>
well.... yes and no
<whitequark>
the middle end is completely different IMO
<whitequark>
the similarity ends at "both are using graphs a bunch"
<rqou>
but lots of algorithms apply to just the graphs
<whitequark>
there are some very high-level shared concepts, like having IR and passes
<rqou>
also, you're forgetting interesting tools like labview
<azonenberg_hk>
whitequark: well yosys is useful for asic too
<whitequark>
so? algorithms aren't the hard part of writing compilers
<azonenberg_hk>
but... not a lot of people touch that either :p
<whitequark>
when I need an algorithm I go download an ACM paper from sci-hub
<whitequark>
most of middle-end work is hunting down miscompilations and plumbing things together
<rqou>
ah that's the specific stuff you're referring to
<rqou>
yeah, that's completely different
<rqou>
but that's the most ad-hoc and informal part of both compilers and hdl synthesis
<whitequark>
this is the bulk of work that goes into a synthesis or compiler tool
<whitequark>
I don't care if the high-level concepts are sometimes similar, that doesnt help with day-to-day operation
<rqou>
yes, but i wouldn't say hdl is any harder
<rqou>
i would say they're about similar amounts of hard work
<whitequark>
I never said it's hard
<whitequark>
harder
<whitequark>
it's *trickier*
<whitequark>
because by now we have really hammered out writing a decent compiler
<whitequark>
for a typical modern arch
<whitequark>
I can do that while blindfolded ;p
<azonenberg_hk>
EDA is decades behind software dev in terms of maturity
<whitequark>
also no
<whitequark>
you can build on existing compiler tools
<whitequark>
for example: csmith
<whitequark>
for example: creduce
<whitequark>
for example: libclang
<whitequark>
etc
<whitequark>
in EDA you pretty much have to jumpstart the entire toolkit you need to debug it from scratch
<whitequark>
which clifford did, in fact
<azonenberg_hk>
look at it this way
<azonenberg_hk>
there are multiple f/oss c compilers
<azonenberg_hk>
of various complexity, for various arches
<azonenberg_hk>
You know how many decent open source hdl synthesizers there are?
<azonenberg_hk>
One
<whitequark>
look at it this way
<whitequark>
people are writing f/oss c compilers literally as a party trick
<whitequark>
(tcc)
<rqou>
you can probably do a party trick verilog synthesis too
firebird_ has joined ##openfpga
<rqou>
no guarantees about correctness
<rqou>
:P
<whitequark>
tcc can boot the linux kernel
<whitequark>
if your party trick verilog synthesis tool can synthesize antikernel
<rqou>
can it still?
<whitequark>
then i'd like to use it :p
<whitequark>
eh, I think you need to patch the sources a bit?
<whitequark>
but the point is that code generation is solid even if support for weird gcc extensions isn't necessarily
<rqou>
yeah hdl can get way more weird correctness bugs
<whitequark>
thats exactly what im talking about
<rqou>
my father's answer is "just don't use the features that you know might have bugs" :P
<whitequark>
so
<rqou>
granted he's been using hdl synthesis basically since it was first mainstream
<whitequark>
don't write hdl?
<rqou>
so he's seen all the bugs
<rqou>
hey, it's an improvement over schematics :P
<whitequark>
I've seen EDA tools that *weren't* an improvement over schematics
<whitequark>
but not messing with K-maps is priceless
<rqou>
but yes, you're right that he basically didn't take much advantage of the potential power of hdl tools
<rqou>
hey, he claimed that back when he first became director of hardware he had to fight to convince the engineers that hdl synthesis could actually be used at all
<rqou>
(using either Abel or AHDL at that time)
<rqou>
it's pretty scary to think that the company my father worked for that did telco-grade networking equipment compiled their fpga bitstreams on an overclocked gaming rig (minus the gpu)
<rqou>
(a lot of validation was done afterwards)
<azonenberg_hk>
lol
<azonenberg_hk>
oh joy
<rqou>
they had a timing violation slip through once
<rqou>
because they didn't set up the constraints properly to silence and/or catch the appropriate things
<rqou>
my father very angrily rejected the "i'll just compile it again" solution :P
<azonenberg_hk>
Lol
<rqou>
i convinced my course professor to accept a ~5ps timing violation
<whitequark>
lol
<rqou>
it was on iirc the dqs line of the ddr controller that we didn't write
<rqou>
it was caused because our crappy design didn't meet timing so i constrained the FFs on the chip until it did
<rqou>
and this congested the region of the chip enough that the dqs was forced to a longer path or something like that
<whitequark>
should've tried pseudoephedrine, I heard it's good for congestion
<rqou>
i thought it's just good for synthesizing meth? :P
<rqou>
does RU do meth much?
<azonenberg_hk>
Lol
<whitequark>
uhm, no, it's a really good nasal decongestant
<whitequark>
so good that someone published a paper on synthesizing pseudoephedrine *from* meth
<rqou>
wtf
<whitequark>
full of brilliant quotes like
<whitequark>
"A quick search of several neighborhoods of the United States revealed that while pseudoephedrine is difficult to obtain, N-methylamphetamine can be procured at almost any time on short notice and in quantities sufficient for synthesis of useful amounts of the desired material."
<rqou>
lol
<rqou>
so what drugs are popular in RU?
<rqou>
does HK still have anti-ketamine propaganda? (haven't been there in a while)
<whitequark>
RU does meth a lot but almost none of it comes from pseudoephedrine
<whitequark>
you can tell because it's a racemate and and not the D enantiomer
<rqou>
interesting
<rqou>
what is the precursor then?
<whitequark>
in fact if you specifically need the D enantimer for something, you're going to pay for it, because the way it's produced is by selective precipitation
<whitequark>
dunno, I was never interested
<whitequark>
rqou: you can always just go look at RAMP if you're curious...
<whitequark>
... or ask them a question or w/e
<rqou>
but you somehow now that most of it is racemic mixture?
<rqou>
*know
<rqou>
"The Russian Anonymous Marketplace or RAMP is a Russian language forum with users selling a variety of drugs on the Dark Web."
<rqou>
thanks wikipedia
<whitequark>
I needed [REDACTED] for [REDACTED] and I've learned that as a side effect
<whitequark>
I *really* dislike meth users ever since I had to refactor some Ruby code written by one
<whitequark>
that also might have had some cocaine in it, reportedly
<rqou>
wut
<whitequark>
what
<whitequark>
i worked at a startup
<whitequark>
what did you expect exactly
<rqou>
i guess that makes sense :P
<whitequark>
a startup that permitted remote work from, for example, Dominican Republic
<whitequark>
I have no idea what drugs are popular in RU, look up WHO statistics or something?
<lain>
man
<rqou>
there was a bunch of fearmongering about "krokodil" a while back
<whitequark>
fearmongering?
<whitequark>
nah that's what you do if you're dirt poor
<whitequark>
I believe that issue is still ongoing
<rqou>
there were things like "it turns you into a zombie!!!!1111oneone" :P
<lain>
I worked at a place doing pcb layout for a couple years. half the job was cleaning up after my predecessor who was fired for, among other things, doing a variety of drugs on the job...
<whitequark>
do you know about the HIV epidemic that's one of the worst in the world
<rqou>
in RU?
<lain>
pcb files in System32, pcb files in Program Files, pcb files everywhere but the fileserver...
<whitequark>
or a tuberculosis epidemic so bad WHO had to literally invent a new category for it, "XDR"
<rqou>
you mentioned that one
cr1901_modern1 is now known as cr1901_modern
<rqou>
"NOTE—If an access value is copied to a second variable and is then deallocated, the second variable is not set to null and thus references invalid storage."
<lain>
one time I was redoing the layout on a board and there's this sprawling arrangement of like 9 transistors for some sort of logic, I forget what exactly. anyway, they were a total mess, like he just randomly placed them. took up 1/4 the board space. turns out they arranged quite neatly together into a small corner of the board when you apply brain to it :P
<rqou>
"Although the spread of HIV has been stemmed in sub-Saharan Africa, in Russia the rate of HIV infection is rising 10 to 15 percent each year"
amclain has quit [Quit: Leaving]
<rqou>
wtf RU why are you moving _backwards_?
<cr1901_modern>
Putin? (/s?)
<lain>
I guess before being fired, he made some threats toward the boss... so on a friday, the boss sits him down in the conference room, puts his gun on the table, and says "don't come back monday. you no longer have a job here."
<whitequark>
lain: his.... gun?
<lain>
(as in, the dude was threatening to kill the boss)
<lain>
the boss was a bit nutty
<lain>
he wanted to make it clear that he kept a weapon on him at all times, even in the office
<lain>
he's not the sort of person you want to casually make death threats toward lol
<whitequark>
well if you hire people like that I *suppose* having a weapon on you at all times is rational
<lain>
oh he was not a rational being.
<rqou>
"In 1997, it made [opioid reduction] use illegal, punishable with up to 20 years in prison."
<lain>
if not for the boss, I probably would have kept that job a lot longer
<rqou>
wtf russia
<lain>
but he sorta slowly went insane
<lain>
it got bad
<lain>
the sound of a filing cabinet being kicked in became too regular
<azonenberg_hk>
Oh joy
<lain>
yeahhhhh
<azonenberg_hk>
meanwhile, the cofounder of my first startup turned out to be a pedophile
<azonenberg_hk>
So... crazy people abound
<lain>
aye
<azonenberg_hk>
So, on topic... i have an interesting problem re the DCMP/PWM/ADC stuff
<rqou>
file types are a native part of vhdl
<rqou>
not a $ function like verilog
<rqou>
i assume this is not required to be synthesizable?
<rqou>
so you can invoke external code at synthesis time
<rqou>
c lifford told me that as far as he knows yosys is the only synthesis tool that supports that
<whitequark>
huh
<whitequark>
I mean, that's not really synthesis
<whitequark>
er
<rqou>
you can use it to compute coefficients or whatever
<whitequark>
not really "synthesizable"
<whitequark>
$readmemb isn't really "synthesized" either IMO
<whitequark>
if we want the distinction to be meaningful, anyhow
<rqou>
hmm vhdl doesn't seem to specify what format the file io files should be in
<rqou>
there is TEXTIO for text files (duh)
<rqou>
but there's also the ability to serialize arbitrary types
<lain>
I've wondered about vhdl file io stuff, but been too lazy to look into it
<rqou>
unlike $readmemb which is afaik standardized
<azonenberg_hk>
Anyway
<azonenberg_hk>
So I have GP_CLKBUF which basically functions like a BUFG
<azonenberg_hk>
or more like a BUFIO i guess, since it's dedicated routing for some hard IP
<azonenberg_hk>
there's four of these for counters and related clocking and one for the SPI core
<azonenberg_hk>
This much is taken care of and implemented already
<azonenberg_hk>
The interesting bit is, there's a mux
<azonenberg_hk>
The inputs are the ring oscillator clock, RC oscillator clock, and outputs of CLKBUF_2 and CLKBUF_4
<azonenberg_hk>
Output of that mux drives the ADC and optionally DCMPs (alternatively DCMPs can be clocked by CLKBUF_1 directly)
<azonenberg_hk>
And it's bitstream programmed, not runtime selectable
<azonenberg_hk>
Do you think that belongs as a primitive? How would you even structure it?
doomlord has joined ##openfpga
<azonenberg_hk>
Would you make a 1-input 1-output primitive called something like GP_ALTCLKBUF and have a mux setting instead of normal fabric routing?
<azonenberg_hk>
Would you try to cram it into an alternate encoding of GP_CLKBUF?
<azonenberg_hk>
i.e. same primitive name and just change the bits under the hood?
<azonenberg_hk>
lain whitequark: ideas?
<azonenberg_hk>
easiest option to implement is probably to make a uniquely named primitive
<lain>
hm
<lain>
¯\_(ツ)_/¯
<azonenberg_hk>
I'm leaning toward making it a GP_CLKBUF but with a different bitstream encoding
<azonenberg_hk>
just to avoid exposing this complexity to the user
<azonenberg_hk>
aaand i found another error in the DCMP documentation
<azonenberg_hk>
yeeah i am going to have a long email for Nazar when I get home
<azonenberg_hk>
the queue of bugs is long now :p
<azonenberg_hk>
Multiple spots where the bitstream indexes in section 16 are totally wrong
<azonenberg_hk>
plus i think one in another section
scrts has quit [Ping timeout: 260 seconds]
<rqou>
according to ghdl z <= a'event'event; is forbidden but z <= a'quiet'event; is allowed
<rqou>
need to check if that is what the lrm says
<azonenberg_hk>
wait what?
<rqou>
'delayed, 'stable, 'quiet, and 'transaction are special hardcoded attributes
<rqou>
i haven't gotten to the part that explains how they work yet :P
scrts has joined ##openfpga
<rqou>
wow ghdl is a giant ad-hoc parser that somehow manages to parse most (all?) of the grammar
<azonenberg_hk>
lool
<rqou>
it's also done by one guy
<rqou>
and written in ada so nobody wants to deal with it :P
<rqou>
the top two contributors are both the same guy
<rqou>
with over 1.1 million lines and about 1200 commits
<rqou>
why do foss hdl-related projects all have really poor bus factors?
<rqou>
imho both ghdl and yosys have a bus factor of 1
<azonenberg_hk>
bus factor?
<rqou>
"The "bus factor" is the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel."
<rqou>
afaik this is allowed by the grammar but semantically invalid
firebird_ has quit [Ping timeout: 250 seconds]
<whitequark>
rqou: do yourealize how much software depends on curl
<whitequark>
and depends on e.g. curl ssl validation working properly
<rqou>
wait really?
<whitequark>
curl might not be all that large but it's still critical infra
<whitequark>
yes
<rqou>
i thought it just downloaded files
<whitequark>
and i don't even yet include libcurl
<whitequark>
yes.
<whitequark>
package managers.
<whitequark>
how do you think they work.
<rqou>
i can do that by just sending a GET with a Host: header
<whitequark>
e.g. opam still relies on curl https:// for security
<rqou>
hmm i assumed that they used some http request library thing
<whitequark>
yes.
<rqou>
e.g. python has their own
<whitequark>
python doesn't need curl, yes
<rqou>
also, i thought package managers used gpg to verify packages?
<whitequark>
not really
<rqou>
debian doesn't even always download over https
<whitequark>
debian is an exception
<whitequark>
btw, that's a stupid idea anyway
<whitequark>
why? well, turns out debian signature validation code had a bug you could exploit by mitming http trivially
<whitequark>
if they used the https transport that wouldn't be a problem
<rqou>
wait it did?
<rqou>
what was the bug?
<whitequark>
most language-specific package managers rely exclusively or heavily on https because you can't make people use gpg under threats of death
<rqou>
also, gpg allows untrusted mirrors safely
<whitequark>
you can always use untrusted mirrors if you have centralized metadata
<whitequark>
(in case of opam I'm only talking about metadata)
<rqou>
but then you need your own sig verification
<rqou>
e.g. via gpg
<whitequark>
no, justcompute a sha512
<whitequark>
not exactly rocket science
<rqou>
i mean, checking a cert isn't rocket science either
<rqou>
but Apps(TM) still can't get it right
<rqou>
(except pokemon go apparently)
<rqou>
:P
<whitequark>
no, checking a cert is pretty hard
<whitequark>
checking sha512 correctly is literally comparing two arrays of bytes
<whitequark>
(and remarkably people still manage to mess that up)
<rqou>
yeah, ask a php programmer to do that :P
<rqou>
"abcdef....00" == "abcdef....01" because they both got converted to floating point
<whitequark>
I know yes
<rqou>
or nintendo? :P
<rqou>
strncmp vs memcmp, what's the difference?
<rqou>
pkcs#1 padding, what's that?
<whitequark>
lol
<rqou>
you know that's how the wii originally got broken into right?
<rqou>
set the signature to all zeros
<rqou>
mess with some unused bytes until you get lucky and the real hash also starts with a zero
<rqou>
boom, signed
<rqou>
:P
<whitequark>
lol
<whitequark>
amazing
<rqou>
second only to That Litigious Company That Couldn't ECDSA Correctly
<whitequark>
that was playstation, right?
<rqou>
yes
<whitequark>
what geohot got sued for?
<rqou>
yes
<whitequark>
what was the bug anyway
<rqou>
sony's ecdsa incorrectly used the same k for multiple signatures
<rqou>
so multiple signatures had the same r value
<whitequark>
pffff
<rqou>
this allows computing the private key
<rqou>
this is then coupled with an unchecked memcpy to exploit and dump the symmetric keys
<cr1901_modern>
"mess with some unused bytes until you get lucky and the real hash also starts with a zero" <-- wow
<cr1901_modern>
that
<rqou>
this happens with pretty good probability (close to 1)
<cr1901_modern>
's good. Very good!
<whitequark>
cr1901_modern: uhm
<whitequark>
i think that given that hashes are ~randomly distributed
<whitequark>
you need about 256 tries to get one
<whitequark>
on averge
<rqou>
yeah, and iirc there were two unused bytes available
<cr1901_modern>
Still interesting
firebird_ has joined ##openfpga
<rqou>
btw the djb-crypto ed25519 specifically only defines deterministic signatures
<rqou>
for reduced sony-like footgun potential
<whitequark>
yeah, djb is awesome
<rqou>
in ed25519 r is computed as sha512(msg) dot G
<lain>
I forget who gave the talk, but someone gave a talk on crypto and was mentioning how non-obvious a lot of crypto is and they're like "... so you can say 'oh well so-and-so developer is an idiot for using the crypto lib wrong', but why did we write the lib to encourage abuse in the first place? why does it even allow such stupid things? we [cryptographers] have to take some responsibility, too..."
<rqou>
rather than random dot G
<lain>
it might've been djb actually, now that I think about it
<rqou>
not the imperialviolet guy?
<lain>
hmm I dunno, it's been too long
<rqou>
aead ciphers are also great for reduced footgun potential
<lain>
yes
<rqou>
*cough* *cough* nintendo again
<whitequark>
lain: that has been djb definitely
<rqou>
iirc earlier nintendo savefiles were encrypted with aes-ctr
<whitequark>
he says that all the time
<whitequark>
rqou: .. did they start with 0 every time?
<rqou>
but somehow their aes-ctr didn't rotate keys/ivs and somehow repeated really quickly
<whitequark>
and did the first block always come out the same?
<whitequark>
oh
<rqou>
so it always came out the same
<lain>
XD
<rqou>
but it varied per cartridge iirc
<lain>
I have to wonder
<rqou>
so when people wanted to modify their pokemon
<rqou>
what they did was
<rqou>
in the pokemon storage boxes empty slots are stored as zeros
<rqou>
the pokemon storage boxes take up a good amount of the room in the save file
<rqou>
so you just save multiple times while rearranging your pokemon
<lain>
hahaha
<rqou>
and now you get a trivial known-plaintext attack
<rqou>
anyways, i've for some reason heard that djb is hard to work with
<cr1901_modern>
He
<rqou>
but his papers seem very clear and understandable to me
<cr1901_modern>
's an Appelbaum apologist
<whitequark>
pffff
<whitequark>
that's not why he's hard to work with
<rqou>
as in, i can read his papers and feel like i could actually code an implementation correctly
<whitequark>
he's hard to work with because he's really arrogant, not without reason
<whitequark>
as is common with people who are really good at something
Bike has quit [Quit: delf]
<cr1901_modern>
Well yea, that's true as well. I wish I was that good at something that I could get away with being arrogant lol.
<rqou>
i've heard djb's code is full of UB
<rqou>
and that he doesn't care
<lain>
UB?
<whitequark>
rqou: he does care a lot
<cr1901_modern>
That is correct
<rqou>
undefined behavior
<lain>
ah
<whitequark>
rqou: but in a slightly different way than most
<whitequark>
what he wants is to fix C
<whitequark>
instead of fixing his code
<lain>
I have this problem.
<rqou>
but what if your C is going to go into a pic10? :P
<rqou>
or an 8051?
<rqou>
or a 6502?
<cr1901_modern>
non 32-bit systems don't exist in djb's world
<whitequark>
rqou: actually I think most of the exploited UB won't help you
<whitequark>
shit like signed overflow being undefined
<rqou>
oh that one is dumb
<whitequark>
they're all made to simplify passes like scalar evoluiton and permit autovectorization
<whitequark>
or rather all used
<whitequark>
moreover
<whitequark>
when you're compiling for a 8051 you can simply always do whole program optimization
<rqou>
i hear a lot of complaints about strict aliasing rules as well?
<whitequark>
this lets you just totally rule out whole classes of assumptions needed to generate fast code with separate compilation
<whitequark>
yes
<whitequark>
strict aliasing is Bad
<whitequark>
... but you need aliasing analysis to do shit like autovectorization and LICM
<whitequark>
I have this problem where I want LLVM's aliasing analysis to be *more* aggressive
<whitequark>
because I am in fact hitting a performance bottleneck that could be eliminated by that (but it's not in C)
<rqou>
i can see strict aliasing help optimized for bankswitching architectures too
<whitequark>
ehhh
<whitequark>
disagree
<whitequark>
it doesn't matter when accessing static data
<rqou>
oh right, you can always cast to char *
<cr1901_modern>
His rant on C on Google groups was filled w/ "only my technical problems matter" attitude, of course mocking 8-bit micros and other archs that aren't RISCy or x86
<whitequark>
no, char is special-cased for strict aliasing
<whitequark>
in fact it's called "omnipotent char" in LLVM sources
<rqou>
yeah, that's what i mean
<whitequark>
8-bit micros don't matter
<whitequark>
face it
<rqou>
why not? chinese 6502/8051 clones are everywhere
<whitequark>
rqou: last time I've heard (five years ago) 8051 was in serious decline
<whitequark>
from someone who IIRC does RE of the sort
<whitequark>
not sure what's the case today
<whitequark>
but in any case we're talking about crypto
<rqou>
8051 smartcards?
<whitequark>
you're not doing crypto on a 8051 core used to make fixing silicon bugs cheaper
<whitequark>
doubt it
<cr1901_modern>
Most HLLs are a bad fit for 8-bit micros. They just can't do the multiplies fast enough
<cr1901_modern>
either*
<rqou>
8051 smartcards definitely exist, but they have hardware crypto cores
<whitequark>
I know they exist
<rqou>
afaik nxp's line is 8051
<rqou>
atmel's line is avr
<whitequark>
I doubt they exist in such an amount as to matter
<rqou>
um... sim cards?
<whitequark>
really? those still use 8-bit micros?
<whitequark>
with the amount of shit they do?
<whitequark>
I am really doubting this conclusion here, enough to get a few sim cards and decap
<rqou>
possibly? some of them are javacards
<whitequark>
afaik most are javacards
<whitequark>
(well, I was looking into that for RU)
<rqou>
the java stuff often runs on a crappy 8 bit uC
<rqou>
that has crypto cores for all real operations
<whitequark>
sure
<whitequark>
you aren't targeting it with C
<whitequark>
which is cr1901_modern's complaint
<rqou>
i guess i would phrase it that the number of unique programs is pretty small
<rqou>
even if the shipping volume is large
<whitequark>
none of those vendor compilers are C standard compliant anyway
<cr1901_modern>
My complaint isn't related to 8-bit micros doing crypto tbh, so I dropped the topic.
<whitequark>
so it's somehwat of a moot point
<whitequark>
of what the C standard says
<cr1901_modern>
I don't think C is appropriate for them anyway
<rqou>
lol
<rqou>
iirc parallax propeller has a gcc port
<rqou>
i wonder how crap that port is?
<rqou>
although propeller isn't an 8-bit arch
<rqou>
it's actually a super slow 32-bit arch
<rqou>
also, avr uses gcc as its compiler
<cr1901_modern>
whitequark: I don't think an 8-bit micro could do multiplies fast enough for something like HTTPS anyway. You'd get timeouts every time. But I've never tested it.
<rqou>
although imho avr is basically the only sane 8-bit arch
<rqou>
it looks like a normal risc, just with 8-bit registers
<rqou>
no bullshit with accumulators, zero pages, SFRs, etc, etc
<cr1901_modern>
I prob wouldn't enjoy writing AVR asm that much
<whitequark>
cr1901_modern: run a 8051 on a 600MHz CPU
<whitequark>
8051 code*
<whitequark>
problem solved
<whitequark>
anyway
<rqou>
avr actually has 16/32 registers
<rqou>
not something like 3
<whitequark>
you absolutely can do software-defined crypto on AVR if you have enough RAM
<whitequark>
the timeouts are very generous, 30-60s
<whitequark>
the problem is primarily memory, you can easily need tens of K of RAM
<rqou>
also, have you timed smartcards recently? they're sloooow
<rqou>
:P
<cr1901_modern>
That's... interesting.
<rqou>
easily a couple hundred ms just for one rsa op
<whitequark>
^
<whitequark>
I mean you can definitely spend a few seconds on RSA or DHE
<whitequark>
and then hundreds of milliseconds on AES
<whitequark>
not sure what the point is
<whitequark>
but you can
<whitequark>
you'll need an ATxmega though, with external SRAM intf
<rqou>
you sure you can't fit ed25519+chacha20+poly1305 on a normal atmega?
<cr1901_modern>
My serious answer to "what's the point" is "There's something invigorating about minimalism!"
<whitequark>
rqou: that stack can perhaps fit, yes
<whitequark>
but thats not what https normally uses
<whitequark>
using djb's crypto will be significantly easier
<rqou>
chacha20+poly1305 is allowed now
<rqou>
ed25519 still isn't yet
<whitequark>
actually nevermind, I know people who srdid that in fact, for verifying fw updates
<whitequark>
so you can definitely fit it in
* whitequark
shrugs
<rqou>
i don't know how slow secp256r1 is
<cr1901_modern>
Someone on hackaday recently made an 8-bit RISCy CPU with 32-bit addr space. I recall the idea was to minimize FPGA resource (less routing)
<rqou>
i especially don't know how to implement that safely
<whitequark>
brb (tell azonenberg I'm in MTR if he asks)
<rqou>
cr1901_modern: why? just use sb0's navre
<cr1901_modern>
Well, I keep forgetting about navre lol
<rqou>
only problem is that navre is gplv3
<rqou>
i should poke sb0 about that
<cr1901_modern>
That's not a problem for me
<rqou>
ianal but apparently gplv3 makes it impossible to use with proprietary ip cores at all
<whitequark>
I don't think you're going to expect concessions on that front
<rqou>
even things like the sdram controller
<whitequark>
mhm
<whitequark>
why don't you use the one from misoc?
<whitequark>
:p
<rqou>
a) haven't actually gotten to this point yet
<cr1901_modern>
B/c no navre port, and I somehow doubt sb0 would accept it anyway
<rqou>
b) it's iirc not as fast
<cr1901_modern>
he loathes 8-bit/retro stuff more than most ppl I know
<rqou>
REing the phaser blocks is somewhere on my TODO
<rqou>
whitequark: afaik gplv3 means that you can't use it even with things like foundry io pad cells
<rqou>
so afaik it basically can't go in an asic at all
<rqou>
unless your asic uses only open cells
<cr1901_modern>
rqou: I mean, you could ping sb0 to discuss it. It's not like he'll bite your head off. I think...
<rqou>
i'll get to it at some point
azonenberg_hk has joined ##openfpga
* cr1901_modern
takes off his rose-tinted 8-bit nostalgia glasses for a minute...
<cr1901_modern>
32-bit archs tend to be only HLL targets and I find reading assembly for them unpleasant. This (mostly) isn't true for the 8-bit archs that are still made today (legacy or new). I think portability between CPU archs is amazing and something to strive for. Just wish archs made today were more pleasant to work with.
<rqou>
i've done arm and x86 asm
<rqou>
it was ok
<rqou>
you're probably not better than the compiler though
<rqou>
also, AVR is an 8-bit arch that was designed to be a HLL target
<cr1901_modern>
you're probably not better than the compiler though <-- In the end, that's what I'm afraid of...
<rqou>
you can occasionally beat the compiler for avr
<rqou>
because sometimes the compiler is dumb
<azonenberg_hk>
Lol
<azonenberg_hk>
I outperform FPGA tools quite often :p
<rqou>
"mul" tends to produce much dumb on AVR
<cr1901_modern>
It's not like Verilog where it's simple to map in your head the equivalent logic gate idiom
<rqou>
because mul outputs to r0 and r1
<azonenberg_hk>
Like when ISE tries to place one block ram way off in a corner from everything else
<rqou>
but the c abi uses r0 as a zero register
<azonenberg_hk>
rqou: wtf
<azonenberg_hk>
why did they do that
<rqou>
so every time gcc emits a mul it then does a big register shuffle
<rqou>
even if there are muls to follow
<rqou>
azonenberg_hk: afaik it's because of limitations on what opcodes can take immediates
<azonenberg_hk>
rqou: zero registers are fine
<rqou>
a number of risc-y archs do that too
<azonenberg_hk>
but why have the mul instruction output to the zero reg
<rqou>
oh that, idk
<azonenberg_hk>
why not r1 and r2 or something
<rqou>
maybe an oops?
<cr1901_modern>
Also, does anyone else feel like besides ARM and x86, the only difference between CPU archs is the opcode encodings?
<azonenberg_hk>
i mean on mips, the zero reg isnt even possible to write to
<cr1901_modern>
There's a RISC-V to LM32 encoder!
<azonenberg_hk>
its hard wired to read 0
<rqou>
lol i actually failed one autograder check because of that
<azonenberg_hk>
rqou: meanwhile on mips1 with -fno-delayed-branch
<rqou>
:P
<azonenberg_hk>
gcc for a long time generated incorrect machine instructions
<rqou>
in my computer architecture class we had to build a mips-like cpu with zero register
<azonenberg_hk>
i.e. still using delay slots
<azonenberg_hk>
when doing a div
<rqou>
but i did it by making the reg file have a normal register for zero
<rqou>
and having logic that inhibited updating it if the register was zero
<rqou>
so i passed all the "overall" tests but there was one particular test that tested the regfile in isolation
<azonenberg_hk>
oh, lol thats not a problem
<rqou>
and it said that "hey, you're regfile allows writing to zero"
<azonenberg_hk>
i was going to guess a pipeline forwarding bug
<rqou>
no, it actually works correctly overall
<azonenberg_hk>
i.e. you can write to r0 then read from it and get a nonzero value the next clock
<azonenberg_hk>
but the actual writeback is nulled out
<rqou>
i didn't want to argue with the ta about one point
<azonenberg_hk>
Lol
<rqou>
this particular one didn't have a pipeline
<rqou>
so no forwarding
<azonenberg_hk>
I had a general rule of not taking grades less than i deserved
<azonenberg_hk>
if i screwed up, sure
<azonenberg_hk>
but if the grading was wrong, i'd contest it
<azonenberg_hk>
And i pretty much always won because i never contested a grade unless i had proof of an error
<azonenberg_hk>
like single-stepping the x86 assembly in question in gdb
<azonenberg_hk>
to prove to the ta that it did what i said it did even if he didn't understand the implementation :p
<rqou>
anyways, apparently the playstation is the king of mips footguns
<azonenberg_hk>
oh?
<cr1901_modern>
haha
* cr1901_modern
knows where this is going
<cr1901_modern>
it's beautiful
<rqou>
the ps1 has a "gte" (geometry engine thingy) as a coprocessor
<rqou>
it has its own pipeline
<rqou>
when you access it when it's not done the mips stalls
<rqou>
when you put such a stall in a delay slot
<rqou>
it blows up
<azonenberg_hk>
lol
<lain>
>delay slots
<rqou>
idk if it's even been RE'd what happens here
<rqou>
you can also put a branch in a branch delay slot
<azonenberg_hk>
Delay slots are a hack that works great for a 5-stage single-thread pipeline
<azonenberg_hk>
And basically nothing else
<rqou>
sh4? :P
<lain>
I'm no expert, but any time I hear "mips" and "delay slot" it's to talk about all the problems it causes
<lain>
I don't think I've ever heard anyone say those things together in a positive or even neutral way
<azonenberg_hk>
hyperthreading and superscalar etc basically die horribly around delay slots
<rqou>
sh4 is in in-order superscalar arch with delay slots
<azonenberg_hk>
you can either not support them and rely on the code being -fno-delayed-branch
<azonenberg_hk>
Or you can hack around it
<lain>
I had a handheld computer that ran sh4, like 14 years ago
<rqou>
sh4 didn't have to hack too much
<azonenberg_hk>
But in both cases it's a lot more work than it needs to be
<azonenberg_hk>
with mips, at least
<azonenberg_hk>
superscalar and hyperthreading basically means emulating delay slots
<rqou>
it was only dual issue so the thing in the delay slot just got shoved in the second pipeline
<azonenberg_hk>
even though the pipeline has no reason to have them
<azonenberg_hk>
And what if the second slot conflicts with the first instruction?
<azonenberg_hk>
uses the result or such
doomlord has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
<rqou>
but the first instruction is a branch
<rqou>
which doesn't produce a result
<azonenberg_hk>
Hmm, true so i guess it cant
<rqou>
also, the sh-arch apparently saw the mips footguns
<rqou>
anything that affects the pc is explicitly forbidden in a delay slot
<azonenberg_hk>
anyway my experience is, superscalar and HT on mips is more trouble than it's worth
<rqou>
and will throw an exception
<cr1901_modern>
MIPS classic pipeline is an elegant design (minus delay slots) with an assembly language that's deeply unpleasant to write or read
<azonenberg_hk>
As a result i plan to go risc-v for my next cpu
<rqou>
sh2 is a more pragmatic pipeline
<azonenberg_hk>
since that's a very similar arch designed for efficient implementation
<rqou>
it's variable 3/5 stages
<azonenberg_hk>
no patents etc
<azonenberg_hk>
no delayslots
<azonenberg_hk>
should scale well to a deep superscalar pipeline
<rqou>
hmm i should release my crappy risc-v cpu
<cr1901_modern>
RISC-V is also unpleasant to read XD
<rqou>
or actually
<rqou>
i can't
<rqou>
i need to check with my project partner
<rqou>
you don't want my risc-v
<rqou>
it's a crappy 3 stage pipeline with tons and tons of hacks and duct tape applied
<azonenberg_hk>
Lol
<azonenberg_hk>
also grr i'm supposed to have met whitequark here half an hour ago but he hasnt shown up
<azonenberg_hk>
And doesnt have his phone so if he's waiting for me in the wrong place i have no way to reach him :p
<rqou>
he said he's on the MTR right now
<azonenberg_hk>
oh, ok
<azonenberg_hk>
So maybe just delayed in the station or something
<rqou>
you're aware of tomatulo's algorithm right?
<cr1901_modern>
rygorous makes an interesting point; there really aren't any good examples of a superscalar OOE pipeline that students can study, and that's simple enough to understand all the control signals required to make it work.
<cr1901_modern>
rqou: Yes, I never understood it. Too stupid.
<azonenberg_hk>
i dont think that "simple enough for easy study" and "out of order" are possible :p
<rqou>
hmm you had to specify OOE
<cr1901_modern>
azonenberg_hk: That's a problem IMHO
<azonenberg_hk>
you have to work your way up
<rqou>
otherwise i would say "azonenberg_hk's barrel processor"
<azonenberg_hk>
rqou: Do you consider that easy to understand?
<rqou>
inb4 "barrel processors are cheating"
<cr1901_modern>
that means it's inaccessible.
<rqou>
i actually haven't looked at it :P
<azonenberg_hk>
I mean it's smallish, about 9k lines of verilog
<azonenberg_hk>
not counting supporting ip cores like fifos etc
<azonenberg_hk>
that i used
<azonenberg_hk>
But, it's a nontrivial amount of code
<rqou>
btw at one point the parallax propeller 2 vaporware turned into a barrel processor
<cr1901_modern>
azonenberg_hk: The problem with working your way up is that OOE essentially states "remember all that nice MIPS stuff you learned? None of it applies! You have to unlearn it all"
<azonenberg_hk>
i think if i did it risc-v it'd be smaller since there's a bit less muxing involved
<azonenberg_hk>
Correct
<azonenberg_hk>
My next CPU will be slightly more advanced
<rqou>
>ELFLoader
<cr1901_modern>
azonenberg_hk: "You have to unlearn it all" Huh... it's just like Quantum Mechanics then
<cr1901_modern>
azonenberg_hk: So yes, I think it's a bit of a problem that there's a huge semantic gap between classic pipeline and superscalar OOE
<rqou>
oh it doesn't process relocations
<rqou>
you can't footgun yourself like sony did
<cr1901_modern>
If you can't make it simple, then just make it possible for a single person to grok it.
azonenberg_hk has quit [Ping timeout: 260 seconds]
azonenberg_hk has joined ##openfpga
<azonenberg_hk>
Back
<azonenberg_hk>
(19:20:20) rqou: >ELFLoader
<azonenberg_hk>
(19:20:44) azonenberg_hk: Still 2-way in-order superscalar, but better utilization (i.e. less conflicts where two insns cant issue due to funcitonal unit dependencies)
<azonenberg_hk>
(19:20:50) azonenberg_hk: and full on hyperthreading vs barrel
<azonenberg_hk>
(19:20:58) azonenberg_hk: so you won't lose single-thread performance
<azonenberg_hk>
i didnt even microcode it, it's probably bigger than it needs to be
firebird_ has quit [Ping timeout: 258 seconds]
<rqou>
oh, you're elf loader doesn't process relocations
<rqou>
so it's probably fine
<cr1901_modern>
azonenberg_hk: "You have to unlearn it all" Huh... it's just like Quantum Mechanics then
<azonenberg_hk>
Yeah i dont do ASLR
<rqou>
sony managed to footgun themselves with this once
<azonenberg_hk>
so there was no point
<azonenberg_hk>
that's on the "would be nice" list
<azonenberg_hk>
but not a priority
<rqou>
the ps2 IOP IRX kernel does process ELF relocations
<rqou>
but it does them wrong
<rqou>
e.g. if you have a lui/ori pair, the IRX kernel needs you to have a pair of relocs
<rqou>
you can't share a lui
<rqou>
this has been locking them into gcc 3
<cr1901_modern>
rqou: I forgot to mention. I'm not aware if Tomasulo's Algorithm is still used. And as I recall, H&P doesn't go into detail about how to implement renaming
<cr1901_modern>
like it does w/ MIPS pipelining
<azonenberg_hk>
cr1901_modern: yeah this is the kind of stuff i havent looked into
<azonenberg_hk>
Superscalar is as far as i've gone so far
<azonenberg_hk>
and i do not do renaming right now
<azonenberg_hk>
if i did, i could probably get higher IPC while remaining in-order just by removing false dependencies
<rqou>
btw if anyone has infinite free time and actually likes retro-ish homebrew
<rqou>
would it be easier to re-bootstrap sony's ancient gcc or to make llvm support ps2?
<rqou>
there was an official sony ps2 linux sdk way back in the day
<cr1901_modern>
They did something similar with the virtual boy compiler
<rqou>
it used gcc3, and homebrew today still uses that
<azonenberg_hk>
rqou: there was one for the ps3 too
<rqou>
isn't spe upstream?
<cr1901_modern>
actually, that's an interesting CPU. It has bit string instructions.
<azonenberg_hk>
well the modern ps3s are locked down and dont support running other OSes anymore
<azonenberg_hk>
(thanks geohot :P)
<rqou>
but we have the ecdsa keys
<rqou>
i thought you can somehow pwn it using that?
<azonenberg_hk>
maybe?
<rqou>
or did sony rotate them
<azonenberg_hk>
i havent looked into it
<azonenberg_hk>
what i do know is, there is no longer an official linux sdk
<azonenberg_hk>
:p
<rqou>
but i thought the spe compiler made it upstream?
<azonenberg_hk>
Yes but for the ps3 in general
<rqou>
as well as the kernel port?
<azonenberg_hk>
not just the cell cpu
<rqou>
but you can probably just take the ppc64 kernel, sprinkle some "device tree magic" and get it to work, right?
<rqou>
possibly un-bitrotting some drivers?
<cr1901_modern>
azonenberg_hk: In your saratoga CPU, what does RS/RT mean in the context of the instruction decoder?
<azonenberg_hk>
cr1901_modern: those are the two register inputs for the instruction
<azonenberg_hk>
the general format of an operation is op rd, rs, rt
<azonenberg_hk>
or op rs, rt, immediate
<azonenberg_hk>
in which case rs is both input and output
<rqou>
i've traditionally seen it as op Rd, Rs, Rt (with capitals) :P
<cr1901_modern>
Ahh, and b/c there's two units, you can use both of them to process a single instruction specially
<cr1901_modern>
like syscall
<azonenberg_hk>
Correct
<cr1901_modern>
that's cool
<azonenberg_hk>
the actual execution of syscall is handled by unit 0
<rqou>
wait does saratoga have the weird kseg stuff?
<azonenberg_hk>
So when issuing that instruction i just add a wait state to unit 1
<azonenberg_hk>
and steal the register file ports
<azonenberg_hk>
rqou: no
<azonenberg_hk>
saratoga is not, and does not claim to be, mips
<azonenberg_hk>
it is compatible with appropriately configured mips-elf-gcc
<rqou>
ah, patents?
<azonenberg_hk>
No, research :p
<rqou>
or trademarks?
<azonenberg_hk>
Compiler compatibility, not ISA compatibility, was the design goal
<azonenberg_hk>
so for example no delay slots
<azonenberg_hk>
hardware message passing
<azonenberg_hk>
no interrupts
<azonenberg_hk>
I used the subset of the mips isa generated by gcc for running userland alu stuff
<rqou>
i see
<rqou>
the "easy" part :P
<azonenberg_hk>
Yeah
<azonenberg_hk>
Then all of the rest was custom designed
<azonenberg_hk>
the mmu is full custom (not very great, but it fit my API and could be replaced with a drop-in nicr version and not lose software compatibility)
<azonenberg_hk>
nice*
<rqou>
not like the folks over in #j-core where i had to poke them about weird edge cases regarding delay slots
<rqou>
they actually aim to be isa compatible
<azonenberg_hk>
Yeah
<rqou>
although as far as anyone can tell you don't necessarily want that
<azonenberg_hk>
I just didnt want to have to write a gcc back end
<rqou>
e.g. if you fault in a delay slot it saves the address of the branch target
<rqou>
so recovery is impossible
<cr1901_modern>
azonenberg_hk: Is it software or hardware page table walk? Not that I could design an MMU b/c I've never wrote a page fault handler
<rqou>
my version doesn't do that, but I haven't tested what breaks
<azonenberg_hk>
cr1901_modern: neither
<rqou>
mips was traditionally always software
<rqou>
wait neither?
<azonenberg_hk>
The page table is a 1:1 mapping of TLB entries to virtual addresses in a ~2MB per thread segment
<azonenberg_hk>
in the current prototype
<azonenberg_hk>
That's all the virtual address space yo uget
<rqou>
oh i see
<azonenberg_hk>
just a map of vaddr to phyaddr
<rqou>
afaik allwinner has an iommu that looks like that
<azonenberg_hk>
But here's the thing
<azonenberg_hk>
you the user never see the mmui
<azonenberg_hk>
mmu*
<azonenberg_hk>
or page table
talsit has left ##openfpga [##openfpga]
<azonenberg_hk>
you send OOB_OP_MMAP to the CPU management interface
<azonenberg_hk>
and it manipulates the data for you
<azonenberg_hk>
So you could replace this with a fancy multilevel page table
<azonenberg_hk>
and remain binary compatible
<azonenberg_hk>
the API just takes in a (vaddr, phyaddr, len, permissions) tuple
<azonenberg_hk>
and creates a mapping or returns an error
<cr1901_modern>
(6:37:11 AM) azonenberg_hk: The page table is a 1:1 mapping of TLB entries to virtual addresses in a ~2MB per thread segment I... don't understand this. Not everything's gonna *be* in the TLB?
<azonenberg_hk>
Yes it is
<rqou>
unless you decrease the page granularity?
<azonenberg_hk>
The page table is essentially an on-chip array
<rqou>
btw this is a fun "porting to windows" problem
<azonenberg_hk>
array[0] = phyaddr and perms of the first page in your vaddr space
<azonenberg_hk>
array[1] = phyaddr and perms of second page
<azonenberg_hk>
etc
<rqou>
windows has 64k page allocation granularity
<rqou>
linux/osx have 4k
<azonenberg_hk>
if you need more than 2MB of memory per thread you're out of luck
<azonenberg_hk>
Prototype woo :p
<cr1901_modern>
How big is the array?
<rqou>
according to raymond chen this is because of the alpha axp :P
<azonenberg_hk>
rqou: lol
* cr1901_modern
still is having trouble visualizing this
<azonenberg_hk>
cr1901_modern: The MMU is basically just an array lookup
<rqou>
apparently the alpha axp instead of having lui/ori had a load high and an add immediate
<rqou>
but the immediate was signed
<cr1901_modern>
The MSBs of the address will index into the array, right?
<rqou>
although in general the alpha axp is just made of footguns
* cr1901_modern
doesn't see the problem here
<azonenberg_hk>
cr1901_modern: it works great as long as the table is big enough
<azonenberg_hk>
In my prototype, it's fairly small
<azonenberg_hk>
So your max available vaddr space is small
<azonenberg_hk>
i think 256 pages or so
<azonenberg_hk>
rqou: on antikernel the default page size is 2KB
<azonenberg_hk>
because thats a small xilinx block ram
<azonenberg_hk>
next power of two up from an ethernet frame
<azonenberg_hk>
and generally an all-around convenient size
<cr1901_modern>
So low_bits is 11 bits in width?
<cr1901_modern>
And high_bits is 10 bits in width?
<azonenberg_hk>
if my mental math is right yeah
<azonenberg_hk>
well then the lowest 2 bits are ignored since word aligned addresses
<cr1901_modern>
Ahhh, fair.
<azonenberg_hk>
And i check that the highest bits are all zeroes (or some constant value, i forget what base address i used)
<azonenberg_hk>
rqou: what time did whitequark say he was on the mtr? lol
<cr1901_modern>
And every time a thread switches the hardware will load a new array?
<azonenberg_hk>
oh wait he's here
<azonenberg_hk>
also the high bits of the address arethe thread id
<cr1901_modern>
(6:44:10 AM) azonenberg_hk: And i check that the highest bits are all zeroes (or some constant value, i forget what base address i used) <--- :?
<rqou>
lol apparently 2:51 my time
<rqou>
so almost an hour ago?
<rqou>
i really should go to bed...
azonenberg_hk has quit [Ping timeout: 260 seconds]
<pie_>
apparetly i got highlighted but its gone off my scroll
<pie_>
btw hi folks
<cr1901_modern>
(6:40:47 AM) rqou: you can maybe see the problem here? :P <-- what was the problem?
<rqou>
page granularity tends to be tied with executable load address granularity
<rqou>
and relocation processing becomes difficult/impossible with a signed lower half
<cr1901_modern>
so how does a signed add cause problems?
<cr1901_modern>
Bleh, I was distracted, so I missed some things
<cr1901_modern>
rqou: You said java cards are 8-bit and have crypto cores?
<rqou>
yeah, e.g. nxp and atmel
<rqou>
also, all smart card vendors have super confusing names for everything
<cr1901_modern>
nxp is an avr clone?
<rqou>
including product lines
<rqou>
no NXP is an 8051
<cr1901_modern>
oh, bleh lol
<rqou>
anyways, it's like bad naming makes them more secure or something :P
<rqou>
atmel uses an avr in their secure uC because, well, it's atmel :P
<cr1901_modern>
FWIW, I would not use C code for 6502.
<rqou>
sure, I can mostly agree
<rqou>
how good is sdcc?
<cr1901_modern>
poor
<cr1901_modern>
I mean, it's the easiest C compiler to understand, but... :P
<cr1901_modern>
If I were truly desperate to get a 6502 to talk via HTTPS, I would look at the generated assembly for a small crypto lib like bearSSL and then just impl the parts I need
<cr1901_modern>
So I "roll my own crypto" without actually rolling my own crypto :P
<cr1901_modern>
(Or I could use a crypto core. That works too)
<cr1901_modern>
rqou: Do both 8051 and AVR provide hardware acceleration for crypto?
<rqou>
no, only the special secure/smartcard ones do
<rqou>
although iirc some atxmegas had a aes core
pie_ has quit [Ping timeout: 268 seconds]
doomlord has joined ##openfpga
<felix_>
rqou: for the phaser block re maybe have a look at the xilinx patents (iirc i linked them some time ago), the phy generated by the mig and the unisim library *cough*
<felix_>
i'll probably be motivated to have a look at that too at the congress
woddy has joined ##openfpga
<lain>
there's a lot of verbose comments in the mig output
<felix_>
yep; from the combination of the three things above it's quite possible to write a spec for those blocks. clean-room re ftw! ;)
<lain>
yep
<felix_>
for the upper layers of the memory controller, i'd do a complete own reimplementation; the xilinx stuff is so much bloat
<lain>
yeah
<lain>
partner and I will be doing that soon on our own project
<lain>
REing phaser and replacing mig
<lain>
we'll probably use mig to get things rolling, but once we're done with the core code we definitely want to replace that stupid mig
<felix_>
the mig for the vc707 ate up 4,3% of that huge fpga. that's just insane...
<lain>
D:
<felix_>
so yeah, doing most of the initialization and recalibration stuff in a softcore would probably decrease the ressource utilization
<felix_>
oh, a leon3 core (no cache or mmu) only used 1,6% of the fpga ressources
doomlord has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
doomlord has joined ##openfpga
doomlord has quit [Client Quit]
<azonenberg_hk>
felix, lain: I plan to write a x32 DDR3 controller on top of the PHASERs for artix7 in a couple months
<azonenberg_hk>
So if you do RE anything and write specs i'd love to see
<felix_>
if i have the time to do some re in that area, i'll definitely talk about it here ;)
<azonenberg_hk>
:)
<azonenberg_hk>
Yeah, open replacements for vendor IP (even if compiled using the vendor toolchain) is definitely on topic
<azonenberg_hk>
In other news
<azonenberg_hk>
just had a fun meeting with whitequark about our pcb hdl project
<felix_>
well, it's re of the hardware blocks, so i considered it on topic ;)
<felix_>
nice
<azonenberg_hk>
Its going to be a radical paradigm shift from any other way of designing a pcb, lol
<azonenberg_hk>
the current thinking is to do a constraint solver based approach, lol
<azonenberg_hk>
I know it sounds silly
<felix_>
doesn;t sound too silly to me
<azonenberg_hk>
But basically, you define constraints like "this pin can accept input voltages up to X" or "these two pins are an I2C bus"
<felix_>
sure, it's quite an unusual approach
<azonenberg_hk>
then the solver will group pins into banks
<azonenberg_hk>
assign bank io voltages
<azonenberg_hk>
and detail-place pins in the banka
<azonenberg_hk>
as you lay out the pcb and route things you can make changes to ease routability
<azonenberg_hk>
and it'll do so around the existing pcb traces
<felix_>
sounds quite awesome tbh
<azonenberg_hk>
i.e. the solver wont touch a connection if it's been routed already
<azonenberg_hk>
then it'll also support group placement of bus objects
scrts has quit [Ping timeout: 246 seconds]
<azonenberg_hk>
like a ddr3 byte group
<azonenberg_hk>
or a spi bus
<azonenberg_hk>
so you can assign them to hard ip on a mcu etc
<azonenberg_hk>
obviously you will be able to constrain a pin/bus to a specific location for mechancial/floorplan reaspons
<azonenberg_hk>
reasons*
<azonenberg_hk>
and it'll route around it
<azonenberg_hk>
DRC is going to use similar setup as well
<azonenberg_hk>
including propagation of uncertainty
<azonenberg_hk>
like if you use 5% tolerance resistors in a voltage divider for a SMPS
<azonenberg_hk>
it'll consider the worst case combo of high/low resistance values to calculate a range of output voltages
<azonenberg_hk>
then check that against acceptable supply voltages for the chips
<azonenberg_hk>
and also use it to derive input/output voltage levels
<azonenberg_hk>
and DRC check that you arent feeding 3.3V to a 2.5V input etc
* felix_
likes that idea
scrts has joined ##openfpga
<azonenberg_hk>
Basically i'm trying to take all of the things that i do when designing a schematic in e.g. kicad, in a labor intensive and error-prone way
<azonenberg_hk>
and automate them
<azonenberg_hk>
whitequark is working on the eagle back end and i'm going to do kicad, we're collaborating on the core
SpaceCoaster has quit [Ping timeout: 245 seconds]
<azonenberg_hk>
There's not much code yet, i havent written any myself in fact
<azonenberg_hk>
The focus of our initial discussion was to figure out what to build :p
firebird_ has joined ##openfpga
maaku has quit [Quit: No Ping reply in 180 seconds.]
<felix_>
i'd even prefer verilog to the usual schematic stuff and you probably know that i dislike verilog ;)
maaku has joined ##openfpga
<azonenberg_hk>
Yes
<azonenberg_hk>
I tried verilog for pcb
<azonenberg_hk>
it was cool but didnt go far enough
<azonenberg_hk>
for things like pin swapping in particular it sucked
<felix_>
the swap fpga pins in the layout to make it better routable is actually something i liked about altium
<azonenberg_hk>
Yeah
<azonenberg_hk>
Integration with kicad is going to take some experimenting
<azonenberg_hk>
we may have to write some kind of plugin
<azonenberg_hk>
b/c a kicad_pcb file does not contain routing topology infomration
<azonenberg_hk>
it contains the ratsnest
<azonenberg_hk>
and the geometry
<azonenberg_hk>
but does not provide sufficient information to tell you "pins A and B are connected by traces, but C and D are not so that's OK to move when you pin swap"
pie_ has joined ##openfpga
maaku has quit [Quit: No Ping reply in 180 seconds.]
maaku has joined ##openfpga
amclain has joined ##openfpga
maaku_ has joined ##openfpga
maaku has quit [Ping timeout: 260 seconds]
Bike has joined ##openfpga
maaku_ has quit [Quit: No Ping reply in 180 seconds.]
maaku has joined ##openfpga
pie_ has quit [Changing host]
pie_ has joined ##openfpga
digshadow has quit [Quit: Leaving.]
mifune has joined ##openfpga
mifune has quit [Changing host]
mifune has joined ##openfpga
digshadow has joined ##openfpga
m_w has joined ##openfpga
cr1901_modern1 has joined ##openfpga
cr1901_modern has quit [Ping timeout: 250 seconds]