<GitHub195>
[artiq] whitequark commented on issue #877: If there is no response to ARP requests it means the core device has crashed. Please acquire a backtrace from the serial console next time this happens. https://github.com/m-labs/artiq/issues/877#issuecomment-354030512
<whitequark>
for some reason the core device cannot receive 1514 octet packets correctly
<sb0>
is it like this crazy ISP/cloudfare (iirc) bug you mentioned a while ago?
<whitequark>
sb0: WTF
<whitequark>
no
<whitequark>
windows is sending packets with invalid TCP checksum
<whitequark>
smoltcp is correctly rejecting them
<whitequark>
I misunderstood about 1514 octets, it was just my test setup that I used to replay the traces that had MTU too small
<whitequark>
i... i'm confused at what to do
<whitequark>
is this a windows bug or what
<whitequark>
ohwait I have an idea
<GitHub78>
[artiq] sbourdeauducq commented on issue #878: This issue is about the compiler crashing; string formatting is not supported and considering other projects that are funded and late (Sayma, DRTIO, etc.) we should not implement it for now. @klickverbot if you want it, please open a new issue. https://github.com/m-labs/artiq/issues/878#issuecomment-354047046
<davidc__>
bah, can't type. If you're capturing on the windows box, I recall something about eth cards with TCP offload screwing up the wireshark capture
<whitequark>
no, I'm capturing on the Linux hypervisor
<whitequark>
I know about TCP offload issues...
<davidc__>
sb0: BTW, I have a few hot cathode gauges (of the ZJ-28 type) to play with. You mentioned some ionpak board issues - anything I should fix/change before I send it off to fab?
<whitequark>
sb0: any idea if `ifconfig mtu` refers to the maximum *ethernet* frame size or maximum *ip* datagram size?
<whitequark>
windows is sending 1500 octet ip datagrams, wonder if that's an issue here
<sb0>
davidc__, yes, there is a errata list in the repository
<sb0>
davidc__, also I'll probably want to simplify the emission settings, i.e. have only 2 ranges and no fancy low-leakage commutation with diodes
<sb0>
davidc__, that being said, the current board layout can be reworked without major troubles - there are a lot of errata, but nothing difficult to execute
<davidc__>
sb0: Ok! I'll see what I can do with kicad. I usually use Altium, but I've been meaning to play more with kicad
<sb0>
davidc__, sending the gerber with the errors + manual rework is probably easier than dealing with the kicad UI
<davidc__>
sb0: hah, but that doesn
<davidc__>
er, doesn't result in a useful pull request for the repo
<sb0>
davidc__, do you want a version with some errors fixed?
<davidc__>
sb0: sure, if you have a WIP I'm happy to start from there
<sb0>
davidc__, there is also a lot of firmware stuff to do, if you want to make useful PRs. the latter is easier for me to manage.
<davidc__>
sb0: I'll see what I can do. I'm pretty comfortable with rust on non-embedded systems, and C on embedded systems, so I'll see what I can do for rust-on-embedded :)
<sb0>
davidc__, okay, emailed you the partially fixed version
<sb0>
davidc__, that being said: getting the final PCB version with all errors fixed is probably ~1 day of work for me. I've just been swamped with other things.
<sb0>
so you may want to focus on firmware anyway
<sb0>
davidc__, This message was blocked because its content presents a potential 552-5.7.0 security issue.
<davidc__>
oh thats excellent. I'll give you another address
<GitHub119>
[artiq] whitequark commented on issue #865: @cjbe This problem hard to conclusively assign to any cause without seeing a core log at a TRACE resolution. Please acquire such a log and I'll take a look. This does not seem to be related to smoltcp, but rather it is more likely a bad case in our allocator or something like that (since smoltcp's processing time per packet has a hard upper bound). https://github.com/m
<davidc__>
sb0: (so the gauge automatically will sequence on/off depending on the state of the vaccum system)
<sb0>
davidc__, you could send that over ethernet
<sb0>
that interlock input would need optoisolation...
<davidc__>
You could, but my existing system uses simple level signalling
<sb0>
unless you have medium vacuum it's *very* easy to throw off the reading from stray currents
<sb0>
what is your system for?
<davidc__>
sb0: upgrade for my SEM
<davidc__>
anyhow, I fully expect the SEM specific mods would have no place in the main repo, but I'd be happy to contribute back anything else that's general
<sb0>
are there mods other than this interlock?
<davidc__>
sb0: oh, uh interface to their custom 8 bit backplane. I have reg docs for what its expecting, and a trivial ice40 FPGA implementatioin of their backplane protocol
<sb0>
but again: i'd be careful to add features to this board in any case, you don't want the current of whatever gadget you are connecting to return through the gauge collector
<davidc__>
sb0: yeah, I'm going to isolate the whole ionpak circuit from the SEM backplane
<sb0>
if you send that over ethernet you have galvanic isolation
<davidc__>
sb0: yes, agreed. But I'm not ready yet to rip-and-replace the rest of the SEM system with custom control electronics, so until I am, I need to live with it as it is :)
<sb0>
you can even use arduino ethernet shield or something like that for your interlock
<davidc__>
oh, I see what you mean. Use something at the other end of the cable.
<sb0>
yes
rohitksingh has joined #m-labs
<sb0>
rohitksingh, "OPTION_USE_TCM_DISABLE_IBUS" is a bit long
<sb0>
why not just "FEATURE_TCM"?
<sb0>
"TCM Populate Wishbone Slave Bus" is a strange comment
<sb0>
rohitksingh, why would the CPU do a burst read of the memory? (cpu_burst_i)
<sb0>
interfce -> interface
<rohitksingh>
sb0: thanks! I'll change the parameter name!
<rohitksingh>
I assume non-pipelined version (espresso and pronto espresso) don't do burst read of instruction memory
<rohitksingh>
sb0: I'll fix the comments and typos . thanks!
<sb0>
rohitksingh, I don't understand the purpose of this "burst fetch"
<sb0>
it's a BRAM with 1-cycle latency, pipelinable, one should be able to insert that directly into the pipeline, without "bursts"
<rohitksingh>
sb0: yeah correct. There is a signal named `cpu_burst_i` coming from the processor's fetch module, which instructs to fetch instructions on every clock cycle (1 cycle latency)
<rohitksingh>
currently it takes 3 cycles from request to handshake to start of another request
<sb0>
why? that doesn't make sense
<rohitksingh>
with BRAM we can do 1-cycle reads
<rohitksingh>
sb0: The reason is the current code itself
<rohitksingh>
it deasserts the ack after ack'ing for 1 cycle
<rohitksingh>
there is a TODO mentioned in the code to change it
<sb0>
I don't see why you'd need a "burst mode" at all on something like a BRAM
<sb0>
the cache is also something like a BRAM
<sb0>
burst modes are when you have >1 cycle of latency
<sb0>
so either mor1kx was designed in a stupid way, or there is something we don't understand, if that's the latter option I'd like to know what it is
<rohitksingh>
sb0: ok let me check if the other two variants can indeed do 1-cycle fetch or not. Otherwise `cpu_burst_i` would be required.
<rohitksingh>
just a sec
<rohitksingh>
sb0: the other variants have 'cpu_burst_i' tied low and can do 1-cycle fetch. Let me simulate with these 2 variants and check the simulation output. It might be only cappuccino's peculiarity
<sb0>
maybe this has to do with requesting a burst on the ibus wishbone interface? though i'd assume the cache itself would request a burst on its own to replace a line
<rohitksingh>
sb0: yup only cappuccino has requirement of `cpu_burst_i`. with espresso, it is taking 2 cycles per instruction fetch which I can decrease to 1-cycle if I do not deassert ack after ack'ing. In cappuccino the `cpu_burst_i` controls the `cti` signal for wishbone interface.
<rohitksingh>
other 2 variants have `cpu_burst` tied low
<rohitksingh>
I can fix the code to 1-cycle instruction access
<sb0>
rohitksingh, okay. so cpu_burst is just for having wishbone bursts in the absence of a cache? is that correct?
<sb0>
rohitksingh, why does the rest of the CPU core need to know about bursts?
<sb0>
isn't a cache refill always 1 line?
<rohitksingh>
sb0: What is happening is that, without cache, cappuccino takes more cycles to fetch an instruction than espresso or pronto-espresso. Actually this is more of an issue of cappuccino's implementation. The rest of the CPU shouldn't need to know about bursts as you have said
<rohitksingh>
yeah cache refill is one line, which is typically done in a single burst access of a bus
<sb0>
rohitksingh, okay. but with TCM the performance should be the same as with the cache (and a 100% hit rate), correct?
<sb0>
1 instruction per cycle?
<sb0>
whitequark, are you still using the kc705?
<rohitksingh>
sb0: yeah, only if we uses TCM + Espresso/Pronto-Espresso implementation. Cappuccino without cache will 1). tie down `cpu_burst` to ground, so we cannot do burst 2). take extra cycle for instruction fetch
<sb0>
rohitksingh, is cappucino always taking 2 cycles to access each instruction? even with the original cache?
<whitequark>
sb0: take it
<sb0>
I thought "cappucino" was supposed to be the high-perfoance implementation
<sb0>
it would surprise me if the cache were slow
<rohitksingh>
sb0: No. With caches enabled, it uses burst to fetch instruction. Without caches, it takes 2 cycles, whatever we might try
<rohitksingh>
sb0: cappuccino was intended to be used with caches so that makes slight sense
<rohitksingh>
sb0: In short. Cappuccino, only when cache enabled, supports 1 cycle instruction fetch. otherwise not.
<sb0>
rohitksingh, so it uses bursts internal to the cache to fetch instructions?
<sb0>
strange
<sb0>
anyway the TCM should also support that, and have a throughput of 1 instruction per cycle
<rohitksingh>
sb0: yeah we can say so
<sb0>
it's quite a strange way of doing things though
<rohitksingh>
sb0: Yeah I can modify it to support 1 instruction per cycle, for both the case of (cappucino-with-cache) and (cappucino-without-cache/espresso/pronto-espresso)
<rohitksingh>
yes it is confusing indeed
<rohitksingh>
I'll discuss and confirm my results and deductions in the #openrisc channel, and add support for 1-cycle instruction access in the TCM
<rohitksingh>
sb0 / rjo / _florent_ : ping! I have a general question
<rohitksingh>
sb0: okay, thanks! would appreciate any help with the latest question too
<sb0>
rohitksingh, you can't with this protocol (unless you introduce combinatorial feedback that will break timing)
<sb0>
you need to introduce a different protocol with pipelining, or use bursts
<sb0>
well the comb feedback will not _always_ break timing, and may be acceptable in certain cases
<sb0>
so with pipelining the initiator would begin the next transaction before it has received data from the target
<sb0>
you need to turn the ack signal into two signals, "transaction accepted" and "data coming back"
<sb0>
if the latency is fixed, though, then you don't need such control signals and pipelining is rather straightforward
<rohitksingh>
sb0: thanks! this clears up many things. One more doubt, do we need to modify the cpu core for this different protocol with pipelining, or add it between cpu core and tcm?
<_florent_>
rohitksingh: for example of protocols without this limitation, you can look at AXI (valid/ready)
<sb0>
rohitksingh, I don't know, haven't digged into mor1kx. how does the cache works?
<sb0>
I believe for TCM you can just copy whatever the cache does, just strip the bus interface and the cache miss handling code
<rohitksingh>
_florent_ : sure, thanks! I'll check out AXI
<rohitksingh>
sb0: I haven't looked at cache code in depth. let me do that.
<rohitksingh>
sb0: Is it allowed to modify core cpu code, since with current ibus interface/protocol it is not possible to reduce latency to 2-cycles per instruction
<sb0>
rohitksingh, really? so right now the cpu takes 2 cycles per instruction because of the cache?
<rohitksingh>
sb0: no. With cache, Cappuccino uses `burst` signal with incrementing addresses to fetch 1 instruction per cycle (from ibus into cache ). For random address fetch, cappuccino takes 3 cycles! Without cache, cappuccino again takes minimum of 3 cycles per instruction. Now, for other two variants (espresso and pronto espresso), without cache, they both take minimum of 2-cycles per instruction fetch due to this protocol limitation. I'll have to check for whe
<sb0>
rohitksingh, ok, i think we only use cappucino
<sb0>
quite surprising there would be such a penalty for random addresses
<sb0>
is this working well? seems to contain a few hacks according to the comments...
<rohitksingh>
sb0: this TCM module is not available for other variants and I'll have to take a look at its code to understand its intended purpose, how it works and and how can we use it
<sb0>
rohitksingh, how many cycles does a jump take on mor1kx?
<rohitksingh>
sb0: don't know, I'll have to check/test.
<rohitksingh>
that is going to be implementation dependent too
<sb0>
cappucino, we're not using the others
<sb0>
but with 3 cycles just to get the instruction, which then has to make it through the rest of the pipeline, it sounds very slow. i'm surprised by this 3 cycles number.
wingdu has quit [Quit: wingdu]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
<rohitksingh>
sb0: mor1kx does have branch predictor. I will check out misprediction penalty
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
<rohitksingh>
sb0: I have asked on openrisc channel for confirmation on this 3-cycle number. Not got any reply yet. But the code and simulation agree with this.
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
wingdu has joined #m-labs
wingdu has quit [Client Quit]
f4bug has joined #m-labs
rohitksingh1 has joined #m-labs
rohitksingh has quit [Read error: Connection reset by peer]
wingdu has joined #m-labs
wingdu has left #m-labs [#m-labs]
f4bug1 has joined #m-labs
f4bug has quit [Read error: Connection reset by peer]