More Complex parts of Computational Redstone E.G: Caches / serial memory and pipline

I feel like there isnt really a guide on how to implement these things and these hard stuff are usually learnt from generations of builders or just the theory of it but not the actual hardware.
I, for one dont know much of these in terms of hardware but hopefully this post can help many newcomers with these things. Feel free to post in here if you are experienced in these things

Just ask the builders that made them, iirc Stuwu should be able to explain stuff good, and asking on Discord should work too.

rememebr im banned from dc

Yeah but asking in game is probably better anyway, the people can actually show u instead of just explaining the build through messages

e good point but id thought discourse might be better because game chat can dissappear and not permanent if you dont have dc and sometimes they might not be online to tell

I’ve written a quick guide on pipelining once in discord. Here’s a copy paste with a few edits (this is a long one):

In it’s most basic form, pipelining is really straightforward:
Put some repeaters on the lines between each step. Have a repeater point into it such that it locks. Have a clock signal that’s always high and drops when the clock “ticks”. That’s basically pipelining
Instead of having your clock run so slow that your entire CPU can go from getting the instruction from ROM to having executed it and wrote it back, you shorten the clock a lot and just latch the result after every step. Thus your clock runs as fast as your slowest step. The smaller you can make your steps, the faster your clock can run.

The big thing that makes it more difficult is because you run into 2 potential issues, because you’re fetching instructions before previous instructions are finished, you have the risk of executing certain instructions incorrectly.

For one, lets say you have an instruction (3op. First operand is the target register)

LDI R1 0011
LDI R2 0100
ADD R3 R1 R2
SUB R4 R3 R2

In the pipelined setup, the ADD might not have finished yet when the SUB instruction is running. This causes the R3 in the SUB to read 0000 instead of the result of the ADD. Meaning you get a discrepancy in your data. These issues with data are called “hazards”, caused by a “data dependency”, and this specific case is a “Read after Write”. The way you solve this is called “argument forwarding”, in its simplest terms you put a MUX on the inputs of your ALU. If you notice that your write address of the previous instruction matches the read address of your current instruction, you take the output of the ALU directly instead of the register data. You still need to write the data back to the register as well, but this just “short circuits” the input so you don’t have to wait for this writeback.

The other issue is branch prediction. Your branch instruction is likely based on flags, and the flags might not be set yet at the moment it’s being decided if it’s going to branch or not.

The way to solve this is “branch prediction”, where you try to predict which branch will be taken. There’s many ways to do branch predictions. In its simplest terms, there’s static branch prediction (aka you say you never branch or you always branch). Another option tracking if the last branch was taken or not, and assuming the same result is going to happen (this works well for long lasting loops in programs, or lots of base cases that are skipped in a row).

However, you will encounter at some point you mispredict. In which case you do something called “bubble the pipeline”. This really just means “clear all the latches and disregard any kind of output you might get”. The issue with this is you cause a massive stall, as you have to wait for your entire pipeline to get up and running again. You want to avoid this as much as possible by using proper branch prediction and writing your programs as such that they work best with what your branch predictor would do.

Why doesn’t some1 make like a git repo, and ppl can contribute texts like dat, so even people who are banned from DC (:skull:) can contribute and read it?

e but i still believe that this is probs the easiest method as i can check wahtever i want to check on ORE discousrse while having access to this user friendly platform compared to git hub IMO

good explanation btw i think i understand it a bit now

IMO The actual latching of pipelining is quite simple to understand and make but its mainly the stall controls and timing stuff makes it quite challenging. (if im correct)

timing correctly is still the same problem with any cpu. i mainly use timing based on stage:tick.
So, C = clock, F= fetch, D = decode, E = execute, W = writeback. so redstone dust right after the latch like the instruction register, it would be D0. it starts with 0 because you can have -1 or -2 due to an early latch.

To be fair, you dont have to look at redstone specific tutorials/explanations since its computational redstone literally anything binary/hex/cpu related would probably be similar to the redstone

How is GitHub not user friendly? Everyone in the field of cs has an account, and you can create issues to state where u have questions, and start discussions, AND have an easy to use way to share any kind of file you want.

well idk how to operate github and i probs wont be allowed

speaking of which how are you supposed to adress or jump a 10bit PC with only 8bit imm?
do you just use 2 bits of the opcode and a mux to direct it?

  1. shifted immediate (0-padding). big jumps should ideally allign to a page or cache line generally.
  2. what you are suggesting is fine too. nobody is forcing you to use only 2 ^ n opcodes.
  3. relative jumps with 8 bit (i’d assume thats about 95% of all jump distances). this reduces the pain somewhat and you can use smaller immediates with sign extention if you need more bits necessary for conditions.

Only real trouble is if you want to do funky things like jump tables or saving pc to the program stack in memory, rather than a hardware call stack.

What is the difference between This type of Hex ROM:


And This type of Hex ROM:

ive seen BBV2 use the bottom one but everyone uses the top version for some reason
so idk what the point is

Rzecz told me that the bottom version is Shared memory or RAM but im still confused because ive seen him use it in BBV2 which is a pure harvard CPU

second one is not hex rom but hex ram. so ye pears to apples. its just that for data cache, you need hex ram. It wouldn’t store instructions, only data, so thats still pure harvard.