What’s a thing you have made which demonstrates sufficient engineering knowledge?:
- Built 10 complete generations of Minecraft CPUs, each of which have improved upon the last in both program runtime and capabilities.
- Created an intermediate language called URCL with the help of a handful of members of the ORE community. Which has for the first time enabled high level code to be ran in a practical way on custom low level ISAs and has allowed programs to be shared between CPUs with different ISAs.
- Written a B compiler which compiles to fairly well optimised URCL code which can then be ported over to a specific ISA relatively easily.
What engineering work went into designing this device?:
My latest CPUs is the MPU6.0 which uses a custom version of the VLIW (Very Long Instruction Word) arch. The MPU6 is built upon everything that I have learned from previous generations and is my first CPU with a 5 tick clock speed.
The initial goals of the MPU6 were:
- Test the viability of a 5 tick clock speed. (A 5 tick clock speed which gave faster program runtimes than a 7 tick clock speed was possible on paper but had not been proven yet in MC)
- Test my custom 8 segment character set further.
- Test new user interface.
- Test a 4 stage waterfall pipeline (Fetch → Decode → Execute → Writeback)
- Test the new MMIO layout with a faster hardware multiplier, RNG and a 32 bit BCD converter.
- Test the new automatic forwarding system which is required to make a 5 tick clock speed give faster program runtimes than the 7 tick clock speeds of previous generations.
- Reduce the number of pistons used to increase reliability. (As pistons are hard to predict and have many bugs associated with them)
- Test the viability of using the character display to show numbers instead of a 7 segment display. (This was the first CPU I have built which didn’t have a 7 segment display)
- Refine the ISA further and test the new stall instruction.
- Have translations for URCL instructions and be able to run simple URCL programs.
- Run 8 bit BBP Pi faster than the MPU4 and output the answer in base 10.
- Be able to compact the code in the instruction ROM further to get able to fit more complex programs in the same ROM space. (Larger ROMs are slower so compacting code is preferable)
- Run 32 bit Fibonacci and print the outputs in base 10.
And as a stretch goal:
- 16 bit BBP Pi in base 10.
Since the MPU5 had failed, the MPU6 had to take on all of the goals for the MPU5 as well as further goals such as the 5 tick clock speed. But despite the extra burdens which had been put on it, the MPU6 was very successful and was able to achieve far more than just the initial goals.
MPU6 achievements in order of accomplishment:
- Ran 8 bit Fibonacci at a 5 tick clock speed without any read before write hazards which proved the automatic forwarding logic worked.
- Ran 8 bit Fibonacci and printed each of the outputs on the character display in base 10.
- Ran division by repeated subtraction using the hardware counter to enable the main loop to be a single line. (So 75/5 = 15 took just 15 cycles ignoring setup time)
- Printed “Hello World” on the character display.
- Ran 8 bit Fibonacci written in URCL which printed each of the outputs on the character display in base 10.
- Ran 8 bit FizzBuzz which printed out each number in base 10 or Fizz/Buzz on the character display.
- Ran 8 bit FizzBuzz written in URCL which printed out each number in base 10 or Fizz/Buzz on the character display.
- Ran a faster version of 8 bit FizzBuzz written by sammyuri in URCL which printed out each number in base 10 or Fizz/Buzz on the character display. (This is the first time a program which was written by someone other than me had run successfully on any of my CPUs)
- Ran 8 bit Collatz written by sammyuri in URCL which printed out each step in base 10 on the character display.
- Ran 8 bit bubble sort written in URCL which sorted 5 numbers and printed out the initial and final list in base 10 on the character display.
- Ran recursive Fibonacci written in URCL. (This is the first truly recursive program I have ran on my CPUs)
- Ran 32 bit Fibonacci and printed each of the outputs on the character display in base 10.
- Ran recursive Fibonacci written in B which was compiled to URCL using the compiler I wrote.
- Ran 8 bit BPP Pi and printed the output in base 10 on the display.
- Ran a Rickroll program which prints out the lyrics of the chorus onto the display.
- Ran 16 bit BPP Pi and printed the output in base 10 on the display.
- Printed user input onto the display. (This was the first time I have had working user input on a CPU)
- Ran a Hangman program. (This was the first user interactable game I had ever ran on my CPUs)
- Ran a moving pixel program which allows the user to move a pixel on the display using the UI.
- Ran a bouncing pixel program where the pixel bounces when it reaches the edge of the screen.
- Ran Bresenham’s line drawing algorithm written in B and compiled to URCL which was able to draw a line between any two points regardless of the gradient or the direction.
- Ran Ackermann written in B compiled to URCL. (This is the first non-primitive recursive program I have ran on my CPUs)
I was able to push the MPU6 much further than it had been originally designed to go due to how reliable and predictable the CPU is, and with the development of URCL plus the B compiler, that has allowed much more complex programs which wouldn’t have been possible to write in assembly to be ran on the MPU6; such as full Bresenhams and Ackermann.
The MPU6 is not without flaw however, as the tiny instruction ROM holds it back from accomplishing more and the 8 segment character set is difficult to read as well as there being no screen buffer making the Rickroll program almost impossible to read.
I have plans for the MPU7 and it will build upon the MPU6 and will have a much larger initial scope and will hopefully be my first full computer instead of just a CPU. The computer will be called IRIS (Interchangeable Rapid Instruction System) and will be able to be programmed and using the UI meaning the computer is fully controllable from the UI. It will feature paging which will allow much more complex programs to be ran without sacrificing much speed and will be capable of running an elementary operating system.
Also, according to LordDecapo’s benchmark sheet: ISA Benchmarks - Google Sheets
The MPU6 is by far the fastest CPU on there and for the fib function benchmark it more than twice as fast compared to the second fastest CPU which is LordDecapo’s PIZA CPU for N = 13.
This is mostly due to the non-trivial custom VLIW arch that the MPU6 uses which enables it to do multiple operations every clock cycle combined with its 5 tick clock speed with no read before write hazards.
Image/s and/or video/s of the device:
Every program ran on the MPU6 has been recorded and uploaded to YouTube:
- 8 bit Fibonacci on the MPU6.0
- 8 bit Fibonacci V2 on the MPU6.0
- 75/5 by Repeated Subtraction on the MPU6.0
- Hello World on the MPU6.0
- URCL 8 bit Fibonacci on the MPU6.0
- Fizzbuzz on the MPU6.0
- URCL Fizzbuzz on the MPU6.0
- URCL Fizzbuzz (by sammyuri) on the MPU6.0
- URCL Collatz (by sammyuri) on the MPU6.0
- URCL Bubble Sort on the MPU6.0
- URCL Recursive Fibonacci on the MPU6.0
- 32 bit Fibonacci on the MPU6.0
- Compiled Recursive Fibonacci MPU6
- 8 bit BBP Pi on the MPU6
- Super Cool Program on MPU6
- 16 bit BBP Pi on the MPU6
- User Input test on the MPU6
- Hangman on the MPU6
- Moving Pixel on the MPU6
- Bouncing Pixel on the MPU6
- Bresenhams Line Drawer written in B running on the MPU6
- Ackermann written in B running on the MPU6
The main MPU6 docs can be found here: MPU6.0 - Google Sheets
Note - most of the above links had to be obfuscated in order to get around the arbitrary “new accounts can only have 2 links in their posts” restriction.
Although I built the MPU6 in singleplayer I have put a world edit schematic of it on my builder plot with the rest of my CPUs.
In the future, as this has been requested by a few people, I would like to do a more in depth video going into the specifics of the MPU6 and its arch as well as the challenges I encountered along the way. And perhaps even further in the future I would like to do a Benny-style tutorial series where I build a simple but reliable CPU (so not a VLIW arch) but unlike Benny I would fully build it beforehand, before starting the series so that the series has an ending and I will explain the timings and assembly language.
If you have further questions or want a live demonstration of the MPU6 or URCL or both, do say and I’ll see what I can arrange.