Course Details and Team DescriptionEdit
UNL CSCE 430/830 Computer Architecture Spring 2015 Course Project
Team Name - RISCy Business
- Shruti Daggumati (Team Lead)
- Corey Svehla
- Daniel Geschwender
- Austin Schmidt
Project Description Edit
We will be implementing a pipelined RISC processor to solve the longest common substring problem. We use both hardware and software designs to develop a RISC processor by completing the following tasks:
- Task 1 - determine the RISC ISA
- Task 2: Pipeline the processor
- Task 3: Adding a static branch predictor
- Task 4: incorporating instruction/data caches
Project Milestones Edit
|1||Team Formation||Jan. 29, 2015||completed|
|Feb. 22, 2015||completed|
|3||Task 2||Mar. 14, 2015||completed|
|4||Task 3||Mar. 28, 2015||completed|
|5||Task 4||Apr. 18, 2015||completed|
|6||Project designs & report submitted||Apr. 24, 2015||completed|
|7||Project demonstration||Apr. 23-24, 2015||completed|
Task 1 Edit
We first created an algorithm that would best fit to solve the LCS problem. Due to brute force being easier to implement over dynamic programming we created a rudimentary algorithm. We simply check every subsequence of string one to see if it is in string two, and take into account the longest subsequence we have come in contact with.
Using the following instructions, we implemented our algorithm in assembly:
Finally, we created an assembler in JAVA that translated the ASM file into the necessary instruction and data memory files. We decided to use Java since all of the team members are comfortable with the language and the utilization of ArrayList and Hashmap is straightforward. There are two passes to the assembler where the first pass takes in our lcs program in assembly and creates the symbol table. The second pass translates the instructions to the HEX values and finally creates two MIF files for the instruction and data memory.
So in conclusion we tailored the ISA to tackled the longest common substring problem, and we wrote a LCS program and compiled it with our own assembler.
Task 2 Edit
Data forwarding takes play in stage 4, the execute stage. The purpose of forwarding is to prevent data hazards from occurring while multiple instructions are being operated on. The most current data values are pulled from the output of stages 4 and 5. The write address of these values are compared to the source addresses of the incoming values in stage 3. If the addresses are the same and the current data values have write flags enabled they are considered more current. The current values are muxed into stage 4 for use in stage 4 and 5. This eliminates all data hazards for stages 4 and 5. When branch predictions was added, much of the branch logic was moved from stage 4 to stage 2. This created additional data hazards in the pipeline. Data values modified in stage 5 are written in the same cycle as used in stage 2, but values modified in stage 4 will not which creates a hazard. Stalling was used in the case of these data hazards. Stage 1 and stage 2 were frozen until values from the needed values from the rest of the pipeline were ready for use.
Task 3 Edit
In this task, we add a static branch predictor to the pipeline so that the control hazard can be mitigated. We are using branch always not taken for the branch prediction. This is not the best choice but given the timeline, it is the fastest to implement. In this stage, we moved the branch decision and branch address calculation to the ID stage. And we added performance counters to monitor the prediction accuracy. We have three branch operations in our LCS program, and we need to take into account the different calculations of each of them. In the branch always not taken we know that the instruction to be fetched is the next sequential instruction. Finally we flush the IF/ID pipeline register when a branch misprediction is detected, and give the PC the correct address.
So in conclusion we made a static branch predictor where we use the branch always not taken method.
Task 4 Edit
Cache Design Edit
Both caches (instruction and data) are designed as follows:
- 512 byte cache
- 2-way set associative
- 16 sets (4-bit index)
- 16 bytes/ block (2-bits for word addressing)
- 2-bit tags
- LRU replacement policy
- No write allocation
Cache Performance Edit
We placed performance counters to monitor the hit rate of the caches. There are four counters, a hit and total counter for both the instruction and data caches. The hit counter increments on every cache hit, while the total counter increments on every cache use. The counters can be viewed on the output LEDs when the switches are set appropriately.
Verification EditAssembly file was verified through the SPIM tool. Wrote code to specifically test out each command individually.Testing was done modularily using both software simulations and hardware simulations. The software simulations were done with the ModelSim software which provides an accurate display of the projected output including timing delays due to hardware. Individual components of the project were tested before being added to ensure correct functionality. Once added, further testing was done to make sure the component performed as expected. Connections to the board were tested with basic VHDL circuits to ensure pin connections were correct. The final LCM program was tested entirely in ModelSim before being verified on the FPGA board. All example inputs were tested for correctness and the hit rates were compared against basic projections.
Meeting 1 (02/10/2015) Edit
- Decided the instructions needed for the ISA
- Decided on the language we will use to implement our assembler: Java
- Decided on the instructions we need to determine the longest common substring.
Meeting 2 (02/16/2015) Edit
- First stage of the 2-stage assembler is complete. The second stage is in progress.
- We are modifying the supplied processor to support the instructions for our ISA. Our ISA has been updated.
- Our assembly code is complete and verified with SPIM.
- Looked over DE2_clock example in order to make use of e of the on-board LCD display.
Meeting 3 (02/23/2015) Edit
- Completed the 2-stage assembler and how the data will be displayed on the LCD display. (Task 1 is complete)
- Discussed partitioning the pipeline stages and creating pipeline registers for the processor data and control paths.
- Need to incorporate data hazard resolution mechanisms.
Meeting 4 (03/02/2015) Edit
- Discuss implementation of Tasks 2-4.
- These next few tasks will be implemented simultaneously
- Task 2 is almost complete
- data forwarding and data hazard detection units is implemented
- pipeline registers in the processor data and control paths and partition pipeline stages needs to be implemented
- Discussed the idea of implementing branch prediction in the ID stage, since that's where we want to flush
- Possibly look at backward taken forward not taken(BTFNT) branch prediction
Meeting 5 (03/09/2015) Edit
- We will be implementing static "always-not take" branch prediction to start off. Since this is the easier one to implement we will use this as a basis for the branch prediction.
- We know that by predicting "Always-Not-Taken" this isn't very effective for the LCS problem since most of the time the branch is taken.
- Made corrections to the LCD display VHDL to correct erroneous behavior
- Finished code to test each instruction individually
Meeting 6 (03/16/2015) Edit
- LCD display was scrapped due to timing and reset issues
- New display will be using the red LED pins on the board, the first half for location one and the second half for location two
Meeting 7 (03/23/2015) Edit
- Spring Break
- Supposed to finish up Milestone 3
Meeting 8 (03/30/2015) Edit
- Due to other deadlines (Research Papers) Milestone 3 was put on hold.
- Milestone 3 and 4 are still in progress
- Starting to write the final report
Meeting 9 (04/06/2015)Edit
- Milestone 3 and 4 are almost done
- We still need to integrate
- We are in the process of writing the report
Meeting 10 (04/13/2015) Edit
- Milestone 3 is done after much delay
- Milestone 4 is in the process of being completed
- The report is being written
- Refinements and integration is taking place
Meeting 11 (04/20/2015) Edit
- Milestone 4 is complete
- Cleaning up our wiki page to make sure everything is explained and is in order
- Finishing the report
- Need to determine what date we will demonstrate our project on