CprE 583 Adaptive Computing Systems


Assignment 4 Distributed on: October 03, 2000

Due on: October 17, 2000

Reading: Mapping Literature.


Problem 1

Consider a four stage pipeline where stage S1 is used in time steps 1 and 6, stage S2 is used in time steps 2 and 4,stage S3 is used in time step 3, and stage S4 is used in time steps 4 and 5. Draw the reservation table and answer the following questions.

  1. What are the forbidden latencies and the initial collision vector.
  2. Draw the state transition diagram for scheduling the pipeline.
  3. Determine the MAL associated with the shortest greedy cycle.
  4. Determine the pipeline throughput corresponding to the MAL.
  5. Determine the lower bound on the MAL for this pipeline. Have you obtained the optimal latency from the state diagram?

Problem 2

(1) Show a scheme to use the fast carry logic of Xilinx 4000 series to implement the 2's complement function of an 8-bit input data.

(2) Show all the configuration data for an 8-bit adder/subtracter/ incrementer/decrementer. You can photocopy the fast carry logic schematic in Figure 1 of Xilinx XAPP 013 and show the configuration bits on this schematic.

(3) Let us build an 8-bit ALU using Xilinx 4000 fast carry logic. Two inputs are named A[7:0] and B[7:0]. The output is C[7:0]. You should also produce a carry and overflow output. The operations to be supported are: ADD, SUB, AND, and OR. An operation code 10, 11, 00, and 01, respectively select the output. Draw a schematic using Xilinx Foundation Software Tools. Simulate your design to verify its correctness. Implement your design and report relevant parameters of the design like cycle time, hardware used etc.


Problem 3

(1) Compare the performance (total cycles and first output) of piperench architecture in the following three cases: (a) it take 1 cycle to load a configuration and one cycle to load a data element and there are separate buses to load data and configurations; (b) it takes 4 cycles to lead a configuration and one cycle to load a data element; (c) it takes 12 cycles to load a configuration and 2 cycles to load a data element. You can take number of data elements as X, Number of sripes available as N, and the number of pipeline stages in an application as K. 

(2) Suppose you were to propose another way to program configurations in GARP architecture. What would be your approach. Pay attention to convenience of specifying in higher level language lake C, assembly programming, micro architecture changes, cache and memory misses, etc.