You are here: Foswiki>Main Web>SimpleScalar>ResearchTopics>ReconfigurableComputingReadingList>Callahan2000 (27 Mar 2011, ToanMai)Edit Attach

Adapting software pipelining for reconfigurable computing, in CASES

Introduction
The Garp Architecture
Basic Compilation Approach
- Loops
- Hyperblock Selection

Introduction

What
- An improvement for the compiler of Garp reconfigurable coprocessor.
  - Extension of techniques from software compilation (rather than high-level synthesis)
  - Uniform schema for pipelining the HW execution of a broad class of loops on coprocessor.
  - Based heavily on compilation techniques for VLIW architectures exploiting operation-level parallelism ~ adaptation of VLIW software pipelining techniques.
Why
- Apps often contain a mix of computation:
  - spatial computation associated with hardware
  - temporal computation associated with fetch-and-execute model of microprocessor.
  - -> motivation for hybrid architectures.
- As embedded apps grow in size, the productivity benefits of automatic compilation for general purpose computing also grow.
  - designers begin with automatic compilation
  - then focus efforts on improving loops specifically for their own hardware.

The Garp Architecture

2-d array of configurable logic blocks (CLBs)
- has 4 32-bit data buses & 1 32-bit address bus.
- while idle, processor can use the memory buses to load configurations or transfer data between processor regs & array regs
Novel feature: fixed clock frequency & simplified timing model
- Timing model enumerates exactly which combination of routing & computation delay in series can be traversed in 1 clock cycle or less
- example: a module can receive an operand over a short vertical bus & perform a carry chain computation within a single cycle.

Basic Compilation Approach

Garp compiler targets loops for acceleration using reconfigurable array.
- One thread of control hops back and forth between the microprocessor & the array
- When encountering a loop utilizing the array during the execution:
  - the correct configuration for that loop is loaded. That takes minimal time if it was the most recently used or resident in the configuration cache
  - Live values are transferred to the array
  - array is activated & microprocessor goes to sleep
- When loop exits:
  - microprocessor wakes up & determines which exit was taken & retries live variables (from the array) appropriate for that exit
  - continues software execution.

Loops

Garp-specific part of the compiler first recognizes natural loops in the control flow graph (CFG)
- Natural loop = strongly connected sub-graph of the CFG with exactly one entry block dominating all other blocks of the loop & back-reachable from them. Blocks inside a loop form its body.
- Recognizes FOR, WHILE loops or even backwards GOTO statements in the C code.
- All are treated as WHILE

Hyperblock Selection

From each loop, form a hyperblock:
- contains the commonly executed paths in the loop
- only hyperblock is implemented on the array

Topic revision: r3 - 27 Mar 2011, ToanMai

Main

Webs
Main
Sandbox
System

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback