You are here: Foswiki>Main Web>SimpleScalar>ResearchTopics>ReconfigurableComputingReadingList>Hartenstein2001 (12 Apr 2011, JongeunLee)Edit Attach

A Decade of Reconfigurable Computing: a Visionary Retrospective

Coarse-Grained Reconfigurable Architectures
Programming Coarse Grain Reconfigurable Architectures
Compilation Techniques
Parallel Computing vs. Reconfigurable
Conclusions

Coarse-Grained Reconfigurable Architectures

The area of Reconfigurable Computing mostly stresses the use of coarse-grained architectures:
- Fine-grained architectures are less efficient
  - huge routing area overhead
  - poor routability
- Coarse-grained architectures provide
  - operator level CFBs, word level datapaths
  - powerfull & very area-efficient datapath routing switches
  - massive reduction of configuration memory, configuration time & the complexity of the placement and routing problem.
Architectures:
- Mesh-based Architectures
- Architectures based on Linear Arrays
- Crossbar-based Architectures
Mesh-based Architectures
- PEs are arranged in a rectangular 2-D array
- Horizontal and vertical connections -> supports rich communication resources.
- Encourages nearest neighbor (NN) links between adjacent PEs (4 sides or 8 sides).
- Longer lines are added with different lengths for connections over distances larger than 1.
- Overview of some Primarily Mesh-Based Architectures
  - Garp:
    - Host: A MIPS-II-like
    - Accelerator:
      - A 32-by-24 LUT-based 2bit PEs RA
        
        Basic unit is a row of 32 PEs - a reconfigurable ALU
      - used for specific loops or subroutines
    - Host and RA share the memory hierarchy
  - MorphoSys:
    - Host: a MIPS-like "TinyRISC" processor with extended instruction set
    - Accelerator:
      - a mesh-connected 8-by-8 RA
      - divided into 4 quadrants of 4-by-4 16 bits RCs (each featuring: ALU, multiplier, shifter, registers)
    - A frame buffer for intermediate data/results
    - DMA Controller
    - extra DMA instructions of the host initiate data transfers between main memory & the "frame buffer".
Architectures based on Linear Arrays
- Based on one/several linear arrays with NN connect.
- aim at mapping pipelines onto it
- if pipes have forks then additional routing resources are needed, like longer lines spanning the whole or a part of the array.
- Some example architectures:
  - RaPiD
    - Reconfigurable Pipelined Datapath (RaPiD)
      - speeds up highly regular, computation-intensive tasks, how?
      - by deep pipelines on 1-D RA.
  - PipeRench
    - has an accelerator for pipelined apps
      - several reconfigurable pipeline stages
      - relies on:
        
        fast partial dynamic pipeline reconfiguration
        
        run-time scheduling of configuration streams & data streams
Future Reconfigurable Architectures:
- sufficiently flexible Reconfigurable Architectures optimized for a particular application domain (e.g. wireless communication, image processing, multimedia, etc.)
- need development tools
  - architectures have a great impact on mapping tools
  - solutions:
    - simple generic fabrics architecture principles
    - development tools itself generically generate the architectures that it can manage easily

Programming Coarse Grain Reconfigurable Architectures

Programming frameworks are highly dependent on structure and granularity and differ by language level
Assembler Programming
- can be compared to configuration code for FPGAs
Frameworks with FPGA-Style Mapping
Run-time Mapping

Retrospective

Three pases of silicon synthesis and applilcation:

	Machine paradigm	Algorithms	Resources
(1) Hardware design	no	fixed	fixed
(2) Microcontroller (von Newmann)	general	variable	fixed
(3) RA usage	no	variable	variable

(1) -> (2): a shift from net-list-based CAD (fixed algorithms) to RAM-based synthesis by compilation.
3rd phase introduces Reconfigurable hardware - RAM-based structural synthesis.

Compilation Techniques

Microprocessor/accelerator(s) symbiosis is the emerging applications
- Sequential code is downloaded into the host's RAM
- Accelerator is implemented by CAD.
- -> need co-compilation
A change of market structure
- by migration of accelerator implementation from IC vendor to customer (has no HW designers available).
- -> a strong need for automatic compilation from High Level Programming Language sources on to RAs.
Software/Configware Partitioning & Compilation
- innovative compilers need to do the partitioning automatically with several criteria:
  - how much workload fits onto a given Reconfigurable Architectures part
  - optimal performance

Parallel Computing vs. Reconfigurable

rapidly shrinking supercomputing conferences = Crisis of parallel computing
- For many application areas, process level parallelism yields only poor speed-up improvement per processor added.
- Dominating problem:
  - instruction-driven late binding of communication paths.
  - leads to massive communication switching overhead.
  - "von Newmann" paradigm is not a communication paradigm.

Conclusions

Reconfigurable platforms & applications
- heading from nich to main-stream
- bridging the gap between ASICs & microprocessors.
In the future:
- many system-level products without reconfigurability will not be competitive.
- Reconfigurable Architecture usage will be the key to keep up the current innovation speed beyond the limits of silicon.

Topic revision: r5 - 12 Apr 2011, JongeunLee

Main

Webs
Main
Sandbox
System

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback