Home > Publications > Reducing communication overhead (1) — shared memory approach

Reducing communication overhead (1) — shared memory approach

May 26th, 2012

In a computer system employing on-chip accelerator processors such as streaming co-processor or reconfigurable hardware co-processor, it is vital to minimize communication overhead lest the communication overhead cancel out any performance improvement by the co-processor accelerator.

Our first cut approach to this problem, tailored to reconfigurable hardware co-processors that we’ve been working with, is an on-chip shared memory that is little more intelligent than just a scratch-pad memory but less so than a cache. We call it Configurable Range Memory (CRM). Here is an article introducing its basic ideas.

Application-specific hardware and reconfigurable processors can dramatically speed up compute-intensive kernels of  applications, offloading the burden of main processor.  To minimize the communication overhead in such a coprocessor  approach, the two processors can share an on-chip memory, which may be considered by each processor as a scratchpad  memory.  However, this setup poses a significant challenge to the main processor, which now must manage data on the  scratchpad explicitly, often resulting in superfluous data copy.  This paper presents an enhancement to scratchpad,  called Configurable Range Memory (CRM), that can reduce the need for explicit management and thus reduce data  copy and promote data reuse on the shared memory.  Our experimental results using benchmarks from DSP and multimedia  applications demonstrate that our CRM architecture can significantly reduce the communication overhead compared to the  architecture without shared memory, while not requiring explicit data management.

 

Read the full text via this.

Categories: Publications Tags: