Dynamo: A Transparent Dynamic Optimization System

Introduction

  • What?
    • Dynamo - a software dynamic optimization system
      • transparently improve performance of native instruction stream
  • Why?
    • Static Compiler Optimizations is becoming less effective
      • Software side:
        • Softwares now often need Dynamic Linked Libraries
      • Hardware side:
        • offloading more complexity to sw compiler: CISC->RISC->VLIW.
      • -> greater performance burden on static compiler while more obstacles to static compiler analysis.
      • -> leads to:
        • complex compiler software
        • modest performance gains on general-purpose apps
        • highly customized compilers for very narrow classes of apps
  • How?
    • COMPLEMENT not COMPETE with the compiler
    • Dynamo operates @ runtime
    • Interprets native instruction stream either from
      • traditional optimizing compiler
      • dynamically generated by an app (JIT...)
    • Opportunities for Dynamo depends on the source of input.

Overview

:)
  • Dynamo interprets the stream until a "hot" instruction sequence (trace) is identified
    • generates optimized version of that trace (fragment) into fragment cache
    • next time encountering the entry address of the fragment -> get from the cache (no need to interpret anymore)
  • Flow of control:
    • starts by interpreting until a taken branch is encountered (A)
    • lookup the branch in the fragment cache (B)
      • if found -> jump to fragment in cache (F)
      • if not -> check start-of-trace condition (C)
        • what's start-of-trace?
          • loop headers
          • exits from previously identified hot traces
        • if yes -> increase counter associated with that branch target address (D)
          • if counter > preset hot threshold (E)
            • get into code generation mode (G)
              • interpreted sequence is recorded in a hot trace buffer
              • check end-of-trace condition (H)
                • what's end-of-trace?
                  • backward taken branch
                • if yes
                  • hot trace buffer is optimized (I) into fragment
                    • what's fragment?
                      • single-entry, multi-exit, contiguous sequence of instructions
                    • if too long? -> truncated ->
                      • why?
                      • how?
                  • save to cache (J)
                    • index = app binary address of the start-of-trace instruction
                    • connect to other fragments if possible! -> minimize expensive fragment cache exits
        • if no -> back to normal interpretation

Startup & Initialization

:)
  • Dynamo
    • a user-mode DLL (shared lib)
    • entry point: dynamo_exec routine invoked by app
      • -> remainder of the app code is under Dynamo control
  • dynamo_exec
    • saves app's context (machine regs, stack env, etc.) to app-context
    • swaps the stack env to Dynamo's stack
      • -> no interference with the runtime stack of app
    • Interpreter (A) starts interpreting app code from return-pc using the context saved in app-context
    • The interpreter never returns to dynamo_exec
  • With Nynamo installed, need an invoke for dynamo_start prior to the jump to _start (the app's main entry point).
  • Dynamo maps & manages a separate area of memory:
    • contains all dynamically allocated objects in Dynamo code
    • access to this area is protected

Fragment Formation

  • Performance improvement opportunities:
    • redundancies cross static program boundaries: procedure calls, returns, virtual function calls, indirect branches & dynamically linked function calls
    • instruction cache utilization:
      • frequently executing instructions are often non-contiguous in app binary
  • unit of optimization is a trace

Trace selection

  • Use MRET (most recently executed tail) instead of profile-based approach for speculating
Topic revision: r13 - 09 Apr 2011, ToanMai
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback