HPVM consists of a few relatively independent key components. THe following list provides a high-level description of the major components.

  • Front ends that translate higher-level languages to HPVM’s representation. There are currently three main frontends:

    • Hetero-C++ Front End: for compiling applications written in Hetero-C++.

    • Keras and Pytorch Fronteds: for lowering Keras and PyTorch DNN models into HPVM-C.

  • Patched LLVM: provides HPVM IR and a compilation infrastructure, including clang and opt.

  • HPVM Target-specific Code Generation: passes that lower HPVM IR to LLVM IR for each target hardware (CPU, GPU, FPGA, CUDNN, etc.). The resulting modules are then compiled using the corresponding target back end (e.g. LLVM X86 back end for host/CPU code) is used to generate the final target-specific binary.

  • HPVM Runtime: a runtime system that interfaces with device-specific OpenCL runtimes to set-up and launch application components onto GPU and FPGA, as well as handle host-side control.

  • HPVM DFG Optimization Framework: includes DFG and non-DFG transformations that optimize the HPVM IR, as well as design space exploration that automatically tunes the application for a given target hardware (FPGA or GPU).

  • HPVM Tensor Extension Framework:

    • Predictive tuner: an autotuner library in Python for finding approximation choices (configurations) with best performance gain within some loss of Quality of Service (QoS, such as accuracy).

    • HPVM profiler: an API in Python for measuring real performance of configurations.

    • Tensor runtime: a backend which holds implementations for some common tensor operators (such as convolution) that HPVM-C functions can be converted into.

More details can be found on each component in the corrosponding files below.