Finding Root Causes of Floating Point Error with Herbgrind
Floating-point arithmetic plays a central role in science, engineering, and finance by enabling developers to approximate real arithmetic. To address numerical issues in large floating-point applications, developers must identify root causes, which is difficult because floating-point errors are generally non-local, non-compositional, and non-uniform. This paper presents Herbgrind, a tool to help developers identify and address root causes in numerical code written in low-level C/C++ and Fortran. Herbgrind dynamically tracks dependencies between operations and program outputs to avoid false positives and abstracts erroneous computations to a simplified program fragment whose improvement can reduce output error. We perform several case studies applying Herbgrind to large, expert-crafted numerical programs and show that it scales to applications spanning hundreds of thousands of lines, correctly handling the low-level details of modern floating point hardware and mathematical libraries, and tracking error across function boundaries and through the heap.
PDF Abstract