Skip to content

ab13dd: L-infinity norm translation bugs causing 8.8% error and Linux crashes #10

@jamestjsp

Description

@jamestjsp

Summary

The C translation of ab13dd (L-infinity norm computation) had multiple bugs not caught by the HTML documentation example test. These caused:

  • 8.8% numerical error for MIMO systems (4.65 vs expected 4.28)
  • Crashes on Linux due to heap buffer overflow

Reproduction

# MIMO system where bug manifested
n, m, p = 3, 2, 2
a = np.array([
    [-1.017041847539126, -0.224182952826418,  0.042538079249249],
    [-0.310374015319095, -0.516461581407780, -0.119195790221750],
    [-1.452723568727942,  1.799586083710209, -1.491935830615152]
], order='F')
# ... (full test in test_ab13dd.py::test_ab13dd_mimo_matlab_reference)

# MATLAB reference: hinfnorm(ss(A,B,C,D)) = 4.2768
# C translation returned: 4.65 (8.8% error)

Root Causes

1. Tolerance formula for imaginary-axis eigenvalue detection

Wrong:

if (tm <= toler * (fabs(ev_imag) + one))

Correct (Fortran):

f64 tol_thresh = toler * sqrt(hundrd + wmax);
if (tm <= tol_thresh)

2. Loop iteration off-by-one

Wrong:

while (iter < MAXIT)

Correct:

while (iter <= MAXIT)

3. Algorithm structure

The C code called ab13dx directly for each eigenvalue instead of:

  1. Collecting all candidate frequencies
  2. Sorting and deduplicating
  3. Computing midpoints between consecutive frequencies
  4. Evaluating norm at midpoints

4. MB03XD BALANC parameter

Wrong: "N" (no balancing)
Correct: "B" (both permute and scale)

5. Hamiltonian scaling

Wrong: Divide by gamma²
Correct: Divide by gamma

6. MB03XD SCALE array buffer overflow (Linux crash)

When BALANC='B', mb03xd requires a SCALE array of size N.
Wrong:

f64 scale_val = zero;  // single scalar
mb03xd("B", ..., &scale_val, ...);

Correct:

f64 *scale_arr = &dwork[iwrk2];  // array of size N
iwrk2 += n;
mb03xd("B", ..., scale_arr, ...);

Why Tests Didn't Catch This

The HTML documentation example uses a SISO system (m=1, p=1) with specific structure that happened to work correctly. The bugs only manifested with:

  • MIMO systems (m>1 or p>1) requiring iterative bisection
  • Systems where the tolerance formula affected eigenvalue classification
  • Linux memory layout exposing the buffer overflow

Fix

Lessons Learned

  1. HTML doc examples are insufficient for testing numerical algorithms
  2. MIMO test cases with known MATLAB/reference values are essential
  3. ASAN testing on Linux catches memory issues that macOS tolerates
  4. When translating BALANC/balancing options, verify array size requirements

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions