Category Archives: signal-processing

Mode of the Signal Envelope

One thing that struck me as clever with the HHT was the use of projecting a spline across the minima and maxima for a given harmonic.   In effect this defines the envelope for the series for a given harmonic (level of decomposition).   A posteri, the mean or mode should be more or less equivalent to the average of the envelope splines.   Interesting!

This is a very appropriate way to model the mean within the context of mean-reversion (ie oscillations around the mode within an envelope).   Instead of trying to model the mean directly as a stochastic process, why not model the envelope — this is more appropriate as we can fit the envelope into our view of mean reversion.

Version 1
I used a regressor to estimate the mean and connected minima and maxima with a spline for the envelope.  The approach has issues (such as what sort of bias does the mean regressor have with respect to the data).   There are some issues below:

Picture 1

Version 2
I took a dfference approach, estimating the inflection points with a regressing “oscillator”  (in green) and determining the mid-points between minima and maxima to produce a spline representing the mode (blue).   So far looks good.   Edge cases, consolidation, and jumps need to be considered:

Picture 2

More on this later.

Leave a comment

Filed under mean, regression, signal-processing, statistics, technical-analysis

Intrinsic Mode Function (Basis Decomposition)

Norden Huang did a very interesting talk at CERN a few months ago “A New Method for Non-linear and Non-stationary Time Series Analysis: The Hilbert Spectral Analysis“.   I found this through Max Dama’s blog (thanks).

Huang proposes a new approach to signal decomposition for non-linear, non-stationary signals.   In the presentation he walks through the issues with Fourier, Wavelet, and Poincare analysis.   The issues with each of these approaches is that they either miss the dynamics in the time domain (fourier, poincare) or miss features in the frequency domain (wavelet).   He further goes on to show that, though Hilbert space analysis presents both the time and frequency domain, the results are skewed by non-stationarity.

My Approach
Before I get into Huang’s approach, let me detail the approach I have been using.   I had explored wavelets a few years ago and realized their limitations.   Wavelets with orthogonal bases lose features in the timeseries due to the 2^n partitioning of the signal.   Features that are aligned at the center of the 2^n partitions are captured most accurately and those on the fringes with diminished accuracy (or not at all).   I designed an empirical basis in response to this:

let X <- <signal vector>
let residual <- X
for (n,rho) in <successive values approximating frequencies 2^n>
{
    # compute data derived basis function
    basis[n] <- <penalized least squares spline> (residual, rho)
    # compute residual
    residual <- residual - basis[n]
}

The above uses a penalized least squares spline as a (near-orthogonal) basis function, decomposing the signal at successive frequencies.   The decomposition has little cross-correlation so works out to be near orthogonal for the most part.

Basis functions in my approach:

Picture 5

Various stages of recomposition:

Picture 3

The basis functions are each optimal least-squares fits and have very little in the way of artifacts.   The downside of this basis function is that there is non-locality, in that earlier parts of the signal influence the basis function in later parts (usually in a non-intrusive way) and is expensive to compute at high frequency.

HHT (Hilbert-Huang Transform)
Huang’s approach is quite clever and importantly, parsimonious.   The approach was motivated by the issues one gets with the Hilbert tranform when dealing with a non-stationary timeseries.   What if we could create an empirical basis function that creates bases that are centered around the maxima/minima and represent the mode.   The algorithm is as follows:

let X <- <signal vector>
decomp[1] <- X
for (i in 2:<maximum bases for this signal>)
{
    # determine local maxima and minima in current component
    maxima <- <locate maxima> (decomp[i-1])
    minima <- <locate minima> (decomp[i-1])

    # compute splines through maxima & minima
    maxspline <- spline (maxima)
    minspline <- spline (minima)

    # compute basis function
    mean[i] <- (minspline + maxspline) / 2
    decomp[i] <- decomp[i-1] - mean
}

Basis functions in Huang’s approach:

Picture 6

Various stages of recomposition with Huang’s approach:

Picture 4

The resulting basis function has some interesting properties such as the functions are centered around the origin with a long term mean approaching zer0 and have locality.   HHT is especially good at capturing non-stationary high-frequency oscillations with fidelity.

There are, however, some noticable artifacts due to the natural-spline approach (notice exaggerated pertubations in the curve near abrupt price movements).   This approach fairs less well with abrupt shifts and at timeseries boundaries then the approach I have been using.

There are some aspects of this approach, though, that may make this useful in analysing the mean.  More on this later.

Leave a comment

Filed under mean, signal-processing

The “Mean”, take 2

Thinking about the prior post, am not satisfied with the approach in that it does not quantify properties of the mean and price relative to the mean in a way that is explicit.

Let’s examine various properties the system should represent:

  1. Integral of Y[t] – μ[t] should be close to 0
    This implies that the mean’s course is balanced between time/distance spent below the mean and above the mean.
  2. average max amplitude should meet a target amplitude
    We want a predictable mean reversion process.   One approach to this is ensuring that the mean is such that it allows for some average max deviation.   We will be working with an evolving  distribution of amplitudes to modify the behavior of the mean.
  3. Should be smooth and continuous with exception of jumps
    That is we minimize the integral of μ”(t)^2 in some ratio with other constraints.   Alternatively we could require a AR(p) process to provide continuity with prior μ(t) observations.

A-posteriori Approach
This is a relatively simple problem to solve after-the-fact, where we find a regressor f(t) that meets the above requirements.   The tricky aspect is in the observation of minima and maxima of the price difference in a way that we can integrate into a system of equations.

Assuming we have the regressor f(t), and a function that evaluates the average amplitude over the regressor Ea (f), we can express as:

Picture 3

The problem with this is that though it is optimal for the data set over which it is evaluated it is unlikely to be optimal relative to future values of Yi.  Observe how the regressors differ (the red with the original data series and the green relative to an additional hour of data):

Picture 2

The initial regressor (red) is no longer optimal given another hr of data.  The green regressor now represents the optimum.   This analysis points in the direction of determining an “online” estimate of the mean which works probabilistically.

Online Approach
I do not yet have a concrete solution in mind, so what follows is a train of thought:

  • use an evolving stochastic cubic system with constraints to guide coefficient processes (what constraints from above and how)
  • alternatively model the mean on an autoregressive process with innovations in proportion to a running variance estimate.   Equating variance and level-duration arrive at a formulation for the amount of “innovation” and therefore deviation from the mean allowed.
  • Another variant of the above approach is to adjust the AR coefficients to respond to changes in volatility.

I’ll update as the ideas mature …

Leave a comment

Filed under mean, signal-processing, statistics, stochatistic