statistics

Calculating average Ksn between knickpoints

Posted on

A blog reader recently asked whether it is possible to obtain average gradients (or Ksn) between two knickpoints. That’s a brilliant question because river steepness between knickpoints will provide clues about changes in fault movements or tectonic uplift.

Read the rest of this entry »

Mean basin ksn and smoothing madness

Posted on

This post is written by Richard Ott, PostDoc at the GFZ Potsdam. Richard’s research focuses on tectonic geomorphology, but he has also been working on diverse topics bringing in his expertise in geomorphometry, cosmogenic nucleides and other analytical and numerical techniques. Also see Richard’s website: https://richardott.weebly.com/

Ksn is one of the standard topographic metrics used in tectonic geomorphology. Based on the stream-power model it is typically interpreted as a proxy for rock uplift, provided steady state topography. Often, we are interested in the mean ksn of a drainage basin. Especially, when we measured catchment erosion rates (e.g. from 10Be), ksn is a predictor candidate that potentially explains much of the variability in between different catchments. This sounds like a very simple and straightforward task. In this blog post I’ll try to convince you that – as with most things in life – the devil is in the detail…

Read the rest of this entry »

Ksn anomaly plot with confidence bounds (or how to distinguish knickzones from knicknoise)

Posted on Updated on

In response to my previous post, Philippe Steer from the University of Rennes came up with an idea that I’d like to explore a bit more here. In brief, I posted a code and figure that showed an anomaly plot of Ksn and Philippe suggested to have confidence bounds around the mean so that one could identify Ksn values that go beyond the noise inherent in data of longitudinal river profiles. To this end, such analysis would be needed if you want to objectively distinguish knickzones from the noise.

Read the rest of this entry »

Point patterns on stream networks (1)

Posted on

Our new paper on point patterns on stream networks just got published in Earth Surface Processes and Landforms and is open-access (thanks to the DEAL agreement). In this post, I’d like to talk about what the new class PPS is and what it is good for.

Read the rest of this entry »

Bayesian analysis of blog usage

Posted on Updated on

The TopoToolbox blog has now been online since October 2014. Since then, the blog had ~22k visitors. With a total of 8.7k, 2017 was the year with the most visitors. Yet, monthly statistics show quite some variability throughout the years. The blog tends to have fewer visitors in December and during the summer months July and August than during the remaining months. In addition, there are strong fluctuations between usage statistics of months that have seen many posts and those when I had no time to post.  This variability superimposes the general trend. We may thus ask the question, whether we truely experience an increase in visitor counts.

To answer this question, I took monthly visitor counts during the last two years (unfortunately, I can only access the last two years). I saved them in an Excel file which I then imported to MATLAB. My aim was then to do a regression analysis to be able to identify possible trends. Since the dependent data are counts, a standard least squares regression is clearly not advisable. Rather, count data are modelled using Poisson statistics. I thus chose a generalized linear regression model with counts as a Poisson response, and time as independent variable. I used ‘log’ as a link function so that the response variable is Y~Poisson(exp(b1 + b2(X))).

The function fitglm (which is part of the Statistics and Machine Learning Toolbox) is perfectly suited to solve this regression problem. However, the function takes a frequentist approach which I didn’t want to follow here. Rather, I wanted to learn how to tackle this problem with Bayesian inference. Gladly, MATLAB increasingly features tools for Bayesian analysis such as the Hamiltonian Monte Carlo (HMC) sampler. The only thing I needed was some guidance of how to derive the prior distributions and how to calculate the posterior probability density function. For this, I found this paper by Doss and Narasimhan quite useful.

Writing tools for Bayesian analysis is not easy and the code quite difficult to debug. But finally, I managed to get my analysis working, and to return following figure:

blogstats.png
Bayesian analysis of monthly blog statistics.

Clearly, the posterior distribution of beta_1, the slope of the regression, does not include zero so I can assume with high certainty, that we truly observe a trend to higher visitor numbers. Can this finding be extrapolated? Well, I don’t know. Just keep on visiting this blog frequently, and we’ll see.

If you are interested in the technical details, here is the function:

function bayesianpoissonregress(t,y)

%Bayesianpoissonregression
%
% t --> datetime vector (equally spaced)
% y --> vector with counts

t = t(:);
y = y(:);

% Numeric month indices
X = (1:size(t,1))';
% add offset to design matrix
X = [ones(size(X,1),1) X];

% link function
gamma = @(beta) exp(beta(:)'*X')';

% Prior distributions of the parameters are gaussian
a = [0 0]; % prior means
b = [20 20]; % prior std

d = a./b + sum(X.*y);

%% MCMC sampling
hmc = hmcSampler(@(beta) logpdf(beta),[0 0],'UseNumericalGradient',true);
% Estimate max. a posteriori
map = estimateMAP(hmc);
% tune sampler
hmc = tuneSampler(hmc,'StartPoint',map);
% draw chain
chain = drawSamples(hmc,'StartPoint',map);


%% Plot results
figure
subplot(1,2,1)
[~,ax] = plotmatrix(chain);
ylabel(ax(1,1),'\beta_0');
ylabel(ax(2,1),'\beta_1');
xlabel(ax(2,1),'\beta_0');
xlabel(ax(2,2),'\beta_1');

subplot(1,2,2)
% random set of parameters from chain
sam = randperm(size(chain,1),100);
sam = chain(sam,:);

hold on
for r = 1:size(sam,1)
    chat = gamma(sam(r,:)); 
    plot(t,chat,'color',[.7 .7 .7]); 
end
plot(t,y,'--s')
ylabel('Visitors')
box on
% -- end of function

%% Log pdf
function p = logpdf(beta)
% posterior distribution according to 
% http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.9684&rep=rep1&type=pdf
% (Eq. 2.2)

% posterior
p = - sum(1./(b.^2) .* beta.^2) + sum(d .* beta) - sum(gamma(beta));

end
end