From Logistic Regression in SciKit-Learn to Deep Learning with TensorFlow – A fraud detection case study – Part I

08/05/2018 ~ Matthias Groncki ~ 7 Comments

Its quite a long time since my last post. It has been a busy time for me and a lot of things changed. One obvious change was the overdue change of my blog title from Ipython to Jupyter notebooks. Also has my work life shifted over the time from a Quant to a Data Scientist role. The most recent change is that relocated to Bangkok this year. I am still working in the finance industry and now lead a small Data Science team in the Operational Risk area.

After getting used to the new life here in South East Asia I’ve decided to continue my blog. But there will be slight change in the topics, I plan to write more about Machine Learning and Deep Learning in Python and R and less about pricing and XVAs. But I will have also the chance to look into some pricing model validation in my new role, so there is the chance that there will be some quant related postings coming as well.

In the next three coming posts, we will see how to build a fraud detection (classification) system with TensorFlow. We will start to build a logistic regression classifier in SciKit-Learn (sklearn). In the next step will build a logistic regression classifier in TensorFlow from scratch. In the 3rd post we will add a hidden layer to our logistic regression and build a neural network.

You can find the complete source code on GitHub or on kaggle.

For this example we use public available real world data set. You can find the data on kaggle. The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. It contains only numerical input variables which are the result of a PCA transformation. Due to confidentiality issues, there are no more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA.

Install requirements

Easiest way is to install the Anaconda Python distribution from Anaconda Cloud. Windows, Mac and Linux is supported. I use the Python 3.6 64 bit Version. Most of the required packages come already with the basic installation. To install Keras and TensorFlow open the Anaconda prompt (shell) and install the missing packages via conda (package manager, similar to apt-get in ubuntu):

conda install -c conda-forge keras tensorflow

Fraud detection with logistic regression in Scikit-Learn

First we load a required libraries and functions

 This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import roc_curve, roc_auc_score, classification_report, accuracy_score, confusion_matrix 
import seaborn as sns
import matplotlib.pyplot as plt

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

Pandas is used for loading the data and a powerful libraries for data wrangling. If you are not familiar with pandas check out the tutorials on the pandas project website. numpy is the underlying numerical library for pandas and scikit-learn. seaborn and matplotlib are used for visualisation.

Load and visualize the data

First we load the data and try to get an overview of the data.

We load the csv-file with the command read_csv and store it as data-frame in our memory.

credit_card = pd.read_csv('../input/creditcard.csv')

Next we try to get an overview of the fraud vs non-fraud distribution, we going to use the seaborn countplot function to produce bar chart.

f, ax = plt.subplots(figsize=(7, 5))
sns.countplot(x='Class', data=credit_card)
_ = plt.title('# Fraud vs NonFraud')
_ = plt.xlabel('Class (1==Fraud)')

We can not even see the bar chart of the fraud cases. As we can see we have mostly non-fraudulent transactions. Such a problem is also called inbalanced class problem.

99.8% of all transactions are non-fraudulent. The easiest classifier would always predict no fraud and would be in almost all cases correct. Such classifier would have a very high accuracy but is quite useless.

For such an inbalanced classes we could use over or undersampling methods to try to balance the classes (see inbalance-learn for example: https://github.com/scikit-learn-contrib/imbalanced-learn), but this out of the scope of todays post. We will come back to this in a later post.

As accuracy is not very informative in this case, the AUC (Aera under the curve) a better metric to assess the model quality. The AUC score is in a two class classification class equal to the probability that our classifier will detect a fraudulent transaction given one fraudulent and genuine transaction to choice from. Guessing would have a probability of 50%.

We create now the feature matrix X and the result vector y. We drop the column Class from the data frame and store it in a new data frame X and we select the column Class as our vector y.

X = credit_card.drop(columns='Class', axis=1)
y = credit_card.Class.values

Due to the construction of the dataset (PCA transformed features, which minimizes the correlation between factors), we dont have any highly correlated features. Multicolinearity could cause problems in a logisitc regression.

To test for multicolinearity one could look into the correlation matrix (works only for non categorical features, which we do today) or run partial regressions and compare the standard errors or use pseudo-R^2 values and calculate Variance-Inflation-Factors.


corr = X.corr()

mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True

cmap = sns.diverging_palette(220, 10, as_cmap=True)

# Draw the heatmap with the mask and correct aspect ratio
f, ax = plt.subplots(figsize=(11, 9))
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5})

Short reminder of Logistic Regression

In Logisitic Regression the logits (logs of the odds) are assumed to be a linear function of the features

$L=\log(\frac{P(Y=1)}{1-P(Y=1)}) = \beta_0 + \sum_{i=1}^n \beta_i X_i.$

Solving this equatation for $p=P(Y=1)$ yields to

$p = \frac{\exp(L)}{1-\exp(L)}.$

The parameters $\beta_i$ can be derived by Maximum Likelihood Estimation (MLE). The likelihood for a given $m$ observation $Y_j$ is

$lkl = \prod_{j=1}^m p^{Y_j}(1-p)^{1-Y_j}.$

To find the maximum of the likelihood is equivalent to the minimize the negative logarithm of the likelihood (loglikelihood).

$-llkh = -\sum_{j=1}^m Y_j \log(p) + (1-Y_j) \log(1-p),$

which is numerical more stable. The log-likelihood function has the same form as the cross-entropy error function for a discrete case.

So finding the maximum likelihood estimator is the same problem as minimizing the average cross entropy error function.

In SciKit-Learn uses by default a coordinate descent algorithm to find the minimum of L2 regularized version of the loss function (see. http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression).

The main difference between L1 (Lasso) and L2 (Ridge) regulaziation is, that the L1 prefer a sparse solution (the higher the regulazation parameter the more parameter will be zero) while L2 enforce small parameter values.

Train the model

Training and test set

First we split our data set into a train and a validation set by using the function train_test_split.

np.random.seed(42)
X_train, X_test, y_train, y_test = train_test_split(X, y)

Model definition

scaler = StandardScaler()
lr = LogisticRegression()
model1 = Pipeline([('standardize', scaler),
                    ('log_reg', lr)])
model1.fit(X_train, y_train)

As preperation we standardize our features to have zero mean and a unit standard deviation. The convergence of gradient descent algorithm are better. We use the class StandardScaler. The class StandardScaler has the method fit_transform() which learn the mean $\mu_i$ and standard deviation $\sigma_i$ of each feature $i$ and return a standardized version $\frac{x_i - \mu_i}{\sigma}$ . We learn the mean and sd on the training data. We can apply the same standardization on the test set with the function transform().

The logistic regression is implemented in the class LogisticRegression, we will use for now the default parameterization. The model can be fit using the function fit(). After fitting the model can be used to make predicitons predict() or return the estimated the class probabilities predict_proba().

We combine both steps into a Pipeline. The pipline performs both steps automatically. When we call the method fit() of the pipeline, it will invoke the method fit_and_transform() for all but the last step and the method fit() of the last step, which is equivalent to lr.fit(scaler.fit_transform(X_train), y_train)

If we invoke the method predict() of the pipeline its equvivalent to lr.predict(scaler.transform(X_train)).

Training score and Test score

confusion_matrix() returns the confusion matrix, C where $C_{0,0}$ are the true negatives (TN) and $C_{0,1}$ the false positives (FP) and vice-versa for the positives in the 2nd row. We use the function accurary_score() to calculate the accuracy our models on the train and test data. We see that the accuracy is quite high (99,9%) which is expected in such an unbalanced class problem. With the method roc_auc_score()can we get the area under the receiver-operator-curve (AUC) for our simple model.

y_train_hat = model1.predict(X_train)
y_train_hat_probs = model1.predict_proba(X_train)[:,1]
train_accuracy = accuracy_score(y_train, y_train_hat)*100
train_auc_roc = roc_auc_score(y_train, y_train_hat_probs)*100
print('Confusion matrix:\n', confusion_matrix(y_train, y_train_hat))
print('Training accuracy: %.4f %%' % train_accuracy)
print('Training AUC: %.4f %%' % train_auc_roc)

Confusion matrix:
 [[213200     26]
 [   137    242]]
Training accuracy: 99.9237 %
Training AUC: 98.0664 %

y_test_hat = model1.predict(X_test)
y_test_hat_probs = model1.predict_proba(X_test)[:,1]
test_accuracy = accuracy_score(y_test, y_test_hat)*100
test_auc_roc = roc_auc_score(y_test, y_test_hat_probs)*100
print('Confusion matrix:\n', confusion_matrix(y_test, y_test_hat))
print('Training accuracy: %.4f %%' % test_accuracy)
print('Training AUC: %.4f %%' % test_auc_roc)

Confusion matrix:
 [[71077    12]
 [   45    68]]
Training accuracy: 99.9199 %
Training AUC: 97.4810 %

Our model is able to detect 68 fraudulent transactions out of 113 (recall of 60%) and produce 12 false positives (<0.02%) on the test data.

To visualize the Receiver-Operator-Curve we use the function roc_curve. The method returns the true positive rate (recall) and the false positive rate (probability for a false alarm) for a bunch of different thresholds. This curve shows the trade-off between recall (detect fraud) and false alarm probability.

If we classifiy all transaction as fraud, we would have a recall of 100% but also the highest false alarm rate possible (100%). The naive way to minimize the false alarm probability is to classify all transaction as genuine. **


fpr, tpr, thresholds = roc_curve(y_test, y_test_hat_probs, drop_intermediate=True)

f, ax = plt.subplots(figsize=(9, 6))
_ = plt.plot(fpr, tpr, [0,1], [0, 1])
_ = plt.title('AUC ROC')
_ = plt.xlabel('False positive rate')
_ = plt.ylabel('True positive rate')
plt.style.use('seaborn')

plt.savefig('auc_roc.png', dpi=600)

Our model classify all transaction with a fraud probability => 50% as fraud. If we choose the threshold higher, we could reach a lower false positive rate but we would also miss more fraudulent transactions. If we choose the thredhold lower we can catch more fraud but need to investigate more false positives.

Depending on the costs for each error, it make sense to select another threshold.

If we set the threshold to 90% the recall decrease from 60% to 45%. while the false positve rate is the same. We can see that our model assign some non-fraudulent a very high probability to be fraud.

y_hat_90 = (y_test_hat_probs > 0.90 )*1
print('Confusion matrix:\n', confusion_matrix(y_test, y_hat_90))
print(classification_report(y_test, y_hat_90, digits=6))

If we set the threshold down to 10%, we can detect around 75% of all fraud case but almost double our false positive rate (now 25 false alarms)

Confusion matrix:
 [[71064    25]
 [   25    88]]
             precision    recall  f1-score   support

          0     0.9996    0.9996    0.9996     71089
          1     0.7788    0.7788    0.7788       113

avg / total     0.9993    0.9993    0.9993     71202

Where to go from here?

We just scratched the surface of sklearn and logistic regression. For example we could spent much more time with the

feature selection / engineering (which is a bit hard without any background information about the features),
we could try techniques to counter the data inbalance and
we could use cross-validation to fine tune the hyperparameters or
try a different regularization (Lasso/Elastic Net) or
try a different optimizer (stochastic gradient descent or mini-batch sgd)
adjust class weights to adjust the decision boundary (make missed frauds more expansive in the loss function)
and finally we could try different classifer models in sklearn like decision trees, random forrests, knn, naive bayes or support vector machines.

But for now we will stop here and we will implement in the next part the logisitc regression model with stochastic gradient descent in TensorFlow and then extend it to a neural net and we will come back to these points at a later time. But in the mean time feel free to play with the notebook and try to change the parameter and see how the model will change.

So long…

Exposure Simulation Part III / CVA for Bermudan Swaptions

26/06/201626/06/2016 ~ Matthias Groncki ~ 1 Comment

In this post we are going to simulate exposures of a bermudan swaption (with physical settlement) through an backward induction (aka American Monte-Carlo) method. These exposure simulations are often needed at the counterparty credit risk management. We will use this exposures to calculate the Credit Value Adjustment (CVA). One could easily extend this notebook to calculate the PFE or other xVAs. The implementation is based on the paper “Backward Induction for Future Values” by Dr. Antonov et al. .

We assume that the value of a derivate and the default probability of the counterpart are independent (no wrong-way risk). For our example our netting set consists of only one Bermudan swaption (but it can be easily extended). For simplicity we assume a flat yield termstructure.

In our example we consider a 5Y Euribor 6M payer swap at 3 percent. The bermudan swaption allows us to enter this swap on each fixed rate payment date.

Just as a short reminder the unilateral CVA of a derivate is given by:

$CVA = (1-R) \int_0^Tdf(t)EE(t)dPD(t),$

with recovery rate R, portfolio maturity T (the latest maturity of all deals in the netting set), discount factor df(t) at time t, the expected exposure EE(t) of the netting set at time t and the default probability PD(t).

We can approximate the integral by the discrete sum

$CVA = (1-R) \sum_{i=1}^n df(t_i)EE(t_i)(PD(t_{i-1})-PD(t_i)).$

In one of my previous posts we calculated the CVA of a plain vanilla swap. We performed the following steps:

Generate a timegrid T
Generate N paths of the underlying market values which influences the value of the derivate
For each point in time calculate the positive expected exposure
Approximate the integral

As in the plain vanilla case we will use a short rate process and assume its already calibrated.

The first two steps are exactly the same as in the plain vanilla case.

But how can we calculate the expected exposure of the swaption?

In the T-forward measure the expected exposure is given by:

$EE(t) = P(0,T) E^T[\frac{1}{P(t,T)} \max(V(t), 0)]$

with future value of the bermudan swaption $V(t)$ at time $t$ . The npv of a physical settled swaption can be negative at time t if the option has been exercised before time t and the underlying swap has a negative npv at time t.

For each simulated path we need to calculate the npv of the swaption $V(t_i,x_i)$ conditional the state $x_i$ at time $t_i$.

In the previous post we saw a method how to calculate the npv of a bermudan swaption. But for the calculation of the npv we used a Monte Carlo Simulation. This would result again in a nested Monte-Carlo-Simulation which is, for obvious reasons, not desirable.

If we have a look on the future value, we see that is very much the same as the continuation value in the bermudan swaption pricing problem.

But instead of calculate the continuation values only one exercises date we calculate it now on each time in our grid. We use a the same regression based approximation.

We approximate the continuation value through some function of current state of the stochastic process x_i:

$V(t_i, x_i) \approx f(x_i)$ .

A common choice for the function is of the form:

$f(x) = a_0 + a_1 g_1(x)+ a_2 g_2(x) + \dots a_n g_n(x)$

with $g_i (x)$ a polynom of the degree $i$ .

The coefficients of this function f are estimated by the ordinary least square error method.

We can almost reuse the code from the previous post, only a few extensions are needed.

We need to add more times to our time grid (we add a few more points in time after the maturity to have some nicer plots):

date_grid = [today + ql.Period(i, ql.Months) for i in range(0,66)] + calldates + fixing_dates

When we exercise the option we will enter the underlying swap. Therefore we need to update the swaption npvs to the underlying swap npv for all points in time after the exercise time on each grid:

if t in callTimes:
cont_value = np.maximum(cont_value_hat, exercise_values)
swaption_npvs[cont_value_hat < exercise_values, i:] = swap_npvs[cont_value_hat < exercise_values, i:].copy()

For our example we can observe the following simulated exposures:

If we compare it with the exposures of the underlying swap, we can see that the some exposure paths coincide. This is the case when the swaption has been exercised. In some of this cases we observe also negative npvs after exercising the option.

For the CVA calculation we need the positive expected exposure, which can be easily calculated:

swap_npvs[swap_npvs<0] = 0
swaption_npvs[swaption_npvs<0] = 0
EE_swaption = np.mean(swaption_npvs, axis=0)
EE_swap = np.mean(swap_npvs, axis=0)

Given the default probability of the counterparty as a default termstructure we can now calculate the CVA for the bermudan swaption:

# Calculation of the default probs
defaultProb_vec = np.vectorize(pd_curve.defaultProbability)
dPD = defaultProb_vec(time_grid[:-1], time_grid[1:])

# Calculation of the CVA
recovery = 0.4
CVA = (1-recovery) * np.sum(EE_swaption[1:] * dPD)

As usual the complete notebook is available here.

Exposure Simulation / CVA and PFE for multi-callable swaps Part II

29/05/201629/05/2016 ~ Matthias Groncki ~ 3 Comments

Hey Everyone,

today I want to continue my last post and show you today how to calculate the NPV of a bermudan swaption through Monte-Carlo simulation. The techniques I will show you in this post can be easily extended to simulate exposure paths of a multi callable swap. This exposure simulations can be used for a xVA calculation like the Credit Value Adjustment (CVA) or Potential Future Exposure (PFE).

A physical settled Bermudan swaption is a path dependent derivate. The option has multiple exercise dates and the swaption holder has the right to enter the underlying swap at any of this exercise dates.

On each of the exercise dates the holder have to decide whether it is optimal to exercise the option now or continue to hold the option and may exercise it on a later date.

Lets consider a bermudan swaption with two exercise dates.

At the latest exercise date T the payoff is the well known european swaption payoff

$V(T) = \max(S(T), 0)$ ,

where $S(T)$ is the npv of the underlying swap at time t.

At the first exercise date $t_i \le T$ the npv of the swaption given by

$V(t_i) = \max(S(t_i), N(t_i) E [ \frac{V(T)} {N(T)} | F_{t_i}]$ .

and so to the NPV at time 0

$V(0) = N(0) E[\frac{V(t_i)}{N(t_i)} | F_0]$ .

One way to solve problem is performing a Monte-Carlo-Simulation. But a naive Monte Carlo approach would require a nested Monte-Carlo Simulation on each path to calculate the continuation value
$C(t_i ) = N(t_i) E[\frac{V(T)}{N(T)} | F_{t_i} ]$ at time $t_i$ .

Lets say we use 100.000 samples in our simulation, so a bermudan swaption with two exercise dates would require 100.000 x 100.000 samples. Which is very time consuming and grows exponential with the number of exercise dates.

Instead of calculate the continuation value also with a Monte-Carlo simulation we will use a approximation. We use the approximation and algorithm developed by Longstaff and Schwarz.

We approximate the continuation value through some function of some state variable (e.g swap rate, short rate, etc…) $x_i$ :

$C(t_i) \approx f(x_i)$ .

A common choice for the function is of the form:

$f(x) = a_0 + a_1 g_1(x)+ a_2 g_2(x) + \dots a_n g_n(x)$

with $g_i (x)$ a polynom of the degree $i$ .

Lets choice $g_i (x)= x^{i-1}$ and $n=5$ for our example.

The coefficients of this function f are estimated by the ordinary least square error method. Thats why some people call this method also the OLS Monte-Carlo Method.

So how does the the algorithm look like?

We would simulate n paths of our stochastic process (in our case the short rate process) and go backward through time (backward induction).

We start at the last exercise date $T$ and calculate for each path the deflated terminal value of the swaption $V_T$ (exactly like we did in the european case).

Then we go back into time to the next exercise date and approximate the needed continuation value by our ordinary least square estimate.

Our choice for the state variable is the state variable of the short rate process itself.

We choice our coefficient so that the square error of $f(x_i) - V(T)$ is minimized, where $x_i$ is a vector of all simulated states over all paths and $V(T)$ the corresponding vector of the calculated npvs at time T from the previous step.

Now we use this coefficients to calculate the continuation value on each path by inserting the state of the current path into the formula.

On path j the deflated value of the swaption will be

$V_{i,j} = \frac{1}{N(t-i, x_{i,j})} max(S(t_i, x_{i,j}), N(t_i, x_{i,j}) C(x_{i,j})$

where $x_{i,j}$ is the the state of the process at time i on path j.

The npv at time 0 is then

$V(0) = N(0) \frac{1}{n} \sum_j=0^n V_{i,j}$ .

Some remarks:

One could also perform a 2nd MC Simulation (forward in time). First estimate the needed coefficients with a smaller numbers of paths. After train the model with this small set generate a larger sets of paths and go straight forward in time and use the approximation of the continuation value of the first run. On each path you only need to go forward until the holder will exercise the option.
The exercise time by this approach is only approximatively optimal, since its uses a approximation of the continuation value. Therefore our price will be an (asymptotic) lower bound of the real price.
The choice of appropriate functions g_i and the degree n is not obvious and can be tricky (see below).

Now lets have a look how this algorithm could be implemented in Python and Quantlib.

We use the notebook from my previous post as our starting point. We use the same yield curves, model (Gaussian short rate model) and the same underlying swap. The underlying 5y swap starts in 1Y.

We setup a bermudan swaption with 2 exercise dates, the first exercise date is on the start date of the swap and the 2nd date on the start date of the 2nd fixed leg accrual period.

Lets add both dates in our list of exercise dates:

calldates = [ settlementDate, 
              euribor6m.valueDate(ql.Date(5,4,2017))
            ]

We need to evaluate the value of the underlying swap at both exercise dates on each simulated path. To do that we introduce two new auxiliary functions. The first gives us all future (relative to an arbitrary evaluation time t) payment times and amounts from the fixed rate leg of the swap:

def getFixedLeg(swap, t):
    """
    returns all future payment times and amount of the fixed leg of the underlying swap

    Parameter:
        swap (ql.Swap)
        t (float) 

    Return:
        (np.array, np.array) (times, amounts)

    """
    fixed_leg = swap.leg(0)
    n = len(fixed_leg)
    fixed_times=[]
    fixed_amounts=[]
    npv = 0
    for i in range(n):
        cf = fixed_leg[i]
        t_i = timeFromReference(cf.date())
        if t_i > t:
            fixed_times.append(t_i)
            fixed_amounts.append(cf.amount())
    return np.array(fixed_times), np.array(fixed_amounts)

The 2nd will give us all future payment times, accrual start and end times, notionals, gearings and day count fractions of the floating leg. We need all this information to estimate the fixing of the float leg.

def getFloatingLeg(swap, t):

    float_leg = swap.leg(1)
    n = len(float_leg)
    float_times = []
    float_dcf = []
    accrual_start_time = []
    accrual_end_time = []
    nominals = []
    for i in range(n):
        # convert base classiborstart_idx Cashflow to
        # FloatingRateCoupon
        cf = ql.as_floating_rate_coupon(float_leg[i])
        value_date = cf.referencePeriodStart()
        t_fix_i = timeFromReference(value_date)
        t_i = timeFromReference(cf.date()) 
        if t_fix_i >= t:
            iborIndex = cf.index()

            index_mat = cf.referencePeriodEnd()
            # year fraction
            float_dcf.append(cf.accrualPeriod())
            # calculate the start and end time
            accrual_start_time.append(t_fix_i)
            accrual_end_time.append(timeFromReference(index_mat))
            # payment time
            float_times.append(t_i)
            # nominals 
            nominals.append(cf.nominal())
    return np.array(float_times), np.array(float_dcf), np.array(accrual_start_time), np.array(accrual_end_time), np.array(nominals)

With these two function we can evaluate the the underlying swap given the time and state of the process:

def swapPathNPV(swap, t):
    fixed_times, fixed_amounts = getFixedLeg(swap, t)
    float_times, float_dcf, accrual_start_time, accrual_end_time, nominals = getFloatingLeg(swap, t)
    df_times = np.concatenate([fixed_times, 
                           accrual_start_time, 
                           accrual_end_time, 
                           float_times])
    df_times = np.unique(df_times)
    # Store indices of fix leg payment times in 
    # the df_times array
    fix_idx = np.in1d(df_times, fixed_times, True)
    fix_idx = fix_idx.nonzero()
    # Indices of the floating leg payment times 
    # in the df_times array
    float_idx = np.in1d(df_times, float_times, True)
    float_idx = float_idx.nonzero()
    # Indices of the accrual start and end time
    # in the df_times array
    accrual_start_idx = np.in1d(df_times, accrual_start_time, True)
    accrual_start_idx = accrual_start_idx.nonzero()
    accrual_end_idx = np.in1d(df_times, accrual_end_time, True)
    accrual_end_idx = accrual_end_idx.nonzero()
    # Calculate NPV
    def calc(x_t):
        discount = np.vectorize(lambda T: model.zerobond(T, t, x_t))
        dfs = discount(df_times)
        # Calculate fixed leg npv
        fix_leg_npv = np.sum(fixed_amounts * dfs[fix_idx])
        # Estimate the index fixings
        index_fixings = (dfs[accrual_start_idx] / dfs[accrual_end_idx] - 1) 
        index_fixings /= float_dcf
        # Calculate the floating leg npv
        float_leg_npv = np.sum(nominals * index_fixings * float_dcf * dfs[float_idx])
        npv = float_leg_npv - fix_leg_npv
        return npv
    return calc

This functions returns us a pricing function, which takes the current state of the process to price calculate the NPV of the swap at time t. The technique we use here is called closure function. The inner function is aware of the underlying swap and the ‘fixed’ time t and can access all variables defined in the outer function.

Remark on the pricing function
This is a very simple function. Its not possible to calculate the correct NPV of the swap for a time t between to fixing times of the floating leg. The current floating period will be ignored. So use that function only to evaluate the swap NPV only on fixing dates. We will extend this function to be capable to evaluate the NPV on any time t in the next post.

Use of the function at time t=0

The generation of the time grid hasn’t changed from the last time. It just consists of three points.

Also the generation of our sample path is the same as last time. After we generate our path we calculate the deflated payoffs of our swap at time T:

pricer = np.vectorize(swapPathNPV(swap, time_grid[-1]))
cont_value = pricer(y[:,-1]) / numeraires[:,-1]
cont_value[cont_value < 0] = 0

First we generate a vectorised function of the pricing function and use it on the array of our sample paths y and then apply the maximum function on the result.

In the next step we go one step back in time and calculate the deflated exercise value of the swaption at that time:

pricer = np.vectorize(swapPathNPV(swap, time_grid[-2]))
exercise_values = pricer(y[:,-2]) / numeraires[:,-2]
exercise_values[exercise_values < 0] = 0

Now we estimate the coefficients of continuation value function. We use the library statsmodels and fit an OLS model to the data.

states = y[:, -2]
Y = np.column_stack((states, states**2, states**3, states**4))
Y = sm.add_constant(Y)
ols = sm.OLS(cont_value, Y)
ols_result = ols.fit()

With this coefficients we can now calculate the continuation value on each path, given the state:

cont_value_hat = np.sum(ols_result.params * Y, axis=1)

The deflated value of the swaption at the first exercise is the maximum out of exercise value and continuation value:

npv_amc = np.maximum(cont_value_hat, exercise_values)

The npv at time 0 is the mean of the simulated deflated npvs at the first exercise date times the value of the numeraire at time 0:

npv_amc = np.mean(npv_amc) * numeraires[0,0]

To check the quality of our regression function $f$ we can have a look on a scatter plot:

As we can see the regression function doesn’t seems to fit that good to the left tail. So we could either increase the degree of our function, try other polynomial function, change to another state variable or try piecewise regression functions.

As usual you can download the source code from my github account or find it on nbViewer.

In the next post we are going to use this regression based approach to generate exposure paths for a multi callable swap.

I hope you enjoy the post. Till next time.

Exposure simulation / PFE and CVA for multi-callable swaps / Bermudan swaptions… Part 1 of 3

02/05/201502/05/2015 ~ Matthias Groncki ~ 3 Comments

In my previous posts we have seen a Monte-Carlo method to generate market scenarios and calculate the expected exposure, potential future exposure and credit value adjustment for a netting set of plain vanilla swaps. In the next three posts we will add multi-callable swaps (Bermudan swaptions) to the netting set.

Roadmap to multi callable products

In the first part we will see how to use a Monte-Carlo simulation for a single-callable swap (European Swaption) pricing. We don’t worry about the model calibration in this posts. This post is intend to fix the used notation and provide a simple example about basic Monte-Carlo pricing. The techniques presented here will be used and extend in the following two posts.

In the second part we develop a regression based approach (aka backward induction method or American Monte-Carlo or Longstaff-Schwartz method) to calculate the npv of a Bermudan swaption. Further we will calibrate the Gaussian short rate model to fit to a set of market prices of European swaptions.

At this point we will be able to calculate the npv of a multi-callable swap but the Monte-Carlo pricing is not suitable for an exposure simulation. Since a nested pricing simulation in a exposure simulation is very time consuming.

Therefore we will modify the American Monte-Carlo method in the third part and make it useable for our exposure simulation. The approach in the third post will follow the method presented by Drs. Alexandre Antonov, Serguei Issakov and Serguei Mechkov in their research paper ‘Backward Induction for Future Values’ and the methodology presented in the book ‘Modelling, Pricing, and Hedging Counterparty Credit Exposure: A Technical Guide’ by Giovanni Cesari et al.

But for now let’s start with something easy: European swaptions.

European swaption pricing

Since my first post we have been living in a single curve world and our model parameter have been being exogenous. To make things even more easy we have been using a flat yield curve. For now we don’t leave this comfortable world and apply the same setting for this example.

How to create a swaption with QuantLib?

An European payer/receiver swaption with physical delivery is an option that allows the option holder at option expiry to enter a payer/receiver swap. The rate paid/received on the fixed leg equals the strike of the swaption.

Given a plain vanilla swap, one can create an European swaption in the QuantLib with very few lines of code. All we need is the expiry date and the settlement type (cash settled or physical delivery).

def makeSwaption(swap, callDates, settlement):
    if len(callDates) == 1:
        exercise = ql.EuropeanExercise(callDates[0])
    else:
        exercise = ql.BermudanExercise(callDates)
    return ql.Swaption(swap, exercise, settlement)

settlementDate = today + ql.Period("1Y")

swaps = [makeSwap(settlementDate,
                  ql.Period("5Y"),
                  1e6,
                  0.03047096,
                  euribor6m)
        ]

calldates = [euribor6m.fixingDate(settlementDate)]
 
swaptions = [makeSwaption(swap, 
                          calldates, 
                          ql.Settlement.Physical) 
             for swap, fd in swaps]

Monte-Carlo pricing

At option expiry $T_E$ the npv of the swaption is $V(T_E) = \max(Swap(T_E), 0.0)$ with $Swap(T_E)$ donating the value of the underlying swap at expiry.

In the Gaussian short rate model under the T-forward measure the zerobond with maturity in T years is our numeraire $N(t)$ . Under the usual conditions and using the T-forward measure we can calculate the npv at time 0 by

$V(0) = N(0) E[\frac{V(T_E)}{N(T_E)} | F_0].$

In our model the numeraire itself is not deterministic so we have to simulate it too.

The Monte-Carlo pricing will consist of three steps

– generate M paths of the short rate process and
– evaluate the swap npv $V_i$ and calculate the numeraire price $N_i$ at option expiry $T_E$ for each path $i=0\dots,M-1$
– and finally approximate the expected value by $\frac{1}{M} \sum_{i=0}^{M-1} \max(\frac{V_i}{N_i},0.0)$ .

Implementation

Instead of using the QuantLib swap pricer we will do the path pricing in Python. Therefore we need to extract the needed information from the instrument.

We convert all dates into times (in years from today). We use the day count convention Act/365.

mcDC = yts.dayCounter()

def timeFromReferenceFactory(daycounter, ref):
    """
    returns a function, that calculate the time in years
    from a the reference date *ref* to date *dat* 
    with respect to the given DayCountConvention *daycounter*
    
    Parameter:
        dayCounter (ql.DayCounter)
        ref (ql.Date)
        
    Return:
    
        f(np.array(ql.Date)) -> np.array(float)
    """
    def impl(dat):
        return daycounter.yearFraction(ref, dat)
    return np.vectorize(impl)

timeFromReference = timeFromReferenceFactory(mcDC, today)

In the first step we extract all fixed leg cashflows and payment dates to numpy arrays.

That are all information we need to calculate the fixed leg npv on a path. We calculate the discount factors for each payment time and multiply the cashflow array with the discount factors array element-wise. The sum of this result gives us the fixed leg npv.

fixed_leg = swap.leg(0)
n = len(fixed_leg)
fixed_times = np.zeros(n)
fixed_amounts = np.zeros(n)
for i in range(n):
    cf = fixed_leg[i]
    fixed_times[i] = timeFromReference(cf.date())
    fixed_amounts[i] = cf.amount()

For the floating leg npv we extract all payment, accrual period start and end dates. We assume that the index start and end dates coincide with the accruals start and end dates and that all periods are regular. With this information we can estimate all floating cashflows by estimating the index fixing through

$fixing(t_s) = (\frac{df(t_s)}{df(t_e)}-1) \frac{1}{dcf_{idx}(t_s,t_e)},$

with the discount factor $df(t$ ) at time $t$ , the year fraction $dcf_{idx}$ between accrual start time $t_s$ and accrual end time $t_e$ using the index day count convention.

float_leg = swap.leg(1)
n = len(float_leg)
float_times = np.zeros(n)
float_dcf = np.zeros(n)
accrual_start_time = np.zeros(n)
accrual_end_time = np.zeros(n)
nominals = np.zeros(n)
for i in range(n):
    # convert base classiborstart_idx Cashflow to
    # FloatingRateCoupon
    cf = ql.as_floating_rate_coupon(float_leg[i])
    iborIndex = cf.index()
    value_date = cf.referencePeriodStart()
    index_mat = cf.referencePeriodEnd()
    # year fraction
    float_dcf[i] = cf.accrualPeriod()
    # calculate the start and end time
    accrual_start_time[i] = timeFromReference(value_date)
    accrual_end_time[i] = timeFromReference(index_mat)
    # payment time
    float_times[i] = timeFromReference(cf.date())
    # nominals 
    nominals[i] = cf.nominal()

We could extend this about gearings and index spreads, but we set the gearing to be one and the spread to be zero.

To calculate the swap npv we need the discount factors for all future payment times (fixed and floating leg), accrual period start and end dates. We store all times together in one array. To get the discount factors we apply the method zeroBond of the GSR model on this array element-wise.

# Store all times for which we need a discount factor in one array
df_times = np.concatenate([fixed_times,
				       ibor_start_time,
				       ibor_end_time,
				       float_times])
df_times = np.unique(df_times)

# Store indices of fix leg payment times in 
# the df_times array
fix_idx = np.in1d(df_times, fixed_times, True)
fix_idx = fix_idx.nonzero()
# Indices of the floating leg payment times 
# in the df_times array
float_idx = np.in1d(df_times, float_times, True)
float_idx = float_idx.nonzero()
# Indices of the accrual start and end time
# in the df_times array
accrual_start_idx = np.in1d(df_times, ibor_start_time, True)
accrual_start_idx = accrual_start_idx.nonzero()
accrual_end_idx = np.in1d(df_times, ibor_end_time, True)
accrual_end_idx = accrual_end_idx.nonzero()

Our pricing algorithm for the underlying swap is:

# Calculate all discount factors
discount = np.vectorize(lambda T: model.zerobond(T, t, x_t))
dfs = discount(df_times)
# Calculate fixed leg npv
fix_leg_npv = np.sum(fixed_amounts * dfs[fix_idx])
# Estimate the index fixings
index_fixings = (dfs[accrual_start_idx] / dfs[accrual_end_idx] - 1) 
index_fixings /= float_dcf
# Calculate the floating leg npv
float_leg_npv = np.sum(nominals * index_fixings * float_dcf * dfs[float_idx])
npv = float_leg_npv - fix_leg_npv

Our time grid for the simulation consists of two points, today and option expiry.

The path generation is very similar like the one in the previous posts, but this time we not only simulate the underlying process but also the numeraires, and we calculate all needed discount factors on a path.

M = 100000
m = len(time_grid)
x = np.zeros((M, m))
y = np.zeros((M, m))
numeraires = np.zeros((M,m))
dfs = np.zeros((M, m, len(df_times)))

for n in range(0,M):
    numeraires[n, 0] = model.numeraire(0, 0)
    
for n in range(0,M):
    dWs = generator.nextSequence().value()
    for i in range(1, len(time_grid)):
        t0 = time_grid[i-1]
        t1 = time_grid[i]
        e = process.expectation(t0, 
                                x[n,i-1], 
                                dt[i-1])
        std = process.stdDeviation(t0,
                                   x[n,i-1],
                                   dt[i-1])
        x[n,i] = e + dWs[i-1] * std 
        e_0_0 = process.expectation(0,0,t1)
        std_0_0 = process.stdDeviation(0,0,t1)
        y[n,i] = (x[n,i] - e_0_0) / std_0_0
        df = np.vectorize(lambda T : model.zerobond(T, t1, y[n,i]))
        numeraires[n ,i] = model.numeraire(t1, y[n, i])
        dfs[n,i] = df(df_times)

Given the matrix of numeraires and discount factors we can calculate the npv on the path very fast using numpy arrays.

index_fixings = dfs[:,-1, accrual_start_idx][:,0,:] / dfs[:, -1, accrual_end_idx][:,0,:] - 1
index_fixings /= float_dcf
floatLeg_npv = np.sum(index_fixings * float_dcf * dfs[:,-1, float_idx][:,0,:] * nominals, 
                     axis = 1) 
fixedLeg_npv = np.sum(fixed_amounts * dfs[:, -1, fix_idx][:,0,:], axis=1)
npv = (floatLeg_npv - fixedLeg_npv)
# Apply payoff function 
npv[npv < 0] = 0
# Deflate NPV
npv = npv / numeraires[:,-1] 
npv = np.mean(npv) * numeraires[0,0]

Some remarks

To extract the information from the swap we use the method leg. This method is not a part of the QuantLib 1.5, but you could clone my QuantLib fork on GitHub (branch: SwigSwapExtension) and build the Swig binding yourself. I also send a pull request to Luigi. Maybe it will be part of the official QuantLib at a later time.

In the real world there are quotes for European swaptions in terms of implied volatility available and one would like use a model that is consistent with the market quotes. This is done by model calibration (choice the model parameter so that the model give the same premium for the quoted swaptions). Of cause one could use the Monte-Carlo pricing to calibrate the model, but this would be very time consuming process. The Gaussian short rate model provide some faster and very convenient routines for that. In the next part we will see how to calibrate the model and use the calibrated model to price Bermudan swaptions.

As usual you can download the notebook on nbviewer or GitHub.

Stay tuned for the next part coming soon…

Expected Exposure and PFE simulation with QuantLib and Python

08/04/201516/04/2015 ~ Matthias Groncki ~ 8 Comments

In this post I will show how to use the Python bindings of the QuantLib library to calculate the expected exposure (EE) for a netting set of interest rate swaps in a IPython notebook. The technique I will present is very simple and works out of the box with standard QuantLib instruments and models. I will use a forward Monte Carlo Simulation to generate future market scenarios out of one-factor gaussian short rate model and evaluate the NPV of all swaps in the netting set under each scenario. The source code to this post (ExpectedExposureSimulation.ipynb) can be found in my repository IPythonScripts on GitHub or at nbviewer .

Methodology

First we define a time grid. On each date/time in our grid we want to calculate the expected exposure. For each date in our time grid we will simulate N states of the market and for each of these states we will calculate the NPV all of instruments in our portfolio / netting set. This results in N x (size of the netting set) simulated paths of NPVs. These paths can be used for EE, CVA (Credit value adjustment) or PFE (potential future exposure) calculations. In the next step will we will floor each path at zero. This give the exposure of the portfolio on a path at each time. The expected exposure is given by the average of all paths: The total number of NPV evaluations is (size of time grid) x (size of portfolio) x N. For a big portfolio and a very dense time grid it can be very time consuming task even if the single pricing is done pretty fast.

Assumption made in this example

For simplicity we restrict the portfolio to plain vanilla interest rate swaps in one currency. Further we assume that we live in a “single curve” world. We will use the same yield curve for discounting and forwarding. No spreads between the different tenor curves neither CSA discounting are taken into account. For the swap pricing we will need future states of the yield curve. In our setup we assume the the development of the yield curve follow an one factor Hull-White model. At the moment we make no assumption on how it is calibrated and assume its already calibrated. In our setting we will simulate N paths of the short rate following the Hull-White dynamics. At each time on each path the yield curve depend only on the state of our short rate process. We will use QuantLib functionalities to simulate the market states and perform the swap pricing on each path. The calculation of the expected exposure will be done in Python.

Technical Implementation

1. Setup of the market state at time zero (today)

rate = ql.SimpleQuote(0.03)
rate_handle = ql.QuoteHandle(rate)
dc = ql.Actual365Fixed()
yts = ql.FlatForward(today, rate_handle, dc)
yts.enableExtrapolation()
hyts = ql.RelinkableYieldTermStructureHandle(yts)
t0_curve = ql.YieldTermStructureHandle(yts)
euribor6m = ql.Euribor6M(hyts)

As mentioned above we live in a single curve world, we use a flat yield curve as discount and forward curve. During the Monte Carlo Simulation we will relink the Handle to the yieldTermStrucutre htys to our simulated yield curve. The original curve is stored in yts and the handle t0_curve.

2. Setup portfolio / netting set

# Setup a dummy portfolio with two Swaps
def makeSwap(start, maturity, nominal, fixedRate, index, typ=ql.VanillaSwap.Payer):
    &quot;&quot;&quot;
    creates a plain vanilla swap with fixedLegTenor 1Y
    
    parameter:
        
        start (ql.Date) : Start Date
        
        maturity (ql.Period) : SwapTenor
        
        nominal (float) : Nominal
        
        fixedRate (float) : rate paid on fixed leg
        
        index (ql.IborIndex) : Index
        
    return: tuple(ql.Swap, list&lt;Dates&gt;) Swap and all fixing dates
    
        
    &quot;&quot;&quot;
    end = ql.TARGET().advance(start, maturity)
    fixedLegTenor = ql.Period(&quot;1y&quot;)
    fixedLegBDC = ql.ModifiedFollowing
    fixedLegDC = ql.Thirty360(ql.Thirty360.BondBasis)
    spread = 0.0
    fixedSchedule = ql.Schedule(start,
                                end, 
                                fixedLegTenor, 
                                index.fixingCalendar(), 
                                fixedLegBDC,
                                fixedLegBDC, 
                                ql.DateGeneration.Backward,
                                False)
    floatSchedule = ql.Schedule(start,
                                end,
                                index.tenor(),
                                index.fixingCalendar(),
                                index.businessDayConvention(),
                                index.businessDayConvention(),
                                ql.DateGeneration.Backward,
                                False)
    swap = ql.VanillaSwap(typ, 
                          nominal,
                          fixedSchedule,
                          fixedRate,
                          fixedLegDC,
                          floatSchedule,
                          index,
                          spread,
                          index.dayCounter())
    return swap, [index.fixingDate(x) for x in floatSchedule][:-1]

The method makeSwap create a new QuantLib plain vanilla swap (see my previous post). We use this method to setup a netting set with two swaps:

portfolio = [makeSwap(today + ql.Period(&quot;2d&quot;),
                      ql.Period(&quot;5Y&quot;),
                      1e6,
                      0.03,
                      euribor6m),
             makeSwap(today + ql.Period(&quot;2d&quot;),
                      ql.Period(&quot;4Y&quot;),
                      5e5,
                      0.03,
                      euribor6m,
                      ql.VanillaSwap.Receiver),
            ]

Our netting set consists of two swaps, one receiver and one payer swap. Both swaps differ also in notional and time to maturity. Finally we create a pricing engine and link each swap in our portfolio with it.

engine = ql.DiscountingSwapEngine(hyts)

for deal, fixingDates in portfolio:
 deal.setPricingEngine(engine)
 print(deal.NPV())

In our Monte Carlo Simulation we can relink the handle hyts and use the same pricing engine. So we don’t need to create new pricing engines or relink the the deals to a new engine. We just need to call the method NPV of the instruments after relinking the yield term structure handle.

3. Monte-Carlo-Simulation of the “market”

We select a weekly time grid, including all fixing days of the portfolio. To generate the future yield curves we are using the GSR model and process of the QuantLib.

volas = [ql.QuoteHandle(ql.SimpleQuote(0.0075)),
ql.QuoteHandle(ql.SimpleQuote(0.0075))]
meanRev = [ql.QuoteHandle(ql.SimpleQuote(0.02))]
model = ql.Gsr(t0_curve, [today+100], volas, meanRev, 16.)
process = model.stateProcess()

The GSR model allows the mean reversion and the volatility to be piecewise constant. In our case here both parameter are set constant. For a more detailed view on the GSR model have a look on the C++ examples “Gaussian1dModels” in the QuantLib or here. Given a time t_0 and state x(t_0) of the process we know the conditional transition density for x(t_1) for t_1 > t_0. Therefore we don’t need to discretize the process between the evaluation dates. As a random number generator we are using the Mersenne Twister.

#%%timeit
# Generate N paths
N = 1500
x = np.zeros((N, len(time_grid)))
y = np.zeros((N, len(time_grid)))
pillars = np.array([0.0, 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
zero_bonds = np.zeros((N, len(time_grid), 12))
for j in range(12):
    zero_bonds[:, 0, j] = model.zerobond(pillars[j],
                                         0,
                                         0)
for n in range(0,N):
    dWs = generator.nextSequence().value()
    for i in range(1, len(time_grid)):
        t0 = time_grid[i-1]
        t1 = time_grid[i]
        x[n,i] = process.expectation(t0, 
                                     x[n,i-1], 
                                     dt[i-1]) + dWs[i-1] * process.stdDeviation(t0,
                                              x[n,i-1],
                                              dt[i-1])
        y[n,i] = (x[n,i] - process.expectation(0,0,t1)) / process.stdDeviation(0,0,t1)
        for j in range(12):
            zero_bonds[n, i, j] = model.zerobond(t1+pillars[j],
                                                 t1,
                                                 y[n, i])

We also save the zero bonds prices on each scenario for a set of maturities (6M, 1Y,…,10Y). We use this prices as discount factors for our scenario yield curve.

4. Pricing on path & netting

On each date t and on each path p we will evaluate the netting set. First we build a new yield curve using the scenario discount factors from the step before.

date = date_grid[t]
ql.Settings.instance().setEvaluationDate(date)
ycDates = [date, 
           date + ql.Period(6, ql.Months)] 
ycDates += [date + ql.Period(i,ql.Years) for i in range(1,11)]
yc = ql.DiscountCurve(ycDates, 
                      zero_bonds[p, t, :], 
                      ql.Actual365Fixed())
yc.enableExtrapolation()
hyts.linkTo(yc)

After relinking the yield termstructure handle is the revaluation of the portfolio is straight forward. We just need to take fixing dates into account and store the fixings otherwise the pricing will fail.

for i in range(len(portfolio)):
    npv_cube[p, t, i] = portfolio[i][0].NPV()

5. Calculation EE and PFE

After populating the cube of fair values (1st dimension is simulation number, 2nd dimension is the time and 3rd dimension is the deal) we can calculate the expected exposure and the potential future exposure.

# Calculate the portfolio npv by netting all NPV
portfolio_npv = np.sum(npv_cube,axis=2)
# Calculate exposure
E = portfolio_npv.copy()
E[E&amp;amp;amp;lt;0]=0
EE = np.sum(E, axis=0)/N

With PFE(t) we mean the 95 % quantile of the empirical distribution of the portfolio exposure at time t.

PFE_curve = np.apply_along_axis(lambda x: np.sort(x)[0.95*N],0, E)
MPFE = np.max(PFE_curve)

Conclusion

With very few lines of code you can build a simulation engine for exposure profiles for a portfolio of plain vanilla swaps. These paths allows us to calculate the expected exposure or potential future exposure. Of course is the problem set in real world applications much more complex, we haven’t covered calibration (historical or market implied) or multi currencies / multi asset classes yet. And its getting even more complicated if you have multi-callable products in your portfolio.

But nevertheless I hope you have enjoyed reading this little tutorial and got an first insight into exposure simulation with QuantLib and Python. In one of my next post I will may extend this example about CVA calculation.

So long!

A brief introduction to the QuantLib in Python…

04/04/201508/04/2015 ~ Matthias Groncki ~ 7 Comments

QuantLib is an open-source framework for quantitative finance written in C++. There is an active community who develop and extend the library. QuantLib covers a wide range of financial instruments and markets like IR, FX and Equities and provide pricing engines and models, optimization algorithm, a Monte-Carlo framework, business day conventions, day count conventions, holidays calendars and even more… and grows continuously.

There are existing interfaces to different language like Python, Ruby, Java and more via swig. In the following I will describe how to build the QuantLib for Python.

In the following description I assume that you have already installed Python with IPython including all dependencies, numpy, scipy, pandas and matplotlib. A good starting point are distributions like Anaconda or WinPython (for Windows only). These distribution are shipped with many more useful libraries and provide an easy way to install more packages. Its worth to check them out, if haven’t tried them yet.

For windows you can find pre-build executables of the QuantLib at Christoph Gohlke’s website. If you want to have the most recent version you should clone the current master branch from GitHub and build the library and the python package yourself.

To build QuantLib from sources under Windows you need the correct Visual Studio depending on the Python version you are targeting. For Python 3.3 you need Visual Studio 2010 and for Python 2.7 you need Visual Studio 2008. Additional dependencies are the boost library and swig.

I will not cover how to install the dependencies but you can find binary installer for swig under windows on the swig project website and there also exists precompiled binaries for the boost library on source forge. Add the path to swig command to your PATH environment variable.

The following steps are valid for Python 2.7 and 3.3:

0. Clone the git repository either with using git:

cd c:\path\to\quantlib
git clone https://github.com/lballabio/quantlib .

or download as a zipped file from GitHub.

1. Open the QuantLib_vcXX.sln and build it in “Release” or “Release static runtime” configuration. For more details check the install documentation on the QuantLib project site. You will find the solution under

c:\path\to\quantlib\QuantLib

2. Open command line window and set required environment variables

SET QL_DIR=c:\path\to\quantlib\QuantLib
SET INCLUDE=c:\path\to\boost;%INCLUDE%
SET LIB=c:\path\to\boost\lib;%LIB%

3. Create the interface code with swig

cd c:\path\to\quantlib\QuantLib-SWIG\python
python setup.py wrap

4. Build and install the package

python setup.py build
python setup.py install

5. Test your install

python setup.py test

If you use Visual Studio 2010 and target Python 2.7 you will encounter the error message “Unable to find vcversall.bat“. The following “hack” works for me:

Set an additional environment variable at step 2:

SET VS90COMNTOOLS=%VS100COMNTOOLS%

Under unix you can also find some pre-build packages (e.g. for Ubuntu) in your package-manager or you build it from sources yourself with the usual configure, make and make install chain (see here and for Mac OS X here). After you build the library you can continue with step 3 above. Be sure you have installed swig before. Just download the source from the swig project and install via the usual unix install commands.

After we have managed to install QuantLib Python package I want to show you how you can use the QuantLib C++ classes from Python. You can download the following simple example from my repository “IPythonScripts” on GitHub.

In the first example we will see how to setup an interest rate swap and use a discounting pricing engine to calculate the NPV and the fair rate for this swap.

First we need to import the QuantLib package, since we want also produce some plots we import numpy and matplotlib.

import numpy as np
import matplotlib.pyplot as plt
import QuantLib as ql

In the next step we will setup the yield curve (aka yieldtermstructure). For simplicity we will use an flat forward curve. I will use the same curve for discounting and calculation of the forwards. QuantLib supports multi curve setups, as I will maybe show in a later post or you can see in the examples how to setup more realistic yield curves.


# Set Evaluation Date
today = ql.Date(31,3,2015)
ql.Settings.instance().setEvaluationDate(today)
# Setup the yield termstructure
rate = ql.SimpleQuote(0.03)
rate_handle = ql.QuoteHandle(rate)
dc = ql.Actual365Fixed()
disc_curve = ql.FlatForward(today, rate_handle, dc)
disc_curve.enableExtrapolation()
hyts = ql.YieldTermStructureHandle(disc_curve)

The yieldTermStructure object provides an method which gives us the discount factor for a particular date (QuantLib.Date object) or time in years (with 0 = evaluationDate). This method is called discount. I am using the numpy method vectorize to apply this function on arrays or list of times and then generate a plot of the discount curve.

discount = np.vectorize(hyts.discount)
tg = np.arange(0,10,1./12.)
plt.plot(tg, discount(tg))
plt.xlabel(&amp;quot;time&amp;quot;)
plt.ylabel(&amp;quot;discount factor&amp;quot;)
plt.title(&amp;quot;Flat Forward Curve&amp;quot;)

In the next step we will setup up a plain vanilla EURIBOR 6M Swap with maturity in 10 years. Therefore we generate an index and using the handle to our yield curve as forward curve and two schedules, one for the fixed rate leg with annual payments and one for the float leg with semi annual payments.

start = ql.TARGET().advance(today, ql.Period(&quot;2D&quot;))

end = ql.TARGET().advance(start, ql.Period(&quot;10Y&quot;))

nominal = 1e7

typ = ql.VanillaSwap.Payer

fixRate = 0.03

fixedLegTenor = ql.Period(&quot;1y&quot;)

fixedLegBDC = ql.ModifiedFollowing

fixedLegDC = ql.Thirty360(ql.Thirty360.BondBasis)

index = ql.Euribor6M(ql.YieldTermStructureHandle(disc_curve))

spread = 0.0

fixedSchedule = ql.Schedule(start, end, fixedLegTenor, index.fixingCalendar(), fixedLegBDC, fixedLegBDC, ql.DateGeneration.Backward, False)
floatSchedule = ql.Schedule(start, end, index.tenor(), index.fixingCalendar(), index.businessDayConvention(), index.businessDayConvention(), ql.DateGeneration.Backward, False)

swap = ql.VanillaSwap(typ, nominal, fixedSchedule, fixRate, fixedLegDC, floatSchedule, index, spread, index.dayCounter())

The last step before we can calculate the NPV we need a pricing engine. We are going to use the discountingSwapEngine. As the name suggests, it will discount all future payments to the evaluation date and calculate the difference between the present values of the two legs.

engine = ql.DiscountingSwapEngine(ql.YieldTermStructureHandle(disc_curve))
swap.setPricingEngine(engine)

print(swap.NPV())
print(swap.fairRate())

That was a very brief demonstration of the QuantLib. The examples which are shipped with the QuantLib give a much more detailed insight. So you may have a look on these.

Jupyter notebooks – a Swiss Army Knife for Quants

A blog about quantitative finance, data science in fraud detection, machine and deep learning by Matthias Groncki

Python