Fraud

In the previous posts we applied traditional Machine Learning methods and Deep Learning in Python and KNIME to detect credit card fraud, in this post we will see how to use a pretrained deep neural networks to classify images of offline signatures into genuine and forged signatures. A neural network like this could support experts to fight cheque fraud. We will use the VGG16 network architecture pertained on ImageNet. The technique we are going to apply is called transfer learning and allows us to use a deep network even if we have only limited data, as in our case. The idea for this post is based on the paper ‘Offline Signature Verification with Convolutional Neural Networks‘ by Gabe Alvarez, Blue Sheffer and Morgan Bryant. We combine native KNIME nodes for the data preparation and extend the workflow with Python code in some nodes using Keras for designing and the network and the transfer learning process.

We will use the same data source for our training set: The signature collection of the ICDAR 2011 Signature Verification Competition (SigComp2011) which contains offline and online signature samples. The offline dataset comprises PNG images, scanned at 400 dpi, RGB color. The data can be downloaded from http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2011_Signature_Verification_Competition_(SigComp2011).

Data Preparation

The offline signatures images have all different size, so in the first step we resize them to have all the same size 150×150 and divide all pixel by 255 to scale our features in the [0,1] space.

With the List Files node we filter all images (pngs) in our training set folder.

In the next step (Rule Node) we mark the first 123 images are forged or fraudulent and the remaining images are genuine signatures.

In the following Metanode we read all images from the files using the image reader (table) node and then standardise the features and resize all images.

In the next step we perform a 80/20 split into training and test data and apply a data augmentation step (using Keras ImageDataGenerator to create more training data) which finalise our data preparation. We will cover the data augmentation in one of the coming posts.

Deep Learning network

We will load the weights of a VGG16 network pertained on ImageNet classification tasks using Keras Applications and strip of the last layers and replace them with our new dense layers and then train and fine tune the parameters our data. Instead of training the complete network which would require a lot more data and training time we will use pretrained weights and leverage the previous experience of the network. We just learn the last layers on our classification task.

Step 1: Load the pretrained network

We use the DL Python Network Creator Node and need to write a few line of Python code to load the network. We are using Keras, which will automatically download the weights.

Step 2: Replace Top Layers and freeze weights

In the next step we modify the network, we add 3 Dense Layer with Dropouts and ReLu and a Sigmoid activation at the end and freeze the weights of the original VGG network and again we need to write a few line of Python code:

Step 3: Train Top Layers

Then we train the new layers for 5 epochs with the Keras Network Learner Node:

This slideshow requires JavaScript.

Step 4 and 5: Unfreeze and fine tune

We modify the resulting network and unfreeze the last layers of the VGG16 network to fine-tune the pre-learned weights (3 layers) and train the network for another 10 epochs.

Apply Network and Test Results

In the last step we apply the network to the test data, convert the predicted probability into a class (p>0.5 -> Forged) and calculate the Confusion Matrix and AUC curve.

Confusion Matrix:

AUC Curve:

The results look very impression, actually a bit to good. We are maybe overfitting the data, since the test data may contains signatures (genuine and forged) from the same reference authors (since there are only 10 reference authors in the complete training set). The author of the paper noted that the performance of their network is very good on signatures of persons whose signatures has been seen in the training phase but that on unseen data it’s only little bit better then a naive baseline. In one of the next posts we will to check the network performance on new unseen signatures and try to train and test the model also on the SigComp2009 data which have signatures of 100 authors and we will look into detail in the data augmentation and maybe we compare this network to some more other network architectures.

So long…

In my previous post I wrote about my first experiences with KNIME and we implemented three classical supervised machine learning models to detect credit card fraud. In the meantime I found out that the newest version of KNIME (at this time 3.6) supports also the deep learning frameworks TensorFlow and Keras. So I thought lets revisit our deep learning model for the fraud detection and try to implement in KNIME using Keras without writing one line of Python code.

Install the required packages

The requirement is you have Python with TensorFlow and Keras (you can install it with pip or conda, if you using the anaconda distribution) on your machine. Then you need to install the Keras integration extensions in KNIME, you can follow the official tutorial on https://www.knime.com/deeplearning/keras.

Workflow in KNIME

The first part of the workflow is quite similar to the previous workflow

We load the data, remove the time column, split the data in train, validation and test sets and normalise the features.

The configuration of the column filter node is quite straight forward, we specify the columns we want to include and exclude (no big surprise in the configuration).

Design of the deep network

For building our very simple 3 layer network we need 3 different new nodes, the Keras Input-Layer-Node, the Dense-Layer-Node and the DropOut-Node:

We start with the input layer and we have to specify the dimensionality of our input, in our case we have 29 features, we can also specify here the batch size.

The next layer is a dense (fully connected) layer. We can specify we number of nodes, the activation function.

After the dense layer we apply a drop-out with an dropout-rate of 20%, also the configuration is here quite straightforward.

We add then another dense-layer with 50 node, another dropout and the final layer with one node and the sigmoid activation function (binary classification: fraud or non-fraud).

The last layer of the network, the training data and the validation set are input to the Keras-Network-Learner Node.

We set the input, the target variable choose the loss function, optimisation method and number of epochs.

We can specify a own loss function if we want or need to.

We can select an early stop strategy as well:

With the setting above the training will be stopped if the validation loss will no decrease more than 0.001 for at least 5 epochs.

During the training we can monitor the training process:

The trained model and the test data are the input for the DL-Network-Executer-Node which will use the trained network to classify the test set.

The results are plugged in a ROC-Curve-Node to asses the model quality.

And here the complete workflow:

Conclusion

It was very easy and fast to implement our previous model in KNIME without writing any line of code. Nevertheless the user still need to understand the concepts of deep learning in order to build the network and understand the node configurations. I really liked the feature of the real time training monitor. In my view KNIME is a tool which can help to democratize data science within an organisation. Analyst who can not code in Python or R can have access to very good deep learning libraries for their data analytics without the burden to learn a new programming language, they can focus on understanding the underlying concepts of deep learning, understand their data and choosing the right model for the data and understand the drawbacks and limitations of their approach instead of spending hours learning programming in a new language. Of course you need to spent time to learn using KNIME, which is maybe for some people easier than learning programming.

I plan to spent some more time with KNIME in the future and I want to find out how to reuse parts of the workflow in new workflow (like the data preparation, which is almost exactly the same as in the previous example) and how to move model into production. I will report about it in same later posts.