Self-Organizing Map is a type of unsupervised deep learning model. The purpose of using
SOMs is to reduce the dimension of our dataset and create visuals.
In this project, I combined SOMs and the Artificial Neural Network to create a model for fraud detection.
We don't want to see that our customers are cheated, and then we perform supervised learning to protect them in the future. Fraud has to be stopped before it happens.
Self-Organizing Maps
The math behind the SOMs algorithm is fairly easy, so there are many
variations. Also, there are many advantages of using Self-Organizing Maps:
1. SOMs retain topology of the input set.
2. SOMs reveal correlations that are not easily identified.
3. SOMs classify data without supervision
4. No target vector, so no backpropagation.
5. No lateral connections between output nodes.
Fraud Detection
The dataset of this case doesn't have a label. There are just some features and the last
column is whether the customer's application has been approved or not.
SOMs Model
Feature scaling is very important in deep learning. To create SOMs, I used normalization.
SOM is not included in the sklearn library, so I used a package called MiniSom.
I just need to define the map's size (10*10), input size (features number), sigma (radius),
and learning rate.
After building the model, I was able to visualize the model by using pylab. One main
advantage of SOMs is that they can segment outliers. In the fraud detection process,
people who are likely to be cheated are outliers. I just need to identify the white
boxes in the plot and then mark them as vulnerable customers.
Artificial Neural Network
After coding these vulnerable customers into the original dataset as the label, I could
create a supervised learning model by using the Artificial Neural Network.
Data preprocessing the key. Data balancing and feature scaling are the most important steps.
In the end, I got a 97% accuracy model.
Software used: Python, Pycharm
Packages used: Tensorflow, Keras, Sklearn, MiniSom, pylab,