Fraud-Detection

Fraud Detection in Financial Transactions

πŸš€ Project Overview

This project focuses on building a Fraud Detection Machine Learning System for a financial company. The goal is to detect fraudulent transactions proactively and provide insights for prevention strategies.

Key highlights:


πŸ“Š Dataset Description


πŸ”Ž Exploratory Data Analysis (EDA)

Class Imbalance

Class Imbalance

The dataset is highly imbalanced with 0.13% fraudulent transactions, which necessitates special handling during modeling.

Transaction Type Distribution

Transaction Type Distribution

CASH_OUT and TRANSFER transactions are more likely to involve fraud.

Amount Distribution

Transaction Amount Distribution

Most transactions are low-value, but fraudulent transactions tend to occur in higher amounts.

Sender Balance Distribution

Old Balance Origin Distribution

Fraud tends to occur more often in accounts with higher balances.

Feature Correlation

Correlation Heatmap

Highly correlated features (oldbalanceOrg vs newbalanceOrig, oldbalanceDest vs newbalanceDest) were handled to avoid multicollinearity.


πŸ—οΈ Model Building

We trained two models:

  1. Logistic Regression (baseline)
  2. Random Forest (final model)

Class imbalance was handled using class weights.


πŸ“ˆ Model Evaluation

Logistic Regression Confusion Matrix

Logistic Regression Confusion Matrix

Random Forest Confusion Matrix

Random Forest Confusion Matrix


πŸ”‘ Feature Importance

Feature Importance β€” Random Forest

Top predictors of fraud:

  1. oldbalanceOrg
  2. amount
  3. type_CASH_OUT
  4. type_TRANSFER


πŸ“Š Model Comparison

Model Comparison Table


πŸ›‘οΈ Fraud Prevention Insights

Based on model analysis:

  1. Monitor high-value CASH_OUT and TRANSFER transactions.
  2. Flag accounts with large sender balances performing suspicious transfers.
  3. Implement real-time transaction monitoring.
  4. Set thresholds using model scores to minimize false negatives.
  5. Multi-layer fraud detection: combine rules + ML model predictions.
  6. Track unusual patterns in new accounts or inactive accounts.
  7. Continuously monitor model performance and drift post-deployment.


⚑ Key Takeaways

πŸ› οΈ Getting Started / Requirements

To run this project locally, you need Python installed (recommended >= 3.9) and the required libraries listed in requirements.txt.

1. Clone the repository

  git clone https://github.com/<your-username>/Fraud-Detection-ML.git
  cd Fraud-Detection-ML

2. Install required packages

  pip install -r requirements.txt

3. Launch Jupyter Notebook

  jupyter notebook Fraud_Detection_Transaction_Analysis.ipynb