Real Time Creditcard Fraud Detection Using Spark

Real time Credit card Fraud detection using Spark Streaming, Spark ML, Kafka, Cassandra and Airflow

Last updated 2022-01-10 | 4.2

- Students will be able to build End to End Big data project using Spark
- Kafka
- Cassandra
- Scala and Java

What you'll learn

Students will be able to build End to End Big data project using Spark
Kafka
Cassandra
Scala and Java

* Requirements

* Spark Streaming
* Spark ML
* Kafka
* Cassandra
* Programming IDE like Intellij or Eclipse
* Java
* Scala

Description

Real-time Credit card Fraud Detection is implemented using Spark Kafka and Cassandra.

Spark ML Pipeline Stages like String Indexer, One Hot Encoder and Vector Assembler is used for Pre-processing

Machine Learning model is created using the Random Forest Algorithm

Data balancing is done using K-means Algorithm

Integration of Spark Streaming Job with Kafka and Cassandra

Exactly-once semantics is achieved using Spark Streaming custom offset management

Airflow Automation framework is used to automate Spark Jobs on Spark Standalone Cluster.

Who this course is for:

  • Data Scientist, Data Engineers, Software Engineers, Managers, Architects, Computer Science Engineering Students

Course content

3 sections • 30 lectures

Course Objective Preview 01:56

This video will give  complete objective of this course.

About Me Preview 01:01

About Me

Introduction Agenda Preview 01:17

The agenda of this section is described in this video.


Download the following Documents

1. Creditcard Fraud detection.pdf
      The complete project is explained & demonstrated in this document

2. Demonstration.pdf

      This document contains only a demonstration and instructions to run the project

Prerequisites Preview 01:50

Technologies that you need to know upfront before taking this course

Components Preview 02:00

All the technology components that are required to implement this project

Introduction to Spark Preview 03:22

Brief introduction to Apache Spark

Introduction to Kafka Preview 04:02

Introduction to Apache Kafka

Introduction to Cassandra Preview 03:20

Brief Introduction to Apache Cassandra

Real-time Fraud Detection Architecture Preview 03:08

This video explains the architecture of real-time fraud detection project architecture.


Agenda Preview 01:19

Agenda of this section.

Install VirtualBox and Image Preview 07:39

Learn how to install VirtualBox and import ubuntu image

Code Setup Preview 08:14

Download code from github. Import code to Intellij

Start Servers Preview 08:24

Should be able to start zookeeper server, kafka server, dashboard webserver, cassandra server. Create database and tables in Cassandra.

Run Spark Jobs from Intellij Preview 12:47

Learn how to run spark jobs from intellij.

Clean Up Preview 05:41

Learn how to stop servers. Learn how to stop spark jobs and do project cleanup

Airflow Configuration Preview 09:20

Learn how to configure Apache Airflow with mysql database. Start and stop Airflow web server and airflow scheduler

Build Project and Start Servers Preview 09:52

Learn how to build maven project and start zookeeper server, kafka server, cassandra server, spark server.

Airflow Automation Preview 18:07

Automate Real-time Fraud Detection Spark Jobs  on Spark Standalone Cluster using Apache Airflow

Agenda Preview 02:29

Agenda of this Section

Project Structure Preview 13:12

Structure of the Fraud Detection Project in Intellij

Spark Initial Import Job Preview 04:39

Implementation of Spark Job to read credit card data from file system and save to Cassandra database

Spark ML Job Preview 09:47

Implementation of Spark Machine Learning Job to train of credit card transaction data

Spark Streaming Initialization and Consumption Preview 08:13

Implementation of Spark Streaming Job. Initialization  Spark Streaming Job and Consuming credit card transactions from Kafka

Spark Streaming Processing and Prediction Preview 03:46

Processing credit card transaction in real-time. Predicting whether a transaction is fraud or not in real-time

Spark Streaming Exactly Once Semantics Preview 06:34

Saving predictions to Cassandra  from Spark.  Achieving Exactly once semantics in Spark Streaming.

Spark Streaming Grace-full Shutdown Preview 03:08

Kafka Producer Preview 05:01

Fraud Alert Dashboard Preview 03:10

Airflow Automation Preview 05:45

Course Conclusion Preview 01:17

Course Conclusion