Data Science Project Planning

Fundamental Concepts for Beginners

Last updated 2022-01-10 | 4.7

- Fundamental concepts underlying core planning activities that are critical for a data science project's success.
- PLEASE NOTE: This course will not cover technical topics like programming
- statistics and algorithms.

What you'll learn

Fundamental concepts underlying core planning activities that are critical for a data science project's success.
PLEASE NOTE: This course will not cover technical topics like programming
statistics and algorithms.

* Requirements

* Willingness to look beyond the technical aspects and learn about the crucial planning activities involved in a data science project.
* Familiarity with high school level mathematics

Description

Success of any project depends highly on how well it has been planned. Data science projects are no exception.

Large number of data science projects in industrial settings fail to meet the expectations due to lack of proper planning at their inception stage.

This course will provide a overview of core planning activities that are critical to the success of any data science project.

We will discuss the concepts underlying  - Business Problem Definition; Data Science Problem Definition; Situation Assessment; Scheduling Tasks and Deliveries.

The concepts learned will help the students in:

A) Framing the business problem 

B) Getting buy-in from the stakeholders 

C) Identifying appropriate data science solution that can solve the business problem 

D) Defining success criteria and metrics to evaluate the key project deliverables  viz;  models, data flow pipeline and documentation.

E) Assessing the prevailing situation impacting the project. For e.g. availability of data and resources; risks; estimated costs and perceived benefits. 

F) Preparing delivery schedules that enable early and continuously incremental valuable actionable insights to the customers 

G) Understanding the desired team attributes and communication needs



Who this course is for:

  • Managers or Leads who are going to plan their first data science project in a real life business environment
  • Members of a data science team who want to build awareness about crucial planning activities required for making their project successful
  • Senior Executives requiring a bird’s eye view of activities involved in planning a data science project

Course content

7 sections • 66 lectures

Course Preview Preview 03:03

Welcome Preview 00:54

Context Preview 03:16

Setting the context of this course by briefly explaining what is data science and why do we need it.

Data Science Project - Challenges Preview 01:59

Main challenges that lead to Data Science project failures

Data Science Project Planning - An Overview Preview 02:37

A brief introduction to the core activities of  data science project planning and essential components of a project plan.

Introduction

Introduction Preview 00:42

Business Problem Definition - An Overview Preview 03:47

Importance of business problem definition. Consequences of poor business problem definition; Advantages of well defined problems. Tasks involved in business problem definition.

Understanding the Business Problem Preview 05:22

Identifying types of business problems ; examples of business problems

Stakeholder Analysis - I Preview 05:59

This lecture will describe how to identify project stakeholders and gather their inputs

Stakeholder Analysis - II Preview 08:42

This lecture will describe document the inputs obtained from stakeholder and assess a stakeholder's influence and interest in the project

Review of Previous Work Preview 02:20

Provides pointers to the sources of information about the previous work done to solve the business problem. Also explains why one should review the previous work.

Framing the Business Problem Preview 07:50

This lecture is about creating a Business Problem Statement and getting the Stakeholder Buy-in for the same.

Review Questions

This assignment will check your understanding of tasks that needs to be done to define a business problem.

Introduction Preview 01:28

Data Science Project Lifecyle - An Overview of CRISP-DM Preview 07:50

Introduction to various phases of CRISP-DM, a popular data science project life cycle based on scientific method for solving problems

Data Science Problem Formulation – An Overview Preview 02:32

A brief mention of steps involved in formulating a data science problem

Data Science Problem Type - Classification Preview 08:05

Conceptual level understanding of Classification Problem

Data Science Problem Type - Regression Preview 03:51

Conceptual level understanding of Regression Problem

Data Science Problem Type - Clustering Preview 04:28

Conceptual level understanding of Clustering Problem

Data Science Problem Type - Anomaly Detection Preview 02:32

Conceptual Understanding of Anomaly Detection Problem

Data Science Problem Type - Association Preview 02:35

Conceptual Understanding of Association Problem

Data Science Problem Type - Recommendation Preview 03:10

Conceptual Understanding of Recommendation problem

Summary of Data Science Problem Types Preview 02:04

All the six data science problem types discussed in the previous lecturesummarized in a tabular form with additional examples and also a single page visual recap.

Setting Project Goals Preview 05:03

A brief discussion on three areas which data science project goals should focus on  viz; model development, establishing data flow pipeline and documentation

Specifying Project Success Criteria – An Overview Preview 06:07

This overview lecture will briefly mention

a) the metrics used to evaluate model quality, their deployment and monitoring. 

b)  the metrics used for evaluating the efficiency and effectiveness for data flow pipeline

c) checkpoints to evaluate documentation quality

Review Questions

This assignment will mainly check your understanding of CRISP-DM , data science problem types & goal setting considerations for data science projects.

Evaluation Metrics for Classification Models Preview 08:37

Definitions of  Accuracy, Precision, Recall, F1 Score with examples

Evaluation Metrics for Anomaly Detection Models Preview 10:28

Definitions of Metrics - Precision, Recall, F1 Score; Accuracy Paradox;

Evaluation Metrics for Regression Models Preview 04:03

Metrics discussed - Root Mean Square Error (RMSE), Coefficient of Determination (R-Squared)

Evaluation Metrics for Clustering Models - I : Internal Evaluation Preview 10:28

Dunn Index & Silhouette Coefficient

Evaluation Metrics for Clustering Models - II: External Evaluation Preview 03:38

Rand Index & Jaccard Index

Evaluation Metrics for Association Models Preview 04:30

Support, Confidence, Lift

Evaluation Metrics for Recommendation Models - I Preview 09:24

Metrics to measure Prediction Error - Mean Absolute Error(MAE), Root Mean Square Error (RMSE),

Metric to measure Relevance -Mean Average Precision(MAP)

Evaluation Metrics for Recommendation - II Preview 07:28

Metrics to measure Diversity, Coverage and Serendipity

Review Questions

The questions in this assignment will test the understanding of metrics that evaluate different data science models

Model Deployment Criteria and Metrics Preview 03:44

Multivariate Testing; Effect Size; Statistical Power

Model Monitoring Metrics Preview 06:36

Population Stability Index (PSI); Resource Consumption; Cost-Effectiveness

Data Flow Pipeline Metrics Preview 07:11

Availability; Latency; Throughput; Integrity; Scalability; Security; Privacy

Documentation Criteria Preview 13:51

Project Documents; Main Contents; Structure; Writing Style; Visuals

Review Questions

The questions in this assignment will test the understanding of metrics and evaluation criteria for deployment and monitoring of models, data flow pipeline and documentation

Introduction Preview 00:40

Situation Assessment - An Overview Preview 03:18

Enumerates the factors that should be assessed while setting goals and planning a data science project

Team Composition Preview 05:57

Skills needed, Team Roles & Team Attributes

Resource Assessment Preview 08:25

Assessment of  available Data, Knowledge and Computing resources

Project Requirements, Assumptions & Constraints Preview 05:50

Different kinds of requirements the project may have to satisfy; assumptions underlying the project plan; constraints the project may have to operate.

Review Questions

The questions in this assignment will test the understanding of following factors that impact a data science project - Team composition, Resources, Requirements, Assumptions & Constraints

Risk Assessment Preview 12:15

General project risks; data related risks; risk assessment criteria; mitigation and contingency planning

Terminology Preview 02:17

Need for a glossary of terminologies

Costs and Benefits Preview 03:27

Factors to consider while conducting cost/benefit analysis

Review Questions

The questions in this assignment will test the concepts underlying risk assessment; glossary of terminologies and cost-benefit analysis.

Introduction Preview 00:51

Scheduling - I Preview 07:51

Overview of key deliverables, activities and team roles for each phase in CRISP-DM lifecycle

Scheduling - II Preview 14:10

Discusses the attributes of  an effective schedule

Review Questions

The questions in this assignment will test the understanding of scheduling activities and deliverables of various data science project lifecycle phases and the attributes of an effective schedule

Introduction Preview 00:37

Emerging Methods for Executing Data Science Projects Preview 04:10

A brief overview of how Emerging methods extend CRISP-DM through Agile approach

Microsoft Team Data Science Process (TDSP) Preview 08:38

Overview of Key components of Team Data Science Process

Agile Data Science 2.0 Preview 12:05

Overview of Agile Data Science Manifesto and the main topics discussed in the book Agile Data Science 2.0 by Russell Jurney

Review Questions

The questions in this assignment will test the conceptual understanding of the emerging methods for data science project execution - Team Data Science Process and Agile Data Science 2.0

Introduction Preview 00:26

Recap of Key Points Preview 07:52

A quick summary of main contents of this course.

Project Plan Review - Checkpoints Preview 03:02

List of questions from both business and data science problem perspective  that needs to satisfactorily answered by the project plan before proceeding further

Preparing a Project Plan

This assignment is intended to provide some practice in preparing a project plan based on all the key concepts learnt in this course.

Closing Remarks Preview 03:46

Important points to bear in mind while planning a data science project.

Congratulations and Thanks Preview 00:36