DATA Science Course

Data Science Course

EduPristine Data Science course is designed to help meet the expanding needs for these "Data Scientists", who are skilled in the utilization of a unique blend of Science, Art and Business. Students will learn to combine tools and techniques from statistics, computer science, data visualization and the social sciences to solve problems using data.

1388 Reviews

Need more info?

Globally and in the US, companies need 190,000 data scientists.

McKinsey Global Report

The data science course is designed and delivered by the experienced faculty and data science professionals who teach at the EduPristine campus Live Classes. EduPristine is known for classroom training and project-based learning. This data science training is no exception. Unlike a typical classroom learning experience, these classes are delivered using standard practices. EduPristine blends live, face-to-face classes with top-notch platform facilities.Read more

The Registration fees for Tableau Desktop certification is not included in this course.

About Data Science Program - Many Corporations have dramatically increased investments in their "digital enterprises" in the past few years. It has been estimated that by 2020, IT departments will be monitoring 50 times more data than they are today. This tidal wave of data is driving unprecedented demand for those with the skills required to manage and leverage these very large data sets into a competitive advantage.

These professionals are skilled in automating methods of collecting and analyzing data and utilizing inquisitive exploring techniques to discover previously hidden insight from this data that can profoundly impact the success of any business.

  • Extensive Classroom Training

    Get trained by topic experts with interactive learning in small batches

  • Online Live Instructor base training

    Learn Concepts once again though Live Online sessions.

  • Assignments & cases

    Work on real time cases from different domains.

  • Complimentary Course

    1.Java Essentials for Hadoop, Python and UNIX session
    2.Basic Stats video recordings.

  • Lab Practical

    200 Hrs - Virtual Lab practice (SAS Language - valid till 1 year)
    3 Months - Cloud lab access to work on Hadoop Platform

  • Online Materials

    This course serves as an introduction to the interdisciplinary and emerging field of data science. Students will learn to combine tools and techniques from statistics, computer science, data visualization and the social sciences to solve problems using data.

  • Home Assignment & Online Live Discussion Module

    Work on home assignment & every Friday discuss It with Faculty on Live Online Mode.

  • 24x7 Online Access

    Access to Course Material (Unlocked Excel Models, Presentations, etc.)

  • Doubt Solving by Experts

    A Write to us and get your doubts solved by our experts within 2 business days. You can also initiate a discussion by posting it on active forums

  • Online Content

    Download the study notes to supplement subject wise video tutorials & webinar recordings.

  • Certificate

    A reference to get ahead in your career. At the end of the course, you will receive a Certificate of Completion.
    1. Business Analytics Module
    2. Big data & Hadoop Module
    3. Data Science Module

  • Real-world Case Studies

    Get the best training in analytics by understanding real world problems and scenarios

  • Unlimited Download Access

    Download the whole material anytime during your 1 year of subscription and use it for any future reference

Course Structure

ReadingsPratical Implementation
  1. Introduction to Hadoop/ Spark
  2. Good Data Scientist tool kit
  3. How modern Big Data technologies & tools provides answers to below problems:
  • Volume is large - Batch Analytics
  • Volume is large - Batch Analytics
  • Velocity is High - Real Time Analytics
  • Variety in Data - Unstructured or Semi Structured data
  • Any non-functional parameters like cost, Reliability, fault tolerance
Getting started with the fundamentals of hadoop/spark
and setting a base to align the same with batch & real-time analytics
ReadingsPratical Implementation
  1. Getting started with fundametals of programming
  2. Python for data processing 
  3. Unix for CLI Commands - Getting familiar with Unix and CLI is first priority
  4. Map Reduce concept and understanding
  5. SQL for Hive
Getting started with fundametals of programming: Python for data processing & unix for CLI Commands
ReadingsPratical Implementation
  1. Cluster Specification & Hadoop Configuration
  2. Basic Linux and HDFS commands
  3. Command Line Interface
  4. Hadoop File Systems
  5. Data Flow
  6. Become familiar with cloud environment
  7. Set-up development environment
Introduction to big data storage, structured data ingestion:
touching base on parellel programming on scalable machines
ReadingsPratical Implementation
  1. Parallel programming on scalable Machines: Map Reduce
  2. Mastering Key Value Pairs:Case Study
  3. Distributed Computing using Map Reduce
  4. The Execution Framework, Concept of Practitioners
Understanding - Map-Reduce Basics and Map-Reduce Types and Formats
ReadingsPratical Implementation
1. Importing Large Objects, Performing Exports, Exports - A Deeper Look.Introduction to Database Imports, Working with Imported Data
ReadingsPratical Implementation
  1. Data warehousing, Management and querying on hadoop:Hive
  2. Web Interface for analyzing data: Hadoop User Experience (HUE)
  3. Querying Data
  4. User Defined Functions
  5. Custom Map/Reduce in Hive
Getting started with data warehousing, management and querying
on hadoop: HIVE & web interface for analyzing data: Hadoop User Experience (HUE)
ReadingsPratical Implementation
  1. Data Flow ETL Scripting Language : Pig
  2. Installing and Running Pig, Grunt
  3. Pig's Data Model, Pig Latin
  4. Developing & Testing Pig Latin Scripts
  5. Writing Evaluation
Building the fundamentals for data warehousing, management and querying on hadoop: HIVE & web interface for analyzing data: Hadoop User Experience (HUE)
ReadingsPratical Implementation
  1. Recap of Hadoop
  2. Opportunity for In memory computing
  3. Spark Ecosystem
  4. Time comparisons with Map Reduce
  5. Spark Architecture
  6. Spark Context
  7. Resilient Distributed Dataset
How to Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk using Spark
ReadingsPratical Implementation
  1.  Lighting Fast In Memory Cluster Computing:Spark
  2. Batch Processing Historical data: Log Analysis; Ecommerce Industry
  3. Interactive and Batch Mode production
  4. Transformations
  5. Spark Program Life Cycle
  6. Closures, Accumulators and Broadcast variables
  7. Project: Log Analysis (Batch processing with Spark)
Getting to understand the log analysis, involving SPARK and Python with the help of a business case study to get a hands on experience
ReadingsPratical Implementation
  1. Work Flow Management Tool: Oozie
  2. Oozie Workflow, Actions and Control flow
  3. Sqoop Action
  4. Hive Action
  5. HDFS Action
Introduction of the work flow management tool with hands on examples
ReadingsPratical Implementation
  1. Random Read and Write Access, OLAP, NoSQL
  2. Database: Hbase
  3. Client API - Advanced Features
  4. Client API - Administrative Features
  5. Available Client, Architecture
  6. Map Reduce Integration, Advanced Usage
  7. Advance Indexing
Introduction to the fundamentals of random read and write Access, OLAP, NoSQL database: Hbase
ReadingsPratical Implementation
  1. MIS Reporing and ELT on Hadoop: Retail Domain
  2. We will provided data sets on which participants will work as a part of the Project:
    • Load data into MySQL
    • Retail Data Analysis with Pig
    • SQOOP data into HDFS
    • Retail Data Analysis with Hive
Given a retail business scenario, this provides a run-through of the MIS reporting and ELT on Hadoop
ReadingsPratical Implementation
  1. Customer 360 & Genome: Banking sector
  2. We will provided data sets on which participants will work as a part of the Project:
    • Data Creation
    • MySQL Data Ingestion
    • Sqoop Daily data to MySQL
    • HBase table Creation
Given a banking sector scenario, this provides a run-through on Customer 360 & Genome
ReadingsPratical Implementation
1.Using Flume, Kafaka , Spark Streaming and Batch Processing Using Hive & ImpalaA run-through on structured data ingestions, semi structured processing
ReadingsPratical Implementation
A session on the exam preparation, pattern and the important topics to be discussed
ReadingsPratical Implementation
Online - 2 hours
ReadingsPratical Implementation
  1. Population vs. Sample
  2. Types of Data Variables and Summarizing
  3. Central Tendency and Spread/Variability
  4.  Data Collection and Data Dictionary
  5. Probability and Random Variables
  6. Probability Distribution: Discrete and Continuous Distributions
  7. Central Limit Theorem
  8. Hypothesis Testing
Introduction to data and statistics with an insight on concepts of distribution and hypothesis testing
ReadingsPratical Implementation
  1. Correlation and Regression.
  2. Multivariate Linear Regression Theory
  3. Bivariate Analysis
  4. ANOVA (Analysis of Variance.)
  5. Identify and Quantify the factors responsible for loss amount for an Auto Insurance Company
    • Model Misspecifications
    • Economic meaning of a Regression Model
    • Bivariate Analysis
Given a multivariate linear regression case study, understanding the correlation and regression, ANOVA.
ReadingsPratical Implementation
  1. Identifying problems in fitting linear regression on data having "Binary Response" variable
  2. Generalized Linear Modeling (GLMs)
  3. Logistic Regression Theory/Case
    • Fitting the regression using SAS language
    • Lift/Gains chart and Gini coefficient
    • K-S stat
  4. Identify bank customers who will most likely default in making the payment on balance due.
Given a multivariate logistic regression case study, identifying problems in fitting linear regression on data having "Binary Response" variable Generalized Linear Modeling (GLMs)
ReadingsPratical Implementation
  1. Models of time series
  2. The Box-Jenkins model building process Identify the ARIMA model.
    • Forecasting future sales based on historical data for an automobile company
  3. Identify bank customers who will most likely default in making the payment on balance due.
Models of time series, The Box-Jenkins model building process - ARIMA Modeling
ReadingsPratical Implementation
  1. Affinity analysis to understand purchase behavior
  2. Understanding Apriority algorithm
  3. Analysis of observational datasets to find unforeseen relationships.
  4. Analysis of output results to plan store layout, promotions and recommendations
Optimization of expenses using Market mix modeling
ReadingsPratical Implementation
  1. Getting an insight on Data Mining and Decision Trees
  2. Introduction and practical application of CHAID analysis
  3. Introduction and practical application of CART
  4. Understand the usage of clustering
  5. Getting an insight on various Clustering methods
  6. Hands on for K-means Clustering Algorithm
Understanding the CHAID & CART Analysis and linking the same with K-means clustering
ReadingsPratical Implementation
  1. Develop a scoring algotithm
  2. Email Score
    • Open rate
    • Click rate
    • Unsubscribe rate
  3. Rank campaign accordingt to Email Score
Air Traffic Control For Emails
ReadingsPratical Implementation
Cross Sell ModelPropensity to Cross sell health insurance products to general insurance customers.
Market Mix ModelingOptimization of the promotion expense using Market mix modeling
Churn AnalyticsDeveloping a churn model to gauge the propensity of attrition among loyal and profitable customer segment.
Email Optimization- Ecommerce IndustryDeveloping a system that ensures that the correct campaign reaches the relevant customers with a suitable frequency to further enhance the level of engagement across all email campaigns.
Customer Lifetime Value AnalysisPredicting the customer survival along with the profitability to model the life time value of each customer.
Telecom Model to Estimate BillBuilding a model that can suggest right tariff plan based on estimated bill amount.
Sentiment AnalysisProcess of detecting the contextual polarity of text to find whether a piece of writing is positive, negative or neutral.
ReadingsPratical Implementation
  1. The visualization design methodology.
  2. The Data Visualization Process. 
  3. Working with Single Data Sources.
  4. Using Calculations in Tableau
  5. Using Multiple Data Source
An introduction to various data visualisation techinques and later tying them back to varios scenarios
ReadingsPratical Implementation
Solving problem statements using apache hive and RReal Time Analytics, Unstructured Data Ingestion
ReadingsPratical Implementation
  1. Ridge Regression
    • Cost functions
    • Ridge regression equation
    • Application of Ridge regression
  2. LASSO Regression
    • Cost functions
    • Lasso regression equation
    • Application of Lasso regression
Introduction to machine learning with an hand on for Ridge and LASSO regression
ReadingsPratical Implementation
  1. Count Regression
    • Poisson Regression
    • Negative Binomial Regression
    • Zero Inated Regressions
  2. Survivor Analytics
    • Time-to-event Data
    • Survival Analysis
    • Comparing Survival Curves
  3. Goodness of fit
  4. Model Comparisons
Getting an understading on count regression and survivor analytics
ReadingsPratical Implementation
  1. Random Forest
    • Hyper parameters of Random Forest
    • Fine Tuning Random Forest calculating its cost function
  2. Neural Network, Back Propagation, Back Propagation Intuition, Gradient Checking
    • The Perceptron learning
    • The back propagation learning
    • Recurrent neural networks
    • Feed Forward Neural Network
Insight on deep machine learning through random forest and neural network
ReadingsPratical Implementation
  1. Cloud and Platform as a Service like APIs from various cloud providers
  2. APIs related to NPL - Natural Language Processing Suitable APIs
  3. Development Environment Setup
  4. Developing hands on solution
  5. More such APIs in area of advanced Analytics and AI like object recognition, mood detection from Image etc in area of cognitive computing
Integrating advanced analytics using ML/ AI APIs with machine learning

You Win Because of Your Team

Level up Your Team

With Corporate Training

refer a friend

ba free material

ba free material
Schedule WHEN
LocationStart DateBatch TypeClass Timing
LocationStart DateBatch TypeClass Timing
LocationStart DateBatch TypeClass Timing
LocationStart DateBatch TypeClass Timing
Benefits WHY?
  • At the end of this data science classes, the student will be able to:
    • Work through a data science project end to end, from analyzing a dataset to visualizing and communicating your data analysis.

    • Through working on the class project, you will be exposed to and understand the skills that are needed to become a data scientist yourself.

    • Identify, obtain, and transform a data set to make it suitable to produce statistical evidence communicated in written form.

    • Build models based on new data types, experimental design, and statistical inference.

A Data Scientist, IT earns an average salary of

INR 620,244

per annum.

We sincerely appreciate the flexibility of teaching and customized guidance that the institute provided each of us. The intensity of the programme prepares one for high pressure situations. We are very grateful for the very valuable training and assistance provided to us by EDUPRISTINE.

Saloni Sharma EOL-E&P, Mumbai
Who should do this? TARGET

Individuals with a bachelor's degree in engineering, science, maths/statistics, finance, computer science, accounting or marketing who are intrigued by statistical and analytical practices may excel in this field.

  • Basic Statistics methods used in business performance measures
  • Strong interest in data science
  • Hands-on experience on Core Java & Unix
  • Good analytical skills to grasp and apply the concepts in Hadoop

What is Data Science?

Why should I opt for this course?

What Data Scientists do?

Pre-requisites for this program?

Which Tools I will be learning?

Is this classroom session?

Data Scientist: Is it the Hottest Job in India?

What kind of job description companies look forward?

Who are some of the major employers for Data Scientist profile?

Which are the some of the big MNCs with operations in India?


- -