## MU MSC DATASCIENCE SEMIII REVISED SYALLBUS 20211 1 Syllabus Mumbai University by munotes

## Page 2

Copy to : -

1. The Deputy Registrar, Academic Authorities Meetings and Services

(AAMS),

2. The Deputy Registrar, College Affiliations & Development

Department (CAD),

3. The Deputy Registrar, (Admissions, Enrolment, Eligibility and

Migration Department (AEM),

4. The Deputy Registrar, Research Administration & Promotion Cell

(RAPC),

5. The Deputy Registrar, Executive Authorities Section (EA),

6. The Deputy Registrar, PRO, Fort, (Publi cation Section),

7. The Deputy Registrar, (Special Cell),

8. The Deputy Registrar, Fort/ Vidyanagari Administration Department

(FAD) (VAD), Record Section,

9. The Director, Institute of Distance and Open Learni ng (IDOL Admin),

Vidyanagari,

They are requested to treat this as action taken report on the concerned

resolution adopted by the Academic Council referred to in the above circular

and that on separate Action Taken Report will be sent in this connection.

1. P.A to Hon’ble Vice -Chancellor,

2. P.A Pro -Vice-Chancellor,

3. P.A to Registrar,

4. All Deans of all Faculties,

5. P.A to Finance & Account Officers, (F.& A.O),

6. P.A to Director, Board of Examinations and Evaluation,

7. P.A to Director, Innovation, Incubation and Linkages,

8. P.A to Director, Board of Lifelong Learning and Extension (BLLE),

9. The Director, Dept. of Information and Communication Technology

(DICT) (CCF & UCC), Vidyanagari,

10. The Director of Board of Student Development,

11. The Director, Dep artment of Students Walfare (DSD),

12. All Deputy Registrar, Examination House,

13. The Deputy Registrars, Finance & Accounts Section,

14. The Assistant Registrar, Administrative sub -Campus Thane,

15. The Assistant Registrar, School of Engg. & Applied Sciences, Kalyan ,

16. The Assistant Registrar, Ratnagiri sub -centre, Ratnagiri,

17. The Assistant Registrar, Constituent Colleges Unit,

18. BUCTU,

19. The Receptionist,

20. The Telephone Operator,

21. The Secretary MUASA

for information.

## Page 3

AC – 29/06/2021

Item No. - 6.39

UNIVERSI TY OF MUM BAI

Syllabus

For the

Program: M.Sc. Semester -I and Semester –II CBCS

(REVI SED)

Course: M.Sc. Computer Science with

Specializ ation in Data Science

Choice Based and Credit S ystem with effect from the

academic year 2021-22)

## Page 4

UNIVE RSITY OF MUMBAI

Syllabus for Approv al

Sr. No. Heading Particulars

1. Title of the Course Master in Computer Science with

Specialization in Data Sci ence

2. Eligibility for

Admission

A candidate with minimum 50% s core in

Graduation can appear for entrance

examination through which the admission

merit list will be generated.

3. Passing Marks 40%

4. Ordinances /

Regulations (if, any)

5. Number of years /

Semesters

As applicable for all M.Sc. Courses

Two years – Four Semester s

6. Level P.G./ U.G. /Diploma / Certificate

(Strike out which is not applicable)

7. Pattern Yearly / Semester, Choice Based

(Strike out which is not applicable)

8. Status New /Revised

9. To be implemented

from Academic year

Date: 28/06/2021 From the Academic Year 2021 – 2022

Dr. Jagdish Bakal Dr. Anuradha Majumdar

BoS Chairperson in Computer Science Dean, Science and Tech nology

## Page 5

PROGRAMME OUT COME

1. Students will a ttain proficiency with statis tical analysis of Data.

2. Students will e xecute statis tical analyses with p rofessional statistical softw are.

3. Students will g ain ski lls in D ata management.

4. Students will de velop the ability to bui ld and a ssess Databased models.

5. Students will a pply data science concepts and methods to solve pr oblems in r eal-world

contexts and w ill comm unicate these solut ions e ffectively

PROGRAMME SPECIFIC OUTCO MES (PSOs)

On c ompl etion of M.Sc. Data Sci ence programme, students will be able:

PO_01: To become a ski lled Da ta Sci entist in indust ry, academia, or government.

PO_02: To use specialised softw are tools for d ata storage, analysis and vis ualization.

PO_03: To indepe ndently carry out research/investigation to solve pract ical problems.

PO_04: To gain problem-solvi ng ability- to assess social issues (ethical, financial, management,

analytical and s cientific analysis) and engineering problems.

PO_05: To ha ve a clear understanding of professional and ethical respons ibility.

PO_06: To c ollaborate virtually.

PO_07: To ha ve critical thinki ng and innovative s kills.

PO_08: To tr anslate vast data into abstr act concept s and to unde rstand da tabase reasoning .

## Page 6

PROGRAMME S TRUCTUR E

Semest er – I

Course Code Course Title Credits

PSDS101 Programming Paradigms 4

PSDS102 Database Technologies 4

PSDS103 Fundament als of Data Science 4

PSDS104 Statistical Methods for Data

Science 4

PSDS1P1 Programming Paradigms Practical 2

PSDS1P2 Database Technologies Practical 2

PSDS1P3 Fundament als of Data Science

Practical 2

PSDS1P4 Statistical Methods for Data

Science Practical 2

Total Credits 24

Semest er – II

Course Code Course Title Credits

PSDS201 Artificial Intelligence and Machine

Learning 4

PSDS202 Soft Compu ting 4

PSDS203 Algorithms for Data Sci ence 4

PSDS204 Optimization Techniques 4

PSDS2P1 Artificial Intelligence and Machine

Learning Practical 2

PSDS2P2 Soft Compu ting Practical 2

PSDS2P3 Algorithms for Data Sci ence

Practical 2

PSDS2P4 Optimization Techniques Pr actical 2

Total Credits 24

## Page 7

DETAILED SYLLAB US FOR S EMESTER - I & S EMESTER - II

Semester – 1

Programming Paradigms

M.Sc (Data Sci ence) Semest er – I

Course Name: Programming Paradigms Course Code: PSDS101

Periods p er week (1 P eriod is 60 minutes) 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Internal -- 40

Course Objectives:

To unde rstand the ba sic building blocks of programming Languages.

To Learn and unde rstand various pr ogramming paradigms.

Unit Details Lectures

I Found ations-Language design, why to study programming language,

compi lation and inte rpretation, pr ogramming e nvironm ents.

Programming language syntax – Specifying syntax: regular

expressions and Cont ext-Free grammar(Token and Regular

expressions, Context Free grammar, Derivations and parse trees),

Scanning(G enerating Finite autom ation, Scanne r code, Table-driven

scanning , Lexical errors, pragmas), Parsing (Recursive Descent,

Writing L1 grammar, Table driven top down parsing, Bottom up

parsing, Syntax errors)

12

II OBJECT O RIENTATION

Basic concepts: objects, classes, methods, ov erloading methods,

mess ages inheritance: overriding methods, single inheritance,

multiple inhe ritance Interfaces, encapsulation, pol ymorphism.

12

III FUNCTIONAL PROGRAM MING

Definiti on of a function: domain and range, total and partial

functions, strict fun ctions. Recursion, Referential transparency, Side

effects of f unctions

12

IV LOGIC PROGRAM MING

Basic constr ucts, Facts: queries, existential queries, conjunctive

queries and rules. Definition and semantics of a logic program,

Recursive programming: Comput ational model of logic

programming, Goal reduction, N egation in logic programming

12

V SCRIPTING LANGUAGE

What is scripting language, Problem domain( Shell languages, Text

processi ng and report generation, Math ematics and statis tics, General

12

## Page 8

purpose scriptin g, Extension languages), Scripting t he world wide

web(CGI scripts, Emb edded server side script, client side script, Java

Appl ets, X SLT)

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

1. Programming Language

Pragmatics Michael Scott Morg an

Kaufmann 4th

Edition 2015

2. The Craft of Functional

Programming Thompson,

Simon. H askell: Addison -

Wesley

Professional 2ndEditon 2011

3. “Found ations of

Programming Languages

Design & Implementation” Roost aSeyed Cenage

learning 3rdEditon 2003

4. Programming Languages:

Concepts and Construc ts Sethi Ravi Pearson

Education 3rdEditon 2000

Programming Paradigms Practical

M. Sc. (Data Sci ence) Semest er – I

Course Name: Programming Paradigms Practical Course Code: PSDS1P1

Periods p er week (1 P eriod is 60 minutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal -- --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

To explore a range of modern programming languages and programming techniques.

To select appropri ate software developm ent tools f or given application environm ents.

## Page 9

Database T echnologi es

M.Sc (Data Sci ence) Semest er – I

Course Name: Database Technologies Course Code: PSDS102

Periods p er week (1 P eriod is 60 minutes) 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Internal -- 40

Course Objectives:

The objective of the course is to present an introduction to database management systems, with

an emphasis on how to organize, maintain and retrieve - efficiently, and effectively - information

from a D BMS.

Unit Details Lectures

I Database Concepts:Why Databases?, Data versus Information,

Introdu cing the Database, Why D atabase Design Is Import ant,

Evolution of File System Data Processi ng, Problems with File

System D ata Processi ng, Database Systems

Data Models :DataMode ling and Data Models, The Import ance of

Data Models, Data Model Basic Building Blocks, Business Rules,

The Evolution of D ata M odels, D egrees of D ata Abstraction

The R elational Database Mod el:A Logical View of D ata, Keys,

Integrity Rules, Relational Algebra, The Data Dictionary and the

System Catalog, Relationships within the R elational D atabase, Data

Redunda ncy Revisited

Entity Relationship (ER) Modelin g:The Entity Relationship Model,

Developing a n ER Diagram, Database Design Challenges:

Confli cting Goals

12

II Advanced Data Modelli ng:The Extended Entity Relationship Model,

Entity Clustering, Design Cases: Learni ng Flexible Database Design

Normaliz ation of Database Tables:Database Tables and

Normaliz ation, The Need for Normalization, The Normalization

Process, Improvi ng the Design

Introdu ction to Structured Query Language (SQL):Introduc tion to

SQL, Basic SELECT Queries, SELECT Statement Options, FROM

Clause Options, ORDE R BY Clau se Options , WHERE Clause

Options, Aggregate Processing, Subque ries, SQL Functions,

Relational S et Operators, Crafting SELECT Q ueries

Advanced SQL:Data Definiti on Com mands, Creating Table

Structures, Altering Table Structures, Data Mani pulation Com mands,

Virtu al Tables: Creating a View, Sequences, Procedural SQL,

Emb edded SQL

Transaction Management and Conc urrency Control :What Is a

12

## Page 10

Transaction?, Concurrency Control, Concurrency Control with

Locking Methods, Conc urrency Control with Time Stamping

Methods, Conc urrency Control with Optimistic Methods, ANSI

Levels of T ransaction Isolation, D atabase Recovery Management

III Three Database Revolutions: Early Database Systems,

The First Database Revolution, The Second Database Revolut ion,

The Third Da tabase Revolut ion

Google, Big Data, and Hadoop: The Big Data Revolut ion, Google:

Pioneer of Big Data, Hadoop: Ope n-Source Google Stack

Sharding, Amazon, and the Birth of NoSQL: Scaling Web 2.0,

Amazon’s D ynamo

Document Databases: XML and XML Databases, JSON Document

Databases

12

IV Tables are Not Your Friends: Graph Databases: What is a Graph?,

RDBMS Patterns for Graphs, RDF and SPAR QL, Property Graphs

and N eo4j, G remlin, G raph D atabase Internals, Graph Comput e

Engines

Colum n Databases: Data Warehousing Schemas, The Colum nar

Alternative, Sybase IQ, C-Store, and V ertica, Colum n Database

Architectures

The End of Disk? SSD and In-Memo ry Databases: The End of

Disk?, In-Memo ry Databases, Berkeley Analytics Data Stack and

Spark

Distribute d Database Patterns: Distribute d Relational Databases,

Nonrelational Distribu ted Databases, MongoDB Sharding and

Replication, H Base, Cas sandra

Consistenc y Models: Types of Consistenc y, Consistenc y in

Mong oDB, HBase Cons istency, Cassand ra Consistenc y

12

V Data Models and Storage: Data Mod els, Storage

Languages and Programming Interfaces: SQL, NoSQL APIs, The

Return of SQL

Databases of the Future: The R evolut ion Revisited,

Counte rrevolut ionaries, Can We have it All?, Me anwhile, Back at

Oracle HQ, Other Convergent Databases, Disrupt ive D atabase

Technol ogies

12

## Page 11

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

1 Database System d esigns,

Implementation &

Mana gement Carlos Co ronel,

Steven Mor ris Cengage 13th 2018

2 Next Generation D atabases Guy Harrison Apress 1st 2015

3 Advanced Database

Technol ogy and D esign Mario Piattini,

Oscar D íaz Artech

House 1st 2000

Database T echnologi es Practical

M. Sc. (Data Sci ence) Semest er – I

Course Name:Database TechnologiesPractical Course Code: PSDS1P2

Periods p er week (1 P eriod is 60 minutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal -- --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

Upon successful compl etion of this course, stude nts should be a ble to:

Describe the funda mental elements of re lational d atabase management s ystems

Explain the basic concepts of relational data model, entity-relationship model, relational

database design, relational algebra and SQL

Design ER -models to represent simple database application scenarios

Conve rt the ER-model to relational tables, populate relational database and formulate SQL

queries on da ta.

Impro ve the d atabase design by norm alization.

## Page 12

Fundam entals of Data Science

M.Sc (Data Sci ence) Semest er – I

Course Name: Fundamentals of Data Science Course Code: PSDS103

Periods p er week (1 P eriod is 60 minutes) 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Internal -- 40

Course Objectives:

To provide strong founda tion for data science and application in area related to it and unde rstand

the unde rlying core concepts and e merging technologies in d ata science.

Unit Details Lectures

I Introd uction to D ata Sci ence:

What is Data? Kinds of data: e.g. static, spati al, tempo ral, text,

medi a,

Introdu ction to high level programming language + Integrated

Developm ent

Environm ent (IDE)

o Describing data: Exploratory Data Analysis (EDA) + Data

Visu alization - Summaries, aggregation, smoothi ng,

distributions

Data sou rces: e.g. relational d atabases, web/API, streaming,

Data collection: e.g. sampling, design (observational vs

experimental) and its impact on visuali zation, modeling and

generalizability of results

12

II Data analysis/modeling:

o Question/probl em formation along with EDA

o Introdu ction to estimation and inference (testing and

confide nce intervals) including simulation and re sampling

o Scope of in ference

o Assessment and s election e.g. training and testi ng sets

Data Cur ation, Manag ement and O rganization-I

Query languages and operations to specify and transform data

(e.g. proj ection, sel ection, join, aggregate/group, s ummarize)

Structured/sch ema b ased systems as us ers and acquirers of da ta

o Relational (SQL) databases, APIs and programmatic

access, indexing

o XML and XPath, APIs for accessi ng and querying

structured data contain ed therein

12

## Page 13

III Data Cur ation, Manag ement and Organization-I

Semi-structured systems as use rs and acqui rers of data

o Access through APIs yielding JSON to be parsed and

structured

Unstru ctured systems i n the acquis ition and struc turing of data

o Web Scraping

o Text/string parsing/processing to give structure

Data Cur ation, Manag ement and O rganization-II

Security and ethical considerations in relation to authentic ating

and authori zing access t o data on re mote systems

Software developm ent tools ( e.g. github, v ersion c ontrol)

12

IV Data Cur ation, Manag ement and Organization-II

Large scale data systems

o Paradigms for dis tribut ed data stor age

o Practical access to example systems (e.g. Mong oDB,

HBase, NoS QL systems)

o Amazon Web Services (AWS) provides public data sets in

Landsat, genomics, mul timedia

Introdu ction to S tatistical Mode ls

Simple Linear R egression

Mult iple Linear Regression

Logistic Regression

Review of hypothesis t esting, confide nce intervals, etc.

Estimation e. g. likelihood pr inciple, B ayes,

12

V Introd uction to Statistical Mod els

Linear mod els

o Regression theory i.e. least-squares: Introduction to

estimation prin ciples

o Mult iple re gression

Transformations, model selection

Interactions, indic ator variables, A NOVA

o Generalized linear mod els e.g. logistic, etc.

Alternatives to classic al regression e.g. trees,

smoothi ng/splines

Introdu ction to m odel selection

o Regularization, bias/v ariance tradeoff e.g. parsimo ny, AIC,

BIC

o Cross v alidation

Ridge regressions and penalized regression e.g. LAS SO

12

## Page 14

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

1 Hands-On Pr ogramming

with R Garrett

Grolemund O'Reilly 1st 2014

2 Doing Data Sci ence Rachel Schutt,

Cathy O’Neil O'Reilly

Media 1st 2013

3 AnIntrodu ction to Statistical

Learning with Applic ations

in R Gareth James,

Daniela Witten,

Trevor H astie,

Robe rt

Tibsh irani: Springer

US 2nd 2021

4 AppliedPr edictive

Modelling M. Kuhn, K.

Johnson Springer

New York 3rd 2019

5 Mast ering Machine

Learning with R Cory Lesmeister Packt

Publishing 2nd 2015

Fundam entals of Data Science Practical

M. Sc (Data Sci ence) Semest er – I

Course Name: Fundamentals of Data Science Practical Course Code: PSDS1P3

Periods p er week

1 Period is 60 m inutes Lectures 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

The students will be able to indepe ndently carry out research/investigation to solve

pract ical p roblems

The students should be able to understand & comprehend the problem; and should be

able to de fine suitable statistical m ethod t o be adopted.

## Page 15

Statistical Methods for D ata Science

M. Sc (Data Sci ence) Semest er – I

Course Name: Statistical Methods for Data Science Course Code:PSDS104

Periods p er week

1 Period is 60 m inutes Lectures 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Theory Internal -- 40

Pre requisites Knowledge of st atistics and math ematic al con cepts

Course Objectives:

1. To pre sent the math ematical, s tatistical and c ompu tational c hallenges of bu ilding neural

networks

2. To stud y the c oncepts of deep le arning

3. To e nable the students t o know deep l earning techniques to support r eal-time a pplications

Unit Details Lectures

I Introd uction to Applied Statistics: The Nature of Statistics and

Inference, What is “Big Data”?, Statistical Mode lling, Statistical

Significance Testing and Error Rates, Simple E xample of Inference

Using a Coin,S tatistics Is for Messy Situations, Type I versus Type II

Errors, Point Estimates and Confide nce Intervals, Variable Types,

Sample Size, Statistical Power, and Statistical Significance, The

Verdict on Significance Testing, Training versus Test Data. 12

II Compu tational Statistics:Vectors and Matrices, The Inverse of a

Matrix, Eigenvalues a nd Eigenvectors

Means, Correlations, Counts: Drawing Inferences: Compu ting z and

Related Scores, Statistical Tests, Plotting Normal Distributions,

Correlation Coefficients, Evaluatin g Pearson’s r for Statistical

Significance, Spearman’s Rho: A Nonp arametric Alte rnative to

Pearson, Tests of Mean Differences, t-Tests for One Sample, Two-

Sample t-Test, Paired-Samples t-Test, Categorical Data, Binom ial

Test, Categorical D ata Having More Than Two Possibi lities. 12

III Power Analysis and Sample Size Estimation :Power for t-Tests, Power

for On e-Way ANOVA, Power for Correlations.

Analysis of V ariance: Fixed Effects, Random Effects, Mixed Models,

Introdu cing the Analysis of Variance (ANOVA), Performing the

ANOVA, Random Effects ANOVA and Mixed Models, One-Way 12

## Page 16

Random Effects ANOVA, Simple and Mult iple Linear

Regression,Simple Linear Regression, Mult iple Regression Analysis,

Hierarchical Regression, How Forward Regression Works,

IV Logistic Regression and the Generalized Linear Model :Logistic

Regression, Logistic Regression, Predicting Probabilities, Mult iple

Logistic Regression, T raining Error Rate Versus T est Error Rate.

Mult ivariate An alysis of V ariance (M ANOVA) and Discriminant

Analysis:Mult ivariate Tests of Significance, Example of M ANOVA,

Outliers, Homog eneity of Cova riance Matrices, Linear Discriminant

Function Analysis, Theory of Discriminant Analysis, Predicting Group

Memb ership, Visu alizing Separation 12

V Principal Component Analysis: Principal Component Analysis Versus

Factor A nalysis, Properties of P rincipal Co mpone nts, Component

Scores, How Many Components to Keep?, Exploratory Factor

Analysis, Com mon Factor An alysis Model, Factor Analysis Versus

Principal Component Ana lysis on the Same, Initial Eig envalues in

Factor Analysis, Rotation in Exploratory Factor Analysis, Estimation

in Fac tor Analysis

Cluster Analysis:k-Means Cluster Analysis, Mini mizing Crite ria,

Example of k- Means Clustering, Hierarchical Cluster An alysis, Why

Clustering Is Inherently Subjective, Nonpa rametric Tests, Mann–

Whitney U Test, Kruskal–Wallis Test, Nonpa rametric Test for Paired

Compa risons and Rep eated 12

,

Books and Re ferences:

Sr.

No. Title Autho r/s Publisher Edition Year

01 Univ ariate, Bivariate, and

Mult ivariate Statistics Using

R Daniel J. Denis Wiley 1st 2020

02 Practical Data Science Andreas François

Vermeulen APress 1st 2018

03 Data Sci ence from S cratch

first Principle in python Joel Grus Shroff

Publishers 1st 2017

04 Experimental D esign in Data

science with Least Resources N C D as Shroff

Publishers 1st 2018

## Page 17

Statistical Metho ds for Data Science Practical

M. Sc (Data Sci ence) Semest er – I

Course Name: Statistical Methods for Data Sc ience

Practical Course Code: PSDS1P4

Periods p er week

1 Period is 60 m inutes Lectures 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 40

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

At the e nd of successful compl etion of the c ourse the stude nt wi ll be a ble to:

Describe basics of math ematic al founda tion that will help the learner to unde rstand the

concepts of D eep Learning.

Understand and describe model of deep l earning

Design and implem ent various deep supervised learni ng architectures for text & image

data.

Design and implem ent va rious deep l earning models and a rchitectur es.

Apply various deep learning techniques to design efficient algorithms for real-world

applications.

## Page 18

SEMESTER-II

Artificial I ntelligence a nd Machine Learning

M. Sc (Data Sci ence) Semest er – II

Course Name: Artificial I ntelligence and Machine

Learning Course Code: PSDS201

Periods p er week

1 Period is 60 m inutes Lectures 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Theory Internal -- 40

Pre requisites Knowledge of Algorithms and math ematic al foundation

Course Objectives:

To provide the founda tions for AI pr oblem-solvi ng techniqu es and knowl edge

representation fo rmalis ms

Understanding Hum an learning aspects.

Understanding primit ives in learning process by compu ter.

Understanding nature of problems sol ved with Machine Learning

Unit Details Lectures

I Introd uction to AI:

The AI problems, AI technique, philosoph y and developm ent of Artificial

intelligence.

Mini max algorithm, alpha-beta p runing, stochastic games, Constr aint-

satisfaction pr oblems.

Knowledge and Reasoni ng: Logical agents, Propositional logic, First-order

logic, Inference in FoL: forward chaining, back ward chaining, resolut ion,

Knowledge representation: Frames, Ontologies, Semantic web and RD F. 12

II Introd uction to PROLOG: Facts and predicates, data types, goal finding,

backtr acking, simple object, compound obje cts, use of cut a nd fail

predicates, recursion, l ists, simple input /output , dynamic database.

Machine Learning: Machine l earning, Examples of Machine Learning

Problems, Structure of Learni ng, learning versus Designing, Training ve rsus

Testing, Characteristics of Machine learni ng tasks, Predictive and descriptiv e

tasks, Machine learning Models: Geomet ric Models, Logical Mode ls,

Probabilistic Models. Featur es: Feature types, Feature Construc tion and

Transformation, Feature Selection 12

III Classifi cation and Reg ression:

Classifi cation: Binary Classification- Assessing Classifi cation performance,

Class probability Est imation Assessing class probability Est imates,

Mult iclass Classi fication. 12

## Page 19

Regression: Assessing performance of Regression- Error measures,

Overfitting- Catalysts for Overfitting, Case study of Polynomial Regression.

Theory of Generalization: Effective number of hypothesis, Boundi ng the

Growth f unction, VC D imensions, R egularization theory.

IV Linear Mod els:

Least Squares method, Mult ivariate Linear Regression, R egularized

Regression, Usi ng Least Square re gression for Classifi cation. Perceptron,

Suppor t Vector Machines, Soft Margin SVM, Obtaining pr obabilities from

Linear classifiers, K ernel methods for non-Linearity.

Logic Based and A lgebraic Mod el:

Distance Based Model s:Neighbou rs and Examples, Nearest Neighbou rs

Classifi cation, Distance based clustering-K m eans Algorithm, Hierarchical

clustering, 12

V Rule Based Models: Rule learning for subgroup discov ery, Asso ciation rule

mining.

Tree Based Models: Decision Trees, Ranking and Probability estimation

Trees, Re gression trees, Clustering Trees.

Probabilistic Mod el:

Normal Distribution a nd Its Geometric Interpretations, Naïve Bayes

Classifi er, Discriminative learning with Maximum likelihood, Probabilistic

Models with Hidde n variables: Est imatio n-Maximization Methods, G aussian

Mixtur es, and Compr ession ba sed Mode ls. 12

,

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

01 Artificial Intelligence Elaine Rich,

Kevin Kni ght Tata

McGraw

Hill 3rd 2017

02 Machine Learning: The Art

and Science of Algorithms

that Make Sense of Data Peter Flach Cambrid ge

Univ ersity

Press 1st 2012

03 Introd uction to Statistical

Machine Learning with

Applic ations in R Hastie,

Tibsh irani,

Friedman Springer 2nd 2012

04 Introd uction to Machine

Learning Ethem Alpaydin PHI 2nd 2013

## Page 20

Artificial I ntelligence a nd Machine Learning Practical

M. Sc (Data Sci ence) Semest er – II

Course Name:A rtificial Intelligence and Machine Learning

Practical Course Code: PSDS2P1

Periods p er week (1 Pe riod is 60 m inutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal -- -

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

Understand the k ey issues and concepts i n Artificial Intelligence.

Acquire the knowl edge about classifi cation and regression techniqu es where a learner

will be able to explore his skill to generate data base kn owledge usi ng the p rescribed

techniques.

Understand and implement the techniques for extracting the knowl edge using machine

learning methods.

Achieve adequate perspectives of big data analytics in various applications like

recommender systems, so cial media applications etc.

Understand the statis tical approa ch related to machine learning. He will also apply the

algorithms to a real-world problem, optimize the models learned and report on the

expected a ccuracy that c an be achieved by applying the mode

Soft Computing

M. Sc (Data Sci ence) Semest er – II

Course Name:Soft Computing Course Code:PSDS202

Periods p er week

1 Period is 60 m inutes Lectures 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Theory Internal -- 40

Course Objectives:

Soft comput ing concepts like fuzzy logic, neural networks and genetic algorithm, wh ere

Artificial Intelli gence is m other branch of all.

All these techniques will be more effective to solve the problem e fficiently

## Page 21

Unit Details Lectures

I Artificial Neural Network:Fundament alconcepts,Evolution of neural

network, basic model of A rtificial Neural Network, Important

terminolog ies, McCulloch Pits neuron, l inear separability, Hebb

network

Supervised Learning Network: Perceptron networks, Ad aline,

MAd aline, Backpropo gation ne twork, R adial B asis Function, Time

Delay Network, Functional Link N etworks, T ree Neural Network. 12

II UnSupe rvised Learning Networks: Fixed w eight competi tive nets,

Kohone n self-organizing feature maps, learning vectors quantization,

counter propogation ne tworks, ad aptive resonance theo ry networks.

Asso ciative Memo ry N etworks: Training algorithm for pattern

Asso ciation, Auto associative memo ry network, hetroassociative

memo ry network, bi-directional associ ative memo ry, Hopfield

networks, iterative autoassoci ative memo ry networks, tempo ral

associ ative memo ry networks. 12

III Special N etworks: Simulated annealing, Boltzman m achine, Gaussian

Machine, Cauchy Machine, Probabilistic neural net, cascade

correlation ne twork, cognition ne twork, neo-cognition ne twork,

cellular neural netwo rk, optical neu ral netw ork

Third Ge neration N eural Networks:

Spiking N eural networks, convolut ional neural networks, deep

learning neural netwo rks, extreme learni ng machine model. 12

IV Introd uction to Fuzzy Logic, Classi cal sets, Fuzzy sets, Classic al

Relations a nd Fuzzy Relations:

Cartesian Produc t of relation, classic al relation, fuzzy relations,

tolerance and equiva lence relations, non -iterative fuzzy sets.

Memb ership Function: features of the membership functions,

fuzzificationand methods of memb ership va lue assignments.

Defuzzification: Lambda-cuts for fuzzy sets, Lambda -cuts for fuzzy

relations, D efuzzification methods.

Fuzzy Arithmetic and Fuzzy measures: fuzzy arithme tic, fuzzy

measures, measures of fuzziness, f uzzy integrals. 12

V Genetic Algorithm: Biological Background, Traditional optimization

and search techniques, genetic algorithm and search space, genetic

algorithm vs. t raditional algorithms, ba sic te rminologies, si mple

genetic algorithm, general genetic algorithm, operators in genetic

algorithm, stopping condition for genetic algorithm flow, constr aints

in genetic algorithm, problem solvi ng using genetic algorithm, the

schema the orem, classification of genetic algorithm, Holla nd

classifi er systems, genetic programming, advantages and limitations

and applications of genetic algorithm 12

## Page 22

Books a nd Re ferences:

Sr. No. Title Author /s Publisher Edition Year

1. Artificial Intelli gence and

Soft Comput ing

2. Principles of Soft

comput ing Anandita Das

Battacharya

S.N.Sivanandam

S.N.Deepa SPD 3rd 2018

Wiley 3rd 2019

3. Neuro-Fuzzy and Soft

Comput ing

4. Neural Networks, Fuzzy

Logic and G enetic

Algorithms: Synthesis &

Applic ations J.S.R.Jang,

C.T.Sun a nd

E.Mi zutani

S.Rajasekaran,

G. A .

Vijayalakshami Prentice

Hall of

India

Prentice

Hall of

India 1st 2004

1st 2004

5. Fuzzy Logic with

Engineering Applic ations Timoth y J.Ross McGraw-

Hill 1st 1997

6. Genetic Algorithms: Search,

Optimization and Ma chine

Learning

7. Introdu ction to A I and

Expert System Davis

E.Goldbe rg

Dan W.

Patterson Addison

Wesley

Prentice

Hall of

India 1st 1989

2nd 2009

Soft Computing Practical

M. Sc (Data Sci ence) Semest er – II

Course Name:Soft Compu ting Practical Course Code:PSDS2P2

Periods p er week (1 Pe riod is 60 m inutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal -- -

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

## Page 23

Course Outcome:

Identify and describe soft comput ing techniqu es and their roles in building intelligent

machines

Recognize the feasibility of applying a soft comput ing methodolo gy for a particula r

problem

Apply fuzzy logic and reasoning to handle uncertainty and solve engineering problems

and also Appl y neural networks for c lassif ication and re gression pr oblems

Apply genetic algorithms to combinator ial opt imization pr oblems

Evaluate and compa re solut ions by various soft comput ing approa ches for a given

problem.

Algo rithms for Data Science

M. Sc (Data Sci ence) Semest er – II

Course Name: Algo rithms for Data Science Course Code: PSDS203

Periods p er week (1 P eriod is 60 minutes) 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Internal -- 40

Course Objectives:

The course is aimed at:

focussing on the principles of data reduction and core algorithms for analysing the data of

data science

providi ng many oppor tunities to deve lop a nd improve programming skills

applying algorithms t o real world data set

Imparting design thinking capability to bui ld big-data

Unit Details Lectures

I Introd uction: What Is Data Science?, Diabetes in Ame rica, Autho rs of the

Federalist Papers, Forecasting NASDAQ Stock Prices, Algorithms,

Python, R , Terminolo gy and Not ation

Data Mapping and Data Dictionaries: Data Reduction, Political

Contributions, Dictionaries, Tutori al: Big Contributor s, Data Reduction,

Election C ycle Contributions, S imilarity Measures, C omput ing Similarity

Scalable Algorithms and Asso ciative S tatistics: Introduc tion, Assoc iative

Statistics, U nivariate Observations, F unctions, His togram Constru ction,

Mult ivariate Da ta, Com puting the Cor relation Mat rix, Linear Regression,

Comput ing β

12

II Hadoop and MapR educe: Introdu ction, The Hadoop Ecosystem,

Medi care Payments, The Com mand Line Envi ronment, Programming a

12

## Page 24

MapR educe Algorithm, Using Amazon Web Services

Data Visu alization: Introduc tion, Principles of Data Visu alization,

Making Good Choi ces, Harnessing the M achine

III Linear Regression Methods: Introd uction, The Linear Regression Model,

Introdu ction to R , Large Data Sets a nd R, Factors, Analysis of R esiduals

Healthcare Analytics: Introd uction, The Behavioral Risk Factor

Surveillance System, Diabetes Prevalence and Incidence, Predicting At-

Risk Indivi duals, Identifying At-Risk Indivi duals, Unusua l Demographic

Attribute Vectors, Building Neighborhood S ets

12

IV Cluster Analysis: Introduction, Hierarchical Agglomerative Clustering,

Compa rison of States, Hierarchical Clustering of States, The k-Means

Algorithm

k-Nearest Neighbor Prediction Functions: Introduction, Not ation and

Terminolo gy, Distance Metrics, The k-Nearest Neighbor Prediction

Function, Expone ntially Weighted k-Nearest Neighbors, Digit

Recognition, Accuracy Ass essment, k-Nearest Neighbor Regression,

Forecasting the S&P 500, Forecasting by Pattern Recognition, Cros s-

Validation

The Mult inomial Naïve B ayes Prediction F unction: Introdu ction, The

Federalist Papers, The Multinom ial N aïve B ayes Prediction F unction,

Reducing the Federalist Papers, Pre dictin g Autho rship of the D isput ed

Federalist Papers, Cus tomer Segmentation

12

V Forecasting: Introd uction, Working with Time, Analytical Methods,

Comput ing ρτ, Drift and Forecasting, Holt-Winters E xpone ntial

Forecasting, Regression-Based Forecasting of Stock Prices, Time-

Varying Regression Estimato rs

Real-time Analytics: Introduc tion, Forecasting with a NASDAQ

Quot ation Stream, Forecasting the Apple Inc. Stream, The Twitte r

Streaming A PI, Sentiment Analysis, Sentiment Analysis of Hashtag

Groups

12

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

1 Algorithms for Data Sci ence Brian Steele,

John

Chandler,Swarna

Reddy Springer 1st 2016

2 Data Sci ence Algorithms in

a Week David Natingga Packt

Publishing 1st 2017

3 Data Sci ence: Theories,

models, Al gorithms and

Analytics SanjivRanjan

Das S.R. D as 1st 2017

## Page 25

Algo rithms for Data Science Practical

M. Sc (Data Sci ence) Semest er II

Course Name:A lgorithms for Data Sc ience Practical Course Code: PSDS2P3

Periods p er week (1 P eriod is 60 minutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

At the e nd of the c ourse the student shoul d be able to:

Understand fundament als of d ata science

Apply data visualis ation in bi g-data analytics

Apply Hadoop a nd ma p-reduce algorithm to big data

Apply different al gorithms to d ata sets

Perform real-time a nalytics

Optimization Techni ques

M. Sc (Data Sci ence) Semest er – II

Course Name: Optimization Techniques Course Code: PSDS204

Periods p er week (1 P eriod is 60 minutes) 4

Credits 4

Hours Marks

Evaluation System Theory Examination 2½ 60

Internal -- 40

Course Objectives:

To familiarize the students with some basic concepts of optimization techniques and

approa ches.

To formulate a r eal-world problem a s a m athematical pr ogramming model.

To develop the model f ormulation and applications a re used in solvi ng decision probl ems.

To solve specialized linear programming problems like the transport ation and assignment

Problems.

## Page 26

Unit Details Lectures

I Math ematic al Foundations: Functions and Cont inuity, Review of

Calculus, V ectors, Ma trix Algebra, Eigenvalues and E igenvectors,

Optimization and Optimalit y, General Formulation of Opti mization

Problems

Algorithms, Compl exity, and Conve xity: What Is an Algorithm?, Order

Notations, Conve rgence Rate, Comput ational Comp lexity, Conve xity,

Stochastic Nature in A lgorithms

12

II Optimization: Unconstr ained Optimization, Gradient-Based Methods,

Gradient -Free N elder–MeadMethod

Constr ained Opti mization: Math ematic al Formulation, Lagrange

Mult ipliers, Slack Variables, Generalized Reduced GradientMethod,

KKT Condi tions, P enaltyMethod

Optimization Techniqu es: Approximation Methods: BFGS Method,

Trust-Region Method, S equential Qu adratic Programming, Conv ex

Optimization, Equality Constr ained Optimization, Barrier Functions,

Interior-PointMethods, S tochastic and Robust Opti mization

12

III Linear Programming: Introduction, SimplexMethod, Worked Example by

Simplex Method, Interior-PointMethod f or LP

Integer Programming: Integer Linear Programming, LP Relaxation,

Branch and Bound, Mixed Integer Programming, Applic ations of LP, IP,

and M IP

Regression and R egularization: SampleMean and Variance, R egression

Analysis, Nonline ar Least Squares, Over-fitting and Information Crite ria,

Regularization and Lasso Method, Logistic R egression, Principal

Component Ana lysis

12

IV Machine Learning Algorithms: Data Minin g, Data Mining for Big Data,

Artificial N eural Networks, S uppor t Vector M achines, Deep Learning

Queueing Theory and Simulation: Introdu ction, Arrival Model, Service

Model, Basic QueueingMode l, Little’s Law, Queue Ma nagement and

Optimization

Mult iobjective Opti mization: Introduc tion, Pareto Front and Pareto

Optimality, Choice and Challenges, Transformation to Single Objective

Optimization, The �Constraint Method, Evolut ionary Approaches

12

V Constr aint-Handling Techniques: Introd uction and Overview, Method of

Lagrange Mult ipliers, B arrier Function Method, PenaltyMethod, Equality

Constr aints via Tolerance, Feasib ility Crite ria, Stochastic Ranking ,

Mult iobjective Constr aint-Handling and Ranki ng

Evolutiona ry Algorithms: Evolutiona ry Comput ation, Evolutiona ry

12

## Page 27

Strategy, G enetic Algo rithms, S imulated Anneal ing, Di fferential

Evolution

Nature-Inspir ed Algorithms: Introdu ction to SI, Ant and Bee Algorithms,

Particle Swarm Optimization, Firefly Algorithm, Cuckoo Search, Bat

Algorithm, F lower Poll ination Al gorithm, Other Algorithms

Books and Re ferences:

Sr. No. Title Autho r/s Publisher Edition Year

1 Optimization Techniques

and Applic ations

with E xamples Xin-She Yang Wiley 3rd 2018

2 Optimization Techniques A.K. Malik,

S.K. Ya dav,

S.R. Y adav I.K.

International

Publishing

House 1st 2012

3 Optimization methods:

from theory to design Marco Cavazzuti Springer 1st 2012

4 Optimization Techniques Chander Moha n,

Kusum D eep New Age

International 1st 2009

Optimization Techni ques Practical

M. Sc (Data Science) Semest er II

Course Name: Opti mization Techniques Pr actical Course Code: PSDS2P4

Periods per week (1 P eriod is 60 minutes) 4

Credits 2

Hours Marks

Evaluation System Practical Examination 2 50

Internal --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm

covering t he entire syllabus.

Course Outcomes:

Learner will be a ble to

Apply operations research techniques like linear programming problem in industrial

optimization probl ems.

Solve a llocation pr oblems us ing various OR met hods.

## Page 28

Understand the characteristics of different types of decis ion making environme nt and the

appropri ate decision ma king approach es and tools to be used in each t ype.

Recognize competitive forces in the marketplace and develop appropriate reactions based on

existing constr aints and resources.

Eval uation Scheme

Internal Evaluation (40 Marks)

The internal assessment marks sh all be awarded as follows:

1. 30 ma rks (Any one of the following):

a. Written T est or

b. SW AYAM ( Advanced Cours e) of minimum 20 hours and c ertification exam

compl eted or

c. NPT EL (Advanced Course) of minimum 20 hours and certification exam

compl eted or

d. V alid International Certifications (Prom etric, Pearson, Certiport, Coursera,

Udemy and the like)

e. One certification marks shall be awarded one course only. For four courses, the

students wi ll have to compl ete four c ertifications.

2. 10 ma rks: Class p articip ation, Que stion answer sessions during lectures, Di scussions

Sugge sted format of Question paper of 30 m arks for the Internal written test.

Q1. Attempt any two of the following: 16

a.

b.

c.

d.

Q2. Attempt any two of the following: 14

a.

b.

c.

d.

## Page 29

External Examination: (60 marks)

To be conduc ted by Univers ity as per other Msc Programmes

All questions are compu lsory

Q1 (Based on Unit 1) Attempt any two of the followin g: 12

a.

b.

c.

d.

Q2 (Based on Unit 2) Attempt any two of the followin g: 12

Q3 (Based on Unit 3) Attempt any two of the followin g: 12

Q4 (Based on Unit 4) Attempt any two of the followin g: 12

Q5 (Based on Unit 5) Attempt any two of the followin g: 12

Practical Evaluation (50 m arks)

To be conduc ted by Univers ity as per other Msc Programmes

A Certified copy journal is esse ntial to appear for the p ractical examination.

1. Practical Question 1 20

2. Practical Question 2 20

3. Journal 5

4. Viva Voce 5

OR

1. Practical Question 40

2. Journal 5

3. Viva Voce 5