MU MSC DATASCIENCE SEMIII REVISED SYALLBUS 20211 1 Syllabus Mumbai University


MU MSC DATASCIENCE SEMIII REVISED SYALLBUS 20211 1 Syllabus Mumbai University by munotes

Page 1

Page 2

Copy to : -
1. The Deputy Registrar, Academic Authorities Meetings and Services
(AAMS),
2. The Deputy Registrar, College Affiliations & Development
Department (CAD),
3. The Deputy Registrar, (Admissions, Enrolment, Eligibility and
Migration Department (AEM),
4. The Deputy Registrar, Research Administration & Promotion Cell
(RAPC),
5. The Deputy Registrar, Executive Authorities Section (EA),
6. The Deputy Registrar, PRO, Fort, (Publi cation Section),
7. The Deputy Registrar, (Special Cell),
8. The Deputy Registrar, Fort/ Vidyanagari Administration Department
(FAD) (VAD), Record Section,
9. The Director, Institute of Distance and Open Learni ng (IDOL Admin),
Vidyanagari,
They are requested to treat this as action taken report on the concerned
resolution adopted by the Academic Council referred to in the above circular
and that on separate Action Taken Report will be sent in this connection.

1. P.A to Hon’ble Vice -Chancellor,
2. P.A Pro -Vice-Chancellor,
3. P.A to Registrar,
4. All Deans of all Faculties,
5. P.A to Finance & Account Officers, (F.& A.O),
6. P.A to Director, Board of Examinations and Evaluation,
7. P.A to Director, Innovation, Incubation and Linkages,
8. P.A to Director, Board of Lifelong Learning and Extension (BLLE),
9. The Director, Dept. of Information and Communication Technology
(DICT) (CCF & UCC), Vidyanagari,
10. The Director of Board of Student Development,
11. The Director, Dep artment of Students Walfare (DSD),
12. All Deputy Registrar, Examination House,
13. The Deputy Registrars, Finance & Accounts Section,
14. The Assistant Registrar, Administrative sub -Campus Thane,
15. The Assistant Registrar, School of Engg. & Applied Sciences, Kalyan ,
16. The Assistant Registrar, Ratnagiri sub -centre, Ratnagiri,
17. The Assistant Registrar, Constituent Colleges Unit,
18. BUCTU,
19. The Receptionist,
20. The Telephone Operator,
21. The Secretary MUASA

for information.

Page 3


AC – 29/06/2021
Item No. - 6.39




UNIVERSI TY OF MUM BAI













Syllabus

For the

Program: M.Sc. Semester -I and Semester –II CBCS

(REVI SED)

Course: M.Sc. Computer Science with
Specializ ation in Data Science



Choice Based and Credit S ystem with effect from the
academic year 2021-22)

Page 4

UNIVE RSITY OF MUMBAI








Syllabus for Approv al
Sr. No. Heading Particulars
1. Title of the Course Master in Computer Science with
Specialization in Data Sci ence

2. Eligibility for
Admission
A candidate with minimum 50% s core in
Graduation can appear for entrance
examination through which the admission
merit list will be generated.
3. Passing Marks 40%

4. Ordinances /
Regulations (if, any)
5. Number of years /
Semesters
As applicable for all M.Sc. Courses


Two years – Four Semester s
6. Level P.G./ U.G. /Diploma / Certificate
(Strike out which is not applicable)

7. Pattern Yearly / Semester, Choice Based
(Strike out which is not applicable)

8. Status New /Revised
9. To be implemented
from Academic year

Date: 28/06/2021 From the Academic Year 2021 – 2022



Dr. Jagdish Bakal Dr. Anuradha Majumdar

BoS Chairperson in Computer Science Dean, Science and Tech nology

Page 5

PROGRAMME OUT COME

1. Students will a ttain proficiency with statis tical analysis of Data.

2. Students will e xecute statis tical analyses with p rofessional statistical softw are.

3. Students will g ain ski lls in D ata management.

4. Students will de velop the ability to bui ld and a ssess Databased models.

5. Students will a pply data science concepts and methods to solve pr oblems in r eal-world
contexts and w ill comm unicate these solut ions e ffectively


PROGRAMME SPECIFIC OUTCO MES (PSOs)

On c ompl etion of M.Sc. Data Sci ence programme, students will be able:

PO_01: To become a ski lled Da ta Sci entist in indust ry, academia, or government.
PO_02: To use specialised softw are tools for d ata storage, analysis and vis ualization.
PO_03: To indepe ndently carry out research/investigation to solve pract ical problems.
PO_04: To gain problem-solvi ng ability- to assess social issues (ethical, financial, management,
analytical and s cientific analysis) and engineering problems.
PO_05: To ha ve a clear understanding of professional and ethical respons ibility.
PO_06: To c ollaborate virtually.
PO_07: To ha ve critical thinki ng and innovative s kills.

PO_08: To tr anslate vast data into abstr act concept s and to unde rstand da tabase reasoning .

Page 6



PROGRAMME S TRUCTUR E



Semest er – I
Course Code Course Title Credits
PSDS101 Programming Paradigms 4
PSDS102 Database Technologies 4
PSDS103 Fundament als of Data Science 4
PSDS104 Statistical Methods for Data
Science 4
PSDS1P1 Programming Paradigms Practical 2
PSDS1P2 Database Technologies Practical 2
PSDS1P3 Fundament als of Data Science
Practical 2
PSDS1P4 Statistical Methods for Data
Science Practical 2
Total Credits 24


Semest er – II
Course Code Course Title Credits
PSDS201 Artificial Intelligence and Machine
Learning 4
PSDS202 Soft Compu ting 4
PSDS203 Algorithms for Data Sci ence 4
PSDS204 Optimization Techniques 4
PSDS2P1 Artificial Intelligence and Machine
Learning Practical 2
PSDS2P2 Soft Compu ting Practical 2
PSDS2P3 Algorithms for Data Sci ence
Practical 2
PSDS2P4 Optimization Techniques Pr actical 2
Total Credits 24

Page 7

DETAILED SYLLAB US FOR S EMESTER - I & S EMESTER - II
Semester – 1

Programming Paradigms

M.Sc (Data Sci ence) Semest er – I
Course Name: Programming Paradigms Course Code: PSDS101
Periods p er week (1 P eriod is 60 minutes) 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Internal -- 40

Course Objectives:
 To unde rstand the ba sic building blocks of programming Languages.
 To Learn and unde rstand various pr ogramming paradigms.

Unit Details Lectures
I Found ations-Language design, why to study programming language,
compi lation and inte rpretation, pr ogramming e nvironm ents.
Programming language syntax – Specifying syntax: regular
expressions and Cont ext-Free grammar(Token and Regular
expressions, Context Free grammar, Derivations and parse trees),
Scanning(G enerating Finite autom ation, Scanne r code, Table-driven
scanning , Lexical errors, pragmas), Parsing (Recursive Descent,
Writing L1 grammar, Table driven top down parsing, Bottom up
parsing, Syntax errors)




12
II OBJECT O RIENTATION
Basic concepts: objects, classes, methods, ov erloading methods,
mess ages inheritance: overriding methods, single inheritance,
multiple inhe ritance Interfaces, encapsulation, pol ymorphism.


12
III FUNCTIONAL PROGRAM MING
Definiti on of a function: domain and range, total and partial
functions, strict fun ctions. Recursion, Referential transparency, Side
effects of f unctions

12
IV LOGIC PROGRAM MING
Basic constr ucts, Facts: queries, existential queries, conjunctive
queries and rules. Definition and semantics of a logic program,
Recursive programming: Comput ational model of logic
programming, Goal reduction, N egation in logic programming

12
V SCRIPTING LANGUAGE
What is scripting language, Problem domain( Shell languages, Text
processi ng and report generation, Math ematics and statis tics, General
12

Page 8

purpose scriptin g, Extension languages), Scripting t he world wide
web(CGI scripts, Emb edded server side script, client side script, Java
Appl ets, X SLT)



Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
1. Programming Language
Pragmatics Michael Scott Morg an
Kaufmann 4th
Edition 2015
2. The Craft of Functional
Programming Thompson,
Simon. H askell: Addison -
Wesley
Professional 2ndEditon 2011
3. “Found ations of
Programming Languages
Design & Implementation” Roost aSeyed Cenage
learning 3rdEditon 2003
4. Programming Languages:
Concepts and Construc ts Sethi Ravi Pearson
Education 3rdEditon 2000



Programming Paradigms Practical



M. Sc. (Data Sci ence) Semest er – I
Course Name: Programming Paradigms Practical Course Code: PSDS1P1
Periods p er week (1 P eriod is 60 minutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal -- --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:
 To explore a range of modern programming languages and programming techniques.
 To select appropri ate software developm ent tools f or given application environm ents.

Page 9

Database T echnologi es

M.Sc (Data Sci ence) Semest er – I
Course Name: Database Technologies Course Code: PSDS102
Periods p er week (1 P eriod is 60 minutes) 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Internal -- 40

Course Objectives:
The objective of the course is to present an introduction to database management systems, with
an emphasis on how to organize, maintain and retrieve - efficiently, and effectively - information
from a D BMS.

Unit Details Lectures
I Database Concepts:Why Databases?, Data versus Information,
Introdu cing the Database, Why D atabase Design Is Import ant,
Evolution of File System Data Processi ng, Problems with File
System D ata Processi ng, Database Systems
Data Models :DataMode ling and Data Models, The Import ance of
Data Models, Data Model Basic Building Blocks, Business Rules,
The Evolution of D ata M odels, D egrees of D ata Abstraction
The R elational Database Mod el:A Logical View of D ata, Keys,
Integrity Rules, Relational Algebra, The Data Dictionary and the
System Catalog, Relationships within the R elational D atabase, Data
Redunda ncy Revisited
Entity Relationship (ER) Modelin g:The Entity Relationship Model,
Developing a n ER Diagram, Database Design Challenges:
Confli cting Goals








12
II Advanced Data Modelli ng:The Extended Entity Relationship Model,
Entity Clustering, Design Cases: Learni ng Flexible Database Design
Normaliz ation of Database Tables:Database Tables and
Normaliz ation, The Need for Normalization, The Normalization
Process, Improvi ng the Design
Introdu ction to Structured Query Language (SQL):Introduc tion to
SQL, Basic SELECT Queries, SELECT Statement Options, FROM
Clause Options, ORDE R BY Clau se Options , WHERE Clause
Options, Aggregate Processing, Subque ries, SQL Functions,
Relational S et Operators, Crafting SELECT Q ueries
Advanced SQL:Data Definiti on Com mands, Creating Table
Structures, Altering Table Structures, Data Mani pulation Com mands,
Virtu al Tables: Creating a View, Sequences, Procedural SQL,
Emb edded SQL
Transaction Management and Conc urrency Control :What Is a







12

Page 10


Transaction?, Concurrency Control, Concurrency Control with
Locking Methods, Conc urrency Control with Time Stamping
Methods, Conc urrency Control with Optimistic Methods, ANSI
Levels of T ransaction Isolation, D atabase Recovery Management
III Three Database Revolutions: Early Database Systems,
The First Database Revolution, The Second Database Revolut ion,
The Third Da tabase Revolut ion
Google, Big Data, and Hadoop: The Big Data Revolut ion, Google:
Pioneer of Big Data, Hadoop: Ope n-Source Google Stack
Sharding, Amazon, and the Birth of NoSQL: Scaling Web 2.0,
Amazon’s D ynamo
Document Databases: XML and XML Databases, JSON Document
Databases





12
IV Tables are Not Your Friends: Graph Databases: What is a Graph?,
RDBMS Patterns for Graphs, RDF and SPAR QL, Property Graphs
and N eo4j, G remlin, G raph D atabase Internals, Graph Comput e
Engines
Colum n Databases: Data Warehousing Schemas, The Colum nar
Alternative, Sybase IQ, C-Store, and V ertica, Colum n Database
Architectures
The End of Disk? SSD and In-Memo ry Databases: The End of
Disk?, In-Memo ry Databases, Berkeley Analytics Data Stack and
Spark
Distribute d Database Patterns: Distribute d Relational Databases,
Nonrelational Distribu ted Databases, MongoDB Sharding and
Replication, H Base, Cas sandra
Consistenc y Models: Types of Consistenc y, Consistenc y in
Mong oDB, HBase Cons istency, Cassand ra Consistenc y








12
V Data Models and Storage: Data Mod els, Storage
Languages and Programming Interfaces: SQL, NoSQL APIs, The
Return of SQL
Databases of the Future: The R evolut ion Revisited,
Counte rrevolut ionaries, Can We have it All?, Me anwhile, Back at
Oracle HQ, Other Convergent Databases, Disrupt ive D atabase
Technol ogies


12

Page 11




Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
1 Database System d esigns,
Implementation &
Mana gement Carlos Co ronel,
Steven Mor ris Cengage 13th 2018
2 Next Generation D atabases Guy Harrison Apress 1st 2015
3 Advanced Database
Technol ogy and D esign Mario Piattini,
Oscar D íaz Artech
House 1st 2000


Database T echnologi es Practical

M. Sc. (Data Sci ence) Semest er – I
Course Name:Database TechnologiesPractical Course Code: PSDS1P2
Periods p er week (1 P eriod is 60 minutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal -- --

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:

Upon successful compl etion of this course, stude nts should be a ble to:

 Describe the funda mental elements of re lational d atabase management s ystems
 Explain the basic concepts of relational data model, entity-relationship model, relational
database design, relational algebra and SQL
 Design ER -models to represent simple database application scenarios
 Conve rt the ER-model to relational tables, populate relational database and formulate SQL
queries on da ta.
 Impro ve the d atabase design by norm alization.

Page 12



Fundam entals of Data Science

M.Sc (Data Sci ence) Semest er – I
Course Name: Fundamentals of Data Science Course Code: PSDS103
Periods p er week (1 P eriod is 60 minutes) 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Internal -- 40


Course Objectives:
To provide strong founda tion for data science and application in area related to it and unde rstand
the unde rlying core concepts and e merging technologies in d ata science.


Unit Details Lectures
I Introd uction to D ata Sci ence:
 What is Data? Kinds of data: e.g. static, spati al, tempo ral, text,
medi a,
 Introdu ction to high level programming language + Integrated
Developm ent
 Environm ent (IDE)
o Describing data: Exploratory Data Analysis (EDA) + Data
Visu alization - Summaries, aggregation, smoothi ng,
distributions
 Data sou rces: e.g. relational d atabases, web/API, streaming,
Data collection: e.g. sampling, design (observational vs
experimental) and its impact on visuali zation, modeling and
generalizability of results






12
II Data analysis/modeling:
o Question/probl em formation along with EDA
o Introdu ction to estimation and inference (testing and
confide nce intervals) including simulation and re sampling
o Scope of in ference
o Assessment and s election e.g. training and testi ng sets

Data Cur ation, Manag ement and O rganization-I
 Query languages and operations to specify and transform data
(e.g. proj ection, sel ection, join, aggregate/group, s ummarize)
 Structured/sch ema b ased systems as us ers and acquirers of da ta
o Relational (SQL) databases, APIs and programmatic
access, indexing
o XML and XPath, APIs for accessi ng and querying
structured data contain ed therein








12

Page 13



III Data Cur ation, Manag ement and Organization-I
 Semi-structured systems as use rs and acqui rers of data
o Access through APIs yielding JSON to be parsed and
structured
 Unstru ctured systems i n the acquis ition and struc turing of data
o Web Scraping
o Text/string parsing/processing to give structure

Data Cur ation, Manag ement and O rganization-II
 Security and ethical considerations in relation to authentic ating
and authori zing access t o data on re mote systems
 Software developm ent tools ( e.g. github, v ersion c ontrol)






12
IV Data Cur ation, Manag ement and Organization-II
 Large scale data systems
o Paradigms for dis tribut ed data stor age
o Practical access to example systems (e.g. Mong oDB,
HBase, NoS QL systems)
o Amazon Web Services (AWS) provides public data sets in
Landsat, genomics, mul timedia
Introdu ction to S tatistical Mode ls
 Simple Linear R egression
 Mult iple Linear Regression
 Logistic Regression
 Review of hypothesis t esting, confide nce intervals, etc.
 Estimation e. g. likelihood pr inciple, B ayes,







12
V Introd uction to Statistical Mod els
 Linear mod els
o Regression theory i.e. least-squares: Introduction to
estimation prin ciples
o Mult iple re gression
 Transformations, model selection
 Interactions, indic ator variables, A NOVA
o Generalized linear mod els e.g. logistic, etc.
 Alternatives to classic al regression e.g. trees,
smoothi ng/splines
 Introdu ction to m odel selection
o Regularization, bias/v ariance tradeoff e.g. parsimo ny, AIC,
BIC
o Cross v alidation
Ridge regressions and penalized regression e.g. LAS SO







12

Page 14




Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
1 Hands-On Pr ogramming
with R Garrett
Grolemund O'Reilly 1st 2014
2 Doing Data Sci ence Rachel Schutt,
Cathy O’Neil O'Reilly
Media 1st 2013
3 AnIntrodu ction to Statistical
Learning with Applic ations
in R Gareth James,
Daniela Witten,
Trevor H astie,
Robe rt
Tibsh irani: Springer
US 2nd 2021
4 AppliedPr edictive
Modelling M. Kuhn, K.
Johnson Springer
New York 3rd 2019
5 Mast ering Machine
Learning with R Cory Lesmeister Packt
Publishing 2nd 2015



Fundam entals of Data Science Practical

M. Sc (Data Sci ence) Semest er – I
Course Name: Fundamentals of Data Science Practical Course Code: PSDS1P3
Periods p er week
1 Period is 60 m inutes Lectures 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50


Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.


Course Outcomes:
 The students will be able to indepe ndently carry out research/investigation to solve
pract ical p roblems
 The students should be able to understand & comprehend the problem; and should be
able to de fine suitable statistical m ethod t o be adopted.

Page 15

Statistical Methods for D ata Science


M. Sc (Data Sci ence) Semest er – I
Course Name: Statistical Methods for Data Science Course Code:PSDS104
Periods p er week
1 Period is 60 m inutes Lectures 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Theory Internal -- 40


Pre requisites Knowledge of st atistics and math ematic al con cepts

Course Objectives:

1. To pre sent the math ematical, s tatistical and c ompu tational c hallenges of bu ilding neural
networks
2. To stud y the c oncepts of deep le arning
3. To e nable the students t o know deep l earning techniques to support r eal-time a pplications



Unit Details Lectures
I Introd uction to Applied Statistics: The Nature of Statistics and
Inference, What is “Big Data”?, Statistical Mode lling, Statistical
Significance Testing and Error Rates, Simple E xample of Inference
Using a Coin,S tatistics Is for Messy Situations, Type I versus Type II
Errors, Point Estimates and Confide nce Intervals, Variable Types,
Sample Size, Statistical Power, and Statistical Significance, The
Verdict on Significance Testing, Training versus Test Data. 12
II Compu tational Statistics:Vectors and Matrices, The Inverse of a
Matrix, Eigenvalues a nd Eigenvectors

Means, Correlations, Counts: Drawing Inferences: Compu ting z and
Related Scores, Statistical Tests, Plotting Normal Distributions,
Correlation Coefficients, Evaluatin g Pearson’s r for Statistical
Significance, Spearman’s Rho: A Nonp arametric Alte rnative to
Pearson, Tests of Mean Differences, t-Tests for One Sample, Two-
Sample t-Test, Paired-Samples t-Test, Categorical Data, Binom ial
Test, Categorical D ata Having More Than Two Possibi lities. 12
III Power Analysis and Sample Size Estimation :Power for t-Tests, Power
for On e-Way ANOVA, Power for Correlations.

Analysis of V ariance: Fixed Effects, Random Effects, Mixed Models,
Introdu cing the Analysis of Variance (ANOVA), Performing the
ANOVA, Random Effects ANOVA and Mixed Models, One-Way 12

Page 16


Random Effects ANOVA, Simple and Mult iple Linear
Regression,Simple Linear Regression, Mult iple Regression Analysis,
Hierarchical Regression, How Forward Regression Works,
IV Logistic Regression and the Generalized Linear Model :Logistic
Regression, Logistic Regression, Predicting Probabilities, Mult iple
Logistic Regression, T raining Error Rate Versus T est Error Rate.

Mult ivariate An alysis of V ariance (M ANOVA) and Discriminant
Analysis:Mult ivariate Tests of Significance, Example of M ANOVA,
Outliers, Homog eneity of Cova riance Matrices, Linear Discriminant
Function Analysis, Theory of Discriminant Analysis, Predicting Group
Memb ership, Visu alizing Separation 12
V Principal Component Analysis: Principal Component Analysis Versus
Factor A nalysis, Properties of P rincipal Co mpone nts, Component
Scores, How Many Components to Keep?, Exploratory Factor
Analysis, Com mon Factor An alysis Model, Factor Analysis Versus
Principal Component Ana lysis on the Same, Initial Eig envalues in
Factor Analysis, Rotation in Exploratory Factor Analysis, Estimation
in Fac tor Analysis

Cluster Analysis:k-Means Cluster Analysis, Mini mizing Crite ria,
Example of k- Means Clustering, Hierarchical Cluster An alysis, Why
Clustering Is Inherently Subjective, Nonpa rametric Tests, Mann–
Whitney U Test, Kruskal–Wallis Test, Nonpa rametric Test for Paired
Compa risons and Rep eated 12
,
Books and Re ferences:
Sr.
No. Title Autho r/s Publisher Edition Year
01 Univ ariate, Bivariate, and
Mult ivariate Statistics Using
R Daniel J. Denis Wiley 1st 2020
02 Practical Data Science Andreas François
Vermeulen APress 1st 2018
03 Data Sci ence from S cratch
first Principle in python Joel Grus Shroff
Publishers 1st 2017
04 Experimental D esign in Data
science with Least Resources N C D as Shroff
Publishers 1st 2018

Page 17

Statistical Metho ds for Data Science Practical

M. Sc (Data Sci ence) Semest er – I
Course Name: Statistical Methods for Data Sc ience
Practical Course Code: PSDS1P4
Periods p er week
1 Period is 60 m inutes Lectures 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 40


Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:

At the e nd of successful compl etion of the c ourse the stude nt wi ll be a ble to:


 Describe basics of math ematic al founda tion that will help the learner to unde rstand the
concepts of D eep Learning.
 Understand and describe model of deep l earning
 Design and implem ent various deep supervised learni ng architectures for text & image
data.
 Design and implem ent va rious deep l earning models and a rchitectur es.
 Apply various deep learning techniques to design efficient algorithms for real-world
applications.

Page 18

SEMESTER-II

Artificial I ntelligence a nd Machine Learning

M. Sc (Data Sci ence) Semest er – II
Course Name: Artificial I ntelligence and Machine
Learning Course Code: PSDS201
Periods p er week
1 Period is 60 m inutes Lectures 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Theory Internal -- 40


Pre requisites Knowledge of Algorithms and math ematic al foundation

Course Objectives:
 To provide the founda tions for AI pr oblem-solvi ng techniqu es and knowl edge
representation fo rmalis ms
 Understanding Hum an learning aspects.

 Understanding primit ives in learning process by compu ter.
Understanding nature of problems sol ved with Machine Learning

Unit Details Lectures
I Introd uction to AI:
The AI problems, AI technique, philosoph y and developm ent of Artificial
intelligence.
Mini max algorithm, alpha-beta p runing, stochastic games, Constr aint-
satisfaction pr oblems.
Knowledge and Reasoni ng: Logical agents, Propositional logic, First-order
logic, Inference in FoL: forward chaining, back ward chaining, resolut ion,
Knowledge representation: Frames, Ontologies, Semantic web and RD F. 12
II Introd uction to PROLOG: Facts and predicates, data types, goal finding,
backtr acking, simple object, compound obje cts, use of cut a nd fail
predicates, recursion, l ists, simple input /output , dynamic database.
Machine Learning: Machine l earning, Examples of Machine Learning
Problems, Structure of Learni ng, learning versus Designing, Training ve rsus
Testing, Characteristics of Machine learni ng tasks, Predictive and descriptiv e
tasks, Machine learning Models: Geomet ric Models, Logical Mode ls,
Probabilistic Models. Featur es: Feature types, Feature Construc tion and
Transformation, Feature Selection 12
III Classifi cation and Reg ression:
Classifi cation: Binary Classification- Assessing Classifi cation performance,
Class probability Est imation Assessing class probability Est imates,
Mult iclass Classi fication. 12

Page 19


Regression: Assessing performance of Regression- Error measures,
Overfitting- Catalysts for Overfitting, Case study of Polynomial Regression.
Theory of Generalization: Effective number of hypothesis, Boundi ng the
Growth f unction, VC D imensions, R egularization theory.
IV Linear Mod els:
Least Squares method, Mult ivariate Linear Regression, R egularized
Regression, Usi ng Least Square re gression for Classifi cation. Perceptron,
Suppor t Vector Machines, Soft Margin SVM, Obtaining pr obabilities from
Linear classifiers, K ernel methods for non-Linearity.
Logic Based and A lgebraic Mod el:
Distance Based Model s:Neighbou rs and Examples, Nearest Neighbou rs
Classifi cation, Distance based clustering-K m eans Algorithm, Hierarchical
clustering, 12
V Rule Based Models: Rule learning for subgroup discov ery, Asso ciation rule
mining.
Tree Based Models: Decision Trees, Ranking and Probability estimation
Trees, Re gression trees, Clustering Trees.
Probabilistic Mod el:
Normal Distribution a nd Its Geometric Interpretations, Naïve Bayes
Classifi er, Discriminative learning with Maximum likelihood, Probabilistic
Models with Hidde n variables: Est imatio n-Maximization Methods, G aussian
Mixtur es, and Compr ession ba sed Mode ls. 12
,
Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
01 Artificial Intelligence Elaine Rich,
Kevin Kni ght Tata
McGraw
Hill 3rd 2017
02 Machine Learning: The Art
and Science of Algorithms
that Make Sense of Data Peter Flach Cambrid ge
Univ ersity
Press 1st 2012
03 Introd uction to Statistical
Machine Learning with
Applic ations in R Hastie,
Tibsh irani,
Friedman Springer 2nd 2012
04 Introd uction to Machine
Learning Ethem Alpaydin PHI 2nd 2013

Page 20

Artificial I ntelligence a nd Machine Learning Practical

M. Sc (Data Sci ence) Semest er – II
Course Name:A rtificial Intelligence and Machine Learning
Practical Course Code: PSDS2P1
Periods p er week (1 Pe riod is 60 m inutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal -- -

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:
 Understand the k ey issues and concepts i n Artificial Intelligence.
 Acquire the knowl edge about classifi cation and regression techniqu es where a learner
will be able to explore his skill to generate data base kn owledge usi ng the p rescribed
techniques.
 Understand and implement the techniques for extracting the knowl edge using machine
learning methods.
 Achieve adequate perspectives of big data analytics in various applications like
recommender systems, so cial media applications etc.
 Understand the statis tical approa ch related to machine learning. He will also apply the
algorithms to a real-world problem, optimize the models learned and report on the
expected a ccuracy that c an be achieved by applying the mode

Soft Computing

M. Sc (Data Sci ence) Semest er – II
Course Name:Soft Computing Course Code:PSDS202
Periods p er week
1 Period is 60 m inutes Lectures 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Theory Internal -- 40

Course Objectives:
 Soft comput ing concepts like fuzzy logic, neural networks and genetic algorithm, wh ere
Artificial Intelli gence is m other branch of all.
 All these techniques will be more effective to solve the problem e fficiently

Page 21


Unit Details Lectures
I Artificial Neural Network:Fundament alconcepts,Evolution of neural
network, basic model of A rtificial Neural Network, Important
terminolog ies, McCulloch Pits neuron, l inear separability, Hebb
network

Supervised Learning Network: Perceptron networks, Ad aline,
MAd aline, Backpropo gation ne twork, R adial B asis Function, Time
Delay Network, Functional Link N etworks, T ree Neural Network. 12
II UnSupe rvised Learning Networks: Fixed w eight competi tive nets,
Kohone n self-organizing feature maps, learning vectors quantization,
counter propogation ne tworks, ad aptive resonance theo ry networks.

Asso ciative Memo ry N etworks: Training algorithm for pattern
Asso ciation, Auto associative memo ry network, hetroassociative
memo ry network, bi-directional associ ative memo ry, Hopfield
networks, iterative autoassoci ative memo ry networks, tempo ral
associ ative memo ry networks. 12
III Special N etworks: Simulated annealing, Boltzman m achine, Gaussian
Machine, Cauchy Machine, Probabilistic neural net, cascade
correlation ne twork, cognition ne twork, neo-cognition ne twork,
cellular neural netwo rk, optical neu ral netw ork
Third Ge neration N eural Networks:
Spiking N eural networks, convolut ional neural networks, deep
learning neural netwo rks, extreme learni ng machine model. 12
IV Introd uction to Fuzzy Logic, Classi cal sets, Fuzzy sets, Classic al
Relations a nd Fuzzy Relations:
Cartesian Produc t of relation, classic al relation, fuzzy relations,
tolerance and equiva lence relations, non -iterative fuzzy sets.
Memb ership Function: features of the membership functions,
fuzzificationand methods of memb ership va lue assignments.
Defuzzification: Lambda-cuts for fuzzy sets, Lambda -cuts for fuzzy
relations, D efuzzification methods.
Fuzzy Arithmetic and Fuzzy measures: fuzzy arithme tic, fuzzy
measures, measures of fuzziness, f uzzy integrals. 12
V Genetic Algorithm: Biological Background, Traditional optimization
and search techniques, genetic algorithm and search space, genetic
algorithm vs. t raditional algorithms, ba sic te rminologies, si mple
genetic algorithm, general genetic algorithm, operators in genetic
algorithm, stopping condition for genetic algorithm flow, constr aints
in genetic algorithm, problem solvi ng using genetic algorithm, the
schema the orem, classification of genetic algorithm, Holla nd
classifi er systems, genetic programming, advantages and limitations
and applications of genetic algorithm 12

Page 22







Books a nd Re ferences:
Sr. No. Title Author /s Publisher Edition Year
1. Artificial Intelli gence and
Soft Comput ing
2. Principles of Soft
comput ing Anandita Das
Battacharya
S.N.Sivanandam
S.N.Deepa SPD 3rd 2018

Wiley 3rd 2019
3. Neuro-Fuzzy and Soft
Comput ing


4. Neural Networks, Fuzzy
Logic and G enetic
Algorithms: Synthesis &
Applic ations J.S.R.Jang,
C.T.Sun a nd
E.Mi zutani
S.Rajasekaran,
G. A .
Vijayalakshami Prentice
Hall of
India
Prentice
Hall of
India 1st 2004


1st 2004
5. Fuzzy Logic with
Engineering Applic ations Timoth y J.Ross McGraw-
Hill 1st 1997
6. Genetic Algorithms: Search,
Optimization and Ma chine
Learning
7. Introdu ction to A I and
Expert System Davis
E.Goldbe rg


Dan W.
Patterson Addison
Wesley


Prentice
Hall of
India 1st 1989



2nd 2009



Soft Computing Practical

M. Sc (Data Sci ence) Semest er – II
Course Name:Soft Compu ting Practical Course Code:PSDS2P2
Periods p er week (1 Pe riod is 60 m inutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal -- -

Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Page 23

Course Outcome:
 Identify and describe soft comput ing techniqu es and their roles in building intelligent
machines
 Recognize the feasibility of applying a soft comput ing methodolo gy for a particula r
problem
 Apply fuzzy logic and reasoning to handle uncertainty and solve engineering problems
and also Appl y neural networks for c lassif ication and re gression pr oblems
 Apply genetic algorithms to combinator ial opt imization pr oblems
 Evaluate and compa re solut ions by various soft comput ing approa ches for a given
problem.



Algo rithms for Data Science

M. Sc (Data Sci ence) Semest er – II
Course Name: Algo rithms for Data Science Course Code: PSDS203
Periods p er week (1 P eriod is 60 minutes) 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Internal -- 40

Course Objectives:

The course is aimed at:
 focussing on the principles of data reduction and core algorithms for analysing the data of
data science
 providi ng many oppor tunities to deve lop a nd improve programming skills
 applying algorithms t o real world data set
 Imparting design thinking capability to bui ld big-data

Unit Details Lectures
I Introd uction: What Is Data Science?, Diabetes in Ame rica, Autho rs of the
Federalist Papers, Forecasting NASDAQ Stock Prices, Algorithms,
Python, R , Terminolo gy and Not ation
Data Mapping and Data Dictionaries: Data Reduction, Political
Contributions, Dictionaries, Tutori al: Big Contributor s, Data Reduction,
Election C ycle Contributions, S imilarity Measures, C omput ing Similarity
Scalable Algorithms and Asso ciative S tatistics: Introduc tion, Assoc iative
Statistics, U nivariate Observations, F unctions, His togram Constru ction,
Mult ivariate Da ta, Com puting the Cor relation Mat rix, Linear Regression,
Comput ing β



12
II Hadoop and MapR educe: Introdu ction, The Hadoop Ecosystem,
Medi care Payments, The Com mand Line Envi ronment, Programming a
12

Page 24


MapR educe Algorithm, Using Amazon Web Services
Data Visu alization: Introduc tion, Principles of Data Visu alization,
Making Good Choi ces, Harnessing the M achine
III Linear Regression Methods: Introd uction, The Linear Regression Model,
Introdu ction to R , Large Data Sets a nd R, Factors, Analysis of R esiduals
Healthcare Analytics: Introd uction, The Behavioral Risk Factor
Surveillance System, Diabetes Prevalence and Incidence, Predicting At-
Risk Indivi duals, Identifying At-Risk Indivi duals, Unusua l Demographic
Attribute Vectors, Building Neighborhood S ets


12
IV Cluster Analysis: Introduction, Hierarchical Agglomerative Clustering,
Compa rison of States, Hierarchical Clustering of States, The k-Means
Algorithm
k-Nearest Neighbor Prediction Functions: Introduction, Not ation and
Terminolo gy, Distance Metrics, The k-Nearest Neighbor Prediction
Function, Expone ntially Weighted k-Nearest Neighbors, Digit
Recognition, Accuracy Ass essment, k-Nearest Neighbor Regression,
Forecasting the S&P 500, Forecasting by Pattern Recognition, Cros s-
Validation
The Mult inomial Naïve B ayes Prediction F unction: Introdu ction, The
Federalist Papers, The Multinom ial N aïve B ayes Prediction F unction,
Reducing the Federalist Papers, Pre dictin g Autho rship of the D isput ed
Federalist Papers, Cus tomer Segmentation







12
V Forecasting: Introd uction, Working with Time, Analytical Methods,
Comput ing ρτ, Drift and Forecasting, Holt-Winters E xpone ntial
Forecasting, Regression-Based Forecasting of Stock Prices, Time-
Varying Regression Estimato rs
Real-time Analytics: Introduc tion, Forecasting with a NASDAQ
Quot ation Stream, Forecasting the Apple Inc. Stream, The Twitte r
Streaming A PI, Sentiment Analysis, Sentiment Analysis of Hashtag
Groups



12

Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
1 Algorithms for Data Sci ence Brian Steele,
John
Chandler,Swarna
Reddy Springer 1st 2016
2 Data Sci ence Algorithms in
a Week David Natingga Packt
Publishing 1st 2017
3 Data Sci ence: Theories,
models, Al gorithms and
Analytics SanjivRanjan
Das S.R. D as 1st 2017

Page 25




Algo rithms for Data Science Practical

M. Sc (Data Sci ence) Semest er II
Course Name:A lgorithms for Data Sc ience Practical Course Code: PSDS2P3
Periods p er week (1 P eriod is 60 minutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal --



Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:
At the e nd of the c ourse the student shoul d be able to:
 Understand fundament als of d ata science
 Apply data visualis ation in bi g-data analytics
 Apply Hadoop a nd ma p-reduce algorithm to big data
 Apply different al gorithms to d ata sets
 Perform real-time a nalytics

Optimization Techni ques

M. Sc (Data Sci ence) Semest er – II
Course Name: Optimization Techniques Course Code: PSDS204
Periods p er week (1 P eriod is 60 minutes) 4
Credits 4
Hours Marks
Evaluation System Theory Examination 2½ 60
Internal -- 40

Course Objectives:

 To familiarize the students with some basic concepts of optimization techniques and
approa ches.
 To formulate a r eal-world problem a s a m athematical pr ogramming model.
 To develop the model f ormulation and applications a re used in solvi ng decision probl ems.
 To solve specialized linear programming problems like the transport ation and assignment
Problems.

Page 26




Unit Details Lectures
I Math ematic al Foundations: Functions and Cont inuity, Review of
Calculus, V ectors, Ma trix Algebra, Eigenvalues and E igenvectors,
Optimization and Optimalit y, General Formulation of Opti mization
Problems
Algorithms, Compl exity, and Conve xity: What Is an Algorithm?, Order
Notations, Conve rgence Rate, Comput ational Comp lexity, Conve xity,
Stochastic Nature in A lgorithms



12
II Optimization: Unconstr ained Optimization, Gradient-Based Methods,
Gradient -Free N elder–MeadMethod
Constr ained Opti mization: Math ematic al Formulation, Lagrange
Mult ipliers, Slack Variables, Generalized Reduced GradientMethod,
KKT Condi tions, P enaltyMethod
Optimization Techniqu es: Approximation Methods: BFGS Method,
Trust-Region Method, S equential Qu adratic Programming, Conv ex
Optimization, Equality Constr ained Optimization, Barrier Functions,
Interior-PointMethods, S tochastic and Robust Opti mization




12
III Linear Programming: Introduction, SimplexMethod, Worked Example by
Simplex Method, Interior-PointMethod f or LP
Integer Programming: Integer Linear Programming, LP Relaxation,
Branch and Bound, Mixed Integer Programming, Applic ations of LP, IP,
and M IP
Regression and R egularization: SampleMean and Variance, R egression
Analysis, Nonline ar Least Squares, Over-fitting and Information Crite ria,
Regularization and Lasso Method, Logistic R egression, Principal
Component Ana lysis




12
IV Machine Learning Algorithms: Data Minin g, Data Mining for Big Data,
Artificial N eural Networks, S uppor t Vector M achines, Deep Learning
Queueing Theory and Simulation: Introdu ction, Arrival Model, Service
Model, Basic QueueingMode l, Little’s Law, Queue Ma nagement and
Optimization
Mult iobjective Opti mization: Introduc tion, Pareto Front and Pareto
Optimality, Choice and Challenges, Transformation to Single Objective
Optimization, The �Constraint Method, Evolut ionary Approaches




12
V Constr aint-Handling Techniques: Introd uction and Overview, Method of
Lagrange Mult ipliers, B arrier Function Method, PenaltyMethod, Equality
Constr aints via Tolerance, Feasib ility Crite ria, Stochastic Ranking ,
Mult iobjective Constr aint-Handling and Ranki ng
Evolutiona ry Algorithms: Evolutiona ry Comput ation, Evolutiona ry

12

Page 27

Strategy, G enetic Algo rithms, S imulated Anneal ing, Di fferential
Evolution
Nature-Inspir ed Algorithms: Introdu ction to SI, Ant and Bee Algorithms,
Particle Swarm Optimization, Firefly Algorithm, Cuckoo Search, Bat
Algorithm, F lower Poll ination Al gorithm, Other Algorithms



Books and Re ferences:
Sr. No. Title Autho r/s Publisher Edition Year
1 Optimization Techniques
and Applic ations
with E xamples Xin-She Yang Wiley 3rd 2018
2 Optimization Techniques A.K. Malik,
S.K. Ya dav,
S.R. Y adav I.K.
International
Publishing
House 1st 2012
3 Optimization methods:
from theory to design Marco Cavazzuti Springer 1st 2012
4 Optimization Techniques Chander Moha n,
Kusum D eep New Age
International 1st 2009

Optimization Techni ques Practical

M. Sc (Data Science) Semest er II
Course Name: Opti mization Techniques Pr actical Course Code: PSDS2P4
Periods per week (1 P eriod is 60 minutes) 4
Credits 2
Hours Marks
Evaluation System Practical Examination 2 50
Internal --


Practical:

Perform minimum ten practical based on the basic concepts of each programming paradigm
covering t he entire syllabus.

Course Outcomes:
Learner will be a ble to
 Apply operations research techniques like linear programming problem in industrial
optimization probl ems.
 Solve a llocation pr oblems us ing various OR met hods.

Page 28

 Understand the characteristics of different types of decis ion making environme nt and the
appropri ate decision ma king approach es and tools to be used in each t ype.
 Recognize competitive forces in the marketplace and develop appropriate reactions based on
existing constr aints and resources.

Eval uation Scheme

Internal Evaluation (40 Marks)

The internal assessment marks sh all be awarded as follows:
1. 30 ma rks (Any one of the following):
a. Written T est or
b. SW AYAM ( Advanced Cours e) of minimum 20 hours and c ertification exam
compl eted or
c. NPT EL (Advanced Course) of minimum 20 hours and certification exam
compl eted or
d. V alid International Certifications (Prom etric, Pearson, Certiport, Coursera,
Udemy and the like)
e. One certification marks shall be awarded one course only. For four courses, the
students wi ll have to compl ete four c ertifications.
2. 10 ma rks: Class p articip ation, Que stion answer sessions during lectures, Di scussions

Sugge sted format of Question paper of 30 m arks for the Internal written test.

Q1. Attempt any two of the following: 16
a.
b.
c.
d.

Q2. Attempt any two of the following: 14
a.
b.
c.
d.

Page 29









External Examination: (60 marks)
To be conduc ted by Univers ity as per other Msc Programmes
All questions are compu lsory
Q1 (Based on Unit 1) Attempt any two of the followin g: 12
a.
b.
c.
d.

Q2 (Based on Unit 2) Attempt any two of the followin g: 12
Q3 (Based on Unit 3) Attempt any two of the followin g: 12
Q4 (Based on Unit 4) Attempt any two of the followin g: 12
Q5 (Based on Unit 5) Attempt any two of the followin g: 12

Practical Evaluation (50 m arks)
To be conduc ted by Univers ity as per other Msc Programmes

A Certified copy journal is esse ntial to appear for the p ractical examination.

1. Practical Question 1 20
2. Practical Question 2 20
3. Journal 5
4. Viva Voce 5

OR

1. Practical Question 40
2. Journal 5
3. Viva Voce 5