Curriculum

This page comprises the courses I’ve taken throughout my journey of continous self-education starting during my PhD years and on. The list is not intended to cover a university level program. However, the hope is it may serve as a useful guideline to broaden, deepen, or bridge any gaps in learning towards gaining proficiency in Machine Learning, Natural Language Processing, as well as Computer Science fundamentals (in Python).

While the amount of subjects and courses might be overwhelming, it’s practical and totally doable to start by taking 1-2 courses from each subject that focus on different aspects (e.g., theory vs. practice) or areas.

Table of Contents

  1. NLP
  2. ML and Learning Theory
  3. Neural Networks/NNs/DL
  4. Math for ML
  5. Probability Theory
  6. Statistics/Statistical Inference
  7. Python
  8. Data Science Practice
    1. Git
    2. Kaggle
  9. Python for Data Science: Pandas/NumPy/IPython
  10. Computer Science/Algorithms and Data Structures
  11. Scientific Paper Writing
  12. Other Useful Subjects

NLP

Traditional NLP Algorithms

Book by Dan Jurafsky and James H. Martin Speech and Language Processing (2020, in progress)

Deeplearning.ai@Coursera Natural Language Processing Specialization, Courses 1 & 2

CMU 11-711 “Algorithms for NLP”

  • program, tasks, books recommendation (NO videos!)

DL in NLP

Book Yoav Goldberg A Primer on Neural Network Models for Natural Language Processing (2015)

Stanford CS224n: Natural Language Processing with Deep Learning

CMU CS 11-747 Neural Networks for NLP - more latest theory

Deeplearning.ai@Coursera Natural Language Processing Specialization, Courses 3 & 4

ML and Learning Theory

Book by Hal Daumé III, A Course in Machine Learning (2017)

CalTech Learning from Data by Prof. Yaser Abu-Mostafa – fundamental/theoretic

  • program, tasks, and videos

Cornell CS4780 Machine Learning by Prof. Kilian Weinberger –this course & the CalTech’s one perfectly complement each other for the fundamentals of traditional ML & Learning Theory

Stanford@Coursera Machine Learning by Andrew Ng –very popular, less in-depth theory

John Hopkins University@Coursera Data Science: Statistics and Machine Learning Specialization by Brian Caffo –in R!!! applied data science

Neural Networks/NNs/DL

Book Deep Learning (2017) by Ian Goodfellow, Yoshua Bengio, Aaron Courville

Book Dive Into Deep Learning (2020) by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola

CMU 11-785 Intro to Deep Learning by Bhiksha Raj –fundamental

DeepLearning.ai@Coursera Deep Learning Specialization

PyTorch

Deep Learning with PyTorch Book (2020) by Eli Stevens and Thomas Viehmann

Math for ML

This is more advanced for in-depth understanding.

Imperial College London@Coursera Mathematics for Machine Learning Specialization

  • program, tasks, and videos

MIT OCW Matrix Methods in Data Analysis, Signal Processing, and Machine Learning –very fundamental (lots of math!)

  • program, tasks, and videos

Probability Theory (Math for Statistics and Learning Theory)

MIT OCW Probabilistic Systems Analysis and Applied Probability by Prof. John Tsitsiklis

MIT@EdX Probability - The Science of Uncertainty and Data by Prof. John Tsitsiklis –same as above but might be updated; only starts at certain dates and not available out of the running sessions!

  • program, tasks, and videos

METU Probability And Random Variables by Porf. Elif Uysal –faster paced than MIT; NO HMM, Processes, Intro to Stat Inference

  • program and videos (NO tasks)

Statistics/Statistical Inference

Book NIST/SEMATECH e-Handbook of Statistical Methods

Book (selected chapters) HANDBOOK OF BIOLOGICAL STATISTICS by JOHN H.MCDONALD: hypothesis testing chapter

Book An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)

Book The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) –all-time classics

MIT OCW Statistics for Applications (renamed to Fundamentals of Statistics) by Philippe Rigollet –fundamental; only after mastering Probability Theory

  • program, tasks, and videos

MIT@EdX Fundamentals of Statistics by Philippe Rigollet –same as above but might be updated; only starts at certain dates and not available out of the running sessions!

  • program, tasks, and videos

John Hopkins University@Coursera Advanced Statistics for Data Science Specialization by Brian Caffo - R language

  • program, tasks, and videos

Python

Google@Coursera Crash Course on Python –the bare basics

MIT@EdX Introduction to Computer Science and Programming Using Python –more comprehensive - only starts at certain dates and not available out of the running sessions!

MIT@EdX XSeries Computational Thinking using Python –!!! it’s a paid course $

Python Practice

Python on HackerRank

Data Science Practice

Git

Git & GitHub Tutorial for Beginners by The Net Ninja –or anything similar, should be plenty online

Kaggle

Introductory tasks:

  • https://www.kaggle.com/c/titanic
  • https://www.kaggle.com/c/house-prices-advanced-regression-techniques

Educative.io Grokking Data Science: Chapter 4. End-to-End Machine Learning Project –a walk through a Kaggle competition

Python for Data Science: Pandas/NumPy/IPython

Book Python for Data Analysis. Data Wrangling with Pandas, NumPy, and IPython.

Book High Performance Python, 2nd Edition by Micha Gorelick, Ian Ozsvald (2020)

Coursera Applied Data Science with Python Specialization

Coursera Pandas Python Library for Beginners/Indermediate in Data Science

Harvard@EdX Using Python for Research –covers NumPy, Scikit-learn

Educative.io From Python to Numpy –1 month free; further subscription for $

Computer Science/Algorithms and Data Structures

The prep for typical programming interview questions.

Book Algorithms, 4th Edition (2020) by Robert Sedgewick and Kevin Wayne –in Java

Coursera Algorithms I, II by Robert Sedgewick and Kevin Wayne (authors of the Algorithms book) –in Java

MIPT Algorithms and Data Structuresin in Python 3 (Алгоритмы и структуры данных на Python 3) by Timofei Khiryanov –in Russian! I personally like them most

MIR OCW 6.006 Introduction to Algorithms by Prof. Erik Demaine –arguably another best course on algo & DS

Udacity https://www.udacity.com/course/intro-to-theoretical-computer-science–cs313

Harvard@EdX CS50’s Introduction to Computer Science

Scientific Paper Writing

École Polytechnique@Coursera How to Write and Publish a Scientific Paper (Project-Centered Course)

Tsinghua University@EdX Writing, Presenting and Submitting Scientific Papers in English

Other Useful Subjects

The following subject vary from those that are more fundamental and typically taught in the Bachelor Degree program but are useful to refresh/revisit to more in-depth gaduate degree courses that may be useful for applied NLP only to certain extent and in the volume of selected chapters.

  • Linear Algebra
  • Partial Differential Equations
  • Measure-Theoretic Probability
  • Convex Optimization
  • Statistical Inference –basically, similar to some fundamental readings on ML theory suggetsed above
  • Discrete Mathematics
  • Scientific Computing/Numerical Analysis
  • Data Structures and Algorithms
  • Software Design Paradigms in Python and C++
  • Stochastic Calculus
  • Stochastic Optimization
  • Managing/Analyzing Large Data Sets
  • Parallel/Distributed Computing
  • Deep Learning (start with Intro to Deep Learning by Bhiksha Raj and then decide where to advance)
  • Reinforcement Learning
  • One domain/practical project course focused on modeling/algorithms
  • One domain/practical project course focused on big data (Stanford’s CS246: Mining Massive Data Sets, videos available here)

Contact

linkedin