This page comprises the courses I’ve taken throughout my journey of continous self-education starting during my PhD years and on. The list is not intended to cover a university level program. However, the hope is it may serve as a useful guideline to broaden, deepen, or bridge any gaps in learning towards gaining proficiency in Machine Learning, Natural Language Processing, as well as Computer Science fundamentals (in Python).
While the amount of subjects and courses might be overwhelming, it’s practical and totally doable to start by taking 1-2 courses from each subject that focus on different aspects (e.g., theory vs. practice) or areas.
Table of Contents
- NLP
- ML and Learning Theory
- Neural Networks/NNs/DL
- Math for ML
- Probability Theory
- Statistics/Statistical Inference
- Python
- Data Science Practice
- Python for Data Science: Pandas/NumPy/IPython
- Computer Science/Algorithms and Data Structures
- Scientific Paper Writing
- Other Useful Subjects
NLP
Traditional NLP Algorithms
Book by Dan Jurafsky and James H. Martin Speech and Language Processing (2020, in progress)
Deeplearning.ai@Coursera Natural Language Processing Specialization, Courses 1 & 2
CMU 11-711 “Algorithms for NLP”
- program, tasks, books recommendation (NO videos!)
DL in NLP
Book Yoav Goldberg A Primer on Neural Network Models for Natural Language Processing (2015)
Stanford CS224n: Natural Language Processing with Deep Learning
CMU CS 11-747 Neural Networks for NLP - more latest theory
Deeplearning.ai@Coursera Natural Language Processing Specialization, Courses 3 & 4
ML and Learning Theory
Book by Hal Daumé III, A Course in Machine Learning (2017)
CalTech Learning from Data by Prof. Yaser Abu-Mostafa – fundamental/theoretic
- program, tasks, and videos
Cornell CS4780 Machine Learning by Prof. Kilian Weinberger –this course & the CalTech’s one perfectly complement each other for the fundamentals of traditional ML & Learning Theory
Stanford@Coursera Machine Learning by Andrew Ng –very popular, less in-depth theory
John Hopkins University@Coursera Data Science: Statistics and Machine Learning Specialization by Brian Caffo –in R!!! applied data science
Neural Networks/NNs/DL
Book Deep Learning (2017) by Ian Goodfellow, Yoshua Bengio, Aaron Courville
Book Dive Into Deep Learning (2020) by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
CMU 11-785 Intro to Deep Learning by Bhiksha Raj –fundamental
- program, tasks, videos, books
- videos Fall 2019 (–actually, the 2020 course run might be better as the classes were lead from home and the instructor used time more freely)
DeepLearning.ai@Coursera Deep Learning Specialization
PyTorch
Deep Learning with PyTorch Book (2020) by Eli Stevens and Thomas Viehmann
Math for ML
This is more advanced for in-depth understanding.
Imperial College London@Coursera Mathematics for Machine Learning Specialization
- program, tasks, and videos
MIT OCW Matrix Methods in Data Analysis, Signal Processing, and Machine Learning –very fundamental (lots of math!)
- program, tasks, and videos
Probability Theory (Math for Statistics and Learning Theory)
MIT OCW Probabilistic Systems Analysis and Applied Probability by Prof. John Tsitsiklis
MIT@EdX Probability - The Science of Uncertainty and Data by Prof. John Tsitsiklis –same as above but might be updated; only starts at certain dates and not available out of the running sessions!
- program, tasks, and videos
METU Probability And Random Variables by Porf. Elif Uysal –faster paced than MIT; NO HMM, Processes, Intro to Stat Inference
- program and videos (NO tasks)
Statistics/Statistical Inference
Book NIST/SEMATECH e-Handbook of Statistical Methods
Book (selected chapters) HANDBOOK OF BIOLOGICAL STATISTICS by JOHN H.MCDONALD: hypothesis testing chapter
Book An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)
Book The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) –all-time classics
MIT OCW Statistics for Applications (renamed to Fundamentals of Statistics) by Philippe Rigollet –fundamental; only after mastering Probability Theory
- program, tasks, and videos
MIT@EdX Fundamentals of Statistics by Philippe Rigollet –same as above but might be updated; only starts at certain dates and not available out of the running sessions!
- program, tasks, and videos
John Hopkins University@Coursera Advanced Statistics for Data Science Specialization by Brian Caffo - R language
- program, tasks, and videos
Python
Google@Coursera Crash Course on Python –the bare basics
MIT@EdX Introduction to Computer Science and Programming Using Python –more comprehensive - only starts at certain dates and not available out of the running sessions!
MIT@EdX XSeries Computational Thinking using Python –!!! it’s a paid course $
Python Practice
Data Science Practice
Git
Git & GitHub Tutorial for Beginners by The Net Ninja –or anything similar, should be plenty online
Kaggle
Introductory tasks:
- https://www.kaggle.com/c/titanic
- https://www.kaggle.com/c/house-prices-advanced-regression-techniques
Educative.io Grokking Data Science: Chapter 4. End-to-End Machine Learning Project –a walk through a Kaggle competition
Python for Data Science: Pandas/NumPy/IPython
Book Python for Data Analysis. Data Wrangling with Pandas, NumPy, and IPython.
Book High Performance Python, 2nd Edition by Micha Gorelick, Ian Ozsvald (2020)
Coursera Applied Data Science with Python Specialization
Coursera Pandas Python Library for Beginners/Indermediate in Data Science
Harvard@EdX Using Python for Research –covers NumPy, Scikit-learn
Educative.io From Python to Numpy –1 month free; further subscription for $
Computer Science/Algorithms and Data Structures
The prep for typical programming interview questions.
Book Algorithms, 4th Edition (2020) by Robert Sedgewick and Kevin Wayne –in Java
Coursera Algorithms I, II by Robert Sedgewick and Kevin Wayne (authors of the Algorithms book) –in Java
MIPT Algorithms and Data Structuresin in Python 3 (Алгоритмы и структуры данных на Python 3) by Timofei Khiryanov –in Russian! I personally like them most
- additional practice
MIR OCW 6.006 Introduction to Algorithms by Prof. Erik Demaine –arguably another best course on algo & DS
Udacity https://www.udacity.com/course/intro-to-theoretical-computer-science–cs313
Harvard@EdX CS50’s Introduction to Computer Science
Scientific Paper Writing
École Polytechnique@Coursera How to Write and Publish a Scientific Paper (Project-Centered Course)
Tsinghua University@EdX Writing, Presenting and Submitting Scientific Papers in English
Other Useful Subjects
The following subject vary from those that are more fundamental and typically taught in the Bachelor Degree program but are useful to refresh/revisit to more in-depth gaduate degree courses that may be useful for applied NLP only to certain extent and in the volume of selected chapters.
- Linear Algebra
- Partial Differential Equations
- Measure-Theoretic Probability
- Convex Optimization
- Statistical Inference –basically, similar to some fundamental readings on ML theory suggetsed above
- Discrete Mathematics
- Scientific Computing/Numerical Analysis
- Data Structures and Algorithms
- Software Design Paradigms in Python and C++
- Stochastic Calculus
- Stochastic Optimization
- Managing/Analyzing Large Data Sets
- Parallel/Distributed Computing
- Deep Learning (start with Intro to Deep Learning by Bhiksha Raj and then decide where to advance)
- Reinforcement Learning
- One domain/practical project course focused on modeling/algorithms
- One domain/practical project course focused on big data (Stanford’s CS246: Mining Massive Data Sets, videos available here)