A curated list of resources for machine learning and deep learning that I found useful.
Last updated: 11/2017
- The Elements of Statistical Learning: Book by the authors of LASSO, gradient boosting, MARS, etc. Machine learning from a statistical viewpoint. Free pdf available online.
- An Introduction to Statistical Learning: An introductory version of the above book, by mostly the same authors.
- Data Analysis Using Regression and Multilevel/Hierarchical Models: Introductory book by Andrew Gelman. Focuses on data analysis, inference, and explanation of your data using linear regression models. Very accessible.
- Machine Learning: A Probabilistic Perspective: Machine learning from a mostly Bayesian viewpoint. Covers a very wide range of ML algorithms in depth.
- Pattern Recognition and Machine Learning: Classic Bayesian ML textbook.
- Bayesian Data Analysis: Bible of Bayesian analysis by Andrew Gelman.
- Convex Optimization: Bible of convex optimization by Stephen Boyd. Free pdf available.
- Deep Learning Book: Written by top deep learning researchers. Includes practical guidelines, theoretical justifications and advanced materials on recent research. Does not cover deep reinforcement learning. Free pdf available.
- Machine Learning (Coursera): Most well-known intro ML online course by Andrew Ng. Very accessible and light in math.
- Stanford CS229 - Machine Learning (video): Andrew Ng's course at Stanford; covers the deeper math and theory of ML. Handouts available on the website.
- Stanford CS246 - Mining Massive Datasets (video, book): Practical data mining methods, such as map-reduce, page-rank and recommendation system. Handouts available on the website.
- Stanford Statistical Learning: Course based on the "Introduction to Statistical Learning" book, taught by its authors.
- Stanford Convex Optimization (video): Taught by Stephen Boyd himself.
- Sctkit-learn Documentation: Very comprehensive documentation, covers many common ML algorithms and has a lot of practical examples.
- Rules of Machine Learning: Practical advices for industrial ML from a Google Engineer.
- Uber's ML Platform: Design of Uber's Machine Learning platform; covers a lot of aspects on how to create, deploy and verify a ML product in industry.
- Machine Learning Tech Debt: Good article by Google on the potential tech debt you might accumulate in building ML systems.
- Practical advice for data analysis: Advices on doing large-scale data analysis in industry by a Google data scientist.
- Stanford CS 231n - Convolutional Network for Computer Vision (video): By Fei-Fei Li and Andrej Karpathy. Deep learning basics, convolutional net and computer vision applications
- Stanford CS 224n - NLP with Deep Learning (video): By Chris Manning and Richard Socher. Word2vec, recurrent network, machine translation, and other NLP applications.
- Theories of Deep Learning (Stanford STATS 385)
- Berkeley CS 294 - Deep Reinforcement Learning (video)
- Neural Networks (Coursera): By Geoffrey Hinton, godfather of the modern deep learning.
- Oxford Machine Learning (video): By Nando de Freitas. Starts from basic ML and dives into deep learning.
- Neural networks - Université de Sherbrooke (video): By Hugo Larochelle.
- 2016 Deep Learning Summer School at Montreal
- 2017 Deep Learning Summer School at Montreal
- Deep Learning Tutorials: Tutorial from Yoshua Bengio's prior course
- Unsupervised Feature Learning and Deep Learning: Tutorial from Andrew Ng's prior deep learning course.
- Tensorflow Tutorial
- Most Cited Deep Learning Papers: A list of important deep learning papers.
- Deep Learning Papers Reading Roadmap: Another good list of deep learning papers.
- Arxiv Sanity: Andrej Karpathy's tool to help you find good papers on arxiv.
- Reddit r/MachineLearning: Actually a place where people discuss good recent research. Paper authors sometimes respond to comments as well.
- Columbia Advanced Machine Learning Seminar: Blog posts on the papers discussed in the seminar.
- numpy + scipy: Fast vector and matrix operations, linear algebra, optimization, sparse matrix.
- scikit-learn: Most popular ML library in Python.
- pandas: Data wrangling and analysis.
- Spark: Data processing and analysis for large-scale data.
- statsmodel: Statistics functions.
- cvxpy: Convex optimization.
- stan and pymc: Bayesian modeling and inferences.
- opencv and scikit-image: Computer vision and image analysis.
- nltk and spacy: Natural laguage processing. Spacy is newer and more performant.
- matplotlib: Most popular Python plotting library
- seaborn: Wrapper of matplotlib to make it look nicer and to provide additional statistical graphs.
- plotly: Interactive graphs.
- xgboost: Most popular and well-tested GBM package.
- lightgbm: A newer library by Microsoft. 5-10X faster than xgboost default mode.
- catboost: Another new libary by Yandex. Handles categorial features naturally and claims to be more accurate than prior libraries.
- theano: One of the early deep learning libary widely used in academia.
- torch: Another early library popular in academia. It is in Lua instead of Python.
- caffe: Popular library for conv net. Has a lot of pretrained models.
- tensorflow: Backed by Google, arguablly the most popular libary now. API is quite similar to theano.
- caffe2: Successor of caffe by Facebook.
- pytorch: Bring torch to Python, also by Facebook.
- mxnet: An open-sourced framework (Apache incubator) backed by Amazon
- cntk: Deep learning framework by Microsoft.
- keras: Provides high-level deep learning API that runs on the top of Tensorflow, theano or CNTK. Very user friendly. Now officially supported in tensorflow.