100 Algorithms for Machine Learning and Data Engineers

Think of this as the one-stop-shop/dictionary/directory for your machine learning algorithms. In this post, you'll find up-to 100 algorithms, including useful infographics to help you know when to use each algorithm (if available).

Scikit-Learn Algorithm Cheat-Sheet

First and foremost are the first 25 algorithms, which are found with this Scikit-Learn cheat sheet below. If you click the image, you'll be taken to the same graphic except it will be interactive. I suggest saving this site as it makes remembering the algorithms and when best to use them incredibly simple and easy.

Included in this infographic are:

  1. SVC
  2. Ensemble Classifiers
  3. Naive Bayes
  4. KNeighbors Classifier (kNN)
  5. Kernel Approximation
  6. Liner SVC
  7. SGD Classifier
  8. SGD Regressor
  9. Elastic Net
  10. Lasso
  11. SVR(kernel='rbf')
  12. EnsebleRegressors
  13. RidgeRegression
  14. SVR(kernel='linear')
  15. Spectral Clustering
  16. GMM
  17. KMeans
  18. MiniBatch KMeans
  19. MeanShift
  20. VBGMM
  21. Randomized PCA
  22. Isomap
  23. Spectral Embedding
  24. LLE
  25. kernal approximation

SAS: The Machine Learning Algorithm Cheat Sheet

You can also find many of the same algorithms on SAS's machine learning cheet sheet, but it includes 15 that are different, of which I have listed below. The SAS website (click the pic) also gives great descriptions about how, when, and why to use each algorithm.

  1. Gradient Boosting Tree
  2. Random Forest
  3. Neural Network
  4. k-modes
  5. Hierarchical
  6. DBSCAN
  7. Gaussian Mixture Model
  8. Latent Dirichlet Analysis
  9. Principle component Analysis
  10. Singular Value Decomposition
  11. Linear SVM
  12. kernel SVM
  13. Decision Tree
  14. Logistic Regression
  15. Linear Regression

Microsoft Azure Machine Learning Algorithm Cheet Sheet

Microsoft Azure's cheet sheet is the simplest cheet sheet by far. Even though it is simple, Microsoft was still able to pack a ton of information into it. Microsoft also made their algorithm sheet available to download. You can find the next 20 algorithms below.

Anomoly Detection
  1. One-class SVM
  2. PCA-based Anomaly Detection
Regression
  1. Ordinal Regression
  2. Poisson Regression
  3. Fast Forest Quantile Regression
  4. Bayesian Linear Regression
  5. Neural Network Regression
  6. Decision Forest Regression
  7. Boosted Decision Tree Regression
Multiclass Classification
  1. Multiclass Logistic Regression
  2. Multiclass Neural Network
  3. Multiclass Decision Forest
  4. Multiclass Decision Jungle
  5. One-v-all Multiclass
Two-Class Classification
  1. Two-Class SVM
  2. Two-Class Average Perceptron
  3. Two-Class Logistic Regression
  4. Two-Class Bayes Point Machine
  5. Two-Class Decision Forest
  6. Two-Class Boosted Decision Tree
  7. Two-Class Decision Jungle
  8. Two-Class Locally Deep SVM
  9. Two-Class Neural Network

THIS IS A NOTE FOR NATHAN TO TELL HIM THAT THERE ARE 40 ALGORITHMS ABOVE THIS NOTE!!!! I AM ALSO THINKING OF ORGANIZING THEM BY TOPIC KIND OF LIKE THE SCIKIT-LEARN CHEAT SHEET, BUT NOT SURE YET. | THIS IS A NEW NOTE ADDED ABOUT 10 MINUTES AFTER THE OTHER NOTE. I'M LIKING THE INFOGRAPHICS AND JUST LISTING THE ALGORITHMS BELOW THEM. WE DON'T NEED TO DESCRIBE ANYTHING BECAUSE THE INFOGRAPHICS DO IT FOR US. ALSO, GOING TO LINK THE ALGORITHMS BACK TO THEIR INDIVIDUAL PAGES BECAUSE WHY REINVENT THE WHEEL WHEN OTHER PEOPLE HAVE DONE THE WORK FOR US.

A special thanks to the Data Science Interns (Rahim, Rabeez, Tooba, Hunaid, Arslan, and Tarun) at Data Science Dojo who helped me put all this together.


This is a companion discussion topic for the original entry at https://blog.datasciencedojo.com/p/7769e39b-fd09-4b6a-b864-eef53aa6d664/