Originally published at: https://tutorials.datasciencedojo.com/r-programming-machine-learning/
The R programming language is experiencing rapid increases in popularity and wide adoption across industries. This popularity is due, in part, to R’s huge collection of open source machine learning algorithms. If you are a data scientist working with R, the caret package (short for [C]lassification [A]nd [RE]gression [T]raining) is a must-have tool in your toolbelt. The package provides capabilities that are ubiquitous in all stages of the data science project lifecycle. Most important of all, it provides a common interface for training, tuning, and evaluating more than 200 machine learning algorithms. Not surprisingly, caret is a sure fire way to accelerate your velocity as a data scientist!
In this presentation Dave Langer will provide an introduction to this package. The focus of the presentation will be using caret to implement some of the most common tasks of the data science project lifecycle and to illustrate incorporating it into your daily work.
Viewers will learn how to:
• Create stratified random samples of data useful for training machine learning models.
• Train machine learning models using common interface.
• Leverage the powerful features for cross-validation and hyperparameter tuning.
• Scale caret via use of multi-core, parallel training.
• Increase their knowledge of the many features.
R code and accompanying dataset can be found here