Introduction to dplyr - Setup & Data Preparation

Originally published at:

dplyr is a a great tool to perform data manipulation. It makes your data analysis process a lot more efficient. Even better, it’s fairly simple to learn and start applying immediately to your work! Oftentimes, with just a few elegant lines of code, your data becomes that much easier to dissect and analyze. For these reasons, it is an essential and foundational skill to master for any aspiring data scientist. In part 1 we cover how to get setup with R, load the wine data set, and install ggplot2/dplyer packages.

Often one may be surprised how some easy-to-learn functions can make the data analysis process that much more efficient. That is certainly the case with dplyr. In this series, we will teach you how to use this incredibly useful package to mung data, while demonstrating with a Kaggle dataset on wine ratings.

Items needed:
R Programming Language

dplyr Package:

ggplot2 Package:

Be sure to also check our accompanying blog post here.

Never used R? Watch our series on: Introduction to R