Intersect & Setdiff - Combining Datasets in dplyr

Originally published at: https://tutorials.datasciencedojo.com/intro-to-dplyr-combining-datasets/

We introduce functions that make it easy to find overlapping and distinct values from two different data sources, intersect and setdiff. These two functions let you see the shared and unique elements from different vectors, making it easy to spot commonalities and differences. After watching this video, you’ll walk away feeling more empowered to tackle large datasets and pinpoint how much similarity they share.

dplyr is a a great tool to perform data manipulation. It makes your data analysis process a lot more efficient. Even better, it’s fairly simple to learn and start applying immediately to your work! Oftentimes, with just a few elegant lines of code, your data becomes that much easier to dissect and analyze. For these reasons, it is an essential skill to master for any aspiring data scientist.

To get setup with dplyr, watch our first tutorial.

Be sure to also check our accompanying blog post here.

(186)