N-grams - Text Analytics with R

Originally published at: https://tutorials.datasciencedojo.com/text-analytics-r-n-grams/

N-grams includes specific coverage of:

• Validate the effectiveness of TF-IDF in improving model accuracy.
• Introduce the concept of N-grams as an extension to the bag-of-words model to allow for word ordering.
• Discuss the trade-offs involved of N-grams and how Text Analytics suffers from the “Curse of Dimensionality”.
• Illustrate how quickly Text Analytics can strain the limits of your computer hardware.

Kaggle Dataset can be found here

The data and R code used in this series is available here

(137)