Filtering Words from a Series

This thread is focused on filtering those words from a series of words that contain at least 2 vowels. If you want to learn how to filter values from a series, check out the thread of Filtering out values from a series.

1. Using regular expression:

  • A regular expression (also known as regex or regexp) is a sequence of characters that define a search pattern. They are used to match and manipulate text.
  • They consist of a combination of special characters and literals, which define a pattern to search for within a string.
  • In the example code below, a pattern is defined to find at least 2 vowels in a word and then used with a list comprehension.

2. Using Python's "count()" method:

  • Python’s count() method can be used with a list comprehension to count the number of vowels for each word and place the word in the list if the count is at least 2.

3. Using a boolean array:

  • In this method, we create a boolean array by counting the number of vowels in each word, and if it is at least 2, then we return a True for that word.
  • The array is then used as a mask to filter words from the series where the value is True.

4. Using "apply()" with custom function:

  • Here, we define a custom function that returns true for a word if its vowel count is at least 2.
  • We then use the apply() method to apply this function to the series, dropping all NaN values using dropna() method where the condition was not true.