How can you find the unique elements in two different Pandas Series in Python?

mubashir_rizvi · February 21, 2023, 6:03pm

This query is somewhat similar to my previous query which was of Efficiently Filter Values from a Pandas Series. However, in this query, I want to find out or filter out unique values present in two series objects. Please provide me with different methods and techniques for doing this.

safiaa.02 · April 17, 2023, 12:29pm

You can use the set difference() method which gets the set difference between two sets. It returns a set containing elements present in the first set but not in the second set. For this technique to work, we first have to convert our series objects into set objects using the set() constructor.

In this technique, the difference() method is applied first on set1 with regard to set2 to get unique items present in set1, and the same is done for set2 with regard to set1 to get unique items from set2.

safa · April 18, 2023, 6:36pm

Hey @mubashir_rizvi ,from your query what I get is you want to find out unique values from the dataset and filter them. For this purpose, you can use the technique of boolean indexing which is provided by the isin() method as it checks whether each element in a Pandas dataframe or series is contained in a sequence of values which in your case is another series. It returns a boolean mask (True/False) indicating which values are in the sequence of values passed to the method.

In the code, the method is applied separately for both series to get unique values from both, but a key point is that the mask is negated (using ~) so that values that match between the series get a False and the unique ones get a True helping us to filter the True values (unique values) only.

sabih · April 18, 2023, 7:53pm

You can use the set difference() method, which gets the set difference between two sets. It returns a set containing elements present in the first set but not in the second set. For this technique to work, we first have to convert our series objects into set objects using the set() constructor.

In this technique, the difference() method is applied first to set1 with regard to set2 to get unique items present in set1, and the same is done for set2 with regard to set1 to get unique items from set2.

nimrah · April 20, 2023, 12:00pm

Hey @mubashir_rizvi inorder to accomplish your objective, you can utilize the Pandas functions pd.concat() and drop_duplicates(keep=False) .

The pd.concat() concatenates or joins two or more Pandas objects along a particular axis (row or column) into a single Pandas object. In this example, we will join 2 series objects.
The drop_duplicates(keep = False) method will drop all rows that are duplicates, even the first occurrence of them. It will return a new dataframe with only the unique rows i.e., it’ll only have unique values.