You’ll be learning how to find unique values or elements from two series such as series1 and series2 in this thread. There are several different ways through which this can be easily achieved, and if you want to learn more about what a Series is and how you can create one, have a look at Building Pandas series with several datatypes thread and if you wanna learn how to filter values from one series that are absent in another series, look at the thread of Filtering out values from a series.
However, these are the techniques for finding unique elements between two series:
1. Using set "difference()" method both ways:
- The
difference()
method in Python is used to get the set difference between two sets. It returns a set containing elements present in the first set but not in the second set. - For this technique to work, we first have to convert our series into sets using
set()
constructor.
In this technique, the difference()
method is applied first on set1
with regard to set2
to get unique items present in set1
and the same is done for set2
with regard to set1
to get unique items from set2
.
2. Using set "symmetric_difference()" method:
- The
symmetric_difference()
method is a set operation available in Python that returns a new set containing all the elements that are unique to each set i.e., elements that are in either of the sets but not in both. - Since
symmetric_difference()
is a set operation, we’ve used theset()
constructor to convert them into sets and apply the operation.
3. Using boolean indexing with "isin()" method:
- The
isin()
method is used to check whether each element in a Pandas DataFrame or Series is contained in a sequence of values which in our case is another series. - It returns a boolean mask (True/False) indicating which values are in the sequence of values passed to the method.
In the example above, the method is applied separately for both series to get unique values from both, but a key point is that the mask is negated (using ~
) so that values that match between the series get a False
and the unique ones get a True
helping us to filter the True
values only.
4. Using "pd.concat()" with "drop_duplicates" argument:
- The
pd.concat()
function is used to concatenate or join two or more Pandas objects along a particular axis (row or column) into a single Pandas object. In this example, we will join 2 series objects. - The
drop_duplicates(keep = False)
method will drop all rows that are duplicates, even the first occurrence of them. It will return a new DataFrame with only the unique rows i.e., it’ll only have unique values.