NumPy provides a powerful array object, called ndarray, which is used to store homogeneous data in a multi-dimensional array. In some cases, we may need to extract unique rows from a two-dimensional NumPy array. In this article, we will discuss how to extract unique rows from a two-dimensional NumPy array using various approaches, which are mentioned below:
1. Using “np.unique()” function:
- Imports the NumPy library and creates a 2D NumPy array.
- Applies the
np.unique()
function to the input array, specifying the axis as0
. - Function returns the sorted unique rows of the array.
- The resulting array is printed to the console.
- Code returns an array with only the unique rows of the input array, sorted in ascending order and in a sorted manner.
2. Using “set() and map()” functions:
- imports the NumPy library and creates a 2D NumPy array.
- converts each row of the array into a tuple using the
map()
function and stores the result in a list. - code then extracts the unique tuples from the list using the
set()
function. - Finally, the unique tuples are converted back to a NumPy array and printed to the console.
- Here, your output won’t be sorted.
- This code returns an array with only the unique rows of the input array, sorted in arbitrary order, by converting the rows into tuples and using the
set()
function.
3. Using “pandas DataFrame”:
- imports the NumPy and pandas libraries and creates a 2D NumPy array.
- then converts the array to a pandas DataFrame using the
pd.DataFrame()
function. - code drops the duplicate rows of the DataFrame using the
drop_duplicates()
function and stores the result in a new DataFrame. - Finally, the unique DataFrame is converted back to a NumPy array and printed to the console.
- Even here, your output won’t be sorted.
- This code returns an array with only the unique rows of the input array, sorted in the order they appear, by converting the array to a pandas DataFrame and using the
drop_duplicates()
function.