In this output particularly, there is overlapping of two elements within each row and I want to learn how you can create these overlapping row-strided dataframes as well non-overlapping ones such as:
col1 col2 col3
0 1 2 3
1 4 5 6
2 7 8 9
3 10 - -
If anyone could provide some codes and methods to create both types (overlapping and non-overlapping) dataframes using a series, please provide them below with an example code which will help me greatly.
Non-overlapping strides, which mean that within rows, there will be no overlapping or common values, and values in each row will be unique.
In the loop, we iterate over the length of the series, with a step size equal to the stride length. The stride is a list slice of the original series, starting at the current index and ending at the current index plus the stride length. This creates a list of non-overlapping strides.
Lastly, a DataFrame is created using pd.DataFrame() which has one row for each stride, with the values of each stride as columns.
You can rename the resulting columns if you want using a similar code line that was used in methods 1 and 2.
This is a simple method by which you can create non-overlapping strides easily.
Also, if the number of values in your series can’t be divided equally considering the stride_len, then the final result will have NaN values. You can see this in the last row of the above code snippet.
Hey @mubashir_rizvi , You can also get this by using the NumPy library’s function to create overlapping strides.
This method is the most flexible if you want to create overlapping strides.
However, a disadvantage of this method is that if total elements in strided data can’t be equally adjusted in rows and columns, then you would have to drop some rows because the NumPy function would create the same number of rows as (len(series) - stride_len + 1).
Also, remember that the value of overlap should be less than the stride_len.
Hey @mubashir_rizvi, you can use this method to create non-overlapping strides. In this method, the shape and strides of the data are calculated differently and are passed as arguments in the np.lib.stride_tricks.as_strided() function to create a view of the original data. The code uses the resulting view to create a new DataFrame using pd.DataFrame() which has one row for each stride, with the values of each stride as columns. For example: