If you have read any of my previous posts, you'll know I don't call myself a data scientist. I would call myself a data enthusiast. Being in marketing, I make almost every business decision based on what the data is telling me.
Having said that, I ran into a great tutorial series about time series in python. It's meant for intermediate to advanced learners, but I found it was incredibly easy to follow along (even if I didn't understand some of the concepts/techniques).
Here are the packages used in the tutorials:
pandas
matplotlib
StatsModels
statistics
This tutorial is taught in Python. If you are more comfortable with R, the presenter has shared the R code (and Python) script Repository
Part 1: Read and Transform Your Data
In Part 1, you will learn how to read and index your data for time series, check that the data meets the requirements or assumptions for time series modeling, and transform your data to ensure it meets those requirements.
The next two parts both start right where Part 1 left off. They both don't have much of an introduction other than a really short review of what was covered in the previous section(s). If you aren't completing each tutorial right after the other, make sure to go back and review.
Part 2: Arima Modeling and Forecasting in Python
Part 2 has you building an Arima model using the StatsModel
package, predicting N timestamps into the future. In addition, you will also look at the Autocorrelation Function plot and Partial Autocorrelation Function plot to determine the terms in your time series model.
Part 3: Evaluating Time Series Forecasts
In the final part of the (time) series, you'll evaluate predictions using mean absolute error and Python's statistics
and matplotlib
packages.
At the end of the video, the presenter challanges you to improve on the model she walks you through. I don't have the data science know-how to improve it, but maybe you do! I encourage you to add your improvements to the Discussion. Who knows, maybe I'll contact you to collaborate on a follow up blog post :)
This is a companion discussion topic for the original entry at https://blog.datasciencedojo.com/p/1f0d33c7-94da-4618-9a52-84ac1546d443/