XGBoost For Time Series Forecasting: Don’t Use It Blindly

Published Jun 25, 2023 by Michael Grogan

When modelling a time series with a model such as ARIMA, we often pay careful attention to factors such as seasonality, trend, the appropriate time periods to use, among other factors.

However, when it comes to using a machine learning model such as XGBoost to forecast a time series — all common sense seems to go out the window. Rather, we simply load the data into the model in a black-box like fashion and expect it to magically give us accurate output.

A little known secret of time series analysis — not all time series can be forecast, no matter how good the model. Attempting to do so can often lead to spurious or misleading forecasts.

To illustrate this point, let us see how XGBoost (specifically XGBRegressor) varies when it comes to forecasting 1) electricity consumption patterns for the Dublin City Council Civic Offices, Ireland and 2) quarterly condo sales for the Manhattan Valley.

Learn More

Are you interested in learning more about how machine learning models can be used to analyse time series data more effectively?

Book a free 30 minute consultation with me on Calendly.

How XGBRegressor Forecasts Time Series

XGBRegressor uses a number of gradient boosted trees (referred to as n_estimators in the model) to predict the value of a dependent variable. This is done through combining decision trees (which individually are weak learners) to form a combined strong learner.

When forecasting a time series, the model uses what is known as a lookback period to forecast for a number of steps forward. For instance, if a lookback period of 1 is used, then the X_train (or independent variable) uses lagged values of the time series regressed against the time series at time t (Y_train) in order to forecast future values.

Forecasting Electricity Consumption

Let’s see how this works using the example of electricity consumption forecasting.

The dataset in question is available from data.gov.ie. From this graph, we can see that a possible short-term seasonal factor could be present in the data, given that we are seeing significant fluctuations in consumption trends on a regular basis.

Let’s use an autocorrelation function to investigate further.

From this autocorrelation function, it is apparent that there is a strong correlation every 7 lags. Intuitively, this makes sense because we would expect that for a commercial building, consumption would peak on a weekday (most likely Monday), with consumption dropping at the weekends.

When forecasting such a time series with XGBRegressor, this means that a value of 7 can be used as the lookback period.

# Lookback period
lookback = 7
X_train, Y_train = create_dataset(train, lookback)
X_test, Y_test = create_dataset(test, lookback)

The model is run on the training data and the predictions are made:

from xgboost import XGBRegressor
model = XGBRegressor(objective='reg:squarederror', n_estimators=1000)
model.fit(X_train, Y_train)
testpred = model.predict(X_test)

The predictions and test data are reshaped for analysis:

Y_test=Y_test.reshape(-1,1)
testpred=testpred.reshape(-1,1)

Let’s calculate the RMSE and compare it to the test mean (the lower the value of the former compared to the latter, the better).

>>> import math
>>> from math import sqrt
>>> test_mse = mean_squared_error(Y_test, testpred)
>>> rmse = sqrt(test_mse)
>>> print('RMSE: %f' % rmse)
RMSE: 437.935136

>>> np.mean(Y_test)
3895.140625

We see that the RMSE is quite low compared to the mean (11% of the size of the mean overall), which means that XGBoost did quite a good job at predicting the values of the test set.

XGBRegressor vs. ARIMA

Having used the XGBRegressor to generate a forecast, how does it compare to a forecast using a standard ARIMA model?

When conducting an analysis in R, auto.arima was used to determine the ideal model configuration using the training data:

> fitarima<-auto.arima(mydata$Kilowatts[1:544], trace=TRUE, test="kpss", ic="bic")

 Fitting models using approximations to speed things up...

 ARIMA(2,1,2) with drift         : 8736.153
 ARIMA(0,1,0) with drift         : 9047.789
 ARIMA(1,1,0) with drift         : 9052.507
 ARIMA(0,1,1) with drift         : 9035.972
 ARIMA(0,1,0)                    : 9041.492
 ARIMA(1,1,2) with drift         : 8745.916
 ARIMA(2,1,1) with drift         : 8744.459
 ARIMA(3,1,2) with drift         : 8712.164
 ARIMA(3,1,1) with drift         : 8711.061
 ARIMA(3,1,0) with drift         : 8929.878
 ARIMA(4,1,1) with drift         : 8523.678
 ARIMA(4,1,0) with drift         : 8808.873
 ARIMA(5,1,1) with drift         : 8334.125
 ARIMA(5,1,0) with drift         : 8455.541
 ARIMA(5,1,2) with drift         : Inf
 ARIMA(4,1,2) with drift         : Inf
 ARIMA(5,1,1)                    : 8327.831
 ARIMA(4,1,1)                    : 8517.408
 ARIMA(5,1,0)                    : 8449.245
 ARIMA(5,1,2)                    : Inf
 ARIMA(4,1,0)                    : 8802.576
 ARIMA(4,1,2)                    : Inf

 Now re-fitting the best model(s) without approximations...

 ARIMA(5,1,1)                    : 8348.318

 Best model: ARIMA(5,1,1)

A model configuration of ARIMA(5,1,1) is yielded:

This model is used to make predictions — which are then compared to the test set. Here are the results.

> forecastedvalues=forecast(fitarima,h=136)
> test=mydata$Kilowatts[545:680]
> library(Metrics)
> rmse(forecastedvalues$mean, test)
[1] 895.6124
> mean(test)
[1] 3905.879

We can see that the RMSE came in at 895 as compared to the test mean of 3,905.

In contrast, the RMSE for the XGBRegressor model came in at 437 as compared to a test mean of 3,895.

In this instance, the XGBRegressor model outperformed ARIMA and was a good choice!

How well will XGBRegressor perform when forecasting condo sales for the Manhattan Valley? Let’s find out.

Forecasting Manhattan Valley Condo Sales

In the above example, we evidently had a weekly seasonal factor, and this meant that an appropriate lookback period could be used to make a forecast.

However, there are many time series that do not have a seasonal factor. This makes it more difficult for any type of model to forecast such a time series — the lack of periodic fluctuations in the series causes significant issues in this regard.

Here is a visual overview of quarterly condo sales in the Manhattan Valley from 2003 to 2015. The data was sourced from NYC Open Data, and the sale prices for Condos — Elevator Apartments across the Manhattan Valley were aggregated by quarter from 2003 to 2015.

From the above, we can see that there are certain quarters where sales tend to reach a peak — but there does not seem to be a regular frequency by which this occurs.

Again, let’s look at an autocorrelation function.

From the autocorrelation, it looks as though there are small peaks in correlations every 9 lags — but these lie within the shaded region of the autocorrelation function and thus are not statistically significant.

What if we tried to forecast quarterly sales using a lookback period of 9 for the XGBRegressor model?

The same model as in the previous example is specified:

from xgboost import XGBRegressor
model = XGBRegressor(objective='reg:squarederror', n_estimators=1000)
model.fit(X_train, Y_train)
testpred = model.predict(X_test)
testpred

The predictions and test data are reshaped for analysis:

Y_test=Y_test.reshape(-1,1)
testpred=testpred.reshape(-1,1)

Now, let’s calculate the RMSE and compare it to the mean value calculated across the test set:

>>> test_mse = mean_squared_error(Y_test, testpred)
>>> rmse = sqrt(test_mse)
>>> print('RMSE: %f' % rmse)
RMSE: 24508264.696280

>>> np.mean(Y_test)
47829860.5

We can see that in this instance, the RMSE is quite sizable — accounting for 50% of the mean value as calculated across the test set.

This indicates that the model does not have much predictive power in forecasting quarterly total sales of Manhattan Valley condos.

Given that no seasonality seems to be present, how about if we shorten the lookback period? Let’s try a lookback period of 1, whereby only the immediate previous value is used.

>>> test_mse = mean_squared_error(Y_test, testpred)
>>> rmse = sqrt(test_mse)
>>> print('RMSE: %f' % rmse)
RMSE: 21323954.883488

>>> np.mean(Y_test)
35266600.64285714

The size of the mean across the test set has decreased, since there are now more values included in the test set as a result of a lower lookback period. This has smoothed out the effects of the peaks in sales somewhat. However, we see that the size of the RMSE has not decreased that much, and the size of the error now accounts for over 60% of the total size of the mean.

Therefore, using XGBRegressor (even with varying lookback periods) has not done a good job at forecasting non-seasonal data.

Conclusion

There are many types of time series that are simply too volatile or otherwise not suited to being forecasted outright. However, all too often, machine learning models like XGBoost are treated in a plug-and-play like manner, whereby the data is fed into the model without any consideration as to whether the data itself is suitable for analysis.

Of course, there are instances where XGBoost can be useful for time series forecasting and can outperform more traditional time series models such as ARIMA — as we saw when forecasting electricity consumption.

Therefore, the main takeaway of this article is that whether you are using an XGBoost model — or any model for that matter — ensure that the time series itself is firstly analysed on its own merits. This means determining an overall trend and whether a seasonal pattern is present.

The allure of XGBoost is that one can potentially use the model to forecast a time series without having to understand the technical components of that time series — and this is not the case.

Understanding the time series data is paramount, and then running a number of models on that data to determine which ones perform best is the ideal approach.

Many thanks for your time, and any questions or feedback are greatly appreciated. You can also find the GitHub repository which illustrates the use of ARIMA and XGBRegressor to forecast electricity consumption patterns here.

Are you interested in learning more about how machine learning models can be used to analyse time series data more effectively? Book a free 30 minute consultation with me on Calendly.