2024 election forecast

This project forecasts, on each day leading up to the 2024 U.S. presidential election, who will win. Predictions are made using this polling data from 538, along with 538's polling averages for swing states, found on this page.

Below you'll find today's forecast compared to the baseline and 538's predictions, followed by previous days' forecasts, followed by information about my methodology.

Today's forecast (Tue Nov 5 2024)

See below for reflections!

Forecast	Baseline	538
Trump: 268 (48.0%) AZ GA NC NV	Trump: 268 (49.8%) AZ GA NC NV	Trump (50.3% chance)
Harris: 270 (48.4%) MI WI PA	Harris: 270 (47.8%) MI WI PA	Harris (49.5% chance)

My forecast and the baseline assume both candidates win their "safe" states. The formatting is:

Candidate: # of electoral votes (popular vote) swing states won

538's forecast shows the candidate's % chance of winning the election.

Model

See below for a description of the hyperparameters shown in the lower left-hand corner.

Reflections

This is my final forecast for the election. Here are a couple of thoughts:

If you've been paying attention to my hyperparameters, you'll notice my \(\alpha\)s are always very low. The only exception I saw was yesterday when \(\alpha_{trump}\) was 0.49. If you know anything about exponential smoothing, then you'll know that this means my predictions were relying more on older data than newer. Incidentally, I noticed my predictions seemed to lag behind 538's and I think this is why.
Everyone is saying this election is a toss-up, including 538. Even though they gave Trump a higher likelihood of winning in today's forecast, they are predicting Harris will win with 270 electoral votes. As mentioned below, I've noticed all the swing state polling has been well within the margin of error, so even though my predictions about the swing states have always been based on who's leading, they probably aren't very helpful. For one thing, I think it's far more likely that Harris will win the popular vote and Trump will win the electoral college, but for the past few days my algorithm has been predicting the other way around.

Thanks for visiting this site and keeping up with it. I may return with a few more thoughts once the outcome of the election is known. Happy Election Day!

Swing state predictions

Computing daily polling averages

Hyperparameters and the prediction algorithm

For a time series with a trend (both candidates seem to be trending up as Election Day approaches), double exponential smoothing appeared to be the most natural choice for forecasting. Recall, the double exponential smoothing model predicts at time \(t\)

\[ \hat{y}_t = \begin{cases} s_{t-1} + b_{t-1} & \text{ for } 1 < t\leq n \\ s_n + (t-n)b_{n} & \text{ for } t> n \end{cases} \]

where \(s_t\) and \(b_t\) are level and slope terms, respectively, given by

\begin{align} s_{t} &= \alpha y_t + (1-\alpha) (s_{t-1} + b_{t-1}), \ s_1 = y_1 \\ b_{t} &= \beta (s_t - s_{t-1}) + (1-\beta) b_{t-1}, \ b_1 = y_2 - y_1 \end{align}

where \(\alpha\) and \(\beta\) are hyperparameters between \(0\) and \(1\).

Originally I didn't think exponential smoothing would be an appropriate election forecast method, since my intuition said new polls shouldn't really rely on polls that far in the past, and exponential smoothing uses all previous observations to make predictions. On the other hand, I wanted the slope term used in double exponential smoothing that takes trends into account. So I wanted to use a combination of a weighted rolling average for the past 7 days, with the double exponential smoothing slope term.

However, when trying to figure out how to modify the model, I realized that the weights already decrease exponentially. If I make my hyperparameters, \( \alpha\) and \( \beta\) for each candidate, large enough, then I can make sure the model only takes the past week or so's observations into account, while the rest give negligible contributions. This seemed much simpler.

I ran a grid search to tune my hyperparameters, optimizing the mean absolute error of the forecast on the test set relative to the mean absolute error of the naïve forecast on the training set over an equivalent horizon. I used all my polling data as training data, with five cross-validation splits and holdout sets of 7 days each. I did it this way because I thought all the data would be useful in building the model, it didn't make sense to leave out the last few days as test data. Surprisingly, after running the grid search the hyperparameters all turned out to be small, meaning the better forecasts come from taking all previous data into account!