538 Model Megathread (user search)
       |           

Welcome, Guest. Please login or register.
Did you miss your activation email?
May 24, 2024, 10:44:45 AM
News: Election Simulator 2.0 Released. Senate/Gubernatorial maps, proportional electoral votes, and more - Read more

  Talk Elections
  Election Archive
  Election Archive
  2016 U.S. Presidential Election
  538 Model Megathread (search mode)
Pages: [1] 2
Author Topic: 538 Model Megathread  (Read 84820 times)
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« on: June 29, 2016, 07:55:51 PM »

Clinton's closest states (polls only):

Arizona
North Carolina
Colorado
Ohio
Iowa
Florida <-- Tipping point
Virginia
New Hampshire
Nevada
Pennsylvania

Wouldn't that order make Virginia the tipping point?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #1 on: July 25, 2016, 07:23:57 PM »

Serious question - why does the Nowcast do a trendline? That doesn't make sense to me. If Clinton leads in Nevada now (as she does per their poll model) why would the Nowcast not predict her as ahead?

Maybe I'm misunderstanding something, but the latest 5 it's showing have Trump up in 3 and Clinton up in 1. The most recent is Clinton +4, but it gets adjusted to a tie based on other factors (e.g. national polling, covariance with other states). So I guess it doesn't surprise me.

If you look at the Nevada Nowcast it has a polling average of Clinton +1.5%.

Adjustment for some of their stuff brings it to Clinton +1.2%.

THEN they adjust for "trend" which swings it by 4.5% in Trump's favour and gives it as Trump +3.4%.

I don't get what trend they're adjusting for if it's "election held today". The net swing is roughly the same as in the forecasts for November 8th.

They are looking at how the national polls have changed since those Nevada state polls were conducted, and assuming Nevada has gone to the right at the same pace since then as the rest of the country (even though we don't have recent Nevada polls to confirm this).

Yeah, now-cast doesn't mean you take all the polls as is at the time they were taken.  It means you take the polls and adjust them for any trends in national polls that have happened since they were taken, to see what they would look like if taken right now.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #2 on: July 25, 2016, 08:08:12 PM »

Serious question - why does the Nowcast do a trendline? That doesn't make sense to me. If Clinton leads in Nevada now (as she does per their poll model) why would the Nowcast not predict her as ahead?

Maybe I'm misunderstanding something, but the latest 5 it's showing have Trump up in 3 and Clinton up in 1. The most recent is Clinton +4, but it gets adjusted to a tie based on other factors (e.g. national polling, covariance with other states). So I guess it doesn't surprise me.

If you look at the Nevada Nowcast it has a polling average of Clinton +1.5%.

Adjustment for some of their stuff brings it to Clinton +1.2%.

THEN they adjust for "trend" which swings it by 4.5% in Trump's favour and gives it as Trump +3.4%.

I don't get what trend they're adjusting for if it's "election held today". The net swing is roughly the same as in the forecasts for November 8th.

They are looking at how the national polls have changed since those Nevada state polls were conducted, and assuming Nevada has gone to the right at the same pace since then as the rest of the country (even though we don't have recent Nevada polls to confirm this).

Yeah, now-cast doesn't mean you take all the polls as is at the time they were taken.  It means you take the polls and adjust them for any trends in national polls that have happened since they were taken, to see what they would look like if taken right now.


I thought about that but then why are the trends not a lot stronger for the November forecasts?

If a particular poll in a given state was taken a month ago, and Trump has gained nationally since then, then you adjust the poll in a Trump-ward direction to figure out what the margin would be if the poll was taken now.  But there's no reason to assume that same trend keeps going in the same direction between now and November.  We have no idea what the trend is going to be between now and November.

So I'm not sure on the details of how the now-cast number differs from the polls-only number.  I guess there's some difference between the predictive power of a poll taken immediately before an election and one taken months beforehand, and the model reflects that, but I don't know the guts of it.  I'm assuming that one facet of it would be that now-cast places heavy weight on polls taken within the past week as opposed to polls taken a month ago, because if the election was held today, then polls taken within the past few days would be highly predictive.  However, if the election is three months away, then the weighting on more recent polls isn't as heavy.  Polls taken three months before an election might not be hugely more predictive than polls taken four months before an election.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #3 on: August 30, 2016, 08:06:13 PM »

Iowa seems strange to me. 50/50 in nowcast, but in Polls Only Hillary is at 60 and in Polls Plus Hillary is at 53. Seems very strange to say if the election were held today it is a total tossup, yet, somehow JUST USING POLLS, Hillary has a 60% shot on election day??? thoughts?
Polls Only model is not aggresive. On the election you in generelly don't give any weight at all for polls older than say 5-6 days. Simplied.
I'm not following. Polls Only model is based on ONLY POLLS. Are you saying Polls Only uses different polls than Nowcast? I would think if polls show if election were TODAY the results are 50/50, but if we wait 2 months, Hillary is more likely to win, something doesn't really make sense. I could deal with 53% or something like that, but 60% is pretty significantly different, no?

The polls being used are the same, but the weights they’re given are different.  If the election were today, then polls that are just coming out would be weighted much more heavily than those from two weeks ago.  Whereas if the election is 10 weeks from now, then, well, there’s no reason to weight polls from 10 weeks before the election *that* much more than polls 12 weeks before the election.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #4 on: September 07, 2016, 01:23:26 PM »

It would be rather dishonest for 538 to continually tweak the model over the course of the campaign to "correct" predicted state margins that "seem implausible".  If you do that, you're basically replacing a model with your gut.  Just leave the model as is, let it run, and if there are any huge misses on election day, you can work to improve your model for the next election.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #5 on: September 25, 2016, 11:04:53 AM »

Odd... you'd think they'd have scripts to automate those things.

How would such a script work?  Search through the PDF of the poll release for the Clinton and Trump numbers?  When every pollster will have poll releases that look different?  It's a lot easier to just type in the numbers manually.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #6 on: September 30, 2016, 07:16:30 PM »

I don't think the swings have been "insane" at all.  This is a pretty normal election as far as poll changes have gone.  2012 was an outlier for being so stable.

As far as this race goes, in the 538 polls-only model, there's been some drift of a few points back and forth around a Clinton lead of about 3 points.  But, newsflash: when the national popular vote margin changes by 2 points, it means that more than one or two states change from being favored by one party to the other.  That's how the electoral college works.  There are these close states that we call "swing states", and it doesn't take much movement in the national margin to switch them from one candidate to the other.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #7 on: September 30, 2016, 11:07:37 PM »

I feel like some folks here are convinced they "know" what Clinton's victory margin is going to be.  Like she's going to win by 4 points or 5 points or whatever.  And then when the polls match that number, it confirms their prior assumptions, whereas when the polls don't match that number, it's just a temporary blip, or some weird phenomenon with polling crosstabs or something like that.  They think that the 538 model should always "know" that the true margin is Clinton by 4 or 5 points or whatever, even though it's based on polls, and the polls don't always show that kind of margin.

But obviously if the polls move from Clinton +1 to Clinton +4 or whatever it is now, then the probabilities in the model will change more than a little.  What else are people expecting to happen?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #8 on: October 01, 2016, 12:49:38 AM »

My point is that I would expect a decent amount of movement in the polls-only and especially the now-cast models. I would think that the polls-plus model, however, would be a bit more static, especially in the probabilities it assigns to Trump or Clinton winning.

Polls plus is still mostly based on polls though.  There are some "fundamentals" added in, but this close to the election, I don't think they count for very much.  In terms of predictive power based on quantitative observations, polls are a much stronger predictor of how the election is going to turn out than anything else out there.  And so, when the polls go from a tie to a 4 points Clinton lead or whatever it is now, it's going to make a big change in the probabilities.

Now, granted, maybe we can predict certain changes in the polls, and therefore be "smarter" than the model.  For example, we could predict that Clinton would have been more likely to gain in the polls from the debate than Trump, sure.  But how is a mathematical model supposed to "know" things like that?  Should we plug Clinton's and Trump's IQs into the model as extra parameters?
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #9 on: October 10, 2016, 04:55:02 PM »
« Edited: October 10, 2016, 04:58:26 PM by Mr. Morden »

The now-cast relies solely on polls, and not at all on voting history or demographics*, correct?  In that case, of course it's going to give some wonky results for states where there are very few polls, and the polls that do exist there are of poor quality.  I don't see why that's a big problem though.  Who really cares how good the model is at predicting Rhode Island?  The swing states are a lot more important.

In any case, there seems to be some confusion that keeps coming up here about probabilities.  `Is "the probability of X" correct or not?' is a question that doesn't even make any sense in a vacuum.  The probability depends on the set of information that you're considering.  The model is saying, XX% of the time that the polls say this, we would see Candidate Y win.  That's pretty much it, since there's very little non-polling information that goes into it.  But because we have access to additional information, like voting history, candidate quality, recent scandals, etc., that the model doesn't have, the potential exists for us to "beat the models".

It's like in a football game, you could have a point where the commentator says "28% of the time when a team is down by this much with this much time left, with this field position, they win."  That can be a useful stat to have.  The viewer instinctively knows that the team in question is the underdog, given the score, the clock, and the field position, but translating that into a probability on the fly isn't something that a normal person can do.  But that doesn't mean that you should treat that 28% as gospel.  If you think team quality, weather or other factors are also important, then you can mentally shift that 28% up or down in your head to get a more "realistic" assessment of the probability.  But that doesn't mean that the initial probability given based on just a few observables isn't a useful thing to know.

Same thing with 538's model.  If you think there's additional information that shifts the probabilities higher or lower, that isn't included in the model, then good for you.  You're free to mentally shift the probabilities up or down.  But that doesn't make the model estimates useless.

*Except to the extent that demographics are involved in computing the correlations between different states' votes, but that only takes you so far.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #10 on: October 10, 2016, 05:05:09 PM »

There is a way to check to see if he's right or not; if events that he says has x% chance of happening actually do end up happening x% of the time.

His predictions have ended up being quite accurate in that regard.

This is not measurable.  It's impossible to verify whether (for example) a 92% chance of Obama winning in 2012 was accurate.  (I think the final number was close to that.)  We have a sample of ONE.  To get close to checking whether 92% was accurate, you'd need to rerun the 2012 election dozens of times, which is obviously impossible.

"Checking" whether the probability of a single event was "correct" or not is nonsensical.  The point is that if you run the same model on many different elections, then you should be able to check if the model is any good by seeing if 92% favorites do indeed win 92% of the time.  The problem is, there are only so many presidential elections to look at since the advent of polling, so you're checking with small number statistics.  You can overcome that problem by looking at each individual state separately, but the states are correlated with each other, so it's a bit messy.

I assume you can also do it for Senate races, which are presumably going to be less correlated.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #11 on: October 10, 2016, 05:27:16 PM »

All right, that's a fair point as far as it goes; but as you noted, the number of Presidential elections since such models existed is far too small a sample to be useful.

The election doesn't have to have happened "since such models existed".  You can go back to elections from long before your model existed, and apply the model and see how it performs, as long as you have the polling data from those old elections.  But you're still stuck using elections since *polling* existed.  And more than that, elections for which large numbers of state polls existed.  And I guess that takes you back a few decades, but not more than that.  So yes, a limited sample.

To be clear, going back to past elections to see how the polls predict the final outcome is how the models were constructed in the first place.  However, since you're again stuck with a limited sample, you have the potential for over-fitting.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #12 on: October 10, 2016, 10:08:43 PM »

For me the biggest surprise on that map is West Virginia.

That's probably because polls tend to underestimate eventual margins of victory, thus why most Safe Trump states will trend D, according to that map, and most of the trend R states are Clinton states.

The third party vote being bigger this time could also decrease the margin of victory slightly in blowout states.  E.g., suppose you have a state where Trump would otherwise beat Clinton 60%-40%.  If 10% of both candidates' supporters shifted to Johnson+Stein, then it becomes 54%-36%, moving it from a 20 point margin to an 18 point margin.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #13 on: October 17, 2016, 09:02:40 PM »

Tipping point state is now PA (in polls only) and MN (in polls plus and nowcast).
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #14 on: October 18, 2016, 01:19:30 PM »

Tipping point state is now PA (in polls only) and MN (in polls plus and nowcast).


And now with the latest updates, MN is the tipping point in all three models.  Might not last very long though, as it's pretty much tied with PA.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #15 on: October 23, 2016, 01:54:20 PM »

Wasn't really sure where I wanted to put this, but the Princeton Election Consortium now has their odds of a Clinton victory at 97% in their random drift projection and at 99% in their Bayesian projection, which is the first time I've seen 99% pop up in their numbers. They have the EVs at 336-202.

EDIT: In fact, while I'm here, here's everyone's projections as collected by NYT



Am I correct in my recollection that one thing that sets the 538 model apart is the comparatively high degree of correlation between the states in their model?  That is of course going to make a huge difference.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #16 on: October 25, 2016, 04:05:38 PM »

I just realized that another model that Nate could use would include early-voting turnout/expectations alongside the factors included in Polls-Plus, something like Polls-Premium?

I would certainly like to see how elements such as the extremely positive turnout in states like FL, TX, and AZ would pull things one way or another. What do you guys think?

Everything in the model (both polls and the "poll plus" variables) is based on the predictive value of the parameters as established by previous presidential elections.  Has early voting really been around long enough to tell you how much predictive power those things have on the election outcome?  You have to be able to say "X% of the time where this variable says Y, the leading candidate wins."  But if early voting has only been around for one or two presidential elections, then you don't have a large enough sample to be able to say anything.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #17 on: October 28, 2016, 09:32:04 AM »

nate just said that all of trump's gains come from johnson...he is literally bleeding left and right.


https://twitter.com/NateSilver538/status/791980369562181632

"Literally" bleeding left and right, huh?  "Literally"....

Sounds painful.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #18 on: October 29, 2016, 09:09:15 AM »

Out of curiosity, I looked up which states polls-only is currently projecting will be won by one candidate or the other with less than 50% of the vote.  I came up with these 13 states:

UT: Trump 35.6
AZ: Trump 46.6
IA: both candidates at 46.6
OH: Trump 46.9
NM: Clinton 47.0
AK: Trump 47.3
NV: Clinton 47.5
NC: Clinton 48.5
CO: Clinton 48.7
FL: Clinton 48.8
GA: Trump 48.9
NH: Clinton 49.0
MI: Clinton 49.6

6 of the 13 are in the West.  I guess that's where Johnson (+McMullin) is strongest.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #19 on: October 31, 2016, 08:14:14 AM »

6 of the 13 are in the West.  I guess that's where Johnson (+McMullin) is strongest.


Definitely needed to look at Nate Silver's projections to come up with this. Yep. Thanks for the insightful analysis Roll Eyes

You're welcome.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #20 on: October 31, 2016, 11:17:37 AM »

If you use HuffPo 3-way and limit the polls to nonpartisan pollsters and likely voters the current numbers from their trend lines are Clinton 45.1%, Trump 40.7%, Johnson 5.1%. For the 2-way it gives 45.8% to 42.4% (moderate smoothing)
Why would you do any of this?

This in response to posts on both sides complaining about inclusion of partisan pollsters. HuffPo lets me filter them out so I did this as a point of comparison.

By further comparison that same exercise today has Clinton 45.2%, Trump 42.5%, Johnson 4.9% in the 3-way and Clinton 45.4% to 42.8%  in the 2-way. Clinton's share in both is essentially unchanged from my post 4 days ago.

HuffPo lets you do that?  How?

At the top of the page, click on "customize this chart", and you can select which pollsters you want to include, as well as how much smoothing to use.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #21 on: October 31, 2016, 11:46:06 AM »

Incidentally, I hadn't noticed previously that HuffPo also allows you to view the polling trendlines among just Dems, just Republicans, and just Indies.

I just compared them, and they showed what I'd been noticing anecdotally in the individual polls: Trump isn't gaining ground among Republicans.  His gains have come from Independents instead.  As the 3rd party vote share has declined, Clinton's gained ground with Dems and Trump has gained ground with Indies.  But Republicans have remained flat, with Trump pulling ~80% of them, compared to Clinton getting ~86% of Dems.  Since Oct. 2, for example, this is the change in support from each party group, using moderate smoothing:

Dems
Clinton +2.9
Trump -1.1
Johnson -0.3

GOP
Clinton no change
Trump -0.2
Johnson -0.9

Indies
Clinton -0.7
Trump +1.9
Johnson -1.5

Link to the charts:

Democrats

Republicans

Independents
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #22 on: November 01, 2016, 02:47:52 PM »

It doesn't make any sense to only rate the accuracy of the various models on the basis of the predictions they're making on election day itself.  The models also make predictions a week before election day, a month before election day, four months before election day, etc.  The model that's most accurate on election day itself might not be the most accurate a month out.  And I think it's more interesting to look at accuracy from a month out or a week out.  After all, if you've already reached election day, then how accurate your model is is no longer a very interesting question, as you're going to know the results in a few hours anyway.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #23 on: November 01, 2016, 03:36:37 PM »

It doesn't make any sense to only rate the accuracy of the various models on the basis of the predictions they're making on election day itself.  The models also make predictions a week before election day, a month before election day, four months before election day, etc.  The model that's most accurate on election day itself might not be the most accurate a month out.  And I think it's more interesting to look at accuracy from a month out or a week out.  After all, if you've already reached election day, then how accurate your model is is no longer a very interesting question, as you're going to know the results in a few hours anyway.


This is one thing I've been wondering a lot about. I think somebody needs to develop a methodology for measuring time-based performance of these models. I've thought that maybe one metric would be showing how long before the election the model decided a correct outcome was more likely than not and didn't shift back. That has the disadvantage of treating anything over 50% as a "call," but it would be illustrative, I think. If one model wound up saying Florida was 50.3% for Obama, but half a day earlier had said it was 50.2% for Romney, then that's not a very valuable call. But if it had said for two weeks that it was likelier than not to go for Obama, we could have some more confidence in it as a predictive model.

The obvious way to measure the accuracy of the models as a function of time is to take, at each time interval, all of the races that the model thinks one candidate is favored in 50-60% of the time and check if the favorite does indeed win 50-60% of the time, then the same where the model has someone as a 60-70% favorite, a 70-80% favorite, etc.  Do the probabilities at a week out match the outcomes?  Or a month out?  Or whatever time interval you want.
Logged
Mr. Morden
Atlas Legend
*****
Posts: 44,066
United States


« Reply #24 on: November 02, 2016, 01:41:18 PM »

What is the uncertainty of the model, given the # of simulations they run?  That is, each time they enter new polls, they apparently run 10,000 simulations based on the latest #s, and that produces (among other things) an overall projected vote margin and win probability.  But let's say they then ran *another* 10,000 simulations with the same input #s but a different random seed?  How different would the results be?  Because if they'd be different, then in theory you could put in a favorable poll for one candidate and it would end up "helping" the other candidate, just because of simulation noise.  But I'm assuming that 10,000 is enough for the simulation noise to be small?
Logged
Pages: [1] 2  
Jump to:  


Login with username, password and session length

Terms of Service - DMCA Agent and Policy - Privacy Policy and Cookies

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Page created in 0.06 seconds with 12 queries.