Saturday, November 5, 2011

Polling: NC1, NC2, NC3, and Precinct-level Post-Strat

NC3pre4
Estimated Obama Approval by precinct in the North Carolina 3rd Congressional District

One thing we've been trying to do is push the limits of what can be done with a traditional poll of a district. This is a map of estimated Obama Approval by precinct, based on a multi-level Bayesian spatial model that took into account past election results, the 2008 Presidential Exit Poll, and the poll of 750 Registered Voters we did of the district. The precinct estimates should be accurate to about six points, which is pretty incredible considering that there are only a couple respondents per precinct.

Here's our first round of public polls:North Carolina's 1st Congressional District: [Obama approval, Congressperson Approval]. Click here for a detailed poll report with cross-tabs.

President Obama depended on strong support from the heavily African-American 1st Congressional District of North Carolina to eek out a win in the state in 2008. Since he retains the support of the black voters in the district, the President is still on a good track to repeat that showing in this district.

He leads Mitt Romney by 28 points, by a 60-32 margin, thanks to his pretty strong 56-41 approval rating. He also leads Rick Perry by a 60-33 margin, but remember all of these polls were taken in mid-October as test subjects for our new polling outfit, so considering Gov. Perry has since taken a nose-dive in his favorability numbers, he would probably do even worse today.

It is striking that even in this heavily Democratic district Governor Beverly Perdue has a negative approval rating at 37-44, but she still leads former Charlotte Mayor Pat McCrory by a 47-36 margin. The discrepancy stems from the fact that African-American voters approve of her by just 44% to 29% disapproval, but still vote for her over McCrory by a 66-16 margin.

Senator Kay Hagan does better than Governor Perdue, but worse than President Obama-- mostly because she's less-well known than the President and thus many supporters of President Obama are hesitant to back her-- she gets only of 70% of voters who approve of President Obama. But she still leads Congresswoman Renee Ellmers by a margin of 49-30.

Congressman G.K. Butterfield is moderately well-liked in the district, with an approval rating of 37% to 33% disapproval.




North Carolina's 2nd District:[Obama approval, Congressperson Approval]. Click here for a detailed poll report with cross-tabs.

North Carolina's old 2nd Congressional District is the classic example of a swing district with a slight Republican tilt: It was carried by former President George W. Bush by convincing margins twice, but bolted for President Obama by a narrow margin in the Presidential Election 2008, allowing him to win the state of North Carolina. Its incumbent Democratic Congressman Bob Etheridge was ousted by an even narrower margin by Republican Renee Ellmers after a personal scandal in which he attacked two teenagers tracking him with a video camera.

The probably most interesting finding in this poll: Fmr. Congressman Etheridge has regained some of his further popularity, leading Congresswoman Ellmers 46-40 in a hypothetical rematch. However, considering the district was made more strongly Republican in redistricting, this lead probably wouldn't exist under the new boundaries.

President Obama sports a negative 47-50 job approval rating, but leads Governor Romney by a 7-point margin-- 48-41. It is notable that here, as well, almost none of the undecided voters approve of President Obama's job performance, so once these voters come home, Romney should be in a narrow lead here.

Governor Perdue sports an atrocious 29-53 approval rating and trails likely Republican nominee Pat McCrory by an 11-point margin-- 36-47. With these numbers, reelection seems close to out of reach for her.

Senator Hagan leads Congresswoman Ellmers by one point in Ellmers' own district-- 42 to 41. Still, Ellmers is not terribly disliked by her district-- she has a 37 percent to 31 percent approval rating-- which is better than a lot of politicians these days.

47% of voters in this district want President Obama to become more conservative, while 18% want him to become more liberal, and 35% are comfortable with his current posturing.

North Carolina's 3rd District: [Obama approval, Congressperson Approval]. Click here for a detailed poll report with cross-tabs.


The third Congressional District of North Carolina is a classic Southern district: It is made up of some African-American voters, but mostly of ancestrally Democratic, conservative Whites. This is perfectly illustrated by the fact that Democrats have a 12-point Party-ID gap over Republicans here (38% of respondents identify themselves as Democrats, 37% as Independents, and just 26% as Republicans)-- but at the same time 50% of respondents say they're Conservatives, with just 16% identifying themselves as liberals.

President Obama is very unpopular in this district. He has an approval rating of 36 to 59 percent, trails Governor Romney by 12 points (37-49), and that's only going to get worse as people who still haven't formed an opinion on Romney, but disapprove of the President's job performance come home.

Governor Perdue has an approval rating of 28-54 here-- which is notable, since her home base of New Bern is located in this seat. She trails Pat McCrory by 11 points, 33-44. Winning this district in 2008 even while President Obama lost it heavily against Senator McCain was a cornerstone for Gov. Perdue's victory in 2008. She doesn't seem likely to be able to repeat that.

Senator Hagan trails Congresswoman Ellmers by three points here-- 37-40, but with 23% of voters undecided this race is very much in flux. Congressman Walter Jones is decently popular at a 39-28 approval rating, but primary polling we'll be releasing over the next days shows that he has at least something to worry about in the Republican primary, if a strong challenger emerges.


Our polls are of registered voters who've voted at least once in the last four years, and have been weighted by Age, Race, and Gender.

MRP, IRT, and Polling

For the past three years, this blog has mainly focused on election forecasting and poll aggregation. Honestly speaking, we seem to be pretty good at that. But after having predicted US House district results with an average error of 2.5 points in 2010, there honestly isn't much room for improvement for finding out who is going to win the election.

Our final 2010 race forecasts graphed against election results.

That's why I'm really excited about newer statistical techniques that will allow us to answer more subtle and interesting questions about the electorate.




Estimated Obama Approval in the first half of 2011, derived from weekly PPP polling over six months(n~20,000). Blue is more Democratic, Red is Republican, Green indicates groups for which there wasn't enough data to reliably estimate opinion. Color scale is different for Black voters to attenuate Age and State differences (California is at 75% approval). 

This map was created with a technique called MRP, which uses multi-level logistic regression in order to produce sub-group estimates that are much more accurate than traditional naive disaggregation of the data into various crosstabs.


Probability of supporting Jack Davis (I), Jane Corwin (R), or Kathy Hochul(D) as a function of IRT-derived ideology for the NY-26th Special Election

To produce this graph, we used a Bayesian IRT model to estimate inherent respondent ideology based off of their survey answers, much like political scientists estimate legislators' ideological beliefs from their votes in congress.

Understandably, it's pretty hard to get pollsters to provide the raw individual level polling data that is needed for this kind of analysis. So we decided to build our own data by building our own pollster. We're aiming to bring the kind of sophisticated statistical techniques that previously were only seen in academic journals into the realm of timely analysis. Check back soon to see our first public polls. 

Tuesday, June 21, 2011

Women Representatives are about as Liberal as their districts

Yglesias writes

As measured by DW-NOMINATE scores, Republican men and Republican women are the same. But Democratic women are much more liberal than their male counterparts:




The next step that it would be interesting to see is whether this is just a coincidence in which no women Democrats represent the kind of districts that tend to elect moderate Democrats. You could compare everyone’s DW-NOMINATE score to their district’s PVI and see if there’s a gender gap in representation.
This is easy enough to check:

Estimated ideology among Representatives in the 111th Congress vs PVI(Average of Kerry and Obama vote). Idealogy estimated from Jackman's  pscl package in R, which has some nerdy advantages over DWNominate.
Now for a regression.

Once you adjust for district partisanship, it doesn't seem that Women representatives are any more liberal than would be expected.

Tuesday, May 17, 2011

Generic Ballot Polling Review

With renewed media focus on House Generic Ballot polling, it's a good idea to see how things have shifted since the election:

The past two months of Generic Ballot Polling. For a full table, click here



In a familiar pattern, Rasmussen polling strongly diverges from everyone else. And the relative frequency of Rasmussen vs non-Rasmussen polling at any given point swamps any actual change of opinion in any naive poll filtering method. 

Large Statistically significant house-effects

In order to get an idea about how many seats the Democratic party can expect to win from a generic ballot result, we do a simple univariate regression relating overall Democratic vote with the percentage of seats won by Democrats in presidential elections since 1948. This is somewhat crude, but because we don't know what the districts will look like in 2012, more sophisticated analysis isn't really possible. 




In order to tease out differences in pollster differences and the LV-RV disparity, we've used our 2010 poll filter that adjusts for house-effects.

House-effect adjusted estimates of Democratic Vote-Share over time in the Generic Ballot among Registered Voters with one standard deviation confidence intervals shown.

That's a registered voter estimate. Republicans almost always perform better among likely voters than registered voters, and the extent of this difference is often called the "Enthusiasm Gap". 
Discrepency between Likely and Registered polls in the last four national elections, estimated using Stochastic Democracy's Bayesian DLM model with House-Effects in a previous post

Using this, we can construct estimates conditional on different turnout scenarios:


This is all consistent with a very close race for control of the House of Representatives in 2012. While this analysis shows a very slight edge for the Democrats, they face a disadvantage due to large Republican gains in state legislatures that will effect voting law and redistricting, and due to incumbency advantage working in the Republican's favor due to their strong performance in 2010. Intrade puts the Democrat's chances at 42%, which seems reasonable.




Tuesday, May 3, 2011

Preliminary Canada Post-Mortem

With all ridings now fully reporting, there's now enough data to pick apart our model and see how it did. 

How our model did in comparison to ThreeHundredEight.com, the primary Canadian Forecaster. We actually produced probability distributions instead of numbers, and the distributions for the Bloc and NDP were heavily skewed:

Pre-Election probability distributions for each party's seat count.

Our pre-election probability distribution for the number of Bloc Seats. 

Our pre-election forecasts graphed against Election Results (Not including Saanich--Gulf Islands). 


Table showing deviation of election results from polling by province. 


Table showing average absolute error of our forecasts by party and by province. Our predictions were best in Quebec and worst in Atlantic Canada.

But our model did more than just produce forecasts. We put a lot of effort into making our model probabilistic so as to produce accurate confidence intervals


And it seems to have paid off! 89.6% of election results fell within our 90% confidence intervals, 94% in our 95% confidence intervals, and 98.9 in our 99% confidence intervals. So when we were wrong, it seems we were exactly as wrong as we said we would be!

Mean, Median, Mode

Summary statistics from the Bloc Québécois pre-election seat forecast distribution. Seems like a good Stats 101 example.

Monday, May 2, 2011

Results are in!

The Conservatives have won a majority. Otherwise, our projections seem to be right on the money, and considerably better than the forecasts at Threehundredeight.com and Sexton at Fivethirtyeight at the New York Times.

The Conservatives did about 4-5 points better than polls showed, and while I need to look at the numbers in more detail, it does seem to be consistent with the idea roughly 15% of Liberal voters who told pollsters that their second choice was the conservative party...voted for conservatives.

http://www.cbc.ca/news/politics/canadavotes2011/#/284 is a good guide. 

New Canada Forecasts

Before I get to explaining the forecasts, which are quite enthusiastic about the chances of the NDP, it's important to stress how dependent they are on the strange situation in Quebec.

Graph showing polls in Quebec over time

Right now, polls have consistently shown the collapse the collapse of the Bloc, and based on regional polling we can expect them to receive about 23% of the vote. If that is indeed what will happen, then then the forecasts below are going to hold up very well.

The problem though, is that individual riding polls taken in the past week have shown individual Bloc candidates doing about 8 points better than you would expect based off of regional polls. This discrepancy is reminiscent of the 2010 election, where there were plenty of Southern Democrats who told local pollsters they would vote Democratic due to popular candidates and good campaigns and National pollsters that they would vote Republican based off of national issues. In 2010, it turned out that they were mainly telling the truth to national pollsters, but there have been elections where the opposite was true.

Based purely off of trends inferred from local polls, you'd expect the Bloc to win around 42 seats, entirely at the NDP's expense. Based off of regional polls, they'd get around 7 seats on average.

Outside of Quebec, things should hold up fine. Riding level estimates are available here and here.

Note the Seat distribution of the Bloc. Due to the wide uncertainty regarding Quebec, the Bloc could theoretically win almost every Riding in Quebec. They're just not particularly likely to pick up any particular seat. Uncertainty with regards to the Bloc comes at the expense of the NDP, which is why their distribution is so skewed.







Thursday, April 28, 2011

Useless Graph of the Day


Calculated from Simon Jackmon's excellent pscl package on R. Unfortunately, the 35th Parliament was the most recent one I could find with a machine readable roll call matrix (If anyone can send me something more recent...). The obvious outliers are due to by-elections and party defections.

Roughly speaking, these graphs are created by pscl by extracting latent dimensions that best describe variation of voting amongst legislators.

Voting seems to be two-dimensional, corresponding roughly to "Pro-government vs anti-government" and "Quebec Separatism". This is in contrast to the United States, where voting is strongly one dimensional on the basis of ideology. Ideology doesn't seem to be extractable from voting records in Canada, as the NDP appears between the Right and the Liberals, probably due to strong control over the agenda by the Liberals who were careful not to call votes that would differentiate leftist parties.

It's also odd that party unity is so much weaker amongst liberals than the opposition. It'd be interesting to see if that continued as time went on.

Tuesday, April 26, 2011

Canadian Federal Election Forecasts




Forecasts for the May 2nd Canadian Federal Election. Click here for a larger version.

Probability Distributions for each Party's Seat Count

Opinion over time with 70% confidence intervals. Click here for a larger version.

Canadian Seat and Vote counts by region

These forecasts were made with Stochastic Democracy's Bayesian Canadian Election Model. It works by first estimating Public Opinion in each of Canada's provinces based off of regional polls, and then applying uniform swing on a region basis. The model takes into account all polls that were released before yesterday. Individual district-level vote predictions are available here, probability estimates are available here.

This is similar to what has been done by ThreeHundredEight, who has generously provided his extensive Canadian poll dataset. The advantage here is that this model is fully probabilistic, which allows rigorous answers to questions like "What's the probability that the Conservatives will win a majority of seats?" (30%), or "What's the probability that the NDP will end up with more seats than the Liberals?" (51%).

As a quick primer for American Readers, Canada has five major political parties: The Conservatives, The Liberals, The NDP (Can be loosely thought of as a Social Democratic party to the Liberals left), The Bloc Québécois (Quebec Separatists), and the Greens.

Because Canada elects it's Parliament based on Plurality voting in Geographical districts, Conservatives have been able to control government with less than 40% of the vote due to a fractured left-wing. Meanwhile, the Bloc Québécois have been able to obtain a disproportionate number of seats due to their concentrated support in...Quebec. The Greens tends to receive six to ten percent of the vote nationally, but has consistently failed to concentrate their vote to win and hold a single seat.

The campaign was mostly uneventful until about a week ago, when the NDP unexpectedly started to surge around the country. Our model now predicts that the NDP is now a very slight favorite to overtake the Liberals as the second largest party in Canada and to overtake the Bloc Québécois as the largest party in Quebec.

The big question mark is whether this surge will persist into the election next week, and to what extent left-wing voters will vote tactically, and if they do, who they will vote tactically for. 

Wednesday, March 16, 2011

Miami Mayor Recall Election driven mostly by Republicans

In a lop-sided result that shocked most advisors, 88% of Miami voters voted to recall Mayor Carlos Alvarez (R- Miami). Some Democrats suggested parallels with the likely upcoming recall elections of Republican State Senators in Wisconsin over controversial anti-Union legislation. Republicans were quick to point out that Alverez's recall campaign centered around the mayor's property tax increases and perceived concessions to county employee unions.

Alvarez recall vote by precinct. Alvarez did not win a single precinct with more then two voters. 

Vote for Republican Senate Candidate Marco Rubio in 2010 vs Yes vote by precinct. Statistically significant but not particularly strong relationship. 

The question is what extent this recall was driven by Democrats or Republican disapproval. Nothing passes with 88% of the vote without bipartisan approval, but a deeper look at the data reveals a disproportionately Republican electorate.

Turnout for the 2011 Mayoral Recall Election by precinct

2010 Senate results by precinct, falling mostly on ethnic lines. Note spatial similarity to the previous map.

A graph confirming the seeming relation between the last two maps. Precincts where Rubio performed strongly had much better turnout relative to 2010 then Democratic districts. 
Election results under different year's precinct turnout levels.  

These factors combined to produce an incredibly Republican electorate relative to 2010, a year that had by far the best Conservative turnout in a decade. Roughly 50% of votes in the special election came from precincts where Rubio received more than 55% of the vote, these same districts made up 37% of the votes in 2010. If precincts had turned out in 2008 at 2011 levels, Obama would have nearly lost the county, traditionally a Democratic stronghold. 

This isn't bad news for Wisconsin Democrats, they seem to be doing fine. But if they prevail, it's safe to say that their electorate won't look anything like this one. 

Tuesday, March 8, 2011

Let's Gerrymander!


It's been 18 months since Stochastic Democracy wrote about Tobler's Law and its applications for Gerrymandering.

In short: Democrats tend to be clustered together, while Republicans are spread out. This creates a large and natural pro-Republican bias when States try to adopt seemingly 'fair' maps based on aesthetically pleasing, 'compact' districts.

But it seems to be intuitive that in a Democratic system a truly fair redistricting would be one that reflects the will of the voters- one in which the share of districts a party gets closely matches its share of votes.

StochasticDemocracy therefore determined the 'fair' share of Congressional districts Democrats should receive in any state by calculating the Weighted 2010 Congressional Averages by Voting Eligible Population- therefore getting rid of both the impact of illegal immigrants (who get counted in the Census but don't actually get to vote) and under-age residents.

We then apply a universal swing- so that the average Congressional Democrat received 50% of the vote. If the districts were fair, we would expect Democrats to win roughly the same percentage of seats in a state that they win in terms of votes.

We can use this to approximate the 'fair' amount of Democratic, Swing and Republican seats there should be in a state as follows:

The number of safe Democratic seats should be the percentage of votes that Democratic candidates would receive in a neutral national year, minus 5% , multiplied by the number of seats the state has, rounded to the nearest integer.

The number of safe Republican seats should be the percentage of votes that Republican candidates receive in a neutral year, minus 5%, multiplied by the number of seats in the state, rounded to the nearest integer.

The number of swing seats should be the total number of seats minus the sum of Democratic and Republican seats.

Finally, the following table also shows how many seats are expected to elect Democrats under the boundaries of the 112nd Congress, in a neutral year. Keep that in mind when comparing it to the first three columns- the first three columns use the number of seats a state will include in the next decade, the last column uses the number of seats a state was assigned for the last decade.

table gerrymander


It turns out that it is possible in most states to create Gerrymanders that follow these rules. In most cases, they are much cleaner than the current lines.

Compare, for instance, the current Maryland map (6-2 Democratic) with our proposed 5-2-1 map.
current maryland map
Maryland 5-2-1 gerrymander

Even of the some more daunting proposals are possible to draw.
Here, for example, is a wonderful map of New York State by Swing State Project user Johnny Longtorso that creates 8 GOP-leaning and 3 swing districts.


Or, just as interesting, a compact Alabama map by SSP user roguemapper with three districts that were won by Barack Obama in 2008- and that should be easily winnable for Southern Democrats. Of course, this map ignores the VRA requirement for an African-American majority seat. But wouldn't African-Americans in Alabama be better represented in Congress by three Democrats with a majority-Black primary electorate than by just one African-American Representative from a very heavily African-American district?


Granted, in some states it seems impossible to redistrict fairly- those are the states where votes are distributed very evenly throughout the state. It won't be possible, for instance, to carve out a GOP seat in Maine, where John McCain won just one tiny county, or three Republican seats in Massachusetts, where Pres. Obama's weakest county still gave him 53% of the vote. Fair representation being impossible in some states under our current system is, in fact, a troubling sign for our Democracy and should be a rallying cry for Election Reform.

But as long as we operate under the current rules, every effort should be made to ensure fair representation for every voter in the country, Democrat or Republican.

Thursday, March 3, 2011

Preliminary 2012 Senate and Governor projections

We at Stochastic Democracy are proud to announce preliminary forecasts for selected 2011 Gubernatorial, 2012 Gubernatorial and 2012 Senatorial contests- namely the 24 out of 48 races that have already been polled. We expect to expand our forecasts in this area within the next few weeks, but for now our predictions are little more than a smoothed average of available polling data.

Still, it's useful to have all of this data in one centralized place. All of the polls were done with Registered Voters, and so we have provided adjustments for different turnout scenarios in the same style as our Presidential forecasts.
2011 Gubernatorial Elections

2011 governor full

The 2011 West Virginia Gubernatorial Race is fully open at this point. Well, that's maybe the wrong way to put it- West Virginians have clear preferences depending on if Congresswoman Shelley Moore-Capito (WV-02, R) gets in the race. If she does, she's a clear favorite, if not, then the leading Democratic candidates (SoS Tennant and Acting Governor Tomblin) are heavily favored.

2012 Gubernatorial Elections
2012 governor full

If North Carolina Republican Pat McCrory decides to run for Governor, the race leans strongly in his favor. Incumbent Missouri Democrat Jay Nixon is favored to win his race against probable Republican nominee, Lt.Gov. Peter Kinder at this point.


2012 Senate Elections


2012 senate full

This is a lot of data to digest and we won't attempt to interpret it right now. But depending on 2012 voter turnout and who wins the primaries, Republicans can either narrowly miss picking up the Senate or narrowly take it over. Informative!