In the 2020 Democratic presidential primaries and caucuses in Iowa, New Hampshire, South Carolina, Alabama, Colorado, Minnesota, Missouri and Arizona, AP VoteCast is based on interviews with a random sample of registered voters eligible to vote in the Democratic primary election or caucus, drawn from state voter files. In California, Massachusetts, North Carolina, Texas, Virginia, Michigan, Ohio, Florida and Illinois, these probability-based interviews are combined with self-identified registered voters selected from nonprobability online panels. Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected from state voter files can be contacted by phone and mail, and have the opportunity to take the survey by phone or online. AP VoteCast features interviews with both voters and nonvoters.
Interviews begin six days before the day of the election or caucus, and are conducted until polls close. Respondents who complete the survey during the first four days of the field period and consent to be re-contacted in the final three days of the field period are sent a follow-up survey via email, text message or phone call that includes vote choice. For respondents that complete both waves of the survey, their most recent vote choice is used for estimation.
Sampling & Data Collection
In each state, AP VoteCast is comprised of probability-based interviews conducted online and by telephone with between 1,750 and 2,000 voters, as well as 500 to 900 non-voters. In some states, the survey will also include nonprobability interviews with 600 to 2,000 self-identified registered voters recruited by Lucid or Dynata. RelevantID is used to screen duplicate entries, in the event a registered voter is recruited by both nonprobability sample providers.
NORC obtains the sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses for all records and phone numbers for some records, allowing potential respondents to be contacted via mail and landline and mobile telephone. In addition, NORC attempts to match sampled records to a registered voter database maintained by L2, which provides additional phone numbers.
The sample is further defined based on the type of primary election each state conducts. In a “closed” primary, only voters registered with a party may cast a ballot in that party’s primary. In a “semi-closed” primary, unaffiliated voters and voters registered with a party may cast a ballot in that party’s primary. In an “open” primary, any registered voter may cast a ballot in a party primary.
- Arizona and Florida are “closed” primary states and the sample is limited to voters registered as Democrats.
- California, Colorado, Massachusetts, New Hampshire and North Carolina are “semi-closed” and the sample is limited to voters registered as Democrats and unaffiliated voters.
- Alabama, Illinois, Michigan, Minnesota, Mississippi, Missouri, Ohio, South Carolina, Texas, and Virginia are “open” primary states and the sample consists of all registered voters. The same sample design is used for the Iowa Democratic caucuses.
All probability sample records are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number to complete the interview with an NORC interviewer. Postcards are addressed by name to the sampled registered voter if that individual is under age 35; postcards are addressed to “registered voter” in all other cases. A subset of non-respondents may be dialed to complete the survey by telephone. Telephone interviews are conducted with the adult that answers the phone following confirmation of registered voter status in the state.
The margin of sampling error, including the design effect, will be approximately plus or minus 3.4 percentage points for voters in states with 2,000 probability interviews with voters and approximately plus or minus 3.7 percentage points for voters in states with 1,750 probability interviews with voters. As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order and nonresponse.
Weighting
AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample when available, and refines estimates at a subregional level within each state. Each of the state surveys are weighted separately.
First, weights are constructed separately for the probability sample and the nonprobability sample (when available) for each state survey. These weights are adjusted to population totals to correct for demographic imbalances of the responding sample compared to the target population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file, and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age and race/ethnicity.
The sample is further weighted based on the type of primary election each state conducts.
- In states where only registered members of each party can participate, the sample is weighted to the population of registered voters of the Democratic Party.
- In states where only registered or unaffiliated voters can participate, the sample is weighted to the population of registered voters of the Democratic Party and unaffiliated voters who are not registered to any other party.
- In states where any registered voter can participate, the sample is weighted to the population of registered voters.
Second, for the states that have a nonprobability component, the nonprobability sample respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as ideology, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on the probability sample estimates.
Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability sample and the calibrated nonprobability sample (if available), and then uses a small area model to improve the estimate within subregions of a state.
Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10 to 20 subregions within each state. For the Iowa Caucus, the survey results are weighted to the first alignment of caucus-goers, which reflects the preference of voters as they arrived at their caucus site.
Learn more about our methodology in the 2018 midterm elections here.