The New Abnormal: The Cicchetti Declaration in Texas & Friends v. Swing States
Texas filed a lawsuit in the Supreme Court against four other states (Michigan, Wisconsin, Pennsylvania, and Georgia). Others have already weighed in on how unserious a lawsuit this apparently is.
But I want to have a look at something that is a bit more approachable, which is the statistics opinion that Texas Attroney General Ken Paxton relied upon in crafting the lawsuit. It makes some remarkable claims:
The probability of former Vice President Biden winning the popular vote in the
four Defendant States—Georgia, Michigan, Pennsylvania, and Wisconsin—
independently given President Trump’s early lead in those States as of 3 a.m.
on November 4, 2020, is less than one in a quadrillion, or 1 in
1,000,000,000,000,000. For former Vice President Biden to win these four
States collectively, the odds of that event happening decrease to less than one
in a quadrillion to the fourth power (i.e., 1 in 1,000,000,000,000,0004). See Decl.
of Charles J. Cicchetti, Ph.D. (“Cicchetti Decl.”) at ¶¶ 14-21, 30-31 (App. 4a-7a,
9a).
So I went looking for the declaration of Charles J. Cicchetti, Ph.D. And I didn’t find it at first, because Paxton’s text is in the PDF as both text and image, and when text search failed to find it, I assumed it was in a separate document. No, it’s there, just as ten image-only PDF pages.
So step one for me was to convert the image to text. I am currently at a Windows 10 machine, so a search on OCR of images led to a free app in the Microsoft app store. I installed the ‘Photo to OCR’ app and started using it. It appears to do a creditable job, but I haven’t yet delved into the results closely. I was able to screenshot one-half page at a time, open it in the app, copy out the OCR text to another editor, advance the PDF, and do it again.
I will put the text I got that way here next for others who want to have a look. The only difference from what the OCR program output is that I added the page numeration in square brackets. I make no guarantee about accuracy. This is unchecked, and the OCR process loses formatting like table layouts. If you want to use this anywhere that accuracy matters, be sure to check it yourself against the original PDF.
Declaration of Charles J. Cicchetti, Ph.D.
I, Charles J. Cicchetti, declare and state as follows:
1.
I am a resident of the State of California. Since 2016, I have been an independent
contractor and work as a Managing Director at Berkeley Research Group, Inc. The views
expressed are my own and do not reflect the views of any entities with which I am affiliated. I
have personal knowledge of the matters set forth below and could and would testify competently
to them if called upon to do so.
Professional Background
2.
I am an economist with a BA from Colorado College (1965) and a Ph.D. from Rutgers
University (1969), and three years of Post Graduate Research in applied economics and
econometrics at Resources For the Future (RFF). I was formally trained statistics and
econometrics and accepted as an expert witness in civil proceedings. I have been engaged to
design surveys, draw random samples, and analyze and test data for significance, and I have
conducted epidemiology analysis using logit models to determine the significance of relative
odds of outcomes and relative risk. I have also been tasked with evaluating the work of other
experts on the data and methods used and to detect and opine on bias, particularly missing
variable bias.
3.
I have testified in civil, arbitration, and administrative proceedings as an expert witness
hundreds of times since my first appearance in 1967. Much of this work involved data analysis
and interpretation, sampling, and survey design.
4.
I began my professional career after completing my academic and postdoctoral studies at
the University of Wisconsin, Madison, from 1972 to 1985, where I eventually became a tenured
Professor of Economics and Environmental Studies. During this period, I also served in other
capacities, including an early role as the first economist for the Environmental Defense Fund
(EDF), Director of the Wisconsin Energy Office, Special Advisor to the Governor of Wisconsin,
and Chair of the Wisconsin Public Service Commission. I had grants from EDF, the Ford
Foundation, National Science Foundation, and the Planning and Conservation Fund (California).
From 1987 to 1990, I was the Deputy Director of the Energy and Environmental Policy
5.
Center at the John F. Kennedy School of Government at Harvard University. I have taught at the[Page 1 — 1a]
University of Southern California (USC) part-time since 1991, and from 1998 to 2006 1 held the
Jeffrey J. and Paula Miller Chair in Government, Business and the Economy.
6.
I worked for and founded a number of consulting firms specializing in applied economics
and econometrics. I currently own Cicchetti Associates, Inc., and I am a member of Berkeley
Research Group. I have written more than twenty books and monographs and many peer
reviewed articles. A true and correct copy of my c. v. is attached as Exhibit l.
Assignment
I was asked to analyze some of the validity and credibility of the 2020 presidential
7.
election in key battleground states. I analyzed two things that seem to raise doubts about the
outcome. First, I analyzed the differences in the county votes of former Secretary of State Hillary
Clinton (Clinton) compared to former Vice President Joseph Biden (Biden). Second, many
Americans went to sleep election night with President Donald Trump (Trump) winning key
battleground states, only to learn the next day that Biden surged ahead. Therefore, I compared
and tested the significance of the change in tabulated ballots earlier in the reporting to
subsequent tabulations. For both comparisons I determined the likelihood that the samples of the
outcomes for the two Democrat candidates and two tabulation periods were similar andrandomly drawn from the same population. I used a standard statistical test in this comparison.
8
I was also asked to compare rejection of ballots in 2016 to 2020 in Georgia. I analyzed
data for mail-in ballots and their rejection rates for the two elections. I use this comparison to
estimate how the election outcome would be affected if the rejection rate in 2020 was similar to
2016 when there were many fewer absentee mail-in and other early ballots. The increase in
voters using early ballots in Georgia for the first time would likely cause errors that would
decrease acceptance relative to rejections. Furthermore, the time between the two presidential
elections is short enough that significant changes as discussed in this declaration could not be
due to underlying changes in demographic factors. It is important to determine if there were
instances of opening absentee ballots before election day commenced, which was not permitted.
The specific procedures for acceptance/rejection are important because the Settlement reached in
Georgia, identified in the complaint, agreed to require three registrars to reject a defective ballot.
This change alone would increase acceptance, and likely caused lower rejection rates.[Page 2 — 2a]
I was asked to analyze absentee ballots in Wayne County, Michigan to determine if the
9.
reporting satisfied the requirements for tabulating and reporting ballots. I found that Detroit
precincts do not provide information on voter registration. These same precincts in Detroit do
not report balanced tabulations as required. These failures make it impossible to determine if the
ballots tabulated are valid.
I. Z-Scores For Georgial
A. Comparing Clinton in 2016 to Biden in 2020 in Georgia
10.
In 2016, Trump won Georgia with 51.0% of the vote compared to Clinton’s 45.9% with
more than 211,000 votes separating them. In 2016, Clinton received 1,877,963 votes and Trump
received 2,089,104. In 2020, Biden’s tabulated votes (2,474,507) were much greater than
Clinton’s in 2016. Trump’s votes also increased to 2,461,837. The Biden and Trump percentages
of the tabulations were 49.5% and 49.3%, respectively.
11.
I tested the hypothesis that the performance of the two Democrat candidates were
statistically similar by comparing Clinton to Biden. I use a Z-statistic or score, which measures
the number of standard deviations the observation is above the mean value of the comparison
being made. I compare the total votes of each candidate, in two elections and test the hypothesisthat other things being the same they would have an equal number of votes.2 1 estimate the
variance by multiplying the mean times the probability of the candidate not getting a vote. The
hypothesis is tested using a Z•score which is the difference between the two candidates’ mean
values divided by the square root of the sum of their respective variances. I use the calculated Z
score to determine the p-value, which is the probability of finding a test result at least as extreme
as the actual results observed. First, I determine the Z-score comparing the number of votes
Clinton received in 2016 to the number of votes Biden received in 2020. The Z-score is 396.3.
This value corresponds to a confidence that I can reject the hypothesis many times more than one
in a quadrillion times3 that the two outcomes were similar.
1 Unless otherwise noted, the data used for Georgia are from the Secretary of State in Georgia.
The mean of a binomial distribution is defined as the probability of candidate getting a vote times the number of
votes cast.
3 A quadrillion is I followed by 15 zeros. Z equal to 10 would reject with a confidence of one in a septillion, or one
followed by 24 zeros, which would be a billion quadrillion, or a trillion, trillion. As Z increases, the number of zeros
increases exponentially. A Z of 396.3 is a chance in 1 in almost an infinite number or outcomes of finding the two
results being from the same population, here Georgia voters preferring a Democrat in 2016 being the same as in
2020.[Page 3 — 3a]
Second, since more ballots were cast I performed an additional hypothesis test of the
12.
similarity of the Clinton and Biden vote percentages to remove the effect of the difference in the
increase number of votes that Biden received relative to Clinton. The estimated Z-score is less
because I removed the influence of differences in the number of ballots tabulated in the two
elections. I continue to find with very great confidence that I can reject the hypothesis that the
percentages of the votes Clinton and Biden achieved in the respective elections are similar. The
estimated Z-score is 108.7. The confidence for rejecting the hypothesis remains many times
4
more than one in a quadrillion.
13.
There are many possible reasons why people vote for different candidates. However, I
find the increase of Biden over Clinton is statistically incredible if the outcomes were based on
similar populations of voters supporting the two Democrat candidates. The statistical differences
are so great, this raises important questions about changes in how ballots were accepted in 2020
when they would be found to be invalid and rejected in prior elections.
B. Comparing Early and Subsequent Tabulations for Georgia5
14. At 3:10 AM EST on November 4 the Georgia reported tabulations were 51.09% for
Trump and 48.91% for Biden (eliminating third-party candidates). The total votes reported forthe two major candidates were 4,662,328. On November 18 at 2 PM EST, the reported
percentages were Trump 49.86% and Biden at 50.14%. The Biden advantage over Trump in the
final tabulations reported was less than 14,000 votes, or 0.28%. For this turnaround to occur, the
subsequent additional “late” ballots totaling 268,204 votes (5.4% of the votes reported on
November 1 8) had to split 71.60% for Biden and 28.40% for Trump. The two periods report
shifts in the percentage favoring Trump from 51.09% to 49.86%, which is a percentage
difference of I .23%.
15. The Georgia reversal in the outcome raises questions because the votes tabulated in the
two time periods could not be random samples from the same population of all votes cast. I use
a Z-score to test if the votes from the two samples are statistically similar. I estimate a Z-score
4
The estimated confidence is actually about I in I with 2,568 zeros.
e The data on the tabulations for early balloting compared to the final tabulations come from the same source for the
different time periods and the five battleground states that I analyzed. The source used was:
https://www.270towin.com/2020-election-results-live/. These are provided by time, date, and state.[Page 4 — 4a]
of 1,891.6 There is a one in many more than quadrillions of chances that these two tabulation
periods are randomly drawn from the same population. Therefore, the reported tabulations in the
early and subsequent periods could not remotely plausibly be random samples from the same
population of all Georgia ballots tabulated. This result was not expected because the tabulations
reported at 3 AM EST represented almost 95% of the final tally, which makes a finding of
similarity for random selections likely and not statistically implausible.
16. Put another way, for the outcome to change, the additional ballots counted would need to
be much different than the earlier sample tabulated. Location and types of ballots in the
subsequent counts had, in effect, to be from entirely different populations, the early and
subsequent periods, and not random selections from the same population. These very different
tabulations also suggest the strong need to determine why the outcome changed. I am aware of
anecdotal statements from election night that some Democrat strongholds were yet to be
tabulated. There was also some speculation that the yet-to-be counted ballots were likely
absentee mail-in ballots. Either could cause the latter ballots to be non-randomly different than
the nearly 95% of ballots counted by 3AM EST, but I am not aware of any actual data supporting
that either of these events occurred. However, given the closeness of the vote in Georgia, 12,670
votes, further investigation and audits should be pursued before finalizing the outcome.II. Z-Scores for Other Battleground States
17.
I analyzed three additional battleground states, Michigan, Pennsylvania, and Wisconsin. I
reviewed similar matters related to Clinton/Biden differences and early tabulated results and
outcome reversals. The states all had significant increases in early ballots compared to 2016.
This is shown in Table 1 for Georgia and the other three battleground states that I analyzed in
some detail.
Table 1: Early Ballots and Percent Increases Between 2016 and 2020
State
Georgia
Michigan
2016
1,277,405
2020
1,924,838
2020/2016
162.30%
243.60%
866.60%
233.10%
Pennsylvania 288,996
Wisconsin
825,620
6 This would be I divided by more than 775,000 zeros.[Page 5 — 5a]
18.
I calculated the same Z-scores for Biden and Clinton total ballots and their respective
percentage of the votes for the four states. These data were Secretary of State certified
tabulations. I analyzed data from what I understand to be a non-partisan neutral source,
270toWin, to compare tabulations when balloting was reported as halted in Georgia discussed
above, and Michigan, Pennsylvania, and Wisconsin states at about 3 AM on November 4, 2020. I
compared this to the data from other time periods from the same source to avoid any reporting
differences. The final tabulations for the two leading candidates that I used in this comparison
are tabulations reported November 18, 2020 at 2PM EST.
19.
Table 2 shows the Z-scores for Georgia discussed above and the other three states.
Table 2: Z Scores Battleground States
Biden & Clinton Early
Geor ia
Penns Ivania
Wisconsin
Michi an
Votes
396.3
290.4
198.5
333.1
Percenta e to Later
108.7
90.7
77
107.4
1891
736
1271
586I reject the hypothesis that the Biden and Clinton votes are similar with great confidence
20.
many times greater than one in a quadrillion in all four states. Similarly, I reject the hypothesis
that the Biden and Clinton percentage of the two leading candidates’ votes are similar with
confidence exceeding many times one in a quadrillion. In fact, the confidence I reject the
similarity in these comparisons with the probability of incorrectly rejecting such hypotheses is
equal to about one divided by one with a thousand or more zeros. Further, when all four
battleground states have the same Clinton to Biden difference, the probability of such a
collective outcome is lower by an exponential factor of four, i.e., the improbability of that
collective outcome effectively raises the odds of all four having the same result to the fourth
power. The probability of there being no meaningful difference in voter preferences for Clinton
and Biden would be approximately one divided by one with about a trillion zeros.
The degree of confidence is even greater for rejecting the hypothesis that the early
21.
morning after election tabulations and the subsequent tabulations were drawn from the same
population of all voters. For example, the Z-score for Michigan is the lowest of the four states[Page 6 — 6a]
shown. The degree of confidence for rejecting the Michigan hypothesis has a one in one with
74,593 zeros. Georgia had tabulated about of the ballots cast by 3 AM EST. The
comparable initial period tabulations in Pennsylvania, Wisconsin, and Michigan were 75%, 89%,
and 69%. These are large enough to expect comparable percentages and vote margins for
random selections of ballots to tabulate early and later. Again, the chance of this happening in
all four states collectively is even far more improbable, and would be about one divided by about
one with a quadrillion zeros.
III. Comparing 2016 Rejection Rates to 2020 Rejection Rates in Georgia
22.
In 2016, the rejection rate for mail-in absentee ballots in Georgia was 6.42%.
2016 Mail-in Absentee Ballots
2016 Mail-in Volume
2016 Mail-in Ballots Rejected
2016 Mail-in Rejection Rate
213,033
13,677
In 2020, many more mail-in absentee ballots were tabulated in Georgia, while the
23.
rejection rate dropped to less than 0.37%.2020 Mail-in Absentee Ballots
2020 Mail in Volume
2020 Mail in Ballots Rejected
2020 Mail in Rejection Rate
1 , 316,943
4,786
0.3634%
There were 1,316,943 absentee mail-in ballots submitted in Georgia in 2020. The Biden
and Trump combined absentees mail-in ballots equaled 1,300,886. There were 4,786 absentee
ballots rejected in 2020. This is a rejection rate of 0.3634% out of all the absentee mail-in ballots
tabulated. This is much smaller than the number of absentee ballots rejected in 2016, when
13,677 absentee mail-in ballots were rejected out of 213,033 submitted. The 2016 rejection rate
was 6.42%, which is more than seventeen times greater than 2020. This decrease in rejection
rates is very unexpected, since there was more than a six-fold increase in absentee ballot use.
If the rejection rate of mailed-in absentee ballots remained the same in 2020 as it was in
25.
2016, there would be 83,517 fewer tabulated ballots for Biden and Trump in 2020. The
Secretary of State’s certified absentee mail-in ballots for the two major party candidates were[Page 7 — 7a]
split 34.681 % for Trump and 65.319% for Biden. Ifthe higher 2016 rejection rate was applied to
the much greater 1,300,886 ballots, and the Biden and Trump shares of rejected ballots was the
same as for all absentee mail-in ballots for the two major party candidates, this would decrease
Trump votes by 28,965 and Biden votes by 54,552, which is a net gain for Trump of 25,587
votes.
26.
The net gain for Trump would be more than the tabulated ballots needed to overcome the
Biden advantage of 12,670 votes. Trump would win by 12,917 votes.
IV. Incomplete Ballots and Non-Reporting in Michigan
27.
I analyzed the absentee ballot data for Wayne County, Michigan, at the precinct level. I
found that 174,384 absentee ballots out of 566,694 absentee ballots tabulated (about 30.8%) were
counted without a registration number for precincts in the City of Detroit starting with Absentee
Vote County Board 1 (ACVB 1) through (ACVB 134). In Wayne County, Biden won 68.4% of
the ballots tabulated.
28.
If this same rate was applied to these votes without a registration number, this would
cause Biden to lose about 1 19,300 votes and Trump’s comparable loss with 30.3% of the
tabulated vote would be about 52,800 votes. This would be a net gain of about 66,500 votes forTrump in one county if votes without a voter registration were not counted. If the percent voting
for Biden was greater, the net gain for Trump would be higher. This seems likely since the
precincts were all from Detroit that included absentee ballots without registration identification
in their tabulation.
29. Michigan requires precincts to balance their reported tabulations. William C. Hartmann
and Monica Palmer (Chairperson) are two of the four members of the Board of Canvassers for
Wayne County. They signed affidavits (attached to my declaration) attesting they would not
certify Wayne County’s vote because about 70% of Detroit’s 134 AVCB precincts were not
balanced. This means the numbers reported must match the votes tabulated and ballots could be
misplaced and unexplained mismatches. Given the number of ballots tabulated without a
registration and the number of precincts that are not balanced, there is a need for more complete
investigations and audits.[Page 8 — 8a]
V. Summary
30.
I examine two reasons why further investigation of the vote tally in Georgia, Michigan,
Pennsylvania, and Wisconsin is needed given what, in my opinion, are extremely improbable
results in the 2020 election for president. First, Biden outperformed Clinton in both total votes
and percentage of the final votes in all four states. Second, Trump led in the voting tabulated
before about 3 AM the morning after in all four battlegrounds states. When the additional ballots
were added, Biden passed Trump in all four states. Battleground states are, by definition,
expected to be close to a 50/50 proposition or coin toss. Biden’s collective win in all four of
those battleground states were with percentage margins that far exceed Clinton’s vote results. I
find this statistically to be extremely improbable. In my opinion, this difference in the Clinton
and Biden performance warrants further investigation of the vote tally particularly in large
metropolitan counties within and adjacent to the urban centers in Atlanta, Philadelphia,
Pittsburgh, Detroit and Milwaukee.
31. Data from two different years or in two different time periods for random coin tossing
would not be expected to be much different than 50 heads and 50 tails. If there were differences,
this would suggest something not expected in a fair coin toss game was affecting the outcome.
This could be a defect in the coin or the tossing procedures. Discovering differences would withhigh probabilities require more analyses and investigations to determine what happened and
why. In my analysis, I found that the odds of the Clinton/Biden and early versus later tabulations
randomly happening in one state are astronomical, and in all four simultaneously occurring
nearly incomprehensible. Accordingly, all four battleground states should be thoroughly
analyzed, investigated and audited to determine whether the outcome of the vote is accurate. In
my opinion, the outcome of Biden winning in all these four states is so statistically improbable,
that it is not possible to dismiss fraud and biased changes in the way ballots were processed,
validated and tabulated. Ifthe efforts to uncover mistakes and violations are completed, I would
not be surprised that there could be a reversal in the outcome of Biden winning in some or all of
these four battleground states.
32.
I found in Georgia that the rejection rates for absentee ballots in 2020 were much less
than in 2016. This is surprising since so many more voters (more than six times as many) used
absentee mail-in ballots in 2020 compared to 2016. I found that if the previous 6.42% rejection
rate of absentee mail-in ballots in 2016 applied in 2020, there would be about 83,500 fewer votes[Page 9 — 9a]
for the two major party candidates. I estimate that if the same split ot’ all absentee mail-in ballots
for Trump and Biden was applied to the difference in the votes corresponding to the 2016
rejection rate that Trump would have fewer ballots rejected for a net gain in the margin of more
than 25,500 votes, and win the Georgia presidential election by nearly 13,000 votes.
33. The statistical differences that I found in Georgia strongly point to the necessity of
reviewing all ballots to make certain the sharp decrease in rejections and/or curing were accurate
and legally 1 permitted.
I analyzed absentee ballots in Wayne County, Michigan. I found 174,384 ballots in
34.
Detroit were not matched to registered voters. I further read the Affidavit of two of the four
members of the Canvassing Board and learned that about 70% of the Detroit precincts did not
balance the votes tabulated as they are required to do so. Both findings strongly support my
opinion that the vote tally is materially inaccurate and warrant an investigation and audit of these
results.7 | am not an attorney, and this is not intended to be a legal opinion.
[Page 10 — 10a]
I will be adding further content here as I begin going through Cicchetti’s work, but I will post this immediately so anyone who wanted some form of searchable text of it can have that right now.
Ok, I will start in on the analysis. Since this is all about what seems to me to be overly-litigious people all worked up over apparent delusions, let me just say here for the record that all of what you are reading here is my personal opinion. People make mistakes all the time. People who seem to be swayed by confirmation bias or motivated reasoning in particular have a difficult time avoiding pitfalls, and I certainly know that I try to watch out for that sort of thing tripping me up.
First up, I want to make sure I have the elements of Paxton’s claims that Paxton purports Cicchetti’s declaration supports:
- Biden’s win in each of the swing states being sued had p<1e-15.
- Biden’s win for all four of the swing states being sued had p<1e-60.
- Biden’s win for all four of the swing states being sued had p<1e-75 when compared to historical election data.
- If Georgia had the same rate of rejected absentee ballots as in 2016, Trump would have a net gain in votes of 25,587 votes.
- 30.8% of Wayne County, MI votes were counted without a registration number and invalidating those would give Trump a win over Biden.
Let’s see what is under the hood of the Cicchetti statistics machine and whether both Cicchetti and Paxton have carried this through correctly.
I will note that I’ve been scooped on Twitter. Cicchetti develops a statistical analysis of whether voter populations were the same between Clinton’s 2016 election results and Biden’s 2020 election results, and then between early counts in swing states and later counts in swing states. For those, his statistics indicate that the null hypothesis (that the populations were the same, or alternatively, that the results in both were drawn from the same distribution) is rejected, with a p-value in each of less than 1e-15.
Finding that the voter populations that succeeded in 2020 to give Biden the popular vote win in those states were different from the voter populations that failed in 2016 to give Clinton the popular vote win in those states is a vastly different result than an assertion that Biden’s *chance* of winning was equivalent to the test statistic’s p-value concerning Biden 2020 election win being of the same voter distribution as Clinton’s 2016 election loss.
Form Cicchetti’s Item 11, we find:
I tested the hypothesis that the performance of the two Democrat candidates were statistically similar by comparing Clinton to Biden. I use a Z-statistic or score, which measures the number of standard deviations the observation is above the mean value of the comparison being made. I compare the total votes of each candidate, in two elections and test the hypothesis that other things being the same they would have an equal number of votes.
So what Cicchetti actually tested and determined, with great statistical confidence, was that in Georgia Biden’s win was *different* from Clinton’s loss.
Think about that a moment.
Much the same thing happens with Cicchetti’s analysis of early and late ballot counts in Georgia. I’ll quote Item 16 again here:
16. Put another way, for the outcome to change, the additional ballots counted would need to be much different than the earlier sample tabulated. Location and types of ballots in the subsequent counts had, in effect, to be from entirely different populations, the early and subsequent periods, and not random selections from the same population. These very different tabulations also suggest the strong need to determine why the outcome changed. I am aware of anecdotal statements from election night that some Democrat strongholds were yet to be tabulated. There was also some speculation that the yet-to-be counted ballots were likely absentee mail-in ballots. Either could cause the latter ballots to be non-randomly different than the nearly 95% of ballots counted by 3AM EST, but I am not aware of any actual data supporting that either of these events occurred. However, given the closeness of the vote in Georgia, 12,670 votes, further investigation and audits should be pursued before finalizing the outcome.
There’s some things to note here. The statistical test is for assessing a difference in two distributions; his claim that ‘entirely different’ populations would be needed is both unexamined and untested by Cicchetti, in part because he doesn’t bother to define ‘entirely different’. With that sentence, Cicchetti departed from anything with any claim to statistical grounding. Sure, one can readily grant ‘different’ populations for the sake of argument — that was the point of his test — but the ‘entirely’ modifier on top of ‘different’ has no basis in anything. The rest of the paragraph continues with mea culpas for his ignorance of the actuality of any possible basis for distributions being different, such as, say, the documented statements of one presidential candidate sending out mixed messages about whether those voting for him should vote via absentee ballot or in person on election day, and finishes with editorial commentary that is also not grounded in any statistical analysis. But it again boils down to Cicchetti finding, with great statistical confidence, that the distribution of early vote counts differed from that of late vote counts.
Then there is the issue of Biden wins across the four swing states in question. Cicchetti takes this up in his Item 20 discussion:
20. I reject the hypothesis that the Biden and Clinton votes are similar with great confidence many times greater than one in a quadrillion in all four states. Similarly, I reject the hypothesis that the Biden and Clinton percentage of the two leading candidates’ votes are similar with confidence exceeding many times one in a quadrillion. In fact, the confidence I reject the similarity in these comparisons with the probability of incorrectly rejecting such hypotheses is equal to about one divided by one with a thousand or more zeros. Further, when all four battleground states have the same Clinton to Biden difference, the probability of such a collective outcome is lower by an exponential factor of four, i.e., the improbability of that collective outcome effectively raises the odds of all four having the same result to the fourth power. The probability of there being no meaningful difference in voter preferences for Clinton and Biden would be approximately one divided by one with about a trillion zeros.
The basic problem with Cicchetti’s claim here is that it is nonsensical. The p-value of a test for a difference in one thing is not any part of making a claim about the probability that one finds — or does not find — a difference in other, independent tests. One rock-bottom approach to statistics is enumerating possible outcomes. Either Biden’s performance was different from Clinton’s performance in each state, or it wasn’t. Given a pair of possible outcomes per state, the smallest probability that one can possibly reach is a mere one in sixteen, that is, 2^4. Either Biden’s win was different from Clinton’s loss in each state, or it wasn’t.
Both of Paxton’s statistical claims #1 and #2 are simply mischaracterizations of what Cicchetti claimed. Cicchetti never analyzed the probability of a Biden win in any state; what he actually analyzed was whether ways in which Biden won were from the same distribution as Clinton’s 2016 loss, and whether the counting process timing of processing particular types of votes was the same over the counting period. The response of any rational agent encountering Cicchetti’s analysis on these points should be a big, “And so?” That he found differences has no bearing on the result or the legitimacy of the result. For myself, I rather expect that a winning campaign in an election likely should have differences from some previous losing campaign. And a candidate does not have any say in how a state chooses to order its ballot-counting, nor when they release information on those counts. Well, unless the candidate’s name is ‘Brian Kemp’.
Then there’s the Paxton stats claim #3, that yet another, larger exponent can be put on the probability of a Biden win across four swing states. Does Cicchetti have something for that? Not that I have found. It appears that Paxton used a ‘5’ where a ‘4’ supplied by Cicchetti should have gone. One thing about fortunately directional errors such as this one by Paxton is that as one encounters an ensemble of such, it becomes less credible to posit that they all simply fall into the realm of error, and more credible to posit that the person making what are initially deemed errors is instead deliberately misleading the reader. And the same critique of Cicchetti’s analysis applies here whether Paxton erred or lied that ‘5’ into existence: there’s no basis for the claim that the p-values of determining a difference are multiplicative across independent trials. To approach this another way, suppose that in two trials, one found terrific support for a difference, say between a losing 2016 campaign and a winning 2020 campaign at p<1e-15, but in the second trial, one came up with p<0.9, something that no one would claim to show a significant difference and no reason to reject the null hypothesis. Yet under Cicchetti’s interpretation, the joint probability of occurrences of differences across the two trials would be p=9e-16, an improvement over either of the single trials despite one trial showing no evidence of a difference.
For the first three statistical claims made by Paxton, they all are misinterpretations of things from what Cicchetti said, compounded by misinterpretations in Cicchetti’s analysis. What Chicchetti’s analysis shows is that winning campaigns are different from losing campaigns. There is nothing in the stats Cicchetti applies to those claims that touches on the probability of a win, other than that the Biden win is not premised on the same distribution of voting as the Clinton loss, which makes all kinds of sense.
Concerning Paxton’s claim that an adjustment to Georgia’s absentee ballot count is both needed and would give Trump the win in Georgia, Cicchetti makes an assumption that the 2016 absentee ballot rejection rate is appropriate to apply in 2020. But Cicchetti gives no basis to prefer the 2016 rejection rate, other than it happened one time before.
24. There were 1,316,943 absentee mail-in ballots submitted in Georgia in 2020. The Biden and Trump combined absentees mail-in ballots equaled 1,300,886. There were 4,786 absentee ballots rejected in 2020. This is a rejection rate of 0.3634% out of all the absentee mail-in ballots tabulated. This is much smaller than the number of absentee ballots rejected in 2016, when 13,677 absentee mail-in ballots were rejected out of 213,033 submitted. The 2016 rejection rate was 6.42%, which is more than seventeen times greater than 2020. This decrease in rejection rates is very unexpected, since there was more than a six-fold increase in absentee ballot use.
The main thing to note here is that there is no basis for the claim of the final sentence, that a decrease in rejection rate was ‘unexpected’. There is no analysis even attempted to support that claim. That statement is a claim about the distribution of absentee ballot rejection rates, and Cicchetti provides only two numbers, then assumes one is wrong and one is right on no grounds whatsoever. Given that there is no basis given by Cicchetti that the 2016 rejection rate ought to be applied to 2020 ballots, his further calculations on numbers of ballots that might be shifted such that Trump could score a win are irrelevant, and Paxton’s claim based on that is vitiated.
Finally, there is Paxton’s statistics claim about votes in Michigan.
Cicchetti’s discussion of these votes:
27. I analyzed the absentee ballot data for Wayne County, Michigan, at the precinct level. I found that 174,384 absentee ballots out of 566,694 absentee ballots tabulated (about 30.8%) were counted without a registration number for precincts in the City of Detroit starting with Absentee Vote County Board 1 (ACVB 1) through (ACVB 134). In Wayne County, Biden won 68.4% of the ballots tabulated.
28. If this same rate was applied to these votes without a registration number, this would cause Biden to lose about 119,300 votes and Trump’s comparable loss with 30.3% of the tabulated vote would be about 52,800 votes. This would be a net gain of about 66,500 votes for Trump in one county if votes without a voter registration were not counted. If the percent voting for Biden was greater, the net gain for Trump would be higher. This seems likely since the precincts were all from Detroit that included absentee ballots without registration identification in their tabulation.
It is not explicitly stated, but it appears that Cicchetti is assuming that some 174,384 votes would be invalidated, and that invalidation would be imputed in a particular way.
This is where Cicchetti demonstrates some quite remarkable inconsistency. Cicchetti is willing here to accept as accurate a distribution of absentee ballot votes from Wayne County, whose absentee ballots were largely tallied late in the ballot counting process, while his earlier calculations of probabilities asserted that the true distribution was that there was no difference in distribution between early and late -counted ballots in Michigan. So here Cicchetti quite conveniently adopts a stance that holds a distribution to be accurate and the one to be applied when he apparently believes it leads to a desired result, while earlier in making other arguments he held that any assertion that a different distribution applied meant there was reason to distrust the results. If he were consistent, he would have to drop one or two of the claimed results, either that the election results for Michigan were suspicious because a Biden win was different from a Clinton loss and the related claim that it was suspicious that early and late-counted votes had different distributions, or he would have to drop this claim that a Trump win could be reverse-engineered by asserting a different distribution was actually the true distribution, and thus it was appropriate to use here rather than the aggregate Michigan vote distribution. If the aggregate everything-was-the-same-over-time-across-the-state distribution Cicchetti assumed applied in his earlier analyses were used, of course, there would hardly be any net shift in the result at all.
Paxton also relies on Cicchetti for a claim about precincts not reporting balanced results in their vote totals, but since Cicchetti develops no statistical analysis on that point, it just shares the same general respect as a dead otter and we could simply pass over it. As a matter of interest here, that liberal rag Forbes has an article on some of these issues, including the out-of-balance precincts. (I wonder why I have Phillip Glass music as an ear-worm now…) Forbes notes:
Detroit has reported only 357 mismatched votes across all out-of-balance precincts, Mayor Mike Duggan said, which is less than 0.2% of the total ballots cast citywide.
Some 28% of Detroit’s more than 600 precincts and absentee counting boards are out of balance without explanation, the Michigan Secretary of State’s office told Forbes, but the Detroit Free Press reports most of these precincts were off by fewer than four votes.
A general takeaway from the Forbes article is that elections are run by humans, human error is normal, and this particular sort of error is far too tiny to have an impact on the direction the election result takes.
One of the aphorisms I recall keenly is that one should identify causes that are sufficient to produce the effect one wishes to tie them to. Cicchetti apparently thinks it makes a practical difference to the election if 357 votes from out-of-balance precincts is joined up to the 174,384 votes he tagged as suspect under the missing registration number issue.
In summary, Paxton appears to have gotten a chunk of what Cicchetti wrote wrong, changed an exponent without basis, and relied on parts of Cicchetti’s statement had had no statistical basis. There is not any part of what Paxton relied on Cichetti for that is not invalidated or put into some doubt when one scrutinizes it.
And, once again, that’s in my opinion.