menu

By Gary Warner

Conspiracy theorists and Trump’s most extreme #MAGA supporters are claiming to have proof that votes were changed from Trump to Biden while other votes were discarded. When President Trump tweeted that 2.7 million Trump votes had been deleted nationwide, he was retweeted more than 180,000 times and his tweet was liked more than 600,000 times.

TrumpTweet

We can contrast this joint statement of the Election Infrastructure Government Coordinating Council Executive Committee, shared by the DHS Cybersecurity & Infrastructure Security Agency, the government agency assigned to look out for election fraud: “There is no evidence that any voting system deleted or lost votes, changed votes, or was in any way compromised.”

So where do the numbers mentioned by President Trump come from and why are they not evidence of widespread fraud? The source of these numbers is a set of tables being shared by Q-Anon types that have figures such as these:

GWarner post_vote total example

The numbers seem far too precise to be entirely made up. Analysts in the UAB Computer Forensics Research Lab who specialize in disinformation campaigns helped me dig in to find the source of the numbers. The analyst behind these numbers is a member of the website “thedonald[.]win” who goes by the user name “PedeInspector.”

PedeInspector

In his thread, labeled “Happening!”, he explains his methodology for calculating the vote switch and lost vote numbers, which have been widely shared, including by President Trump himself. His “proof” that his methodology is correct is that his data shows publicly acknowledged “glitches” such as the Antrim county, Michigan “glitch” which was later confirmed to be human error.

To allow others to confirm his work, PedeInspector has posted a copy of all of the vote count data that he used for his analysis, along with his Python program, on the website “workupload[.]com.”

Here’s an example of the data analysis

  • {“trumpd”:0.578,”bidenj”:0.401},”votes”:573857,”timestamp”:”2020-11-04T01:51:26Z”}
  • {“trumpd”:0.568,”bidenj”:0.406},”votes”:574417,”timestamp”:”2020-11-04T01:51:52Z”}

In these records, at a certain timestamp, there were 573,857 votes that had been counted in Michigan, with Trump receiving 57.8% of the votes and Biden receiving 40.1% of the votes, which would work out to 331,689 votes for Trump and 230,116 votes for Biden.

Twenty-six seconds later, there were 574,417 counted votes, which means 560 votes were counted during that 26 seconds. But now, Trump is listed as receiving 56.8% of the vote and Biden receiving 40.6% of the vote. That would mean Trump now had 326,268 votes (5,421 LESS votes) and Biden now had 233,213 votes (3,097 MORE votes). If we assume that all 560 new votes went to Biden, that would mean that Biden has 2,537 more votes than he should, or so it would seem.

PedeInspector’s interpretation of these numbers is then that Biden “stole” 2,537 votes from Trump and that 2,884 votes were “lost.”

The methodology used by PedeInspector was to extract each timestamped data record for each state from his source, which expressed the current state of the vote using numbers such as these in the format listed above, and at every timestamp, calculate the number of assigned votes for Biden and the number of assigned votes for Trump, and when Trump’s “next timestamp” votes were less than the “previous timestamp” votes, calculate how many votes were “switched” or “lost.”

Our first assessment was to see whether it was possible that PedeInspector had altered the data after acquiring it from the original source. He has not. We downloaded PedeInspector’s files and also acquired the data from his source and compared each timestamped record to see if there were any discrepancies. With the exception that some data has been ADDED (in the form of newer timestamped records) to the original source files since being downloaded, no discrepancies were found. For example, with the Michigan votes files, the PedeInspector copy of the file has 534 timestamped vote_shares records, ending with 2020-11-09T23:04:46Z. The New York Times copy of the file has 538 timestamped vote_shares records, ending with 2020-11-12T22:03:58Z.

What? The New York Times?

Yes, the data used by PedeInspector was downloaded from the Times. The URL provided by PedeInspector as his sources are URLs such as this example, as well as this example.

Each state has a similar file there, just replace “Michigan” or “Pennsylvania” with the name of the state whose data you wish to review. We assume this data will be quickly made unavailable, but surprisingly, it is still live days after being disclosed as the source by PedeInspector.

However, despite being the source of the data file, The New York Times is not the source of the data. The full format of each record includes an additional field, “eevp_source”: “edison”. On their “Election Polling Page,” Edison boasts that they provide vote tabulation for ABC News, CBS News, CNN and NBC News. Edison’s executive vice president, Rob Farbman, has already spoken to the accuracy of their numbers when debunking the “HAMMER” conspiracy theory, pointing out that a change in the Armstrong County, Pennsylvania votes was “simple human error.” But conspiracy theorists want to believe and have tried to use their own data analysis of Edison numbers to show that there were not a handful of changes, but instead thousands of changes.

Rounding Errors and Math is Hard

Several other members of “thedonald[.]win” website have pointed out issues with PedeInspector’s analysis. The greatest problem is that the data files being used for PedeInspector’s analysis were intended to be ingested to shade a state on a map and not to do vote-by-vote analysis. To help me make sense of the numbers, and the competing versions of the analysis programs, I was pleasantly surprised to have five student programmers in my lab who jumped on an early Saturday morning Discord call to help me out.

As one of my programmers explained the issue:

“You know how dividing 4 by 3, you get 1.33333333… and it goes on forever? Is 1.333 the same number as 1.33333 repeating? They’re similar, but in cutting the number off, you lose precision. 1.333 is not 1.333333 repeating. For most purposes, this is fine. When you need a number to be exactly right, such as when handling money or a vote count, it’s dangerous to lose precision. These small losses in precision can add up over time, making significant differences in the final result.”

Here’s an example from the Pennsylvania data. At timestamp 2020-11-08T18:36:39Z, there had been 6,756,903 votes counted. In the next reporting period, 2020-11-08T19:58:46Z, there had been 6,758,279 votes counted. President Trump is said to have 49.1% of the first number and 49.0% of the second number.

The granularity of the Edison data being used by both PedeInspector and VisualScience only has accuracy to 1/10 of one percent. It was never intended to be used in this way! In the table below, the change in Trump count is easily accounted for by the percentage being flipped from 49.1% to 49.0%, which is likely the result of rounding.

GWarner post_vote table2

Both the original PedeInspector Python program and the new VisualScience Python program show their bias in their code. PedeInspector calls his program “fraudcatch” and ONLY searches for examples where the candidates lose votes. VisualScience calls his main routine “newFindFraud” and again only searches for places where votes are lost. It is in the nature of rounding numbers that sometimes the number will round down and sometimes the number will round up.

PedeInspector selected only the lines of data where President Trump “rounded down” and then added all of these numbers together, claiming that 900,000 votes had “disappeared.” His dramatic findings are being thrown around all over the Internet, but his own retractions are hardly mentioned. PedeInspector has repeatedly tried to point out that he does not know the accuracy of the data, did not have permission to use the data, and that he has made corrections to his analysis that are being ignored.

PedeRetraction

PedeRetraction2

Another amateur researcher at TheDonald pointed out that the raw data, in integer form, for every precinct is also available on The New York Times website, but it is much more difficult to find, as the file names include timestamps to the 1/1000ths of seconds, such as this file, which is an example precinct-by-precinct, county-by-county, integer votes cast in Pennsylvania file. For each “state-level” summary data entry, there is solid precinct level data that could be used to prove that in the vast majority of instances, rounding—rather than neglect, error or deceit—was the cause of the data. To locate this file, the analyst wrote a program that tested for the presence of file names by checking to see if there was a file whose name ended in “.000Z.json” then “.001Z.json” then “.002Z.json” – basically “rolling the numbers” until he found the actual raw data for a single instance of the Pennsylvania vote reporting.

To finalize this election and silence disinformation based on this data, it is urgent that Edison Research and/or The New York Times help explain the seeming discrepancies found by “PedeInspector” so that we can end the speculation and welcome and support the Biden Transition Team. This is the last “hanging chad” of the 2020 election, and it desperately needed to be addressed. We hope that in our comments here we have provided enough guidance for that process to be finalized and for the public to realize that the most dramatic claim of voter fraud is not supported by the numbers and is even doubted by its originator.

Gary Warner is the Director of Research in Computer Forensics at the University of Alabama at Birmingham (UAB).