People who watched the video will get the joke

A Pattern-Recognition Critique of Dr. Shiva Ayyadurai

Election fraud in Michigan? Nope, just a huckster

Naim Kabir

--

Dr. Shiva Ayyadurai is doubling down on a video which I poked holes in a few days ago. In it, he claimed that Joe Biden stole more than 60,000 votes in Michigan. It involved poor mathematics, and folks like Matt Parker of StandupMaths drew similar conclusions.

Stanford Ph.D and director of MIT’s Election Lab, Charles Stewart III, happens to agree with us.

In this new video, Ayyadurai dismisses math-based criticisms by saying that detecting election-fraud is “not a math problem, but a pattern-recognition problem.”

Luckily, pattern recognition is my main discipline and the basis for my professional career — starting at the University of Pennsylvania where I learned how to detect seizures from ECOG data in BE521, developing during SBIR and STTR grant research for the Navy while at Commonwealth Computer Research, Inc, and finally coming into full form at a fin-tech start-up focusing on enhancing employee financial health.

So. We can talk about what he gets wrong, here.

Here’s the video so you can look at the raw material.

First I want to start by just noting inconsistencies, though, because it doesn’t take an expert in pattern recognition to recognize when someone declares one thing and then says something completely contradictory. Let’s go for it.

Inconsistencies

Composite Curves, Are They Normal Or Bad?

Recall, in the last video, the kinked curve in Ayyadurai’s graphs were a reason for alarm. He pointed at the bend in the curve and said, aha, this must be where the voting machines’ anti-Trump algorithm kicked in! Clear alarm signal!

31:52 — “For some reason, right at 20%, it looks like his vote count starts getting linearly reduced.”

He goes as far as to say that some kind of transistor function or ramp function is at work (39:05), an algorithm deep in Diebold-esque vote tabulation machines.

Yet in the new video, he says that combined flat and sloped curves are the surefire feature of a “normal” election.

Which one is it? You can’t have it both ways.

56:16

Wayne county: was there cheating or not?

This is my favorite graph in all of Ayyadurai’s presentations. It’s the slide of Wayne County, which he flashes again in his new video:

59:00

He shows it’s a flattish trend-line and the slide still says there was “no cheating.” But this certainly isn’t a “normal” composite curve, so there must be cheating according to his rule above, no?

That’s not even the best part, though. In this county where there was “no algorithm detected”, I’ve circled in yellow some points that cannot possibly exist. Remember that Dr. Shiva is plotting (% split-ticket Trump votes﹣% straight-ticket Republican votes) on the Y-axis, and % straight-ticket Republican votes on the X-axis.

If you look, there are points at approximately (X: 4%, Y: -20%), or (X: 2%, Y: -10%), etc. Those data-points would imply that Trump got negative split-ticket votes at those precincts. e.g if -20 = SplitTrump4; SplitTrump must be -16.

That would be a surefire piece of evidence that there was cheating, since you can’t have negative votes — but of all the counties he plots for us, this is the only one where there was “no cheating algorithm detected”.

At this point I’m like 30% sure he’s just trolling, and that this is him at peak comedy. But really it just means you can’t trust his plots, because clearly he shows fake data.

Are flat lines good? Or are they signs of fraud?

In the last video, Dr. Shiva Ayyadurai set the expectation that a flat line in his graphs should be a signal of normalcy. He plotted it in the case where sentiment towards Trump was about even with the Republican Party:

22:07

In response to a comment from Mitt Romney saying we should expect poor sentiment towards Trump from lifelong Republicans, Ayyadurai then posits this “normal” graph:

44:45 — “For example, Mitt Romney would and he has said this — ‘the pattern make sense’ he would say, ‘since Republicans stay Republican but no longer like Trump, this could be true, right?’”. Ayyadurai goes on to say: “Well if that were true, what would you see? You would see this!”

This graph should be alarming, because it also has an impossible point on it. On the far left you have one that’s approximately (X: 5, Y: -8) — which would imply it’s “normal” for you to observe -3% split-ticket voting percentages for Trump.

Anywho, besides being inconsistent with itself, it’s inconsistent with what he says in his new video — where he expects a composite curve in the usual case.

56:16

Feature Engineering

Let’s talk a little bit about feature engineering. In this video Ayyadurai touches a lot on how important it is to extract information-dense bits of signal from inputs — and this is correct. The majority of a professional machine learning engineer’s job will mostly involve milling raw data into derived features that will correlate well with a final inferred output — in Ayyadurai’s case, that’s “Election Fraud” or “No Election Fraud”.

Ayyadurai’s defense against Matt Parker’s criticism that subtracting two percentages doesn’t have any meaning is: it doesn’t have to mean anything mathematically, it just needs to be a good correlate for distinguishing “Election Fraud” from “No Election Fraud”. He says this is more of an art than it is just mathematics.

It’s true that feature engineering takes creativity, intuition, and artfulness, but it always does come back to down to mathematics — and we’ll see how in a second.

But first let’s get a sense of what Ayyadurai considers important features:

He mentions just the two here, but it’s enough to critique. Some criticisms:

Features shouldn’t be correlated with each other

Generally, features you use for a machine learning model shouldn’t be correlated. It’s hard to tell without knowing what downstream statistical model you’re using — since some won’t care, like decision trees and all the ensemble variations of decision trees—but at worst it’ll screw up your models and at best it just won’t be helpful.

It’s particularly harmful for understanding your multivariate linear regression models, where you make an assumption that there is no multicollinearity.

In this case, one of Ayyadurai’s features (2) is used as a term in another one of his features (1), meaning they are destined to be correlated. This is that big ol’ mean ol’ math criticism Shiva doesn’t like coming back to haunt him, here.

Shiva mentions clustering data in his presentation as well. Correlated features can lead to problems with clustering, since it will warp and skew your feature-space so things aren’t as nicely separable. It’d be like plotting points in a Cartesian space where the axes weren’t at right angles to each other, and then trying to distinguish clusters of points.

Speaking of clustering, though —

What are you even going to cluster?

In the stated research aims of Ayyadurai’s project, clustering is Aim 2.

Clustering is where you take your set of data-points and plot them in feature-space (the space where each feature is one of your axes) — and you split them into distinct groups, where each group has its own label. In Ayyadurai’s problem, each label should presumably be “Election Fraud” or “No Election Fraud”.

Below is an example of some points in feature-space being clustered into three classes by the K-means algorithm.

Clustering via K-Means. Source: Chire, Wikipedia

But here’s what Ayyadurai’s plots look like in his feature-space:

What clusters would you even make here, and what would they mean?

A deeper issue: the points in his feature-space are precincts. What he’s proposing is that his features are able to distinguish between fraud and no-fraud at the precinct level, but in every single one of his talks he mentions fraud being a property of a county, not a precinct.

Which leads me to—

Ayyadurai doesn’t know what a feature is

For a lad who took the first 20 minutes of the whole video fawning over his own accolades in pattern recognition, he sure does seem to be confused.

A feature is a property of the object that can be discriminated into its different classes (or regressed onto a continuous value). In Ayyadurai’s case, that object is a county. He’s always talked about counties as having fraudulent elections or not, and has never made claims that he’s been detecting fraud at individual precincts.

This leads me to believe that what he calls a “signal” is what he actually means as his feature, and what he calls a “feature” is a sub-component used to derive that feature.

So really, his feature is the polynomial degree of the curve he’s plotting, on a county-by-county basis.

A degree of 1 (a line) is bad, and a degree of >1 is good, according to this slide. I don’t know if he just mixed up terms because he was running on low sleep or something— but I suspect he was just desperately trying to save face by trying to play off his choice of (% split-ticket Trump votes﹣% straight-ticket Republican votes) as an “artful feature choice” rather than a mistake/deceit he and his team made.

So there you have it. A small taste of a pattern recognition person’s critique of Dr. Shiva Ayyadurai MIT Ph.D, B.S.E.E, M.S.V.S, M.S.E.E.

A recommendation: you probably shouldn’t trust this guy, since he seems to just be using techno-babble to push an agenda. It is your American right to put political pressure on your Secretary of State to ensure that we have free and fair elections via audits/recounts/hand-counts, etc., but lending your support to scam artists like this who just rile up conflict via disinformation is not the way to do it.

--

--

Naim Kabir

Engineer. Focused on experimentation, causal inference, and good software design. naimkabir.com