menu

The headline of an article the other day said it all – we now have information on 800,000 people in our terrorist databases. That means we now have information of one sort or another on more people than live in San Francisco. We have “big data,” as the people would say who pretend to know something about it. Big Data, they often claim, will solve the problem. To my mind, we have a big search, analysis and distribution problem, and despite “big data” claims of prowess, connecting the dots before a terrorist strikes is never going to be an easy thing.

I have a lot of biases about Big Data collection, storage, sorting, analysis and distribution. A lot has gone into the front end – the collection and storage part. There is good money there. Not as much good is being done on the sorting, analysis and distribution side. The software for that is infinitely more complex – and is yet able to read minds.

On the former, we are collecting and storing data at unbelievable rates. The data is now filling Indiana Jones-like warehouses. Lines of racks and servers stretch out in acres and acres of facilities. Huge air conditioner units rumble and power drains to keep the servers from melting like the Wicked Witch of the West. And organizations from the National Security Agency to Amazon are spending billions to build more.

There Is Gold In Them Thar Data

So what are you collecting all this stuff for unless you can use it? And it is being used. Corporations are trying to track our individual spending to better focus their sales efforts. Other corporations use it to track us from employment, credit records or dating services. Smart candidates use various databases to ferret out and turn out voters who will support them.

However, the business of terror is a different matter. Businesses and candidates can afford to be a little off in their demographics. If you are tracking someone who is trying kill – the tipping point of motivation to decision – then it needs to be more precise. And Silicon Valley claims aside, it is not there yet – nor will it ever be.

What we learned from the Boston bombings has been true of Big Data and terrorists all along. We do a wonderful job of collecting information, and we do a good job of tying it together at the National Counter Terrorism Center – set up in the wake of 9/11 to collect all the dots. Yet, the question remains: what dots are there, what dots matters, and to whom do you give the processed dots when you are through? Oh, by the way, can those dots give you real intent?

Garbage In, Garbage Out

There is an old expression that the best analysis cannot overcome bad data – garbage in, garbage out. We do not have bad data necessarily; we just have a tremendous amount of it, and it is incomplete. Figuring out what cupcakes someone buys is far different that determining whether hanging out on a certain website will lead to radicalization; or whether saying certain provocative things on Facebook means they are going to fill a pressure cooker with nails and blow it up; or even if traveling to a given area of the world means something beyond a trip. This behavior indicates but it does not determine.

Few places beyond fiction are ever going to yield the statement: I am radicalized and have a nail bomb I will blow up at 3 PM in downtown Boston on Boylston Street. That is the dream of intelligence officers and law enforcement, but it is not reality, no matter how much data you have. Oh, by the way, are the guys you are looking at part of the database?

Then we come to the distribution problem. If we have put together the data that indicates someone might be radicalized enough to kill, how do we get it to the right places? Does it have classified information we cannot fully share? If so, do we need to “dumb it down?” A lot of effort goes into that determination, and it can’t be done automatically. Us poor, flawed humans using judgment are always the final determinant.

America also has a lot of distribution points. The Brits have about 50 constabularies covering their law enforcement needs. We have 17,600 state, local, and tribal law enforcement authorities. They are dutifully connected by a system called the Homeland Security Information Network (HSIN). A lot of data travels over that system. With some irony, there are nearly 800,000 law enforcement officers in the United States. It is a rough business this data sharing, and the best analysis can get lost. Or worse, the system gets viewed as having too much data not tactically oriented enough to do a first responder on the beat any good.

Risk Management Is the Name of the Game

In the post-Boston world, we are currently embarking on the usual set of who knew what and when and why they failed. You will hear the question, “why were the dots not connected?” You will hear blame for “stove-piping information” and difficulty with sharing systems that overwhelm or do not provide enough or the right kind of information.

I think it is time we grow up. We can perfect the information gathering, storing, searching, and analysis and distribution systems to the Nth degree. That will get you 95 percent of the way. You will likely do a great job of stopping the vast majority of planners and hopefuls – which we have done with great success over the past 12 years.

But all the data gathering systems in the world are not going to see into the mind of an individual to the moment of decision to kill. That is a reality of our modern age. We can minimize risk, but we will never totally eliminate it.