True Crime

I was talking with a friend the other day about some of the blog entries, and he had an interesting remark:  “People love to read the police blotter.  You could have a section devoted entirely to crimes against data.”

He’s right, but having committed a few “crimes” myself, I want to be a little careful.  As Linus van Pelt said to Lucy, when she presented him with a ten-foot scroll listing suggestions for personal improvement: “These aren’t faults! These are character traits!”

We can get carried away.  Most analysts do a very decent job and try to ascertain reality as best they can. There’s no “crime” there, regardless of approach.

But I’m willing to apply a #truecrime tag to deliberate distortion or data fraud.

The most egregious distortion?   It’s hard to go wrong with data cherry-picking. For example, cite cases in which a trained person has prevented a terrorist attack with a weapon. Therefore the more weapons we have in our hands, the better. Or getting a “better” model by tossing out selected data, for no other reason than the data we removed were a pain in the ass.  It’s the same principle, if not the same stage.

Real data fraud – cooking the books, false transactions, “augmenting” time cards?  Not much to discuss, really – if we can use analysis to detect this, I’m all in favor.

Some other things might be sub-optimal practice, but to me the criterion for “true crime” is deliberate and willful misrepresentation. And then I agree with my friend, we should take time to call out the perpetrators.

The political candidates are an obvious choice.  But then, what else would there be time to write about?   I think I’ll read, just like everyone else.

