An interesting post by Zachary Lipton makes an excellent point: that much of the public information on data technology (he speaks particularly to Artificial Intelligence technologies such as deep learning) borders on uninformed cultism.
Agreed. However, the technical, consulting, and academic communities can shoulder a good deal of the blame for the problem of analytics cults. What these cults have in common is an adulation of complexity. Tools of complexity and opacity are ofen seen as cool and intrinsically useful. I’ll grant the coolness, but I’ll also debate the intrinsic utility.
Complex methods like simulation and artficial intelligence have their place, and in very skilled hands can bring value. But even in expert hands, there is an obvious issue with complex techniques: they are difficult even for practitioners to understand, and their interaction with complex data can even more challenging to understand. As a result, our audiences may distrust the resulting outcomes, particularly when those outscomes are unexpected or unpopular. And lack of trust is the immediate precursor to a lack of adoption.
But it’s adoption of results that matters! Results without adoption are not worth having. When the only adopted analytics outcomes are what our audience has previously anticipated, it’s arguable that the analytics program itself is not worth having – it’s merely an expensive way to confirm existing biases.
Analytics practitioners may see any lack of trust and adoption in complex techniques as irrational – but I disagree. Audiences intuitively realize that opaque methodologies can be subtly – albeit unintentionally – manipulated to reflect various biases. When our method lacks transparency, an unpopular outcome may be seen resulting from the bad attitude of a technical expert. The other side of this coin is confirmation bias – as we want to please our sponsers, we may unintentionally steer complex methods to what sponsors expect. Not only is that not difficult, it can actually be difficult to avoid. For example, once we encounter a result that “makes sense,” it is natural to stop truly challenging and testing a complex approach. Investigative inertia can wind up favoring pre-existing notions, without anyone explicitly doing anything wrong.
I learned about adoption of results the hard way. After giving a valid but opaque argument that challenged an accepted scientific notion, my reward was to (literally) be yelled at by an roomful of angry scientists. That was certainly a sub-optimal moment, but I also realized they were right. An black-box algorithm effectively separates a new fact from its data, so that the algorithm plus its result no longer offer actual proof. From that moment, I’ve advocated for transparency as a feature of good analytics work. When we have to convince anyone of our result – and if we don’t, why don’t we? – transparancy enables explanation. Transparency is certainly not beyond the reach of complex simulation and artificial intelligence techniques, but it’s often a separate analysis activity.
Over-eager advocates of complex analytics and data methods – ranging from big data to Metropolis Monte Carlo to elaborate artificial intelligence protocols – often haven’t had to explain a truly new result to a genuinely skeptical audience. But that is also the ideal in analytics value – to discern genuinely new facts from data, and convince our customers that this New Thing is both valid, and worth their attention and action.
The costs of complexity are lack of understanding, leading to lack of trust, leading to a bias towards what is already known. These things can be overcome, but only at additional effort and cost. It’s one thing to use a complex method to return what people expect. It’s quite another thing – often more challenging and rewarding – to craft a simple and transparent method that convinces people of something unexpected. Even when the result makes people initially unhappy, ultimately everyone will know it was a good moment. It’s why we’re taking all that trouble in the first place.