I have friends who tell me that one of my favorite words is suboptimal. They could be right – I do like how suboptimal sounds, even if I don’t like what suboptimal means. Speaking of which, it’s suboptimal that optimization problems, which are highly important in analytics, are difficult to solve well, and perhaps even more difficult to specify completely.
We don’t seem to hear as much about optimization as we do about exploratory and predictive analytics, but optimization is often where the analytical rubber meets the decision-making road. It’s the real endgame for many analytics processes.
Say you’ve concocted and validated a lovely and accurate model of employee value (not cost!) for your organization, as a function of experience, training, time on work, and other factors.
It’s entirely cool, a paragon of analytical prowess. So you take your new, cool thing to management, where they will presumably bestow accolades, plaudits, and maybe even enhanced compensation for your admirable work. And after your presentation, what do they say? Sorry, this model may explain value, but we don’t see much value in what you currently have.
Miss Manners politeness-points aside, management has a point. They can’t use the employee-value model for much, other than to look at a single employee, and with some degree of certainty, see how much that employee is worth. Their decision-making question is different: given their budget, current employees, potential hires, available training options, health insurance plans, and compensation packages, how do we maximize the current human value of our workforce? That’s an optimization question, which uses the model is an input to that question.
That’s only the beginning of complication. Like many other questions, the optimization question might not be obvious: should we be asking how to maximize the value of our workforce over the next year? Or, asking how to optimize a value metric that includes personal employee interests as well as value to our organization, for altruistic or practical reasons? Not only is the model not a full answer to management’s question, the decision question itself may not be decided.
It gets better. After thinking about all of the constraints such as budget that we can catalog, we realize that contingencies could easily introduce additional constraints – the budget might not really be fixed, labor markets could tighten, key personnel could leave or retire. Perhaps we just didn’t think of all the constraints.
Of course, we’ll still need to arrive at an optimal solution, a process that can be anything from routine to outlandishly challenging.
I am not dredging this up to suggest optimization problems are beyond resolution. They can be resolved. I’m bring this up to support the idea that we can readily and too often brush aside the complexities of stating and solving optimization problems. The most common result if we’re not careful? We optimize something, but it’s the wrong thing.
Optimization matters, because when we’re deciding something, we are really optimizing something. Predictive models will relate an interesting outcome to a set of inputs, but when we’re deciding what to do, we want a best (or at least better) outcome, and need to understand what inputs will achieve that end. Which health care plan? How much vacation per employee? Should we hire experience or train our current workforce?
Optimization matters, but good optimization is not a cakewalk.
Objectives are often ambiguous: it can be more difficult to know what to optimize than we would like to admit. In my example, “value” may be defined by a function, but even with the precision of a function, our definition can be arbitrary. In health care policy, do we optimize for the cost of health care, or for patient outcome? The question of what to optimize is frequently unresolved. I touched on this in “Questions of Diversity” – if we don’t know the question to be answered, we should follow each open question to its conclusion, assess when the answers to different optimization questions are consistent or different, and start from there in a decision making process.
That’s going to be a lot of work! Indeed it is. The alternative is to assume one question is the right one without any basis, and the consequence is to arrive at conclusions that derive essentially from assumptions. When we follow multiple questions to their conclusion we may arrive at a spread of differing conclusions – meaning that we don’t have enough problem definition to arrive at one correct decision. That may be irksome, but it’s also real.
Selecting objectives are one challenge. In addition, relevant constraints can be elusive. Innocuous “oh, by the way” constraints can have a devastating effect on an optimization outcome. I once worked on an materials optimization problem in which – after over a year – the client prioritized a national security interest in the outcome. Did that matter? Oh, very definitely: now the atomic elements we were allowed to use for our material were restricted to what was readily available in designated “friendly” nations. It was a little like telling the chef that they will be making their desserts tonight without the aid of sweeteners, after they’ve broken the eggs and beaten the flour.
It’s easy enough to give an idea of an optimization problem, and quite a different thing to specify it to the point of reliable decision support. We might casually say that we want to minimize operating costs, but let’s not tell that to our linear programming package: the response will be to shut everything down – after all, that minimizes costs, doesn’t it?
So what’s the trick? Brothers and sisters, if there is a trick I don’t know it – optimization problems are challenging to specify, often challenging to solve, but critically important. All reasonable objectives should be pursued, and all reasonable constraints should be introduced. And then, we still may not get things right on the first try. The best defense against suboptimal optimization may be to examine our answers for sensitivity to constraints or objective function parameters. It is the under-constrained solution that will often be the most reliable, and binding constraints merit special scrutiny. A good does of “what if” can also be helpful; having the infrastructure to quickly recast the formulation of our problem also helps.
After this long litany of potential issues and watch-outs, it might seem that constraints, in particular, are a worry – not only do they limit what we can accomplish, it can be hard to know if we have specified every relevant one. Life, it seems, would be better without them. Well, that may be, but constraints are a little like friction – we might wish we didn’t have them, but without friction there wouldn’t be traction, and without traction machines would not work, cars would not go, and people would not walk. Constraints are really what make the optimization problem – they are the difference between an optimization answer that says “minimize costs by closing down your company,” and an answer that says “minimize costs by running more machines at night when electricity costs are lower.”
Ironically, constraints can be the best friend an analyst ever had, because when an optimization problem is kept in view from the beginning of our process, its constraints can eliminate a great deal of tedious – or even infeasible – model-building. If training employees is not an option, we don’t need a model that includes training. If the only elements I can use in my material are Silicon and Germanium, I don’t need a model at all to say what semiconductor we’re going to use. For all of the challenges associated with posing and solving optimization problems, keeping the endgame of optimization in mind can be the difference between a successful analytics outcome and no outcome, particularly when information for a predictive model is limited. For after all, a job is simpler when you don’t have to do it in the first place, right