There Is A Name For That

As a gold standard for nomenclatural exactitude, it’s hard to beat the systematic IUPAC nomenclature for organic compounds, which establishes a one-to-one mapping between a compound’s name and its molecular structure.  Names for complex chemical structures are correspondingly complex – for example, the IUPAC name for “chlorophyll a” is Magnesium [methyl (3S,4S,21R)-14-ethyl-4,8,13,18-tetramethyl-20-oxo-3-(3-oxo-3-{[(2E,7R,11R)-3,7,11,15-tetramethyl-2-hexadecen-1-yl]oxy}propyl)-9-vinyl-21-phorbinecarboxylatato(2-)-κ2N,N’] .  Nonetheless, if you can make, buy, or think of a compound, there is an IUPAC name for it, all set and ready to go.  It’s quite impressive.

Unambiguous nomenclatures like the IUPAC system are unusual. Crafting a good name is often not straightforward, but good names are crucial to ordinary discourse and data work alike – names indicate what we’re discussing, and provide the subject of any data question we’re going to ask.  A good name parallels a working definition, separating what something is, from what it is not. In addition, a ideal name is compact and easily comprehended – even IUPAC doesn’t always do well on that score.

Pithy but exact names are elusive, even for everyday concepts.  The Merriam-Webster definition of a car: “a vehicle that has four wheels and an engine and that is used for carrying passengers on roads” is reasonable, and will usually get the job done. But there are moving vehicles whose essential car-hood is uncertain, including three-wheeled cars/trikes,  car/pickup combinations, flying carsand the spectacular Mondrian mobile, which deserves to be in a category all its own.

In databases and ordinary discourse, a name is often all we have to tell us what something means.  So when we lack a precise nomenclature, we will inevitably start comparing what is not truly comparable. In discussion this results in unresolved arguments; in data work, the results are problematic outcomes of unknown quality (and possbily arguments, too).   I’ve witnessed “costs” that are compared without considering different times, or currency values, or the method of valuation, and ditto for “values.”  What is the cost of my house?  The county auditor’s site says one thing, which is different that what I paid, which is different than the present value of what I paid, which is different from what I would ask, which is different from its selling price.  Even “dates” can be subtly different and compared invalidly. In one case, I found a contractual in-force time starting from when a customer paid their first bill.  The bill date was labeled “start_date,”  but didn’t indicate which start date. While the problem was understandable, it was also costly for the team involved.

In cases like “costs”, “values”, and “start dates” the names are not specific enough – several specific concepts had the same labeling. Names can also be overreaching, as when the name of a numeric measure implies equivalence to a more general concept. As an example, the body-mass index (BMI) is useful in assessing a healthy body weight but it is not a complete metric for assessing “obesity.”  The BMI is numerical and convenient for analytical work, but it is not an obesity metric, no matter how convenient. (I have explained this to at least one medical professional when I was declared to be, but definitely was not, obese.)

Given its importance, I support almost any effort to craft good naming conventions.  On the other hand, I find myself becoming impatient with willfully ambiguous nomenclature, whether the motivation is political, economic, or something else –  the resulting confusion impedes both social and technical discourse.  As one example, consider “clean energy.”  After reading several conflicting essays telling me what “clean” means, I concluded that for practical purposes, the term means little.  I personally wouldn’t sign on for the definition that energy is clean if, like electricity, it doesn’t pollute when I use it.  In 2015 about 66% of US electric power was generated from hydrocarbons (at roughly 50% efficiency).  That’s clean?   And  then we have political labels like “conservative” and “liberal.”  Advocates, having tussled over these terms, appear to have stripped them of much of their meaning.  I hope it was worth it….

In a recent column, Philip Galenes discussed how someone should be addressed, when the form  (Sir/Ma’am/Ms/Mr/Miss) might prove awkward – a variant of the nomenclature question. And his recommendation?  Don’t use the label at all.    Although his subject was social situations, I thought his suggestion was excellent, and we might apply it generally.  From my perspective, it’s always OK to inquire what a label means, and when encountering a general labels to understand if the label might be ambiguous.  And when the label is ambiguous, don’t use it.

Of course if the ambiguity is intentional, such questions may not be answered with alacrity. Once in a doctor’s office I was handed a list of my “current medications,” which included everything I had ever taken (or perhaps even thought of).   I assumed it was a mistake – perhaps a software glitch – and I thought I would helpfully mention the problem.  When I did, the staff person I spoke with was rather offended – and I was then treated to a  lecture explaining that the word “current” had been added to avoid patient confusion.

Well, here’s an alert: there is nothing more confusing than ambiguous nomenclature. When the “what” of a question or discussion is ambiguous, the answers will be also. I’ve long wondered how many of the arguments we witness – whether technical and social – entail people believing they are discussing the same thing, when they are really discussing two different things.  I’m not sure, but good names certainly lead to better data questions and data outcomes, and very likely to better general discourse too.

