The Chandelier Problem

May 6

As a heat averse British Columbian, the hot sun of Houston Texas wasn’t the most startling thing to me during my third major data science research conference. During SIAM Data-Mining 2017 Conference in Houston, Texas I attended a panel on why doctors don’t apply machine learning and other methods. Initially, I thought it would be more summarizing common issues of say:

Ignorance, doctors are not familiar with the benefit of technology in general;
Traditionalism, the tendency to never change anything if it works well enough;
Neophytism, doctors don’t know what data science solution is relevant to their problem based on their expertise.

Similar issues may occur in your own field, of why innovation seems to be stifled because of a lack of interest or familiarity. Unfortunately for my preconceptions, the actual panel was far more nuanced. It wasn’t simply that doctors are ignorant tribesmen from an isolated culture, burdened by the lack of rationality and faith to convert to the wonders of data science, but a far deeper criticism of data science as a field. I would summarize their points as follows:

Patient Ignorance: The algorithm ignores individual facts on a patient, their personal history, their demeanor during an appointment, and cannot tailor their approach to them individually.
Anti-Science: The biology underlying the human body is ignored in developing the model.
Trust: There isn’t enough validation that the approach actually works or can generalize to other populations and exceptions. A prediction may be based on faulty assumptions and cannot explain those assumptions convincingly.

All of these are very real concerns that aren’t simple, surface level deflections but part of the central question of data science itself. Starting my career, I tended to underestimate that data science itself can manifest as cutting away the whole idea of science, of precise correct modeling that is replaced by ruthless empiricism: that the end result, good prediction or bad prediction, is all that matters, regardless of how that prediction is made and the result real world effects of it. To strip away all the “fluff” of real world psychology, of human interaction, of complex phenomenon and only express it into a prediction.

Just like the first missionaries, effective conversion is about integration. Academia itself is full of research for the sake of research: of designing a chandelier that is too specific to ever be applied to a different circumstances or is not a proper fit for the decor and space used. It is more self-gratification for the authors rather than any real company or stakeholder. People don’t ask fundamental questions of “Who cares?” or “Is this Useful?”

And some funny and not so funny examples of over-engineering:

The juicero product rocks and is one of my favourite things in life: of taking an idea and just spending hundred times more money to do something you can do by hand sufficiently well.
Theranos is pretty interesting in terms of focusing too much on the end product without making sure it works: of focusing too much on the dream of an idea without having the foundation to actually achieve it. The main idea is ignoring the reality of the difficulty of an engineering problem of detecting illness from a few drops of blood while scamming people out of billions of dollars of funding.
The Book Three Cups of Deceit covers a case of a person developing a charity to build schools in Afghanistan, though in that case building schools in places where they wouldn’t be used or survive the winter.
A lot of silicon valley seems to be like that: check out my blockchain app or app to solve homelessness that requires all the homeless to have cellphones.

I’ve been asking myself those questions of ``why’’ in my work and volunteering of “Is this just going to be another overdesigned chandelier, inconsistent with actual stakeholder needs” and the answer is usually yes, until a lot more work is done.

Technology for the sake of technology is an ouroburos devouring itself, of creating a problem just because it can be solved. To put the cart before the horse and call it innovation rather than a waste of time. I made ROM2 consulting to try to deal with such issues, especially in the volunteering space: of not simply asking “What technology is relevant?“ but “What technology is actually feasible to implement, apply, and sustain.“ I will hope this consulting service leads to more purposeful applications of data science.

Ryan McBride

The Chandelier Problem

ROM2 Consulting