Categories
Narrative

Of Bad Science and Solid Science

When Spruance Del Curtin had agreed to work on Dr. Doorne’s project, he’d been under the impression that he’d be getting to do something interesting and extraordinary. He’d been at it for three days now, and so far all he was doing was checking the validity of datasets. Talk about tedious.

He looked up at the ceiling and rolled his eyes, wishing he could Instagram it. But Dr. Doorne had been quite clear about the matter — one whisper about this project, and he was out. No questions asked, no excuses accepted, just a quick boot to the behind.

And speaking of Dr. Doorne, she would have to pick that moment to walk into her office. “Good morning, Mr. Del Curtin.” Although she kept her tone conversational, Sprue could tell she was not pleased.

Still, he’d best show no sign of annoyance, nothing that could appear “defensive.” A polite greeting, a pleasant question about how things were going for her.

But she was not going to be deflected by the outward gestures of politeness. “You seem rather unenthusiastic today. I had been under the impression that you were excited about this project.”

“Well, I was.” Sprue studied her, trying not to narrow his eyes too much in the process and look disreputable. “But I thought it was going to be something a lot more interesting than going through reams and reams of data. I mean, I know data sanitization is important, but does it have to be so boring?”

Dr. Doorne pulled out the second chair and sat down. “A lot of things that are worth doing are boring. For instance, Tycho Brahe spent years accumulating celestial observations that were as accurate as he could make them with the instruments available to him. I would imagine that meant a lot of boring nights in a chilly observatory. And when Johannes Kepler used Tycho’s data to work out the elliptical nature of orbits, I can assure you that meant hours upon hours of tedious hand calculations, every one of which needed to be done perfectly, which would mean doing them multiple times and making sure they matched.”

Sprue understood that concept — modern statistical packages made heavy use of such processes as regressions to minimize error when it couldn’t be eliminated. It was also why you kept backups, and worked only on copies of your data, not the original.

“OK, got that. But why are you having me go through all this data,” he gestured at the columns of figures on the monitor in front of him, “with no idea of what any of it is about? That’s what’s making it super-tedious.”

Was that a smile curling Dr. Doorne’s lips ever so slightly. “Remember what we talked about in our first class about bias and lying with statistics?”

Sprue had not forgotten that day. Dr. Doorne had given them what seemed like a dozen examples of statistics done badly, with dire warnings about the fate that would befall any professional scientist who committed those statistical sins.

“Then you understand why we need to be careful that we don’t end up cherry-picking what fits our hypothesis, or otherwise seeing what we expect to find.”

Sprue considered the concept. “So it’s kind of like double-blind tests in medicine?”

Yes, that was a definite smile. “Exactly. We want the data verified and sanitized by someone who understands the general principles of the process, but not enough about the purpose of this specific dataset to bias the process.”

Which meant this was something super-important. Now all he could do was get the job done as best he could and hope he’d be let in on what was going on as things progressed.