Sampling Error


“To be successful in its search for objective truth, science must identify, combine, and explain all viewpoints.” That sentiment, inscribed on a large plaque in the lobby of WICO’s Boulder field office, guided yesterday’s study and evaluation of the protocols used by test communities to report local conditions and the results of applying parts of the global strategy for use in refining it. I was also admittedly biased by my belief that the strategy’s probability of success would be greatly increased by: (1) encouraging a mindset of creation instead of remediation; and (2) using people’s innate reactions to an environment as a means of assessing its health.

I started by applying a lesson learned the hard way as a journalist: context is everything. The context in this case was the selection and definition of the test communities themselves. Based on my earlier discussions with other members of the Quality Assurance team, the communities (or “test environments” as the team often called them) were established or enlisted because of their proximity to observation stations used in the biosphere assessment which had been located on the basis of ecosystem characteristics.

The “regions” used in the global strategy were defined much later as random samples that, like pieces of a jigsaw puzzle, would together provide a meaningful picture of how a set of global variables are distributed around the world. I was reminded of how strange they looked on a map, some being extremely large, others looked relatively small, and not all of them contiguous. Rico Sanguini the data analyst had explained when I first saw it that territorial connection wasn’t nearly as important as the relationships of the main global variables such as ecological resources.

With the help of Innes Johnson, one of the technicians at the field office, I constructed a map showing the locations of the test communities along with the region boundaries. It was immediately obvious to my untrained eye that the communities could not be representative of the regions. Johnson told me she had noticed the same thing, and been told that the apparent disparity didn’t matter because the environmental considerations were most important anyway and the biosphere assessment had proved that they were adequately represented.

I decided to ask Sally for clarification since she was likely the original source of the explanation. “I can’t help you with that, Will,” she responded.

“You can’t or you won’t?” I pressed, surprised by the refusal.

“Within the scope of assuring the best chance of success, your concern is not worth pursuing. I appreciate your diligence, though.”

It was the first time in our relationship that I felt genuinely irritated with her. “You’re too busy. I get it,” I said, and decided to find my own answer.

There are three test communities in the Rocky Mountain area, each in an ecosystem type or “biome.” WICO classifies the entire area as belonging to just one region, which encompasses five states. Worldwide there are 12 biomes for 300 regions, represented by 120 test communities. Two things stood out from these numbers: the original number of regions was based on the number of existing test communities, which averaged about ten per biome; and the Rocky Mountain area should have one test community at most. 

Following my original instructions, I studied the reporting protocols and a sample of recent reports with an eye for the types of information being collected and sent. While the protocols were nearly identical, I noticed a clear bias in the reports, both in terms of environmental observations and the health and welfare of the residents. Someone smarter than me would need to do the analysis, but it looked like the types of environments and access to cities were playing huge roles in determining what the residents were choosing to report and how they were framing it.

From yesterday through part of today I discussed my concerns with Maura, the field office personnel, and the rest of our team. By the end, we agreed that there is a significant problem with the whole process. Sally was no more forthcoming with them than she was with me, which suggested a potentially larger issue.

Reality Check

The quote about science on the plaque and Will’s biases are all mine. 

I am not a journalist. I first learned “context is everything” from history professors in college, who insisted that reports of events should first be interpreted based on the experience of the originators. At the time I was majoring in physics, which focuses on understanding the fundamentals of how everything in nature relates to everything else in deterministic and non-deterministic ways, which together define context for what all experience. My career as a test engineer reinforced the lesson as I was forced to identify characterize the influences of measurement equipment and environmental conditions on test results. Success in my next career as a technical writer (with similarities to journalism) depended on keeping goals, audiences, and biases of sources and audiences in mind when writing documentation, as well as understanding the real-world subjects and the many potential influences on them. Now as a freestyle researcher and writer who creates more than I report, context must be thoroughly internalized to ensure consistency and quality in what I share with the world.

References to the biosphere assessment are loosely based on my experience in systems integration and testing of environmental variables. The number of biomes is real, and my derivation of the numbers of regions is based on actual population estimates. 

