PE&RS February 2017 Public - page 77

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
February 2017
77
SECTOR
INSIGHT:
.
com
E
ducation
and
P
rofessional
D
evelopment
in
the
G
eospatial
I
nformation
S
cience
and
T
echnology
C
ommunity
By Pierre Goovaerts, Chief Scientist at BioMedware, Inc.
On the Danger of Biased Spatial Sampling: The Case of Drinking Water Lead Levels in Flint, Michigan
A
common practice in geospatial science is the reli-
ance on sampling to characterize populations that
are too large to be measured exhaustively; in oth-
er words a measurement cannot be collected at
every location in the spatial domain. When designing sam-
pling schemes one should keep in mind the objectives of the
study (i.e. which questions are we trying to answer), as well
as the definition of the population to be studied. Attention to
sampling design is particularly critical when analyzing data
collected by others and sparsely documented. The analyst
should thus question whether that sample is representative
of the underlying population and explore ways to correct sta-
tistics whenever biased or preferential sampling is suspected.
The drinking water contamination crisis in Flint, Michigan
was a painful reminder that biased monitoring and sampling
procedures are easy ways to falsify facts. Indeed, the delay in
reporting high levels of lead in Flint drinking water was par-
tially caused by the biased selection of sampling sites. Flint’s
water testing from late 2014 missed the bulk of the city’s lead
pipe network, instead targeting properties on the eastern and
western fringes of the city which, in some cases, were a long
way from any apparent source of lead. Even more troubling
was the news that such practices are used by other public
water systems throughout the country!
Since Flint returned to its pre-crisis source of drinking water
on October 16, 2015 close to 25,000 water samples have been
collected and tested for lead and copper in more than 10,000
residences. The majority of these samples (80%) were collected
through a voluntary or homeowner-driven sampling whereby
concerned citizens decided to acquire free testing kits available
to residents at local water distribution centers and conduct sam-
pling on their own. This type of crowd sourcing was supplement-
ed by a State-controlled monitoring which in a first phase, called
sentinel program, aimed to determine the general health of the
distribution system and to track changes in lead concentrations
over time. Once again, samples were collected by homeowners
although after training by a sentinel team. State officials relied
on the latter data to demonstrate the steady improvement in
water quality (i.e. declining percentage of measurements above
the EPA action level of 15 μg/L) since the source water switch. A
recent analysis of the remaining 80% of the data revealed how-
ever an actual increase in lead levels above 15 μg/L, averaging
at some point twice the percentages reported by the sentinel
1
Goovaerts, P. 2016. The drinking water contamination crisis in
Flint: Modeling temporal trends of lead level since returning to De-
troit Water System.
Science of the Total Environment
,
2
Hanna-Attisha, M.; LaChance, J.; Sadler, R. C.; Champney Schnepp,
A. (2016). Elevated Blood Lead Levels in Children Associated With
the Flint Drinking Water Crisis: A Spatial Analysis of Risk and Pub-
lic Health Response.
Am. J. Public Health
106, 283−290.
Photogrammetric Engineering & Remote Sensing
Vol. 83, No. 2, February 2017, pp. 77–78.
0099-1112/17/77–78
© 2017 American Society for Photogrammetry
and Remote Sensing
doi: 10.14358/PERS.83.2.77
program
1
. Despite the lack of control on the selection of non-sen-
tinel sites (crowd sourcing has been labelled as “non-probability”
sampling), one should expect both sampling programs to share
the same objective of characterizing water lead levels in Flint
housing stock in general. A legitimate question is thus whether
any of these results can be trusted and whether sampling bias
could be the culprit for such opposite trends.
In this particular case, the set of all 51,045 residential tax par-
cels located within the City of Flint is viewed as the population
of interest. Lead in drinking water mainly comes from lead
fixtures and pipes present within old houses (premise plumb-
ing) in addition to lead service lines (LSL) bringing water from
street main water breaks to the property. A representative
sample would thus be expected to reproduce the main housing
and neighborhood characteristics suspected to influence wa-
ter lead levels, such as presence of LSLs, construction year or
census-tract poverty level. Measurements should also be uni-
formly distributed within the city boundaries to account for
any other putative factors (e.g., water travel time between the
treatment plant and home plumbing system) likely to be spa-
tially structured. Their spatial distribution was here assessed
using the percentage of data collected for each of the nine city
wards since these geographical units were used in the seminal
paper on children blood lead levels in Flint that caused the
media storm and triggered the emergency response
2
.
Assessing the representativeness of both sample sets did not
require the application of advanced statistical procedures. A
simple comparison of percentages computed from the refer-
ence population (Flint housing stock) and the two datasets
67...,68,69,70,71,72,73,74,75,76 78,79,80,81,82,83,84,85,86,87,...166
Powered by FlippingBook