Date
|
Author
|
Subject
|
Thread
REPLY TO THIS
MESSAGE
|
OR |
POST
A NEW MESSAGE
|
"Bad data" and the push to restrict distribution
- Archived: Fri, 22 Sep 16:02
- Date: Fri, 22 Sep 2000 15:58:05 -0400 (EDT)
- From: Rich Puchalsky <rpuchalsky@att.net>
- Subject: "Bad data" and the push to restrict distribution
James Conrad wrote:
>I have a harder time understanding the hostility to the Office
>of Environmental Information. Do folks really prefer EPA's
>databases to contain wrong information, or not to include any
>sort of contextual explanations or disclaimers? Mechanisms
>exist to fix errors that reporters make, but until recently no
>mechanisms existed to fix errors EPA made. Kind of Orwellian
>from our perspective. OEI does need to get on the stick, but
>that's an issue of bureaucracy, not politics.
I handled complaints to RTK NET about accuracy of the TRI database
for many years. A representative of a TRI respondent would call
and tell us that we were showing the wrong data. In every case
that I handled, it turned out to be a reporting error on their part
that had dutifully been put into the database by EPA. In many
cases, these respondents had filed revisions that hadn't gotten
into the public database yet, since EPA generally supplied one copy
of the database per year at that time.
In every case, I passed the complaint along to someone at EPA who
worked with the respondent to fix the data. In some cases I would
manually make a change in RTK NET's data to reflect an update that
hadn't gotten out yet.
So there always have been mechanisms to fix data. Just call EPA,
and they'll help you fix it. That has been true since at least
1991. All of these data are submitted by industry, not EPA, and
it's the individual respondent's responsibility to check it before
sending it. If EPA did make an error in copying the data, the same
mechanisms were available to fix it. EPA even sends back TRI
numbers to people who report high or unusual ones, just to check
whether EPA made an error.
So why such a fuss about it now? Because it distracts EPA from
new projects and puts them on the defensive about releasing data.
As for contextual explanations for data, the push for more of those
is a push for EPA to restrict distribution of raw data. EPA more
and more often refuses to distribute raw data because of the chance
of misinterpretation.
In fact, one recent industry lobbying effort concerned "responsible
use". As far as I could determine, this meant that EPA should not
distribute data in any format that could be "misused", and should
design its data provision tools to prevent "misuse". What the heck
is misuse? Unsupported interpretation or bad statistics? That is
the responsibility of the data user, not the data provider.
The whole responsible use concept is a push for suppression of
information. It's like telling a library that they shouldn't
provide access to a book on revolutions because someone could misuse
it to start a revolution. I suspect that "misuse" is defined as
any use that could embarass EPA or industry.
But the issue is presented as one of contextual explanations and
disclaimers. How can anyone be against those? Well, you can be
against them if the only tools of data release are those that will
helpfully display contextual explanations all the time -- in other
words, predigested summaries of minimal value.