A few notes on the draft
report, Preliminary
Water Quality Screening Results; Lake Merced Pilot Stormwater
Enhancement Project:
1) While full contact
recreation may not be 'permitted' at Lake Merced, Water Contact
Recreation remains a designated beneficial use according to the
Regional Water Quality Control Board (RWQCB), and water quality
standards appropriate to that beneficial use should apply.
Further, full body contact does occur, occasionally, and usually
inadvertently, when someone flips a boat. I think that we have
now agreed with the Health Department and the Public Utilities
Commission that standards for infrequent fresh water contact should
apply. I believe that represents an E-coli count of 583 MPN/L or
less. (Your report says 576; I'll go with that.)
2) If metals are in the
feedstock, and not in the lake and not in the ground, where did they go?
3) Scientific method dictates
that the hypotheses tested should be the opposite of the desired
outcome, the so-called null hypotheses. I have not yet read the
actual statistical evaluation so perhaps this issue has been
addressed. However, the hypotheses as stated are not fully
testable.
4) Your statistical analysis
(Appendix E) states that "all of the log-normally transformed groups
are normally distributed." However, your tables indicate no
reportable results for either Kolmogorov-Smirnov or Shapiro-Wilks tests
of normality when sample size was just 3. In fact, calculating a
variance with a sample of just three points is a highly dubious
enterprise. I would not be willing to bet my lake on that outcome.
5) That said, your next
observation, that the Student t test assumes normally distributed
populations is quite correct. Therefore, with sample sizes less
than 5 I think that the Student t test is simply not applicable.
Whether there is some opportunity to cluster or group these samples is
worth exploring.
6) I note that in some places
you report taking a logarithmic mean, in others a geometric mean of
logarithms. I'm not sure what the implication of the latter is,
but it sounds like a double smoothing. If so it may obscure
differences rather than correct for non-normal distributions in the raw
data.
7) Again, with respect to
significant differences, since the value of t is inversely related to
the square root of the sample size, with very small samples (i.e.,
<5) the sensitivity of this test is substantially reduced.
That is, the difference between two outcomes would need be quite large
to indicate statistical significance. We come then to the
distinction between importance and significance. Were observed
differences to be important if actual then the question as to whether
or not they are statistically significant may be moot. Has that
question been addressed?
John Plummer
September 15, 2005