## Introduction

When limnologists contemplate a sampling design for a project, statistical considerations are sometimes treated as low-priority and considered only after the project is completed. For volunteer monitoring projects, the role of statistics may be even further marginalized as:

• Control of lake, site, date, or even technique is removed from the coordinator’s grasp.
• The project coordinator is faced with the dilemma of obtaining meaningful data, despite the variability inherent in data submitted by numerous volunteers.

However, high quality data greatly facilitates all subsequent predictive and diagnostic efforts. A number of manuals are specifically aimed toward the sampling of lakes and reservoirs, providing coordinators sampling design suggestions.

Reservoirs

• Gaugush, R.F. 1986. Statistical methods for reservoir water quality investigations. Instruction Report E-86-2, U.S. Army Engineer Waterways Experiment Station, Vicksburg, Miss.
• Gaugush, R.F. 1987. Sampling design for reservoir water quality investigations. Instruction Report E-87-1, U.S. Army Engineer Waterways Experiment Station, Vicksburg, Miss.

Lakes

• Reckhow, K.H. 1979. Quantitative techniques for the assessment of lake quality. U.S. E.P.A. Office of Water Planning and Standards. EPA-440/5-79-015.
• Reckhow, K.H. and S.C. Chapra. 1983. Engineering Approaches for Lake Management. Volume 1: Data Analysis and Empirical Modeling. Butterworth Publishers (Ann Arbor Science).

## Data Integrity Considerations

### Statistical Terms To Know:

Accuracy — the closeness of a measured or computed value to its true value (Sokol and Rolf (1981). Robust sampling programs seek greater accuracy.

Precision — the closeness of repeated measurements of the same quantity (Sokol and Rolf (1981). Greater precision is desirable for any sampling program.

• Maximized via careful instructions to the volunteers, a standardized sample collection and storage procedure, minimization of the need for the volunteer to handle samples, and, of course, careful analytical laboratory procedures.
• Quality Assurance (QA) — the set of operating principles that should be set up at the beginning of the program to guide both the volunteer and the laboratory in pursuit of precision.
• Quality Control (QC) — procedures formulated (after QA plan is created) and used to assure that the volunteer and the laboratory are maintaining the necessary degree of precision.

Bias — a systematic error introduced into the estimate by either the method itself or the use of the method. (American Public Health Association, 1989). Reduction of bias should be a goal for all sampling program.

• Method components which may introduce errors include (but are not limited to): method interpretation by samplers, site selection, sampling, storage of samples, or analysis of samples.
• Bias can invalidate the data of the most well-planned monitoring program.
• Techniques to deal with bias:
• Identify and remove forms of bias (can be costly and/or impractical).
• Acknowledge existing bias and adjust expectations & interpretations of the data accordingly.

## Mitigating Analytical Bias

### Establish A Sampling Plan

To maximize collection of accurate and useful data:

• Carefully define project goals and expectations
• Based on defined goals/expectations, devise appropriate sampling strategies (methods).
• List possible biases accompanying each method
• Eliminate, minimize, or manage each bias (as possible).

Example:

Goals stated for the State of Vermont’s lake monitoring program (Smeltzer, et al., 1989) were to:

1. Provide a perspective on the range of eutrophication conditions found in the state’s lakes
2. Describe water quality conditions in individual lakes
3. Provide data useful in developing regional empirical eutrophication models
4. Detect trends of deteriorating water quality caused by nutrient pollution.

Existing bias:

• Large and more heavily used lakes within the state were more likely to be selected for the program (Smeltzer, et al., 1989).

Solution(s):

• The bias was recognized, but could not be eliminated. Consequently, the description of lakes included within the program was altered from “Vermont’s lakes” to “Vermont’s major recreational lakes.”

Re-phrasing the description allowed the program to provide information on the lakes thought to be of greatest importance to the State of Vermont (i.e. large, recreational lakes), as opposed to a more unbiased estimate of the overall lake water quality in the state.

## Standard Sampling Strategies (Methods)

Table 1. Common sampling strategies for lakes and reservoirs. The “Typical Sampling” scheme is not recommended.

 Random Stratified Sequential “Typical” Limnological Sampling Strategy Lake Chosen Randomly from geographic area Randomly chosen from within a geographic region (state, county, ecoregion) or some other classification (recreational, water supply) Sample every lake along a transect, using a randomly chosen transect starting point Choose lake based on convenience, access, proximity, or interest Site Randomly chosen from lake grid Randomly selected from within regions of lake Sample down pre-chosen, equidistant sites along transect of lake, starting with a randomly chosen point Sample at the dam or over the deepest part of the lake Depth Randomly chosen Randomly chosen within depth regions (epilimnion, hypolimnion, photic zone, etc. Sample at preset intervals, starting with a randomly chosen depth Sample at the surface or at preset intervals surface to bottom Date Randomly chosen Randomly chosen within season, month, or limnological period Sample every two weeks, starting with a randomly chosen date Sample the same day every week Time Randomly Chosen Randomly chosen within period such as daylight or some other division of day. Sample every two hours starting with a randomly chosen time Sample when you get there.

### Caveats for Each Method

Random Sampling

• Simplest to analyze, but it does not account for known gradients within lakes, depths, or geographical regions, which may have differing mean values and variances.
• May be affected by auto-correlation (the next sample in the series may be predicted to some extent by the preceding sample).

Stratified Sampling

• May be affected by auto-correlation (the next sample in the series may be predicted to some extent by the preceding sample).

Sequential Sampling

• May or may not produce more accurate estimates than random sampling, but is often the easiest.
• Samples drawn in sequence may be auto-correlated (the next sample in the series may be predicted to some extent by the preceding sample), posing a problem: samples are not being drawn independently and randomly from a pool of possible variables.
• Therefore, the statistical number of samples being taken is less than the actual number of samples (Cochran, 1977; Reckhow and Chapra, 1983). Autocorrelation may affect random and stratified sampling procedures as well.
• If there are periodic fluctuations in the variable of interest, it is possible with sequential sampling to always sample the same part of the curve (Cochran, 1977).

## Choosing Sampling Locations

### Waterbody & Volunteer Selection

Bias can creep into a monitoring program when lakes are chosen for the program. If the purpose of the program is to provide information on, for example, the trophic state of the state’s lakes, then the lakes of the state must be sampled without bias; every lake must have the equal potential to be included in the program. The coordinator might wish to stratify the sampling within geographic, political, or ecological regions within the state, but, within each division, the lakes still must be sampled randomly.

Rarely do volunteer monitoring programs have the luxury of choosing volunteers on randomly selected lakes. More often, the lakes are selected based on the availability of volunteers. It is quite possible that the lakes will not be chosen randomly, and, quite probably, will produce data that will not be unbiased estimates of the state’s lakes. For example, volunteers may be more available for recreational lakes, for lakes where they have cabins, or for lakes that are easily accessible. Private lakes, lakes in state parks, small, shallow, or weedy lakes may be under-represented. A solution to this dilemma is to reexamine the goals in light of the realities of the volunteers and their lakes. The coordinator could either randomly match chosen lakes with volunteers and fill in the non-volunteer lakes with your agency staff or could change the goals of the program. Perhaps the expectations can be modified so that the recreational lakes of the state are adequately sampled. As a last resort, accept the fact that you are estimating the trophic state of a potentially biased set of volunteer-sampled lakes. The important consideration may not be that the data are biased, but how the data are interpreted and used once they are gathered.

### Sampling Location within a Lake

The location of the sampling point in a monitoring program is heavily dependent on the goals of the program. In the past, many limnologists would sample natural lakes at a single site over the deepest part of the lake. There is a good reason for using the point over the deepest part of the lake; it allows the sampling of every possible depth. However, using this point as the single site to characterize a lake assumes that there are no horizontal gradients in the lake; each depth at that single sampling site represents the conditions at that depth throughout the lake. As emphasized by Peters (1990), the use of this sampling site also ignores the characterization and influence of the littoral zone, which can constitute a significant percentage of the total lake area and volume. Peters also suggests that marked horizontal differences should occur over horizontal distances of only 1500 meters (0.9 miles) in a 15 meter deep lake. It is difficult to imagine the immensity of the influence of this single point, deep water sampling procedure on our perspective of lakes. It not only rejects the possibility of spatial gradients in the lake, but ignores the littoral zone entirely, both as a component and its functional role in the lake system.

The characterization of reservoirs adds an entirely new dimension to the problem of selecting a sampling site. Many reservoirs are long and narrow, with a majority of the water flowing into the reservoir at a single point. Others are highly dendritic, with many bays and inlets, each often with their own contributing stream. Spatial differences in water quality in reservoirs are the rule, not the exception, and assumptions about homogeneity at a given depth stratum simply cannot be made. To attempt the characterization the entire reservoir by a single sample would be almost meaningless.

Two different approaches have been suggested to characterize heterogeneous bodies of water. The first, recommended by Thornton, et al. (1982) and Gaugush (1987), suggests that sampling stations should be more numerous where concentration gradients are the highest, such as near stream and river inflows. Before initiation of a sampling program, the lake or reservoir would be sampled intensively throughout. Replicate surface samples are taken at each site, and a statistical multiple range test used to identify significant differences between station means. If the means between two different stations are not significant, then only a single sampling station would probably be sufficient. Where station differences are significant, separate stations would be established.

In contrast, Walker (1987) suggests that sometimes these areas of greatest change, such as near inputs and in narrow embayments, do not constitute a large proportion of the reservoir’s volume. More emphasis should be placed in sampling the areas of a reservoir where most of the volume can be found. Walker’s approach would concentrate on sampling the bulk of the reservoir’s volume. In a sense these differ rather than intensively sampling areas where greatest change occur. These two different methods of sampling reflect the need to characterize the entire surface area of the reservoir (or lake) versus the need to represent the condition of the bulk (not entirety) of the water body.

A third approach does not require such extensive sampling effort. If the goal of the program is to monitor changes in water quality over seasons and years, then complete characterization of the lake or reservoir may not necessary. In this case, sampling at a single site might be appropriate. The important factor in single site monitoring is that the site be clearly marked, so that anyone, not only the initial volunteer, can find the site again. The sites should be documented using either a satellite-based GPS device to establish the exact latitude and longitude or triangulation with permanent shoreline markers, such as houses, water towers, etc. Several sites should be monitored on large or dendritic lakes in order to keep track of local changes.

## Sampling at Depth

Limnologists probably use one of several methods when sampling with depth: they may sample only at the surface or they may take a series of samples at pre-set intervals from the surface to the bottom of the lake (0 m, 1 m, 2 m, etc.). A “surface” sample can be taken just below the surface (0.5 meters or 1 foot) to avoid surface scums. Another approach is to take a single integrated sample with a tube or hose sampler from the surface to some pre-determined depth (euphotic depth or thermocline). All of these techniques have advantages and drawbacks, either because of limnological, logistical, or statistical considerations.

The advantage of a single sample taken at the surface is that it requires only one analysis, an important consideration to a program on a budget. It also does not require any special sampling apparatus; the volunteer can just hold a sampling container under the surface of the water. Because it is simple and inexpensive, it is an attractive sampling technique for volunteer programs.

Taking a single sample at the surface assumes that the lake, or at least the epilimnion, is completely mixed, and that the surface sample is an adequate estimator of the lake or epilimnetic concentration. In many cases, the assumption of a completely mixed body of water is incorrect. Many lakes have vertical gradients of temperature, oxygen, nutrients, and algae which will be missed by the surface sample. Most certainly, deeper lakes establish a thermal gradient in the summer and the lake will have at least two thermal regions, and, possibly, at least two chemical regions as well. Even in shallow, usually well-mixed lakes, chemical gradients will develop during calm periods. Gradients can even occur in the upper waters of lakes. Metalimnetic populations of algae will develop in many stratified lakes (Hanna and Peters, 1991; Knowlton and Jones, 1989). Blue-green algae also migrate deeper into the epilimnion during the day, and a surface sample would miss them. On the other hand, on calm days, blue-green algae may float to the surface, forming surface scums. If sampled on these days, the amount of phosphorus and chlorophyll would be highly exaggerated.

Sampling at intervals through the entire water column is the standard alternative to the single surface sample. If sampled at sufficiently close intervals, the technique detects gradients within the water column. The technique is also used when constructing a nutrient budget to estimate the total content of a nutrient, such as phosphorus, in the water column.

As it is commonly done, however, interval sampling violates one of the requirements of sequential sampling: it is usually started at the same depth each time. Sequential sampling require that the starting depth be randomly selected. For example, sampling might be chosen randomly to begin a 0 meters on one day and 0.4 meters on the next visit. If a randomly chosen starting point is not selected each time, there is the possible introduction of bias into the estimate. For example, the predetermined sampling points might continue to miss a thin, metalimnetic algal population or a near-sediment phosphorus or oxygen gradient. Interval sampling also requires a larger number of analyses than the single sample. The coordinator should consider whether the information gained, or the error minimized, by interval sampling can be justified by the added expense.

The hose or tube sampler collects and integrates a column of water into a single sample. The volunteer lowers the weighted end of a plastic hose (even garden hose has been used) to a given depth. The other end of the hose, which is in the boat, is then corked. The weighted end pulled to the surface by an attached line. If the top end of the hose is well-sealed, all or most of the water will remain in the hose as it is raised. Once the weighted end is in the boat, it is placed inside a sample container, the other end uncorked, and the entire volume of the hose emptied into the container. A rigid tube can be used instead of a hose, but a longer tube becomes cumbersome to lift into the boat, and it is difficult, especially for a single person, to pour the sample into a bottle.

Like the interval sampling technique, a greater proportion of the water column is included in the hose sample than is obtained with the surface sample, but unlike interval sampling, the water is mixed together in the hose or sampling container, producing a single sample. A single sample is less costly, but information about possible gradients is lost. The supposed advantage of the hose sample is the possibility that it better represents the actual concentration of the target variable within the water column than does the surface sample and that it is less expensive, both in terms of expense of equipment and of number of analyses, than interval sampling.

The hose sample has several disadvantages, especially for volunteer monitoring. Someone, probably the volunteer, is going to have to decide the maximum depth to which the sampler will be lowered. The hose could be lowered to the bottom of the lake, to the bottom of the epilimnion, to the bottom of the euphotic zone (1% light), or to a fixed depth. It makes little sense that these depths should be fixed from lake to lake nor necessarily constant from visit to visit. The volunteer would have to make a decision about the depth prior to lowering the sampler.

If the hose is to be lowered to the bottom of the lake, the volunteer would have to estimate the depth of the lake without disturbing the sediments, perhaps using the Secchi disk as a depth indicator. Any sediment disturbed by the disk or the sampler itself would invalidate the sample.

If the hose is to be lowered to the thermocline or bottom of the epilimnion, the volunteer would have to estimate either the depth of the thermocline or the epilimnetic-metalimnetic boundary, a determination that requires a temperature meter and some fairly sophisticated understanding of thermal structure.

The hose could be lowered to the extent of the euphotic zone, with the euphotic depth estimated, not necessarily accurately, using some multiple of the Secchi depth. However, the euphotic depth (1 percent of surface light) is not a fixed multiple of Secchi depth. Although the computation of 2× Secchi depth is often used, the Secchi depth is not a constant function of sub-surface light intensity as is euphotic depth. Estimates of euphotic depth have been reported to be from 1.16 to 2.3 times the Secchi depth (Davies-Colley and Vant (1988). Because this multiple varies as a function of a number of uncontrollable environmental variables, the euphotic depth, as estimated by Secchi depth, is a very rough estimate indeed.

The second consideration about using euphotic depth is that it is unclear what value there is to most monitoring programs of knowing concentrations in the euphotic zone. Euphotic depth is considered to be the maximum depth where photosynthesis occurs. Knowledge of the concentration of chlorophyll in the zone of photosynthesis is important if photosynthesis is being modeled or estimated. However, if the intent of gathering data is to estimate the total (or average) concentration of chlorophyll in the lake, not just the euphotic zone needs to be measured. The concentration of phosphorus or any material other than chlorophyll in the euphotic zone has no intrinsic meaning, other than if a relationship exists with chlorophyll. It is possible that, in different lakes, the euphotic zone may be less than a meter in one and could extend into the hypolimnion in another, which might confuse any other connotations the measurement might have.

The sampler could be lowered to a fixed depth. The same fixed depth could be used in every lake, or this depth could reflect an estimate of the mean depth of the lake, the epilimnion, or the thermocline in each individual lake. The coordinator could visit each lake and determine the depth of the epilimnion, but since most programs are initiated in the spring, perhaps before a thermal gradient has set up, alternatively, the thermocline or epilimnetic depth could be estimated using any of numerous equations relating morphometric measures to these variables. Hanna (1990) devised equations relating thermocline depth to Maximum Effective Length (MEL), lake area, and shoreline length that could be obtained from a map before the sampling season began (Table 2). Maximum effective length is defined by the straight line connecting the two most distant points on the shoreline over which wind and waves may act without interruptions from land or islands (Håkanson 1981; Hanna 1990). These values could then be field-evaluated later in the season.

Table 2. Models that could be used to estimate thermocline depth (THER) based on maximum effective width (MEL), area (A) and shoreline length (L).

 Equation r Source log THER = 0.336 Log (MEL) – 0.245 0.92 Hanna, 1990 Log THER = 0.185 Log (A) + 0.842 0.91 Hanna, 1990 Log THER = 0.282 Log (L) – 0.225 0.91 Hanna, 1990 Log THER = 0.443 Log (A) + 0.553 0.91 Osgood, 1987

A disadvantage of the hose sampler is that it samples each depth equally, even though each depth contributes unequally to the total volume of water in the lake. The average concentration in the lake (CAVG) is calculated as:

$C_{AVG}=\sum_{z=0}^{z_{max}}C_{(z_{n}-z_{m})}V_{(z_{n}-z_{m})}$

Where C(zn-zm) is the averaged concentration between depths zn and zm and V(zn-zm) is the volume of water in the same interval (Fig. 2). The bulk of the water is at the surface, therefore, the concentration of a substance at the surface usually contributes most to the average concentration in the lake. The volume of water in the metalimnion and hypolimnion is much less, and concentrations in this region should contribute less to the average concentration.

 Figure 2. Calculating the volume of a layer of water in a lake. The volume of a lake or reservoir can be estimated from the morphometric map by measuring the area of each depth interval with a planimeter or image analysis software. The volume of each interval can then be calculated for horizontal slice between depths n and m using the formula given in Hutchinson (1957) and Lind (1979). V(m-n) = 1/3(Am + An + sqrt(AmAn)(n-m)) where A is the area at depth m or n. The total volume of the reservoir is then equal to the sum of the volumes of each vertical segment. It is a good idea to estimate the volume at very close intervals in the range of the normal fluctuation of the water height to allow the calculation of the volume at any given reservoir elevation. The accuracy of such volume estimates is dependent on the accuracy of the morphometric map.

When a tube sampler is used, the concentrations at each depth are weighted equally rather than in proportion to the contribution of each volume to the total volume of the lake. This equal sampling increases the influence of the deeper concentrations on the calculated average concentration. If concentrations at these lower depths are much higher than at the surface, their contribution to the lake’s average concentration is higher as well.  Large metalimnetic chlorophyll peaks or hypolimnetic phosphorus concentrations would have an exaggerated impact on the estimate of the average concentration. There is no guarantee that a hose sampler lowered to the depth of the epilimnion or thermocline increases the accuracy of an estimate of epilimnetic or lake concentration. Increased accuracy will depend on the morphometry of the lake and the nature of the vertical distribution of the variable concentrations.

If a program does not have the funds to sample each depth independently, yet considers that the hose sampler inaccurately measures the average concentration, an alternative would be to have the volunteer produce a composite sample. This would entail having the volunteer pour a volume of sample proportional to the volume of water at each depth into a sample container. The resulting concentration in the sample container should be very close to the average lake concentration. However, the volume of each stratum would have to be known, and the volunteer would have to carefully measure volumes appropriate for each depth. This may be more than can be expected from the average volunteer.

Several studies have evaluated the effect of vertical sampling strategies on estimates of lake concentrations. Hanna and Peters (1991) compared several strategies, including weighted-composite samples to the trophogenic depth (estimated as 2× Secchi depth), euphotic depth (estimated as 1% light), and maximum depth, hose samples to the epilimnetic and trophogenic depths, and subsurface samples. They found that subsurface samples had the lowest phosphorus concentrations, epilimnetic hose samples somewhat higher, samples of the trophogenic and euphotic still more, and water column composites (total water column) the highest. In their lakes the maximum difference of the means was only slightly more than 20%. Chlorophyll concentrations were also smallest in the subsurface samples, higher in the epilimnetic hoses, higher yet in the volume-weighted samples, and highest in the hose samples of the trophogenic zone. Although there were differences between techniques, the inter-lake differences were even greater, meaning that any of the techniques could distinguish between lakes. However, they thought that subsurface samples yielded low values and were subject to greater daily variation and therefore should be avoided in inter-lake surveys. They recommended the use of the hose sample over integrated composite samples because the hose sample requires less effort. They recommended using 2× Secchi depth because such samples represent lake productivity better in principle, although they could not demonstrate this advantage in practice.

Knowlton and Jones (1989) compared subsurface (10 cm) samples with samples continuously drawn with a pump from the surface to the oxycline, or, when no oxycline was present, from the surface through the entire mixed layer. Differences between surface and composite samples in chlorophyll and total phosphorus were usually small, but differences increased when phytoplankton formed subsurface layers. Chlorophyll concentrations tended to increase with depth, seldom being maximal at the surface. However, they considered the error obtained from differences between the two sampling techniques to be much less than temporal variation and the sampling technique would add negligible error in inter-lake surveys.

The difference between this study and that of Hanna and Peters (1991) may be the stability of the algal layers. Knowlton and Jones found subsurface chlorophyll layers to be the major cause of differences between the techniques, but these layers were transient, common only on calm days. Metalimnetic maxima were also transient, occurring infrequently in the lakes. More importantly, the concentrations in the peaks were less than 20% higher than surrounding layers. It may be that metalimnetic peaks accounted for some of the differences in found by Hanna and Peters (1991). In their lakes the peaks may have been more prevalent and more stable. This would explain why hose samples actually had higher chlorophyll concentrations than did weighted composite samples.

The final word on the best method to sample with depth has yet to be written. Surface samples are the easiest to measure and require the least equipment. However, they may be more variable and under-represent the concentrations in the lake. Hose samplers are marginally more expensive and may sample more of the water column, thus reducing variability caused by vertical heterogeneity. However, if high concentrations are found deeper in the sampled water column, the hose sampler may over-represent the epilimnetic or lake concentration. Also, there is no agreement as to the appropriate depth the hose should be lowered. A volume-weighted composite sample may be best, but it requires the knowledge of the lake volume at each stratum and a great deal more effort on the part of the volunteer.

## When To Sample

Deciding on when, how long, and how frequently it is necessary for the volunteers to take samples is as fraught with statistical decisions, as are deciding on sampling location and depth. There seems to be no statistically acceptable method that coincides well with the realities of using volunteers to collect samples. Because of this, some compromises may have to be made.

### Length of Sampling Season

The length of time that the volunteers are asked to sample is dependent on the goals of the program. If for example, the goal is to obtain annual mean concentrations or to examine the annual fluctuations in the target variables, then the calendar year is the appropriate duration of the sampling season. If ice-free or summer mean concentrations are what is desired, then the sampling period becomes obvious. If a whole-lake springtime phosphorus concentration is needed for a nutrient budget or to calibrate a loading model, then sampling should include the period of spring turnover. If that spring turnover total phosphorus concentration were used in a model to predict summer chlorophyll (Dillon and Rigler, 1973), then calibrating that model would require summer chlorophyll concentrations as well.

It may be a little more difficult to select a sampling season if the goals are to determine concentrations or parameters that may not be dependent on a certain season, i.e., maximum concentrations. Peters (1990) points out that the typical limnological preoccupation with the summer season may miss spring and fall peak chlorophyll concentrations, which, according to Marshall and Peters (1989) may be higher than peak concentrations found in the summer (Fig. 2), especially in eutrophic lakes. Marshall and Peters (1989) point out that most models predicting mean chlorophyll concentrations ignore the temporal distribution of chlorophyll. The result is that when such models are used, the resulting predictions may be irrelevant to the program goals. Does the program need to know the annual or seasonal mean of chlorophyll or phosphorus, or is the peak concentration of more interest? Will a single number (the mean) be all that is used of the volunteer data, or are seasonal fluctuations of importance as well? Does it matter to the goals of the program that the maximum comes in April or is the peak concentration in July or August of more interest?

There is no single recommendation that can be made regarding sampling duration. The duration depends on the goals of the program. The better these goals are defined, the easier it will be for the coordinator to choose a sampling period. Once this is done, and the ideal period chosen, probably the reality of volunteer cooperation and capability should be considered. An annual mean necessitates that someone must visit the lake throughout the calendar year. Will volunteers be willing to do so? In the northern states, being on the lake in the winter or early spring or late fall can be cold and dangerous; some volunteers may not launch their boats or go to their cabins until May or June. The coordinator may have to modify the program expectations to incorporate the schedules of the majority of the volunteers.

### Frequency of Sampling

The frequency of sampling is another important consideration in the design of the volunteer program. If the coordinator has the volunteers sample too infrequently, then there is a good chance that the variation will be so large that it will be impossible to statistically identify differences between lakes or between sampling years. If the volunteers sample too frequently, their enthusiasm may wane and the added statistical reliability may not be justified. Oversampling will also waste money that could have been used to fund more lake monitoring. When sampling frequency translates into dollars and volunteer time, it becomes necessary to balance the need for statistical confidence against the monetary and human costs incurred in obtaining that level of confidence.

Most volunteer programs probably have as one of their goals the estimation of the annual or seasonal statistic of central tendency (mean or median) for each of the monitored lakes. Probably this statistic will be used to compare lakes and to monitor year-to-year changes in each lake. In order to do either, a sufficient number of samples must be taken so that a reliable estimate of the true central tendency can be obtained. The term “reliable” in this case is relative; it depends on the level of uncertainty that the coordinator will accept. The less uncertainty (the more precision) desired, the more samples will have to be taken.

There are two methods that have been used to estimate the number of samples needed in a monitoring program; both are discussed in Marshall et al. (1988). The first, also discussed in Gaugush, (1987), Reckhow and Chapra, (1983), and Thornton et al., (1982), uses the equation

$n=\frac{t^{2}s^{2}}{d^{2}}$

Where n = number of samples

t = appropriate value from Student’s t distribution for the degrees of freedom (n-1)

s2 = sample variance

d = desired precision about the mean, expressed in the same values as the mean (e.g., μg/L).

The term d can also be written as r, where r represents the desired precision as expressed as a fraction and, the mean of the sample (Gaugush, 1987). For example, if the mean of the chlorophyll concentration was estimated at 20 μg/L, and the desired level of precision is + 10 percent, d would have the value 0.2 (0.1 x 20). The equation then looks like

$n=\frac{t^{2}s^{2}}{r^{2}\bar{\chi ^{2}}}$

To use this equation, the coordinator first decides what level of precision is desired, and have some estimate of the mean and variance of the target variable. The desired precision in most experiments is commonly 5%, but to obtain this level of precision is very difficult in limnological monitoring programs without a very large number of samples.  More often, values of 10 or 20% are found to be more possible (Gaugush, 1987; Marshall, et al., 1988).

There are several commonly used methods used for estimating the mean and variance of a particular variable (Gaugush, 1987). First, the coordinator can conduct a pilot study, gathering sufficient data to estimate mean and variance. Second, the results of previous studies on the lake or on similar lakes can be used. The third method, according to Gaugush (1987), would be to make an educated guess. A fourth method has been reported by Marshall, et al., (1988), for chlorophyll and by France and Peters (1992) for total phosphorus. This method relies on the reported relationship between the log variance of an annual or seasonal sample and the log mean value of the same sample (Walker, 1985). Marshall, et al., (1988) report the relationship between variance (s2) and mean () for chlorophyll to be

log(s2) = =0.53 + 2.10 log

for annual data, and

log(s2) = = -1.03 + 2.50 log

for seasonal data. France and Peters (1992) report the relationship for total phosphorus to be

log(s2) = = -0.563 + 1.65 log

for seasonal data (April – October).

Marshall, et al., (1988) used these values in Equation (2) above to obtain estimates of the variance based solely on the mean value. Knowledge of the mean is still necessary, but again it can be estimated from similar lakes or previous studies, or from information on Secchi depth which can be related to total phosphorus or chlorophyll.

The most difficult part of estimating n is the determination of t, the value from the Student’s t distribution for the degrees of freedom (n-1). This value depends on a knowledge of n, the value you are seeking. Therefore an iterative approach, where an initial guess at the value of n is used to obtain t, and then this calculated value on n is used to estimate t, and so on until n converges on the best estimate of n (Thornton, et al., 1982; Gaugush, 1987).

Marshall, et al., (1988) estimates that 10 observations per year would be needed to estimate the seasonal mean of chlorophyll with a coefficient of variation (r) of 20 percent. Between 30 and 40 observations would be needed to obtain a coefficient of variation of 10 percent. Total phosphorus has a lower variance around the mean than does chlorophyll, and fewer samples would be needed (Knowlton, et al., 1984; France and Peters, 1992). However, if a relationship between chlorophyll and phosphorus is desired, then the larger number of samples might be desirable for each. Since the variance apparently increases as the mean of the sample increases, oligotrophic lakes would not need as many samples taken as would a eutrophic lake (Marshall, et al., 1988).

### Temporal Strategies for Sampling

Sampling the lake over the year could be done by randomly pre-selecting sampling dates for the volunteers. Having the volunteers sample the entire calendar year may have some virtues, but the needs of most programs, at least in the northern states, might consider using the ice-free period as more important and logistically possible.

Within the summer or ice-free period, there is rarely a random assortment of possible concentrations and conditions. Daylength, temperature, and, in some lakes, algal chlorophyll increases to a maximum and then decreases. Inputs of phosphorus may he highest during spring runoff and these may translate into higher lake concentrations in the spring. On the other hand, internal loading in some lakes may cause a continual increase in nutrient concentrations throughout the summer.

Because of these non-random fluctuations in limnologically interesting variables, a simple random selection of sampling dates during the summer may not adequately sample these systematic fluctuations. It may be more appropriate to stratify the sampling dates or to use a sequential sampling technique. Stratification could be done within months (June, July, etc.), within limnological periods (Ice out to stratification, Stratification to turnover, turnover to ice-over) or hydrologic periods (high flows in the spring, summer, low flow inputs in the summer, autumnal high flows). In any of these cases, the coordinator would supply the volunteer with a list of randomly selected sampling dates within each period, and the volunteer would be expected to sample on that date, rain or shine.

If a sequential sampling technique were used, the coordinator would first decide on the period of interest and on an appropriate time segment (7 days, 14 days, 30 days, etc) with which to divide that period. An initial sampling day would be chosen randomly from the first time segment. Sampling would be initiated on that randomly chosen date and at equal time segments thereafter. For example, a coordinator might choose to sample every 15 days from May 1 to September 30. To do so, a date would be chosen at random between May 1 and May 15 (a 15 day interval). If May 6 were chosen, sampling would commence on May 6 and then at 15 day intervals (May 21, June 5, etc.) throughout the summer. New starting dates would be chosen for each lake or volunteer.

A sequential sampling program is the easiest to organize because the volunteer would know when they sample, but since the initial sampling times are randomly pre-determined, it might mean that the volunteer would have to sample every Thursday, even though they go to the lake only on weekends.

Although a random or a sequential design could be designed for a volunteer program, the reality may be that it will be difficult to get volunteers to agree to go out to sample on the assigned days. On some lakes it may be dangerous to go out on days when the boat traffic is highest. Certainly no volunteer should go out if there is a threat of lightning or high winds. Rainstorms may discourage others. Volunteers may only go to their cabins on weekends, and sampling would therefore be limited to certain dates. Any of these instances would violate one or more assumptions of the sampling regime, and therefore increase the possibility of biased estimates.

It is the question of bias that should be of concern. The goal of the program should be to obtain the least unbiased estimates of the condition of the lake. A sampling date chosen by the volunteers could introduce bias. For example, if a lake is only sampled on a weekend, boat traffic may temporarily increase turbidity, and therefore lower Secchi depths would be recorded. On the other hand, increased numbers of visitors at the lake during the weekend might increase nutrient inputs from septic tanks, which, in turn, might translate into higher algal densities during the next week. In the first instance sampling, for example, every Wednesday would have a larger Secchi depth than sampling on Sunday; in the other, it might be smaller. Consider the validity of comparing the chlorophyll concentrations in two lakes, one sampled on Sunday, the other on Wednesday. Without evidence that there are no regular weekly fluctuations, the comparison could be invalid. In some lakes, sampling during or after a rainstorm may produce Secchi depths and nutrient concentrations altered by inputs of silt and clay from runoff. Three days later, the lake may be clear or it may have higher algal populations. Statistically valid sampling techniques assume that each of the events has an equal chance of being sampled, and each event contributes equally to the final estimate of central tendency.

A range of temporal sampling strategies seem possible. At one extreme, the coordinator provides a rigid sampling scheme that is statistically correct, either using a random, a stratified random, or a sequential sampling design. Volunteers would be required to adhere to the sampling regime in order to be a part of the program. Although it might sound rigid, it would provide the needed statistical rigor.

The other extreme would be to ignore the whole business of sampling strategy, hoping that there is sufficient randomness and minimum bias in the volunteer sampling that the numbers will still approximate the population mean. For example, volunteers could be urged to sample twice a month, suggesting that the two dates not be close together. No more guidance would be given as to possible sampling dates, relying on the judgment of the volunteer.

Between these two extremes is the possibility that the coordinator could establish a sampling “concept,” in which volunteers attempt to sample either within certain time periods. For example, volunteers could be asked to sample within the 2 week period, July 1 to 15, or plus or minus a given number of days (June 1 + 1 week). The central day or the time period would be chosen at random by the coordinator. The actual sampling date would not be randomly selected, but selected at the convenience of the volunteer. This scheme would provide some random structure to the sampling regime, but allow some flexibility to the volunteer.

It should be obvious that the first extreme may be too rigid for a volunteer program and the other too haphazard. The third still may contain the weekly biases mentioned above. If, however, an effort was made to determine if there were or were not weekly sampling biases, the procedure might have more validity. It could be suggested to the volunteers that they measure Secchi depth several times in a single week. From these multiple measurements, some estimate of within-week variability might be obtained. If there was a strong weekly pattern, then some accommodation may have to be made. If not, then there would be some confidence that the sampling “concept” may also be statistically valid.

In the case of sampling during or after rainstorms, it may be that it would be easier to filter the data after collection rather than to urge the volunteer to endanger themselves adhering to a rigid schedule. Volunteers could be urged to write the weather conditions for the past 3 days on their note card. When the data are analyzed, the coordinator could test as to whether prior precipitation had an effect on the Secchi depth or any other variable. Then, if necessary, the coordinator could eliminate all data in which precipitation occurred within a 24, 48, or 72 hour period prior to sampling.

## Reconciling Statistical Accuracy With Volunteer Reality

This chapter has attempted to deal with the problems of obtaining statistically reliable data from a volunteer program. In a real sense, the problems discussed in the chapter are not limited to volunteer programs, but are commonly encountered in professional limnological investigations. Solutions vary, but too often the solution is to ignore rather than to deal with the problems. Using volunteers instead of trained professionals only adds to the problem because it is more difficult to control when, where, or how samples are taken. The coordinator must rely on non-professional eyes to avoid non-normal situations and on non-professional integrity that shortcuts in technique, location, or sampling dates are not used. The volunteer is being asked to shoulder a considerable burden, and may not fully realize the importance of that burden.

The solutions offered in this chapter often center around two themes: that statistical validity is important, and that the realities of volunteer sampling must be recognized. If the coordinator and the volunteer both realize that it is important to sample on certain days, at certain sites, and careful, repeatable, defined procedures, then a first step has been made towards statistical validity of the data. The second step is made when the coordinator recognizes, and deals with, the known or potential biases included in the sampling procedures used by the volunteers.

The second theme reduces to one of acknowledging the potential biases remaining in procedures. In the same sense that one would not make pronouncements about the annual mean concentration of phosphorus based on an intensive summer sampling procedure, one should recognize that non-randomly selected lakes do not necessarily allow unrestricted statements about the condition of the state’s lakes. Samples taken at the surface may or may not represent the average content of the lake, and samples taken at a single point do not necessarily represent the entire extent of variation within the lake. One can, however, make statements regarding the average summer concentration of phosphorus at the surface at a location near the center of the lake. If that station is well-marked and could be returned to repeatedly, then weekly, monthly, or yearly fluctuations of the surface water phosphorus at that site could be measured and compared.

## Recommendations

### Secchi Dip-In offers following recommendations

1. The coordinator should clearly and completely define the goals of the monitoring program. These goals should be used to guide decisions as to sampling times, locations, etc.
2. It may be that availability of volunteers, rather than the coordinator’s needs and desires, will decide the lakes that are sampled and the locations and times that samples are taken. The coordinator should carefully consider the biases introduced by these non-random elements in the sampling program. Efforts should be made to eliminate biases, filter the data, or assure oneself that the amount of bias introduced is not significant, relative to the needs of the program.
3. At least one permanent sampling site should be chosen for each lake. This site should be over the deepest portion of the reservoir or lake, so that vertical samples can be taken of the entire water column. This site should be well-marked, so that it could be used in the future by other volunteers or professionals. This primary site may not be used to characterize the entire lake basin, but can serve as a point of comparison from year to year.
4. Volunteers should be urged to sample at more than one site. In dendritic or flow-through reservoirs or lakes, samples should be taken at several sites. These sites should also be well-marked.
5. No recommendation is made as to how to best sample a water column. Surface samples are inexpensive but may have high variability and underestimate epilimnetic and lake concentrations. Hose samples may integrate the waters of the lake, but may cause overestimation of lake concentrations. A volume-weighted composite sample or individual samples at several depths may be the best method, but may involve too much labor and laboratory expense for volunteer programs.
6. The seasonal period of sampling may depend on the goals of the project and the latitude of the sample lakes. For many purposes, seasonal or ice-free sampling seasons may be acceptable. Shorter sampling periods may reduce variability, but also reduce information on seasonal variation.
7. It is recommended that some random structure be given to deciding on sampling days, while still giving the volunteers some degree of freedom for selecting convenient days. Final sampling schemes can be tested for randomness. Within-week sampling bias should be considered prior to analysis of the data and the data filtered if necessary.

## Literature Cited

American Public Health Association. 1989. Standard Methods for the Examination of Water and Wastewater. 17th Edition. APHA.

Cochran, W.G. 1977. Sampling Techniques. John Wiley and Sons.

Davies-Colley, R.J. and W.N. Vant. 1988. Estimation of optical properties of water from Secchi disk depths. Wat. Res. Bull. 24: 1329-1335.

Dillon, P.J. and F.H. Rigler. 1973. The phosphorus-chlorophyll relationship in lakes. Limnol. Oceanogr. 17: 767-773.

France, R.L. and R.H. Peters. 1992. Temporal variance function for total phosphorus concentration. Can. J. Fish. Aquat. Sci. 49: 975-977.

Gaugush, R.F. 1986. Statistical methods for reservoir water quality investigations. Instruction Report E-86-2, U.S. Army Engineer Waterways Experiment Station, Vicksburg, Miss.

Gaugush, R.F. 1987. Sampling design for reservoir water quality investigations. Instruction Report E-87-1, U.S. Army Engineer Waterways Experiment Station, Vicksburg, Miss.

Hanna, M. Evaluation of models predicting mixing depth. Can J. Fish. Aquat. Sci. 47: 940-947.

Hanna, M. and R.H. Peters. 1991. Effect of sampling protocol on estimates of phosphorus and chlorophyll concentrations in lakes of low to moderate trophic status. Can. J. Fish. Aquat. Sci. 48: 1979-1986.

Knowlton, M.F. and J.R. Jones. 1989. Comparison of surface and depth-integrated composite samples for estimating algal biomass and phosphorus values and notes on the vertical distribution of algae and photosynthetic bacteria in midwestern lakes. Arch. Hydrobiol./Suppl 83. 2: 175-196.

Hakanson, L. 1981. A manual of lake morphometry. Springer-Verlag.

Hutchinson, G.E. 1957. A Treatise on Limnology. Vol. I. John Wiley and Sons, Inc. New York.

Lind, O.T. 1979. Handbook of Common Methods in Limnology. St. Louis: Mosby.

Marshall, C.T. and R.H. Peters. 1989. General patterns in the seasonal development of chlorophyll a for temperate lakes. Limnol. Oceanogr. 34: 856-867.

Marshall, C.T., a. Morin, and R.H. Peters. 1988. Estimates of mean chlorophyll-a concentration: precision, accuracy, and sampling design. Water Res. Bull. 24: 1027-1034.

Osgood, R.A. 1988. Lake mixis and internal phosphorus dynamics. Arch. Hydrobiol. 113: 629-638

Peters, R.H. 1990. Pathologies in limnology. In: de Bernardi, R., G. Giussani, and L. Barbanti (Eds). Scientific Perspectives in Theoretical and Applied Limnology. Mem. Ist. Ital. Idrobiol. 47: 181-217.

Reckhow, K.H. 1979. Quantitative techniques for the assessment of lake quality. U.S. E.P.A. Office of Water Planning and Standards. EPA-440/5-79-015.

Reckhow, K.H. and S.C. Chapra. 1983. Engineering Approaches for Lake Management. Volume 1: Data Analysis and Empirical Modeling. Butterworth Publishers (Ann Arbor Science).

Thornton, K.W., R.H. Kennedy, A.D. Magoun, and G.E. Saul. 1982. Reservoir water quality design. Water Res. Bull. 18: 471-480.

Smeltzer, E., V. Garrison, and W.W. Walker, Jr. 1989. Eleven years of lake eutrophication monitoring in Vermont: a critical evaluation. Enhancing State’s Lake Management Programs. 1989:

Walker, W.W. Jr. 1985. Statistical bases for mean chlorophyll a criteria. J. Lake Reserv. Manag. 4: 57-62.

Walker, W.W. Jr. 1987. Empirical methods for prediction eutrophication in impoundments. Report 4. Phase III: Applications Manual. Technical Report E-81-9. US Army Engineer Waterways Experiment Station, Vicksburg, Mississippi.53-62.