Upper Midwest Environmental Sciences Center
|
|
| Home/ Overview/ Science Programs/ Data Library/ Products and Publications/States/ Rivers/Teachers and Students/ Links/ Contact/ Search |

|
Introduction The LTRMP collects data using sampling locations that have been selected both probabilistically and nonprobabilistically (in LTRMP parlance, "stratified random" and "fixed-site" data, respectively). Sample information from probabilistically selected locations may be used to make inferences about the populations from which those samples were derived. For example, the prevalence of submersed aquatic vegetation may be estimated for the entire population of sample units (generally defined for an entire reach) using data collected at locations selected probabilistically combined with information about the sampling design. Our “fixed site” data do not permit such design-based generalizations. For this reason, this web site is primarily concerned with the use of sample information from probabilistically selected sites. The LTRMP estimates annual means and associated standard errors by relying on the sampling design (rather than on distributional assumptions presumed associated with the observed data). These so-called “design-based” methods accommodate complexities often associated with survey designs, including stratification and nonproportional sampling ("strata" represent populations from which independent samples are drawn). A useful comparison of design- and model-based methods for the analysis of survey data is provided by Lohr (1999). The LTRMP does not presently adjust statistics from the biological components for detection probabilities or capture efficiencies. Consequently, statistics from the Program's biological components are more properly termed index statistics. Index statistics are presumed to be correlated with parameters of interest (e.g., abundance, percent frequency of occurrence). However, because index statistics have not been adjusted for variation in detection probabilities, changes in index statistics cannot be explicitly differentiated from changes in detection probabilities. Further information on index statistics and detection probabilities is provided in Thompson et al. (1998). A sample inclusion probability is the probability that an individual population unit—for the LTRMP, a grid point—is selected for sampling. Example: 20 grid points from each of strata i and j are selected using simple random sampling. If the population sizes of these strata are 1000 and 2000 grid points, then the sample inclusion probabilities are 20/1000 = 2% and 20/2000 = 1%, respectively. For inclusion probabilities to be constant across strata (i.e., “proportional to size”), the number of grid points selected would need to be directly proportional to the strata sizes (e.g., select 10 and 20 units from strata i and j, respectively). For a given component, these inclusion probabilities have varied across strata (within a given pool) but, with few exceptions, have been generally constant across years. The LTRMP uses sampling weights to adjust for nonproportional sampling. Sampling weights for the LTRMP are generally defined as the inverses of the sample inclusion probabilities, and they may also be viewed as the number of potentially sampleable units represented by a given sampled unit. Continuing with the previous example, each sampled unit in strata i and j may be viewed as representing 50 and 100 (i.e., 1000/20 and 2000/20) potentially sampleable units, respectively. In some instances, locations selected for sampling by the LTRMP were not sampled. This might have occurred, for example, when the intended sampling location was inaccessible. At present, the LTRMP treats these missing observations as missing completely at random (vegetation component) or by substituting predefined alternative locations (other components). If unsampled locations were either (1) not missing completely at random or (2) were not interchangeable with the alternate locations for the given metric, then we may expect our reported statistics to reflect bias of unknown magnitude. At present, the LTRMP ignores the issue of missing data, and sample inclusion probabilities are estimated using the observed rather than intended number of sample units. Sample inclusion probabilities and sampling weights by strata, component, and reach are calculated using the number of sampling observations and the corresponding population sizes. Population sizes are provided below in both pdf and Excel format. Population units (Excel file) (pdf file) Estimating Design-based Means and Standard Errors For the LTRMP, multi-strata means are adjusted for nonproportional sampling and standard errors of multi-strata means for both nonproportional sampling and stratification. Design-based means and standard errors are estimated using SAS' survey means procedure (proc surveymeans); further technical details are provided in SAS (2003). Comments by sampling component:
Macroinvertebrate
Vegetation
Water quality
Estimating means and standard errors from a subpopulation not defined by the design typically require methods that acknowledge that the number of samples in the subpopulation is a random variable (Thompson 2002). This issue is most commonly faced when estimating means and standard errors for upper and lower Pool 4, but is also faced when estimating means from vegetation data collected in 1998 (see Vegetation above). Species richness estimates reported by the LTRMP represent the number of detected species and, as such, should be treated as possible underestimates. Methods for estimating species richness that adjust for species-specific detection probabilities are reviewed by MacKenzie et al. (2005). Given a finite number of potential sampling locations, the variance of a statistic will decrease as increasing proportions of those locations are sampled. For example, when an entire population is sampled, the design-based sampling variance is zero (because the population is censused). While corrections for the sampling fraction of a population may be addressed using finite population correction factors, such corrections are often ignored when sample inclusion probabilities are less than 10%. For the LTRMP, sampling fractions only rarely exceed 10%. A discussion of our approach for these few exceptions is provided under notes on finite population corrections and confidence intervals. Gutreuter, S., R. Burkhardt, and K. Lubinski. 1995. Long Term Resource Monitoring Program Procedures: Fish monitoring. National Biological Service, Environmental Management Technical Center, Onalaska, Wisconsin, July 1995. LTRMP 95-P002 1. 42 pp. + Appendixes A-J. Sauer, J. 1998. Temporal analyses of select macroinvertebrates in the Upper Mississippi River System, 1992-1995. U.S. Geological Survey, Environmental Management Technical Center, Onalaska, Wisconsin, April 1998. LTRMP 98-T001. 26 pp. + Appendix. (NTIS PB98-140874) Thompson, W. L., G. C. White, and C. Gowan. 1998. Monitoring vertebrate populations. Academic Press, San Diego, California. Yin, Y., H. Langrehr, T. Blackburn, M. Moore, J. Winkelman, R. Cosgriff, and T. Cook. 2001. 1998 annual status report: Submersed and rooted floating leaf vegetation in Pools 4, 8, 13, and 26 and La Grange Pool of the Upper Mississippi River System. U.S. Geological Survey, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin, May 2001. LTRMP 2001-P001. 9 pp. + Appendix + Chapters 1-5. (DTIC ADA392067) Contact: Further information about estimating means and standard errors from LTRMP data may be obtained from Brian Gray, LTRMP statistician, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin, at brgray@usgs.gov. |