Upper Mississippi River Restoration Program

Upper Mississippi River Restoration Program

Long Term Resource Monitoring

 

LTRM Statistics

Sampling Designs


The LTRM sampling designs include those that rely on random (probabilistic) and nonrandom (nonprobabilistic) sampling. From 1988 through 1991, sampling locations were selected nonprobabilistically (in LTRM parlance, "fixed-site sampling,"). Beginning in 1992 and 1993, however, probability sampling was introduced using a stratified random design (in LTRM parlance, "SRS"). At present, sampling effort is predominantly allocated to SRS designs and, therefore, most inferences are derived from data collected from those designs. Use of data from sites not selected probabilistically is discussed under Using Data from LTRM Nonrandom ("Fixed-Site") Locations.

Probabilistic Sampling Designs

The LTRM samples using a stratified random sampling design within five reaches of the Upper Mississippi River (UMR) and one reach of the Illinois River. These six reaches represent a judgment sample of reaches within the Upper Mississippi River System (UMRS), and, as a result, inferences about either the UMR or the UMRS using LTRM data must rely on investigator’s assumptions regarding reality (i.e., models) rather than on the design.

Probabilistic sampling began in 1992, 1993, and 1998 for LTRM's macroinvertebrate, fish and water quality, and vegetation components, respectively. Within reaches, the list of sampling locations (i.e., sampling "frame") is stratified by broad geomorphic features (Wilcox 1993); sampling locations are selected randomly within these strata. The number of strata per reach is small (roughly four), and strata definitions have been constant (excepting a minor change in the vegetation component in 1999). The sampling frames (grids) used for LTRM sampling components may be viewed at: fish, vegetation, and water quality. Sampling intensities have varied by component, reach, and stratum but have remained roughly constant across sampling years within component-reach-stratum combinations. The Program tries to keep the sampling frame constant over time (see FAQs below for explanation).

Further information about LTRM's probabilistic sampling designs are provided in reports published for fish, macroinvertebrates, vegetation, and water quality components, as well as a report on monitoring rational for the fish component. The FAQ section below covers some issues in detail.

Contact: Questions or comments may be directed to LTRM statistician Brian Gray or LTRM support staff Jim Rogala, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin.


Probabilistic Sampling Design FAQs

Questions related to the sampling frame (the frame is the listing of all possible sampling locations from which random locations are selected):

Questions related to how we sample the frame:

 

What are the consequences of possible "errors" in strata designation in the original frame?

Because of LTRM's interest in stratum-scale estimates, it is important to consider whether inaccuracies in the frame (e.g., designating as side channel an area that more closely resembles backwater conditions) results in estimates that do not adequately capture differences among strata within and across reaches. In doing so, it is also important to understand that strata are not equivalent to "habitats". Soballe (1997) addressed this issue as follows:

"LTRM sampling strata are NOT habitat classes. LTRM sampling strata do reflect differing habitat types approximately, and this correspondence is desirable, but it is not precise, and it is not required by the statistical design."

"We expect each stratum to contain a broad mosaic of habitat types, with the distribution of habitats differing among the strata. Unlike habitats, strata do not have a set of physical, chemical, or biotic attributes (i.e. depth, velocity, substrate type, vegetation) that uniquely define them. ... strata are fundamentally just geographic areas on a map."

LTRM response: Given the above statements, it seems implied that there are no strata "errors", because we do not expect a strict delineation in characteristics between strata. Frame elements that appear to be in the wrong stratum are just the extremes in a wide distribution of characteristics within a stratum, yet those distributions do vary among strata within a sampling event. Evidence of these differences among strata has been observed in many variables measured by LTRM.

Back to FAQs top of page

Why haven't the frames been changed to reflect changes that have occurred in land water boundaries?

Large rivers can be dynamic, with both new wetted areas and loss of wetted areas occurring over time. In addition, human modifications to the system alter the aquatic portion of the floodplain. Below are examples of changes that could occur in the extent of the overall frame:

1. Sedimentation/erosion changing the "shoreline" in backwaters - The actual boundary between aquatic and terrestrial, at least for the purposes of creating strata maps, was determined based on vegetation, which may not reflect true changes in bed elevation. Surveys of backwater sedimentation (Rogala 2003) indicate that near shore changes are slow. Sedimentation due to fluvial processes (e.g., alluvial fan formation) is greater, but differences between the 1989, 2000, and 2010 Land Cover/Use (LCU) databases suggest that changes from aquatic to terrestrial are a small proportion of the backwater strata size.

2. Sedimentation/erosion changing the "shoreline" of channels - Changes in channel shorelines can be much greater than that of backwaters. However, these changes may not always be unidirectional (e.g., revert to prior conditions in following years), and occur quickly in response to extreme discharge events.

3. Island building - Changes due to construction of islands doesn't reduce the overall frame size (i.e., turn areas from aquatic to terrestrial) significantly for most strata. The exception is for the fish component, where strata for shoreline gear can be relatively small.


LTRM response: Although changes in land water boundaries occur, we believe these changes will not have measurable effects on parameter estimates derived from the SRS design. This belief is supported by comparisons of the three land cover databases LTRM now has over the period of record, and the small proportion of SRS sites that are inaccessible during any given sampling event due to terrestrial conditions. In addition, frame changes are sometimes temporary, so altering the frame to incorporate these changes does not seem beneficial. Furthermore, incorporating some of these changes is not practical given data required to make the changes (e.g., land cover databases are generated at a decadal scale, often using variable methods). In the early years of the LTRM it was expected that the need to remap the frame would be evaluated on approximately a ten-year cycle. Now that we better understand the limited changes occurring at the scale of the overall frame, and the consequences related to data analysis and interpretation, LTRM does not plan to alter the overall sampling frame or strata boundaries unless a major event alters the system in a significant way.

Back to FAQs top of page

Why haven't changes in strata designation been made within the existing frame?

The dynamic nature of large rivers and human modifications can result in permanent changes from one aquatic area type (i.e., the basis of the stratification) to another. Some examples include:

Morphometric change

Original aquatic areas type

New aquatic areas type

Loss of islands in lower portions of pools

BWC

IMP

Formation of natural levees at both ends of a side channel

SC

BWI

Formation of a natural levee at upper end of a side channel

SC

BWC

Breach of a natural levee in a narrow linear backwater

BWC

SC

Island construction in lower portions of pools

IMP

BWC

Removal of portion of a levee around an isolated backwater

BWI

BWC

  Aquatic Area codes: BWC = Backwater contiguous; BWI = Backwater isolated; IMP = Impounded; SC = Side channel

LTRM response: While enduring changes in some areas will happen over the life of Program, there are several issues complicating a change in the frame to capture those changes. Some of these issues have been stated above, such as determining the persistence of the change and having GIS data suitable to map the change. However, we believe that the single most important issue related to changing strata is that data analysis and interpretation become difficult if the stratification of the frame is altered. That topic is discussed further under the question: In general, what is the justification for a constant sampling frame?

Back to FAQs top of page

In general, what is the justification for a constant sampling frame?

Primary justification is related to data analysis and interpretation:

1. A constant frame simplifies data analysis and its subsequent interpretation (e.g., weights remain constant through time). This is true if the primary interest is in pool-wide estimates across strata, which Soballe (1997) states is the case: "The primary target of the LTRM SRS design is the pool/reach - annual scale." However, this is especially true when multiyear estimates are desired. Such estimates will typically be obtained using models rather than by appealing to the sampling design (i.e., using design-based estimators); addressing variation in sampling weights using models is often challenging and remains a research area for statisticians. Hence, we think it will generally be easier for users of LTRM data to adjust for covariates than adjust for temporal changes in sampling probabilities.
2. It is likely more efficient to detect change within a constant frame, rather than look for change in changing frame. This reasoning was introduced by Soballe (1997): "Two fundamentally different approaches can be taken to assess changes within a stratified framework. One method is to frequently remap the strata based on field observations and then attempt to track or quantify these changes in the map. ... An alternative approach is to have the sampling strata permanently fixed in space, and then quantify changes in conditions within the permanently fixed strata based on the sampling data. .... Changes in conditions within the strata (and at the pool or reach level), rather than changes in the stratum boundaries, are used to indicate changes in the River." From these statements and given the uncertainty in "remapping" the strata, it seems apparent that the alternative approach using a permanently fixed sampling frame is more efficient.
3. Related to the previous justification, altering the frame at decadal intervals to reflect changes in the aquatic area types would make interpretation over long periods difficult because of the "resetting" of strata estimates. This figure illustrates this point when comparing the two approaches.

Other justifications/considerations:

4. Changes in the frame are generally small relative to the total frame size. We see evidence of this in land cover databases and the number of sites not sampled or denoted as in different "stratum".
5. Temporal variability is large, so spatial changes likely have small effect.
6. Change in the frame may be temporary. From Fish Procedures Manual (Gutreuter 1995): "The strata are based on enduring geomorphic physical features, called aquatic areas (Wilcox 1993), that help define important habitat types for fishes."
7. Frame changes can't be incorporated in a timely manner. In most cases, changes would need to be incorporated when a new
Land Cover/Use (LCU) GIS coverage is developed. Currently, these are generated at a decadal frequency, and lagged many years after the aerial photography is collected.

Back to FAQs top of page

Are strata comparable across study reaches?

LTRM has an interest in strata-scale estimates, and strata comparisons across study reaches may be desired. Given the origin of the sampling frame, one may question if these strata represent similar areas across pools.


LTRM response
: We expected differences among reaches in many of the parameters estimated from LTRM data that would be unassociated with how well strata represented the enduring geomorphic attributes of interest. For example, differences in water temperature among northern and southern reaches reflects climatic differences and not differences in geomorphic physical features. There are a few known issues with strata that do differ across reaches. The impounded strata are different among pools, and comparisons of impounded strata across reaches should be limited to Pools 8 and 13. The impounded area in Pool 26 is newly formed and so it lacks some impounded backwater features defined by Wilcox (1996); in particular, it is small and deep and would be expected to differ ecologically because there are minimal wind-generated wave effects. A second issue is that backwaters in the LaGrange Pool include managed/manipulated backwater units that are absent from the other pools; differences between the LaGrange backwater strata will often reflect the inclusion of these backwater units.

Back to FAQs top of page

Are reach-wide estimates valid given that the sampling frame doesn't cover the entire aquatic area of a reach?

Most of the incompleteness in the frame is due to sampling frame reduction to facilitate a high probability of sampling. These reductions varied by component. The fish frame was reduced the greatest due to gear deployment issues. Boat accessibility issues were addressed across all components and strata. These issues are greatest for isolated backwaters, so, at best, a component may sample only selected isolated backwaters. There are many other reasons for frame reduction to assure high probability of sampling, with a couple being shallow stump fields in impounded areas and managed backwaters that may be completely dewatered during some sampling events.

There were several other frame reductions that were related to other issues, and a few examples are as follows. Deep areas, as defined by a selected depth at low discharge conditions, were excluded from the frame used for vegetation sampling because the probability of vegetation in these areas was near zero. Some large areas were mapped as tributaries during the aquatic areas database generation and therefore left out of the sampling frame. The most prominent example of this is the portion of Pool 8 that connects the east spillway of Lake Onalaska with the rest of the Mississippi River as it flows on both sides of French Island. Technically, though called the Black River, it is not a tributary at all, as it already traveled through a large floodplain lake (Lake Onalaska).


LTRM response: As is true of all inferences from sampling data, inferences from LTRM data are strictly on the frame, and are not exactly "pool-wide" or "aquatic area type" estimates. However, estimates from the LTRM sampling frame are superior to estimates across all aquatic areas for several reasons. First, excluding areas that can't be consistently sampled through time minimizes periodic missing data, and such missing data makes comparisons across years invalid (i.e., essentially a changing frame each year depending on accessibility of some portions of the reach). Second, in the case of excluding deep areas for vegetation sampling, the estimates have more precision because sampling effort is not applied to areas where vegetation is very unlikely to occur. In this specific case, the estimates are valid for the area where most vegetation is likely to occur, and this "index" is valid across years and study reaches. Most other exclusions were areas that represented a very small proportion of the total aquatic area, with even the special case of a large excluded area (the "Black River" in Pool 8) representing only 3% of the total aquatic area in the reach.

Back to FAQs top of page

Is the grid of points used to pick the random sample at a fine enough resolution to assure the sample is completely random, or is it somewhat a systematic sample?

In most cases when a spatial random sample is selected, it is done so with some grid of points. These may be defined by each unit in the geographic projection (e.g., meters in a UTM projection), or a wider spaced interval. If the spacing of the elements in the sampling frame is too wide, then the sample more closely resembles a systematic sample rather than a random sample. The concern with systematic samples is that they underestimate variance. LTRM uses a rather large grid, and there may be a concern that variances are being underestimated.


LTRM response
: Although the LTRM grid size is somewhat large (typically 50 m), we are sampling over large areas, so the elements in the sampling strata are still large (see stratum population sizes, Nh, at http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.pdf). Given the rather modest sample selection sizes used by LTRM, the sampling fractions tend to be low (<10%) in nearly all cases, with sampling fractions often being less than 1%. While this doesn't eliminate the concern for underestimating variance, it does illustrate that such underestimates would be expected to be minimal in most cases. A good understanding of spatial variance at small scales would be needed to definitively address this concern, and those data do not exist.

Back to FAQs top of page

Do sampling probabilities vary among strata?

In a stratified design, samples can be allocated at varying rates among strata. Reasons for doing so include avoiding oversampling larger strata and increasing sample size in strata with larger variance to more efficiently estimate across strata. A key concept here is that it is not stratum size that determines the number of samples desired, but rather the variance within that stratum, and a minimum sample size required to adequately determine variance. Even with a fixed rate of sampling across strata, there can be differences in sampling probability because of missing data or rounding errors. When estimating statistics across strata, weighting is required when sampling weights vary among strata.


LTRM response: Sampling probabilities vary among strata for LTRM data, and thus estimates across strata require weighting. LTRM used unequal sampling probabilities to allocate samples to some extent but being that many parameters are measured while sampling each site, and that the variance among those parameters vary, only obvious deviations from an equal sampling probability design could be addressed. For example, it can be reasonably assumed that most water quality parameters vary less in the well mixed channels than in the patchy off-channel areas, so samples sizes per unit area in channels need not be as large as those in off-channels. In contrast, it is less clear how the variance in fish parameters might vary among strata. In most cases, LTRM assigned sample sizes to strata by 1) determining a minimum sample size required for any stratum, 2) using existing knowledge of variances within strata to consider increases in sample size in strata with larger variances, and 3) considering stratum size when variances within strata were unknown. The latter of these issues assumes a relationship between stratum sizes and variances within strata.

Back to FAQs top of page

How frequent are data missing (i.e., a selected location can't be sampled)?

Missing data can be a concern when the data are not missing completely at random (MCAR) because the estimates may be biased. For example, if more than a trivial amount of data is missing because an area becomes inaccessible during a sampling event, then the non-missing samples are unrepresentative of the whole sampling frame because the data are not MCAR. Some other reasons for missing data include sampling gear/equipment failure, and, for water quality, critical errors in processing laboratory samples.


LTRM response
: For some components, LTRM randomly selects alternate sampling locations to replace locations that can't be sampled, but those replacements do not eliminate the concern related to potential bias sampling. Missing data varies greatly among components, years, and study reaches in LTRM. The primary reason for missing data is water levels, with restricted access during low water periods being the most common cause. The data missing due to access problems are more prevalent in only some reaches, and such problems should be evaluated when evident in sample sizes. Except for these unique occurrences in some reaches, missing data occur at low rates. Generally, we find missing data least prevalent in vegetation data, and most prevalent in water quality data, but again restricted to some reaches. Missing data from equipment/gear failure and laboratory errors are both reasonably treated as MCAR, so these are expected to have only small impacts on parameter estimates.

Back to FAQs top of page

Is the sample selected with, or without, replacement? Does the answer to this question affect inferences from LTRM data?

Selecting a sample with, as opposed to without, replacement can affect the inference for the sample.


LTRM response
: LTRM selects sampling locations without replacement within a sampling event, but with replacement across sampling events (an exception was for vegetation in Pool 8 from 2001 to 2004). The implications of such selection include that for within events, the sampling is more efficient, with lower variance for a given sample size. For across events using sampling with replacement, variance associated with the selection itself is variable across events, thus making change detection across a few years less probable. However, the estimates over a longer period more reasonably represent the entire frame, rather than the inference being on the initial selection. Because LTRM is intended to collect data over an extended period, the inclusion of variance associated with sample selection itself seems minor in comparison to the risk of repeatedly sampling a single selection that may be unrepresentative of the entire frame.

Back to FAQs top of page

How are the multiple sampling events within year handled?

Both fish and water quality components stratify sampling within year. Fish has three continuous six-week periods running from June 15 to October 30. Water quality has four discontinuous periods selected to represent winter, spring, summer, and fall seasons. These sampling periods could potentially be treated as individual sampling events with periods and seasons as strata.


LTRM response: The three periods of fish sampling were established to ensure data were collected over the entire warm season portion of a given sampling year. For design-based estimates, these could be, and most often are, treated as a single sample to estimate at an annual scale. The periods may be treated as strata if there is a specific information need, but sample sizes will produce poor estimates in some cases. The water quality seasonal samples are specifically in the design to produce estimates at the seasonal scale. Estimates at the annual scale across water quality seasons is not typically done, as these periods are not contiguous, sample less than 15% of the entire year, and such estimates are less informative because water quality conditions have the most ecological significance at the season scale. Model-based estimates might be approached differently than those used for design-based estimates by considering dependence across periods and seasons.

Back to FAQs top of page

Should the annual sampling data be treated as independent samples?

As with the issue of independence when modeling for the multi-within year samples, investigators should consider dependency across years for LTRM data.


LTRM response
: The annual data for fish, vegetation and invertebrates are not independent across years. For fish, individuals within year classes can be captured across many years. For vegetation, established vegetation beds likely persist from one year to the next not simply because of suitable conditions, but because of development of rooting structures that enable growth in the following year, or actual active growth over the entire year. For invertebrates, similar lack of independence across years can be imagined. Those modeling trends in LTRM fish, vegetation and invertebrate data must consider how to address the possibility of dependency across years. Water quality, on the other hand, is frequently reset by seasonal conditions. This suggests that the annual seasonal samples for water quality can be treated as independent samples in most cases.

Back to FAQs top of page

Why aren't all components sampled at the same time and place?

Sampling of multiple components is often done in coincidence to provide information on explanatory variables (e.g., habitat suitability). LTRM does not use such a sampling design. Users often ask why not, and whether any type of investigations into associations across components can be done with LTRM data?


LTRM response
: The primary objective for the SRS sampling by LTRM is to produce status and trends information for measured variables at the reach and annual scale, and the sampling design was optimized for those purposes. The data is suitable for analysis of explanatory variables at those scales (e.g., in a given pool, higher catch years of some fish species were associated with years with increased vegetation indices). The design could have incorporated the additional objective of collecting data suitable for analysis of explanatory factors at the site scale, but there are several reasons, including logistics, why that was not desirable given the primary objective.

References

Gutreuter, S., R. Burkhardt, and K. Lubinski. 1995. Long Term Resource Monitoring Program Procedures: Fish Monitoring. National Biological Service, Environmental Management Technical Center, Onalaska, Wisconsin, July 1995. LTRMP 95-P002-1. 42 pp. + Appendixes A-J

Rogala, J. T., P. J. Boma, and B. R. Gray. 2003. Rates and patterns of net sedimentation in backwaters of Pools 4, 8, and 13 of the Upper Mississippi River. U.S. Geological Survey, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin. An LTRMP Web-based report available online at http://www.umesc.usgs.gov/data_library/sedimentation/documents/rates_patterns/. (Accessed August 2019.)

Soballe, D. M. 1997. May 12, 1997 Memorandum to LTRMP Field Teams and Component Specialists from Dr. David M. Soballe on the subject of LTRMP Stratified Sampling Issues. 

Wilcox, D. B. 1993. An aquatic habitat classification system for the Upper Mississippi River System. U.S. Fish and Wildlife Service, Environmental Management Technical Center, Onalaska, Wisconsin, May 1993. EMTC 93-T003. 9 pp. + Appendix A. (NTIS PB93-208981) 

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey


Page Last Modified: December 5, 2019 US Army Corps of Engineers USGS Upper Midwest Environmental Sciences Center US Fish and Wildlife Service U.S. Environmental Protection Agency U.S. Department of Agriculture Natural Resources Conservation Service Minnesota DNR Wisconsin DNR Iowa DNR Illinois Natural History Survey Missouri DC