Classification

This page covers developing a classification scheme and your reference water body classification database.

Step 2, accurately classifying water bodies, applies only to the spatial reference condition approach. Group the set of reference water bodies identified in step 1 into classes by similar nutrient concentrations. Other factors such as elevation, geology, hydrology, positioning within the watershed, size and shape of the channel or basin also might be useful in defining your classes. Define each class to reduce the variability of nutrient concentrations within the class relative to variability among the classes. You can then specify different nutrient criteria that apply to waters for each individual class.

Classification can be an iterative process. You might have to revisit and refine your classification scheme as you progress through developing your criteria. For example, you might identify factors later in the process that affect nutrient dynamics and primary production responses.

Developing Your Classification Scheme

There are two basic approaches for grouping reference water bodies:

Using a geographic classification scheme (e.g., ecoregions, land resource areas, and physiographic provinces).
Applying a statistical method biological, chemical, and physical data.

Geographic classification

EPA has developed a useful resource for initial classifications by delineating geographic divisions based on non-anthropogenic factors that affect ecosystems. You can use the maps the Agency has developed of ecoregions in the United States at various levels of resolution and aggregations based on interpretations of spatial coincidence of geographic phenomena that cause or reflect differences in ecosystem patterns (e.g., geology, hydrology, physiography, and soils). EPA has also developed aggregations of Level III ecoregions for use in nutrient management and assessment that reflect similar expectations for nutrient dynamics.

Statistical classification

You can also classify freshwater environments statistically by identifying characteristics of the reference water bodies associated with naturally expected nutrient concentrations. The first step in this process is to assemble data from reference waters that can predict naturally expected nutrient concentrations.

For lakes, physical classification factors might include water body morphologies (e.g., depths, widths, shapes, and volumes), water body origin, hydrodynamics (e.g., residence time), contributing watershed size, management regime (for reservoirs), and natural landscape and watershed vegetation. Chemical factors that might influence or be associated with naturally expected nutrient concentrations include alkalinity, color, hardness, iron, pH, silica, and temperature.

Stream classification factors might be based on canopy cover, channel morphology, flow continuity, retention time, slope, substrate features, and system size. Chemical factors similar to those for lakes also might be relevant for streams.

Physical classification of reference estuaries and coastal marine ecosystems can be more challenging because each water body is unique in terms of response to nutrient pollution, sensitivity, and size. For those water bodies, consider depth, habitat type (e.g., mangrove, seagrass, coral), morphometry, physical/hydrodynamic factors (e.g., circulation, mixing, stratification, water residence time, tidal amplitude, river flow, tides, wave exposure), and salinity gradients.

After assembling data on water body characteristics and nutrient concentrations in those water bodies, you can use multivariate statistical analyses such as cluster analysis or classification and regression tree to partition reference water bodies into groups with similar nutrient concentrations. Figure 1 shows the results of analysis in which TP concentrations at reference sites across the conterminous U.S. were modeled using a classification and regression tree with the following predictor variables: catchment area, elevation, longitude, latitude, mean annual temperature, and mean annual precipitation. The regression tree selected the predictor variables that minimized variance in TP concentrations within groups relative to among-group variance. Longitude was the first variable identified in the analysis, splitting the national data into western and eastern groups. Subsequent splits were found for latitude (in western streams) and for catchment area (in eastern streams). Distributions of TP concentrations differ substantially among the groups, with streams in the northwestern U.S. having the lowest concentrations and large streams in the eastern U.S. having the highest concentration.

Figure 1. Results of a classification and regression tree analysis of reference site TP concentrations in the conterminous U.S. Boxplots below each group show the distribution of log transformed TP concentrations. Labels in the circles are as follows: Long = degrees longitude, Lat = degrees latitude, and Area = catchment area.

Examples of Classification Schemes

Example 1: Bed-sediment Phosphorus Levels

In Florida, the concentrations of phosphorus found in the bed-sediment of relatively undisturbed headwater streams were used to help identify and classify stream reference sites. Figure 1 displays regions in South Florida delineated according to bed-sediment phosphorus levels based on samples collected from headwater stream draining areas from 1976 through 2006 (USGS).

Figure 1. Mean values of bed-sediment phosphorus concentration within geologic map units

Example 2: Salinity Zones

The Chesapeake Bay and its sub-estuaries are delineated by salinity zones, which have greatly reduced variability in nitrogen, phosphorus, or chlorophyll a data.

Figure 2. Salinity zones in the Chesapeake Bay and its subestuaries

Example 3: Cluster Analysis

Cluster analysis and similarity analysis were used with water quality monitoring data from sites in Biscayne Bay, Florida. The cluster analysis factored in TN, TP, and chlorophyll a. The resulting dendrogram indicated the degree of similarity among sites as a function of the three variables. Sites 101 and 102 in Figure 3 were highly similar. Distant clusters such as BB1 and BB5 were less similar.

Figure 3. 3-variable (TN, TP. chlorophyll a) cluster analysis dendrogram for Biscayne Bay. Distance between clusters defined as Euclidean distance between standardized values of each variable.

Developing Your Reference Water Body Classification Database

At this stage in developing nutrient criteria, you probably have compiled a variety of water quality, biological, and physical data used in previous efforts such as characterizing water bodies, selecting water body types, developing a conceptual model, defining reference conditions, and identifying reference water bodies. In this step, because the focus is exclusively on the reference water bodies you identified in Step 1, you will need to partition those data from the general population.

If you need additional data to effectively accomplish the classification task, you can access other information sources that include local, state, regional, and national databases. The Water Quality Portal EXIT is an example of a national database that is a clearinghouse for USGS, EPA, and USDA data.

Data Sources for Lakes and Reservoirs

For lakes and reservoirs, other data sources include the following:

National Eutrophication Survey (NES)
National Surface Water Survey (NSWS)
U.S. Army Corps of Engineers
U.S. Department of the Interior, Bureau of Reclamation
National surveys such as the EPA Eastern Lakes Survey
Clean Lakes Program
Regional limnological studies

Data Sources for Rivers and Streams

For rivers and streams, other data sources include the following:

USGS Hydrologic Benchmark Network (HBN) and National Stream Quality Accounting Network (NASQAN)
USGS National Water-Quality Assessment Program (NAWQA)
USGS Water, Energy, and Biogeochemical Budgets (WEBB)
USDA Agricultural Research Service (ARS)
U.S. Forest Service

Data Sources for Estuaries and Coastal Waters

For estuaries and coastal waters, other data sources include the following:

Ocean Data Evaluation System (ODES)
Chesapeake Bay Program
National Estuarine Programs
National Oceanographic and Atmospheric Administration (NOAA)
National Estuarine Research Reserve System

Contacting local and state technical groups (e.g., university researchers or local NEP staff) might be a good way to check on available data resources. If an EPA- or state-sponsored technical advisory group exists, members or associated researchers will likely be knowledgeable about the availability and applicability of data resources for criteria development. In addition, the local experts and researchers will likely have the knowledge and access to other local scientific expertise to assist in the sampling design and collection of additional data.