7  Notes on using the web interface

The web interface for the database allows users to select subsets of data from the database by a range of criteria. Sites can be selected based on:

In all cases, except when selecting on project, samples can be selected based on:

For ‘Projects’, all of the data collected for that project is provided without an option for subsetting.

Summary information on the number and types of samples at each site can be accessed on an interactive map, and, if the selected subset contains data from 50 or fewer sites, a summary table of the numbers and types of samples is shown.

The web interface compiles selected data into an Excel file which can be downloaded, together with an explanatory document providing a summary of the data downloaded, including licence information, metadata, and cautionary notes on using the downloaded data. You should read the explanatory document before using downloaded data. The compiled data in the download file differs from the database data in several ways.

  1. The samples table includes a SIGNAL score field calculated from the biota data to provide a quick stream-health indicator.

  2. The count field in the biota table is adjusted depending on the ‘data type’ chosen:

    1. if “presence-absence (all collection methods)” is selected, then all samples in the chosen sites are selected, and counts are converted to 1.

    2. if “unbiased count (lab-sorted methods)” is selected, then only lab-sorted samples are selected. For each record, the number counted in the subsample is recorded as count and the percentage subsample size is recorded in subsamp_perc. Subsamp_perc for coarsepick specimens is set to 100%), This data format permits analysis of the data so that subsampling error is modelled separately from ecological processes (e.g. Walsh et al. 2023).

    3. if “quantitative methods” is selected, then only samples collected by quantitative methods (e.g. airlift sampler, Hess sampler or snag-bag) are included. Counts and subsamp_perc are reported as for (b).

  3. Taxa in the biota table are combined if taxonomic resolution of “family” or “genus” is selected. This is done by converting the last 4 characters of each taxoncode to “9999” for family level, or the last two characters to “99” for genus level, and then summing counts for all resulting unique taxoncodes. If “lowesttaxon” is selected, taxoncodes are unchanged, but only samples with processing_method indicating they were identified to lowest taxon are selected. If “genus” is selected, samples identified to either genus or lowest taxon are selected. If “family” is selected, all samples are selected.

  4. The taxonomy table in the download file is a compiled list of all taxa in the dataset using the function codeTaxonomy() (see above).