Appendix A — Version changes

The PostgreSQL version of the database on the Waterway Ecosystem Research Group ‘water’ server always holds the most up-to-date version of the mwbugs database. Most recent changes are documented at the end of this appendix. If the final listed version is pending, the changes have not yet been committed to the downloadable gpkg version of the database. Past versions will be available here.

As new data are generated, they are added to the database. The following details structural changes to the database from version 1 separate from the addition of new data.

A.1 Version 2

The new version of the database contains 2 new tables: site_groups, site_photos (See Chapter 2).

The projection of version 2 spatial tables (sites and site_photos) are GDA 2020 MGA Zone 55 (EPSG 7855). In version 1, they were in GDA 1994 MGA Zone 55 (EPSG 28355). Version 1.3 (and 1.2) of the Melbourne Stream Network database (Walsh 2023), which match the sitecode_v12 and reach_v12 fields in the sites table, are also in EPSG 7855. Note that Version 1.1 of the stream network, which matches sitecode and reach codes in the mwbugs database is in EPSG 28355.

The geometry column in the spatial tables is called geom in version 2 (it was ‘geometry’ in version 1).

sitecode_v12 and reach12 are new additions to the sites table, and the fields mgae and mgan (coordinates in EPSG 28355 in version 1) have been replaced by mgae_7855 and mgan_7855. mw_subcat and mw_cat have replaced the Melbourne Water management codes in the version 1 sites table, in line with the adoption of 69 ‘subcatchments’ (mw_subcat) and 5 ‘catchments’(mw_cat) by Melbourne Water. The full names and geographic boundaries of the mw_cat and mw_subcat fields can be found in the mwsubcatchments table in the mwstr_v13 database (Walsh 2023).

While taxonomic changes are an ongoing feature of updates within versions, several major changes have been made to the taxonomy tables between version 1 and 2. Following a collation of DNA barcodes for the highly diverse aquatic mites (Carew et al. 2022), I revised the taxoncode scheme for mites that had previously been lumped under the t1code of “MM”. The code “MM” remains in the taxonomy tables as it has been widely used in family-level identifications throughout the database. However, “MM” should be considered ambiguous in datasets that contain records with taxoncodes beginning with the following two letters, which split the mites into orders:

  • “MS” Astigmata

  • “MO” Oribatidae

  • “MP” Prostigmata (which includes the groups usually termed ‘Hydracarina’)

  • “MH” Sphaerolichida

  • “MC” Mesostigmata.

Similarly, following advice from Z Billingham (pers. comm.), codes for Tipulidae (“QD01”) and Limoniidae (“QD02”) species have been revised. All family-level records of Tipulidae (“QD01”) before 2021 should be considered ambiguous, conceivably being either Tipulidae or Limoniidae.

A.1.1 Version 2.0.1 (pending)

Two new tables, projects and sample_project_groups added to permit easier searching for samples by project, allowing for the possibility that some samples have been used in more than one project. (See … in manual. still pending).

An additional new table sitecode_env contains attenuated forest values for all sites in the sites table (added to this database because the values in the mwstr database, calculated for the bottom of each reach, are not always accurate for locations of sites in this database.). reach_v12 and sitecode_v12 values for 32 sites in the sites table were corrected in preparing this table.

Records added for: 384-JKS-470-7-1-AC and 384-JKS-470-7-1-BC (to samples, biota, and sampprs tables).

samples table: incorrect collection dates were corrected for smpcodes beginning with: “384-BNY-69016-6”,“383-BRS-1472-0”,“383-LSN-150-4”, “382-REI-3857-8”,“384-YAR-215620-0”,“381-CHA-1268-2”,“381-EMU-14206-0”,“381-MAM-1366-8”,“381-BOY-13537-6”,“381-EMU-1643-1”, and “384-LSS-96-6”. This was caused by confusion between sample dates for DNA samples and for morphologically ID-ed samples. As a result the monthcodes (and therefore the smpcodes and samppprs codes) for some of these samples do not match the date (by no more than 1). These code errors were not corrected, but the dates were: any analysis of differences in sampling dates should use dates rather than monthcodes.

processing_methods table: pcodes L and M corrected and pcode P added. These three methods cover the three processing methods used in DNA metabarcode identification.

biota table: an error in the taxoncodeToShortcode() function was discovered and corrected (and corrected version uploaded to bug_database_functions.R). The shortcode field in the biota table was incorrect for many records as a result. These have all now been corrected.

Taxonomy tables:

With the addition of the first set of data identified by DNA barcodes, many new taxa have been added to the taxon tables, particularly the taxon_spp and morphospp_etc tables. Many entries in these tables in version 2 that were flagged as being added from DNA analysis have been replaced with revised data, with the inclusion of samples with sourcecode = 64: Spring 2018 samples: one set taken as an additional sample with the morphologically identified samples at each site - subsampled, picked, homogenized and metabarcoded (samples ending in L) and picked samples from one of the morphologically identified samples individually barcoded (samples ending in M). Further revision of the taxonomy tables is in train including storing sequence and Barcode Index Numbers and sequences will be complete before release of version 2.0.1