Informace o publikaci

Joint optimization of cluster number and abundance transformation for obtaining effective vegetation classifications

Autoři	LENGYEL Attila LANDUCCI Flavia MUCINA Ladislav TSAKALOS James L. BOTTA-DUKAT Zoltan
Rok publikování	2018
Druh	Článek v odborném periodiku
Časopis / Zdroj	Journal of Vegetation Science
Fakulta / Pracoviště MU	Přírodovědecká fakulta
Citace
Doi	http://dx.doi.org/10.1111/jvs.12604
Klíčová slova	cluster validation; clustering; community similarity; cover scale; data type; multivariate data analysis; numerical classification; stability of classification
Popis	Question: Is it possible to determine which combination of cluster number and taxon abundance transformation would produce the most effective classification of vegetation data? What is the effect of changing cluster number and taxon abundance weighting (applied simultaneously) on the stability and biological interpretation of vegetation classifications? Locality: Europe, Western Australia, simulated data. Methods: Real data sets representing Hungarian sub-montane grasslands, European wetlands, and Western Australian kwongan vegetation, as well as simulated data sets were used. The data sets were classified using the partitioning around medoids method. We generated classification solutions by gradually changing the transformation exponent applied to the species projected covers and the number of clusters. The effectiveness of each classification was assessed with a stability index. This index is based on bootstrap resampling of the original data set with subsequent elimination of duplicates. The vegetation types delimited by the most stable classification were compared with other classifications obtained at local maxima of the stability values. The effect of changing the transformation power exponent on the number of clusters, indexed according to their stability, was evaluated. Results: The optimal number of clusters varied with the power exponent in all cases, both with real and simulated data sets. With the real data sets, optimal cluster numbers obtained with different data transformations recovered interpretable biological patterns. Using the simulated data, the optima of stability values identified the simulated number of clusters correctly in most cases. Conclusions: With changing the settings of data transformation and the number of clusters, classifications of different stability can be produced. Highly stable classifications can be obtained from different settings for cluster number and data transformation. Despite similarly high stability, such classifications may reveal contrasting biological patterns, thus suggesting different interpretations. We suggest testing a wide range of available combinations to find the parameters resulting in the most effective classifications.