INTRODUCTION
Tea is one of the world’s most popular beverages. Tea is primarily made from the leaves and buds of the tea plant (Camellia sinensis L.) (Zhang et al., 2019). The optimum climate for growing tea is 19°C–23°C with 2,500–3,000 mm per year of rainfall. Shoot growth is greatly reduced when the temperature is outside the 13°C–30°C range or the rainfall is less than 1,200 mm per year (Jayasinghe et al., 2019). Thus, tea is suitable for being cultivated in tropical and subtropical areas. According to 2019 data, China is the world’s leading tea producer, accounting for nearly half of the global tea production. It is followed by India, Kenya, and Sri Lanka (FAO, 2021).
Tea from various geographical areas frequently has a distinct flavor. The elevation is one of the factors that influence tea aroma. Although tea can be grown at elevations ranging from 0 to 2,200 m above sea level, higher elevations produce tea with a richer flavor due to more concentrated aromatic oils when the plant is grown under stress conditions (Jayasinghe et al., 2019). The disadvantage of cultivating in a stressed environment is that crop yield is lower. The combination of peculiar taste and limited availability prompts a higher economic value for authentic tea of a specific origin, particularly in this globalization era. Several teas have received a protected geographical indication (PGI) certificate which recognizes a regional specialty, such as Wuyi-Rock tea from Wuyi Mountain in the north of Fujian Province in China, Xihu Longjing tea from the Xihu district of Zhejiang Province in China, and Kangra tea from the Western Himalayas in northern India (European Commission, 2021).
As tea is a popular commodity with some products having a higher value than others, tea has been a target of counterfeiting or adulteration. Detecting counterfeit, mislabeled, or adulterated tea is a difficult task. The appearance of the fraudulent products is often identical to that of the originals, so they cannot be distinguished by appearance. Authentication is essential to ensure product quality and originality, which protects both consumers and legitimate producers. Elemental fingerprinting, defined by a unique pattern of elemental concentration in tea or tea infusion, has been used in many publications for tea authentication, especially concerning the geographical origin of tea. Due to a large amount of data available in multielemental fingerprinting, multivariate data analysis is required to help interpret the results. A suitable multivariate data analysis technique provides a powerful prediction for the data sets.
The purpose of this review is to discuss the recent applications of elemental fingerprinting for the authentication of tea. The most commonly used multielemental determination techniques and multivariate data analyses in the literature over the last two decades are highlighted.
METHODS
A literature search was performed between May 10 and 17, 2021, using the electronic database Web of Science from 2000 to 2021. The keywords used in the literature search were “tea AND authentication AND element*” and “tea AND origin AND element*” and “tea AND classification AND element*.” The abstracts of the search results were examined for potential inclusion of the articles. A total of 67 articles were further reviewed, and unsuitable articles were excluded. The excluded articles included articles that did not use multivariate data analysis, articles that did not use tea (C. sinensis L.) as the main object of the study, and review articles. Two appropriate articles were included following a manual search of the reference lists of the included articles. A final total of 48 articles were used in this review study. Figure 1 depicts a flowchart of the selection process. A summary of the research articles is available in Table 1.
MULTIELEMENTAL DETERMINATION TECHNIQUES
Authentication of tea by means of elemental fingerprinting requires instruments that are suitable for high-throughput analysis. Inductively coupled plasma-mass spectrometry (ICP-MS) is the most popular technique for multielemental analysis in the authentication of tea (Table 1). Previously, inductively coupled plasma-optical emission spectroscopy (ICP-OES) was employed for multielement determination. Flame atomic absorption spectroscopy (FAAS), X-ray fluorescence (XRF), and laser-induced breakdown spectroscopy (LIBS) were used in a limited application. ICP-MS outperforms other instruments mainly due to its low limit of detection (LOD) and wider linear dynamic range. Thus, ICP-MS can determine many more elements simultaneously. The growing popularity and availability of ICP-MS for elemental determination have aided the development of this topic, which has highly accelerated since 2017 (Fig. 2).
Figure 1. The flowchart of the literature search for this review. [Click here to view] |
Table 1. Summary of the articles included in this study. [Click here to view] |
The introduction of a sample into an ICP instrument requires a suitable system, which is dependent on the type of sample. A pneumatic nebulizer and spray chamber are typically used to introduce liquid samples. For volatile elements, such as As and Hg, a hydride generator is an appropriate sample introduction system. A laser-ablation system can be coupled to an ICP instrument, allowing it to analyze solid samples, etc. ICP efficiently atomizes and ionizes the sample that enters the instrument. Elements with ionization energy of less than 8 eV are almost completely ionized, whereas metalloids and nonmetals can be partially ionized.
The ions produced by ICP are then carried by argon gas into a mass spectrometry detector (for ICP-MS) or an optical emission spectroscopy detector (for ICP-OES). Both instruments allow the rapid and simultaneous detection of multiple elements in a single analysis. ICP-MS can be equipped with a quadrupole-based, time-of-flight (TOF), or sector-field detector. Quadrupole-based ICP-MS is the preferred instrument for elemental quantification, where exact ion mass determination and high mass resolution are not required. It offers sufficiently fast analysis and high sensitivity at a lower cost compared to TOF and sector-field detectors. TOF is a faster instrument that has better performance for transient signals (for coupling with laser-ablation or HPLC). Sector-field ICP-MS offers higher resolution, and in the case of multicollector sector-field ICP-MS, it has superior precision that is suitable for the determination of exact mass, i.e., for isotopic analysis. Further reviews of ICP-MS instrumentations are available elsewhere (Jakubowski et al., 2011; Meermann and Nischwitz, 2018). In ICP-OES, the instrumentation can be differentiated by the viewing of the light emitted from the plasma. The conventional radial view examines the plasma from the side, while the axial view examines the plasma from above. As more light is examined by the axial system, the sensitivity is higher, but it suffers more from noise which affects the precision. Thus, the selection of the instrument depends on the concentration of the analyte and the complexity of the matrices (de Souza et al., 2008).
Figure 2. The number of studies on the authentication of tea using elemental concentrations and multivariate data analysis from 2001 to mid-2021. The increasing number of studies was influenced by the higher utilization of ICP-MS instruments. [Click here to view] |
ICP-MS is superior to ICP-OES in terms of LOD. Zhang et al. (2019) validated multielemental analysis of 64 elements in tea via ICP-MS. The LOD was typically under 1 ng/ml, ranging from 0.0004 ng/ml (for Re) to 35.906 ng/ml (for Ca). The LODs of 14 elements analyzed by ICP-OES were 0.001–0.045 µg/ml (McKenzie et al., 2010). Thus, ICP-MS is capable of analyzing rare earth elements (REEs) in tea, which are present in ultra-trace concentrations. ICP-MS, but not ICP-OES, can distinguish isotopes of an element as additional parameters for the authentication of tea (Liu et al., 2020a, 2020d). However, the ICP-MS comes with higher purchasing and operating costs. A standard quadrupole ICP-MS costs around €150,000 to purchase, which is two to three times higher than the cost of an ICP-OES. On top of that, due to the high sensitivity of this instrument, the operating costs of ICP-MS increase to maintain a low-level metal environment, such as the need for a clean room, trace metal grade acids, and perfluoroalkoxy alkane vessels instead of glassware.
FAAS can be used in the authentication of tea despite its limitations. FAAS is much cheaper (around €20,000) and easier to operate than ICP instruments. Thus, FAAS is appropriate for routine analysis in a nonspecialized laboratory. However, it is not widely used for tea authentication because it takes a long time to analyze multiple elements. As a result, it is not cost-efficient and requires a high number of samples for multielemental analysis. While ICP instruments usually require less than 1 g of tea sample, FAAS may need up to 10 g for the analysis of 14 elements (Ca, K, Mg, Na, P, Mn, Fe, Zn, Cu, Co, Cd, Cr, Ni, and Pb). Moreover, the LOD was 0.01–0.40 µg/g for the 14 elements analyzed, which was much higher than the ICP-based methods. Additional sample preparation procedures were required to compensate for matrix and spectral interferences in FAAS analysis, such as the addition of cesium chloride for K and Na measurements to increase analyte atomization and the addition of lanthanum (III) oxide for Ca and Mg measurements to dissociate the analyte from the matrix (Brzezicha-Cirocka et al., 2016).
ICP-MS, ICP-OES, and FAAS commonly require samples in solution form. Sample preparation prior to analysis is critical as it affects analytical performance and often takes a significant amount of time. Sample preparation may include sample size reduction, digestion, and dilution. Sample digestion is needed to release the analyte and eliminate the matrix. Microwave-assisted digestion is available for fast and efficient digestion. The digestion by the microwave digestion system required only 30 minutes of gradient heating (120°C for 10 minutes, 160°C for 10 minutes, and 180°C for 10 minutes) with a common digestion solution of 5 ml of 65% HNO3 and 1 ml of 30% H2O2 (Li et al., 2019). On the other hand, conventional wet digestion required the samples to be digested with a strong acid (such as HNO3) overnight at room temperature before adding H2O2 and heating to achieve complete digestion, ensuring a quantitative process (Meng et al., 2020). The latter remains a choice when the microwave digestion system is not available or is limited by the number of vessels.
XRF provides an alternative method for a rapid multielemental analysis, eliminating the necessity of sample digestion. Sample preparation is normally nondestructive, involving only sample size reduction and homogenization (Lim et al., 2021; Rajapaksha et al., 2017). However, the method is limited by the concentration of the elements in the sample. The element with a concentration of <3 µg/g exhibited a very high RSD due to its low sensitivity. Furthermore, elements with low atomic numbers, such as Na, are difficult to analyze using XRF (Lim et al., 2021). Because the energy level of light elements is low (around 1 keV), the fluorescence is mostly absorbed by the sample, and the detector is not sensitive enough to detect the remaining unabsorbed fluorescence. Additionally, the fluorescence undergoes scattering effects, such as Rayleigh and Raman scattering (Da-Col et al., 2015).
LIBS has recently been used for elemental analysis. LIBS can detect multiple elements simultaneously with a minimum sample preparation step. Prior to analysis, the sample is usually dried to achieve an accurate and precise result. The sample is then ground and compressed into a tablet. Although the accuracy and precision of LIBS are generally lower than those of XRF, LIBS has better performance for the analysis of lighter elements. In LIBS instrumentation, a laser is focused through a lens to generate intense plasma on the sample’s surface. The emitted light is collected and transmitted to a detector. The signal is sent to a personal computer for data acquisition and analysis. To observe the elements, the resulting spectra are compared to an established database, such as those from the National Institute of Standards and Technology (Wang et al., 2016). Aside from metal elements such as Ca, Fe, Al, Mn, Mg, K, and Si, molecular spectra of CN and C2 can be observed from the LIBS spectra generated from tea leaves (Wang et al., 2016; Zhang et al., 2018a).
MULTIVARIATE DATA ANALYSES
Multielemental determination for authentication study generates a large amount of data. Multivariate data analysis is essential for data analysis in the context of pattern recognition to determine whether the data sets have some shared similarities that are homogeneous enough to be classified into a group. Multivariate data analysis can be categorized into unsupervised and supervised methods based on the algorithm used to build the models. Unsupervised methods do not make any assumptions on sample grouping prior to modeling, so the algorithm discovers the inherent pattern in the data. On the other hand, supervised methods assign data into groups before constructing a model. The model learns the pattern of the group through iterative prediction.
Principal component analysis (PCA) and hierarchical cluster analysis (HCA) are the most common unsupervised methods of multivariate data analysis used in tea authentication (Table 1). PCA generates principal components (PCs) from variables to explain data variation. The PCs with most variations are usually depicted in order to visualize the clusters and the contributing variables. HCA assigns the data into clusters based on the straight-line (Euclidean) distance between the objects using Ward’s method. The data is visualized as a dendrogram as a result of the analysis (Moreda-Piñeiro et al., 2003). Because PCA and HCA are unsupervised data analysis techniques that seek natural patterns in data, the classification does not always have enough confidence to cluster the samples (Liu et al., 2020c; Zhang et al., 2018b). The techniques can therefore be used to first explore and find a pattern in the data for further statistical analysis.
Linear discriminant analysis (LDA) is a widely used multivariate data analysis technique for authentication of tea. LDA is categorized as a supervised pattern recognition method because sample groups are assigned a priori. The dimensionality reduction is made by creating discriminant functions from a linear combination of the descriptors to maximize between-group variance while minimizing within-group variance (Ni et al., 2018). When developing an LDA model, one can use all available variables or perform a selection to eliminate variables that do not potentially influence clustering. Analysis of variance (ANOVA) can be used to eliminate elements that are not significantly different between groups, resulting in a more accurate prediction by LDA (Zhang et al., 2018b, 2020b). Liu et al. (2019) utilized the first few PCs of PCA to create an LDA model that distinguished Westlake Longjing tea from tea products from nearby regions in the same province. The resulting LDA model was satisfactory, with an accuracy of at least 97.6% in the training set and 87.8% in the testing set, despite the high similarity between samples (Liu et al., 2019). Another alternative is to use stepwise selection of the variables using Wilks’ lambda criterion to reduce the number of variables. The method is known as stepwise LDA, which produces a better discrimination model (Liu et al., 2021b; Ni et al., 2018).
Other methods that have been frequently used in authentication of tea are partial least square-discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), and ANN. The PLS-DA algorithm focuses on the discrimination of classes in the model. On the other hand, SIMCA models individual classes and calculates the distance between unknown samples and the center of the class (Lim et al., 2021). ANN is a group of classification modeling techniques that employ variables as inputs to a “neuron.” These inputs are subjected to a mathematical function that yields values (outputs). The “neurons” are connected by nonlinear functions depending on the type of the method, such as back propagation-artificial neural networks (BP-ANN) and counter propagation-artificial neural networks (CP-ANN) (Marini, 2009). Reviews of the ANN in food analysis and authentication are available elsewhere (Liang et al., 2020; Marini, 2009).
In building a classification model, overfitting can occur when too many irrelevant variables, referred to as noise, are incorporated into a model. The model closely fits the initial (training) data but will struggle to predict and classify other data as the generalization of the model is diminished. Some of the most common methods for selecting relevant variables are Pearson’s correlation and ANOVA. Pearson’s correlation can be used to select elements whose concentrations have a significant correlation in tea and soil to build a stronger classification model of tea based on geographical origin (Zhang et al., 2021). ANOVA can be applied to identify elements that differ significantly between groups, e.g., those of different origins (Zhang et al., 2018b).
APPLICATIONS
Geographical origin
Many tea products have been certified based on their geographical origin as certain regions produce premium products. Multielemental analysis has been extensively applied to the authentication of tea based on its geographical origin. Green tea from Guizhou Province in China, which has a PGI certificate, displays a distinct multielement profile when compared to tea from other provinces. The analysis was carried out using LDA and orthogonal projection to latent structure-discriminant analysis (OPLS-DA) based on 31 elemental concentrations of mostly REEs and four stable isotopes of C, N, H, and O. Over a smaller geographical area, green tea collected from different counties in the same province shows clustered samples on LDA and OPLS-DA score plots (Liu et al., 2021b). The distinct elemental profile was also observed in the protected designation of origin product of Westlake Longjing (Westlake Dragon Well) green tea. The product was well separated from green tea products from the surrounding area and other provinces in China based on the random forest model generated from 17 elemental concentrations and 5 stable isotopes. The most important geographical proxies were Rb and Mg, which contributed to 20.58% and 12.5% of the prediction, respectively (Deng et al., 2020). Tea from various tea-growing regions in Sri Lanka, all of which were on the same island, was correctly classified by canonical discriminant analysis (CDA) of 13 elemental concentrations and two stable isotopes of C and N. PCA revealed that variations in elemental concentrations, primarily Rb, Mn, Cu, K, and Fe, had a higher contribution to the loading scores compared to 13C and 15N. Therefore, the stable isotopes of 13C and 15N did not have a strong influence on the provenance (Rajapaksha et al., 2017).
Systematic studies on paired soil and tea samples revealed a strong correlation between the concentrations of many elements in soil and tea. Zhao et al. (2017a, 2017b) analyzed tea and soil samples from different provinces in China. Of 20 elements, the concentrations of Na, Mg, Ca, Cr, Fe, Ni, Rb, Sr, Cd, Tl, and Pb in tea had a significant correlation with the elemental concentrations in the corresponding topsoil and subsoil (depths of 0–20 cm and 20–40 cm, respectively) (Pearson’s correlation, P < 0.01). Therefore, the concentration of the aforementioned elements in tea was affected by the soil parent material. On the other hand, the concentrations of Al, K, V, Mn, Co, As, Se, Mo, and Sn in tea did not significantly correlate with the elemental concentrations in soil (Zhao et al., 2017b). Zhang et al. (2020a, 2020b) used 16 elemental concentrations in tea and soil that were significantly correlated to distinguish tea from different areas in Guizhou Province (Pearson’s correlation, P < 0.05 or P < 0.01). The resulting PCA model generated five PCs, with the first PC explaining 71.2% of the total variance. Co, Mn, Tl, Y, P, and Pb contributed as the highest loadings on PC 1. Even when two tea cultivars were included, the stepwise LDA based on Sr, Cr, Pb, P, U, and Cd provided a 100% correct classification rate on geographical origin (Zhang et al., 2020a).
Besides the elemental concentrations in soil, altitude and soil pH influence the elemental concentrations in tea. Li et al. (2018) used Pearson’s correlation tests to determine the correlation of soil pH and altitude to the elemental concentrations in tea. The concentrations of Ni, Mn, Cu, and K in tea had a significant negative correlation with soil pH (Li et al., 2018). The tea plant thrives in acidic soil with a pH of 4–6.5. At lower pH, the trace element mobility from the soil to the plant is increased because of the lower metal adsorption and higher desorption. This factor may influence the correlation between soil pH and elemental concentrations in tea (Neina, 2019). Meanwhile, the altitude had a significant positive correlation with Zn, Mg, Cu, and K concentrations and a significant negative correlation with Ca concentrations in tea according to Pearson’s correlation (Li et al., 2018). The level of precipitation may have an effect on the correlation of altitude with metal concentrations in plants. Higher altitudes have more precipitation, which generally brings more metal to the soil. The metal is then absorbed by the plant, influencing the metal concentration in the plant (Zechmeister, 1995).
Harvesting time
Aside from geographical origin, harvesting time affects the variability of elemental content in tea. Tea harvested in spring, summer, and autumn of the same year had a distinct REE fingerprint which lowered the correct classification rates of the stepwise LDA model for the geographical provenance. In comparison to geographical origin and variety, multiway ANOVA revealed that the variability of the levels of La, Ce, Pr, Nd, Sm, Eu, Gd, Dy, Er, and Yb in tea was mostly affected by season. Interestingly, the REE concentrations in the soil of the corresponding tea-growing locations did not vary by season (Zhao and Yang, 2019). It could be speculated that the seasonal variability of elemental contents was influenced more by climatic factors such as temperature, rainfall, and sunlight duration, which varied in spring, summer, and autumn (Zhao and Zhao, 2019).
Tea from different harvesting years was difficult to classify using elemental fingerprinting in combination with stable isotopes. The tea samples were collected from the same region by the same manufacturer and harvested in different years over a 5-year period. Tea from various years was clustered in the same area based on PCA, PLS-DA, LDA, and HCA. Although the elemental fingerprinting was only moderately successful, Mn, Zn, and Tl had a significant impact on the differentiation of tea from different production years. The authors argued that the elemental concentration in tea was affected by differences in annual climate and the use of different fertilizers (Liu et al., 2020a, 2020c, 2020d).
Cultivars
The elemental fingerprint of tea infusion was distinct from that of other herbal infusions (other than Theaceae) as modeled by PCA of 13 elements (Winkler et al., 2020). When grown in a research garden, eight different tea cultivars could be distinguished from each other with 100% accuracy using HCA, LDA, and back propagation-neural network (BP-NN). The research garden had a regulated environment, whereby the elemental concentrations, pH, and percentage of organic matter in the soil were all comparable. Mn and Al were the most important factors in distinguishing the cultivars as revealed by LDA (Chen et al., 2009). Similarly, PLS-DA of either 98 or 261 LIBS spectral peaks representing C, Fe, Mg, Mn, Al, and Ca clearly separated six cultivars each with 100 tea samples. All of the samples were acquired from the same research institute (Zhang et al., 2018a).
In real-world samples, cultivars had a minor influence on the elemental fingerprints. Among the 12 REEs, only Er and Yb differed significantly between cultivars (Zhao and Yang, 2019). Another study found that the concentrations of only 2–4 elements (among As, Cr, Cu, Pb, Sb, and Zn) out of the 24 elements analyzed significantly differed between tea cultivars, based on the ANOVA test. The stepwise LDA model of 17 elemental concentrations in tea was applied to classify tea grown in different geographical regions within a province. Although each region has 2–3 tea cultivars, the stepwise LDA model successfully classified tea based on geographical origin with a 94.25% discrimination rate (Zhang et al., 2021). Thus, the geographical origin was a stronger determinant for the classification of tea compared to cultivar.
Types
Various types of tea that are produced by different postharvesting processes are available in the market. Tea is typically oxidized at controlled temperature (20°C–30°C) and humidity (95%–98%), which are optimized to produce consistent quality and optimal yield (Pou, 2016). The level of oxidation, mostly by enzymatic processes, determines the tea types produced. White tea is made by drying the buds and leaves as soon as they are harvested, allowing for the least amount of oxidation. Green tea is minimally oxidized, whereas black tea undergoes full oxidation before being stored in a cool and humid environment. To produce Puerh tea, secondary fermentation is applied to black tea after full oxidation and heating (McKenzie et al., 2010; Minca et al., 2015).
LDA successfully distinguished green and black tea using Zn, Mn, Fe, Mg, Cu, Ti, Al, Sr, Ca, Ba, Na, and K with a 98.7% accuracy (Fernández-Cáceres et al., 2001). Similarly, Wang et al. (2016) used the LIBS technique in combination with discriminant analysis to classify green, black, white, and Puerh tea to achieve a minimum of 92% classification accuracy. However, LDA showed lower sensitivity and specificity (64%–100%) than classification by a probabilistic neural network (PNN) (93%–100%) when 10 selected elements (Al, Ba, Ca, Cu, Fe, Mg, K, Sr, S, and Zn) were used to classify black, green, and Puerh tea. In this case, a nonlinear approach of PNN showed better suitability than a linear model of LDA to study the relationship between elemental fingerprints and tea types (McKenzie et al., 2010). Different levels of oxidation, however, did not always result in a distinct elemental profile for each type of tea. PCA was frequently unsuccessful in distinguishing tea types (Diniz et al., 2015; Paz-Rodríguez et al., 2015; Pohl et al., 2020). Another study successfully classified green, black, and Puerh tea by stepwise LDA based on 12 elements (Li, Be, Cr, Co, Cu, Zn, Cd, Pb, Mn, Mg, Sc, and Ce). However, because each tea type originated in a different province, the geographical origin of the tea also influenced its classification (Ma et al., 2019).
CONCLUSION
Many studies have reported on the authentication of tea using elemental analysis in combination with multivariate data analysis. ICP techniques, both ICP-MS and ICP-OES, have remained the methods of choice for elemental fingerprinting due to their high sensitivity, sufficient accuracy and precision, and high-throughput capability. In fact, the availability of ICP-MS, which has recently become more accessible, has been assisting in the advancement of this field. Various multivariate data analyses, with PCA and LDA being the most popular methods, were optimized and implemented for the authentication study. Elemental fingerprinting has a lot of potential as a tool for tea authentication, especially in distinguishing the geographical origin of tea. The method can be further applied for authentication of other high-valued plants, such as saffron, white truffle, specialty coffee, medicinal plants (e.g., American ginseng (Panax quinquefolium L.), Indian snakeroot (Rauvolfia serpentina), turmeric (Curcuma longa L.), etc. In the case of medicinal plants, geographical origin may influence the level and proportion of active substances, thereby influencing biological activities.
While ICP-MS remains the most popular method for multielemental determination for tea authentication, the lengthy preparation procedure is a limiting step for high-throughput analysis. The development of faster preparation methods, possibly using a laser-ablation technique, will further accelerate the progress in this field. XRF and LIBS, which require less sample preparation than ICP-MS, can be improved to increase sensitivity and precision, thereby expanding the number of elements that can be analyzed using these methods.