Vibrational spectroscopy in combination with chemometrics as an emerging technique in the authentication of coffee: A review

Coffee enthusiasts now consume coffee not only as a reliever of drowsiness but also as a lifestyle. Sizeable annual consumption and high demand for exports of coffee can trigger a shortage of coffee stocks from supply companies. This shortage has forced some producers to take fraud actions in coffee counterfeiting. With the vast economic benefits from substituting or adulterating coffee, the development of authentication methods is an ideal solution to follow up on this practice. The combination of some chemometric methods including pattern recognition and multivariate calibrations with fingerprint analysis techniques of Fourier transform infrared spectroscopy (FTIR) spectroscopy could be performed to authenticate coffee products. The use of chemometrics is unavoidable because of the large amount of data received even from the single scanning of FTIR spectra. Some chemometric methods are commonly applied to build classification and prediction models of adulterants in coffee. The objective of this review is to feature the application of infrared (IR) spectroscopy and chemometric analysis to authenticate coffee from various adulterants.


INTRODUCTION
Coffee has become an essential food commodity for countries' economies in production and export (Sezer et al., 2018). Global coffee output was about 172 million bags in 2020/21, in which Robusta (Coffea canephora) and Arabica (Coffea arabica) represented 41% and 59% of the main commercialized species (Couto et al., 2022;Dias and Benassi, 2015). The world's coffee consumption in 2020/2021 increased 1.9% compared to that in 2019/2020. In contrast to this significant increase in coffee consumption, coffee production increased by 1.4% (ICO, 2021). Differences in demand and production levels have caused coffee prices to rise. As a result, coffee adulteration for profit purposes is more common than ever (Cai et al., 2015). Adulteration activities are carried out to reduce production costs. Coffee counterfeiting involves substituting high-quality coffee beans with lower ones, adding other ingredients to coffee blends to make them less expensive, or adding certain chemicals such as sibutramine (SIB) in coffee. Raw Arabica and Robusta beans can be easily distinguished by differences in size and color by a qualified examinant. However, these visual cues are lost during roasting and milling (Barbin et al., 2014). Ground roasted coffee has a reproducible physical appearance, making it the target of fraudulent admixtures with lower-quality coffee beans (Reis et al., 2013a). Various factors affecting the end product of coffee include the grade of coffee beans, roasting level, and grinding process. The addition of adulterants is challenging to detect, especially after roasting and grinding; therefore, identifying adulterants in coffee becomes very complex (Monteiro et al., 2018). For this reason, to ensure coffee quality and authentication, it is necessary to develop and standardize reliable analytical methods with high sensitivity, reliability, traceability, and comparability of results (de Morais et al., 2018;Milani et al., 2020).
Coffee is mainly authenticated using three analytical approaches using different conventional and modern (instrumental) analytical methods, namely, physical, chemical, and biological approaches. The conventional methods typically used in analytical laboratories for detecting adulteration in roasted and ground coffee are based on physical properties such as moisture content, mineral residues, extractable substances, and optical and electron microscopy (Milani et al., 2020;Toci et al., 2016). The modern techniques which have provided reliable and reproducible authentication results more recently include spectroscopic-based methods such as multispectral imaging (Calvini et al., 2017), proton transfer reaction-mass spectrometry (MS) (Monteiro et al., 2018), and nuclear magnetic resonance spectroscopy and vibrational spectroscopy (VS) [near-infrared (NIR), mid-infrared, and Raman] (de Araújo et al., 2021;Forchetti et al., 2020). Capillary electrophoresis (Daniel et al., 2018), gas chromatography (Pua et al., 2019), and high-performance liquid chromatography (Núñez et al., 2021) were all reported to successfully authenticate coffee from adulterants. Additionally, biological methods using DNAbased methods like polymerase chain reaction (PCR) offered specific and sensitive results. PCR successfully authenticates coffee from barley, maize, and rice to adulterate coffee (Ferreira et al., 2016). Among all techniques, VS seemed to be the most-used technique for food authentication purposes. Barbin et al. (2014) published a review that focused on the impact of IR spectral approaches on coffee quality and compositional parameters. This review does not emphasize the use of chemometrics, does not focus on coffee authenticity issues, and is regarded as outdated. Munyend and Njoroge (2022) have reviewed the application of spectroscopy for coffee authentication. However, unlike the prior review, the author did not highlight chemometrics. Due to the sheer development of sophisticated VS research instruments, a massive amount of data can be collected. Chemometric procedures are obliged for further data interpretation, and this is why VS and chemometric processes are inextricably linked. To the best of the author's knowledge, there is no other review highlighting the use of VS in conjunction with chemometric analysis for coffee authenticity.
The application of molecular spectroscopy and chemometrics to authenticate coffee-based products was emphasized in this review. This review provides the latest and detailed information about the usage of VS coupled with chemometrics for coffee authentication purposes. This study thus gives an overview of the use of the IR spectral approach (in combination with chemometrics) as a valid procedure for assessing coffee composition and quality characteristics and classifying coffee samples from various types and quality grades. The author hopes that this paper can be useful as a resource material or information provider to anyone interested in the subject.

METHODS
Various databases, namely, PubMed, Scopus, and DOAJ, were explored to find the references used in this review. The keywords employed during literature searching are coffee, authentication, IR spectroscopy, Raman spectroscopy (RS), adulteration, and analytical methods. Some Boolean functions, namely, AND, OR, and NOT, were used to explore references effectively. Following a manual review of the reference lists of the mentioned articles, some qualified articles were also included. The references found were then subjected to redundancy and critically assessed before being used.

Vibrational spectroscopy
VS, including near-infrared spectroscopy (NIR), midinfrared spectroscopy (MIR), hyperspectral imaging, and RS, measures the absorption of IR radiation by analytes(s) of interest due to the presence of chemical bonds that are IR active. The same functional groups present in the molecular structures of the compounds tend to absorb IR radiation in the same frequency range regardless of the other molecular structure in which the functional group is located. This principle is used to identify the structure of an unknown molecule (Haughey et al., 2015;Rohman and Man, 2012). The specific interaction and binding behaviors between functional groups are distinctive and can be useful for fingerprint analysis (Kucharska-Ambrożej and Karpinska, 2019). Fingerprint profiling techniques such as VS, liquid chromatography-MS/MS, or DNA-based method are currently applied to detect the adulteration of food products, including in coffee authentication. The VS method is frequently used for coffee authentication and quality control as a sensitive and rapid analytical tool (Bázár et al., 2016;Chen et al., 2019;Lohumi et al., 2015). VS methods offer simple sample preparation and minimum solvents and reagents, which support applying green analytical methods in chemical analyses (Moros et al., 2010). VS is also considered a non-destructive technique in which the analyzed samples using VS can be further analyzed with other instrumental techniques like chromatographic techniques (Bassbasi et al., 2013;de Marchi et al., 2014). Due to the extensive data set generated during measurement with VS, the data treatment using chemometrics approaches is typically used. The combination of VS and chemometrics is successful in being employed in food products authenticity, including squeezed orange juice, meat, beeswax, and honey (Ferreiro-González et al., 2018;Maia et al., 2013;Shen et al., 2016;Yang et al., 2018).
Coffee authentication using VS can also be approached using targeted and untargeted adulterants. Targeted analytical methods focus on finding the specific adulterants discovered as potential materials to adulterate the coffee. The commonly targeted adulterants in coffee include barley (Ebrahimi-Najafabadi et al., 2012;Ferreira et al., 2016), coffee grounds, coffee husks, and coffee sticks (de Morais et al., 2018), corn/maize (Monteiro et al., 2018), soybeans (Daniel et al., 2018), chickpea and rice (Sezer et al., 2018), oat (Flores-Valdez et al., 2020), or even Robusta coffee (Adriansyah et al., 2021). On the other hand, untargeted analytical methods aim to detect whatever adulterants are present in a coffee sample. Targeted and untargeted analytical methods have their perks. Targeted modeling provides more-detailed information about an adulterant. Targeted analytical methods present several responses or assignments, while untargeted modeling serves as a more restrictive model with one response (a sample fits the model or not) (López et al., 2013). Targeted modeling, however, needs standardized and validated operating procedures, chemicals, analysis, and statistical models. In contrast, non-targeted approaches can be used even for samples with simple or no preparations (Esslinger et al., 2014).

Chemometrics
Fingerprint features in VS contain a concurrent determination of various elements from the samples in a single test. The main advantage of VS over other multicomponent methods is the ability to analyze samples without any treatments (Cuadros-Rodríguez et al., 2016). Even from the single measurement, the data gathered from VS are rather complex and are considered big data; therefore, a special statistical treatment called chemometrics is needed. Chemometrics is the study of statistical or mathematical processes used to analyze the measurements to extract as much data as possible from the chemical data. Chemometrics is defined by The International Chemometrics Society as "the science of relating chemical measurements made on a chemical system to the property of interest (such as concentration) through the application of mathematical or statistical methods" (Rohman and Windarsih, 2020). Nowadays, chemometrics has become necessary for analyzing data from vibrational-based instruments (Peris-Díaz and Krężel, 2021;Xu et al., 2020). VS combined with chemometrics provides specific analysis and classification or discrimination methods. Dimensional overload, collinearity, spectral noise, and spectral interference all require this combination (Cuadros- Rodríguez et al., 2016), quantifying and determining adulterants in samples.
Chemometrics is mainly based on applying empirical models intended to build predictive models for qualitative (classification) or quantitative (calibration) purposes. The experimental measurements could provide extensive data containing much information, allowing the analyst to predict one or more properties of interest. The same data could be treated with different chemometric techniques. Consequently, selecting appropriate chemometric models under investigation and verifying model reliability are fundamental aspects. The chemometric strategies for performing these tasks are collectively referred to as validation. Validation is intended to assess whether chemometric modeling can generate dependable conclusions (Brereton et al., 2017). During the validation process, including some criteria is suggested such as the appropriateness of the chemometric model, the adequacy of computational calculations used in the fitting procedure, statistical reliability of the models, and the generalization of any resulting interpretations (Westad and Marini, 2015).
Chemometric methods are grouped into two groups based on their development, namely, (1) traditional chemometrics including k-nearest neighbor (k-NN), linear discriminant analysis (LDA), one-class partial least squares (OC-PLS), partial least squaresdiscriminant analysis (PLS-DA), quadratic discriminant analysis, and soft independent modeling by class analogy (SIMCA) and (2) machine learning methods including artificial neural networks, classification and regression trees, naïve Bayes, random forest, and support vector machine (SVM) (Cuadros-Rodríguez et al., 2016). Depending on their objectives, chemometric methods are grouped into three types: (1) processing techniques used to enhance the information available from spectra such as normalization, baseline corrections, and centering, (2) classification chemometrics which can be in the form of exploratory data analysis and unsupervised pattern recognition methods such as cluster analysis and supervised pattern recognition such as discriminant analysis, and (3) regression methods involving multivariate calibrations which linked vibrational spectra to quantifiable properties of analytes such as concentration (Moros et al., 2010).
To assess the performance of the developed models, some diagnostic parameters based on model parameters or the statistical calculation of residuals (the difference between actual and predictive parameters) are often used as error criteria. Validation of chemometric models can be performed using two approaches, internal validation (cross-validation) and external validation. Crossvalidation is required to avoid overfitting the model. Cross-validation is based on repeatedly resampling the dataset into the subsets of training and testing. This validation is typically done using the leave-one-out technique in multivariate calibration models. One of the calibration samples has the developed calibration model deleted, and the other calibration samples are utilized to create a new calibration model. The excluded sample is then calculated using the newly generated model. By removing one of the calibration samples at a time, the procedure is repeated. Furthermore, the statistical performances are evaluated to determine whether the model is reliable enough or not. Cross-validation is suitable if the number of the evaluated samples is small and there is no possibility of building an external test set. The main disadvantage of this validation is that the result estimations could still be biased because the calibration and validation datasets are never entirely self-contained. The second dataset, independent of the dataset used in the calibration model, is used in external validation. In this approach, the residuals (test set validation) are calculated from independent samples, which mimic how the developed model will be routinely used. Therefore, this strategy is recommended whenever possible (Biancolillo and Marini, 2018). It is suggested that the validation approach should be selected based on the sample size. When the dataset or number of samples is small (less than 50), cross-validation is preferred, while external validation is typically used if the number of samples is more than 50 (Kos et al., 2003).
To evaluate the classification chemometrics, some performance characteristics including sensitivity, specificity, accuracy, and model efficiency are used (Oliveri and Downey, 2012;Oliveri et al., Another important parameter used for classification is the number of misclassification (NMC) and reliability rate: (8) Statistical parameters typically used for the performance characteristics evaluation during the validation of analytical methods involving multivariate calibrations are coefficient of determination (R 2 ) for the relationship between two variables, actual values on the x-axis, and predicted values using specific instruments (accuracy). The precision of validated analytical methods is assessed by root mean square error of calibration (RMSEC) for error evaluation in the calibration model and root mean square error of prediction (RMSEP) for error evaluation in the prediction model. The following formulae are used to obtain RMSEC and RMSEP: M and N are the samples used in calibration and validation; Y i is the predicted value, while Y i is the actual value.
Basically, ( i -i) Y Y is the residual. In the case of cross-validation using the leave-one-out technique, the terms of RMSEC are replaced with RMSECV.

Coffee authentication using MIR spectroscopy combined with chemometrics
Attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR) and Diffuse reflectance fourier transform infrared spectroscopy (DRIFTS) are used for rapid analytical methods in food authentication analysis including different types of coffee. Chemometric approaches such as principal component analysis (PCA) and LDA were used to determine and discriminate the presence of spent coffee grounds, barley, and corn from pure roasted coffee. All samples (adulterated and unadulterated coffee and each adulterant sample) are separated clearly by LDA modeling with accuracy levels of 100%, although there is an overlap between barley and corn samples. The absence of starch in coffee, by-products in barley or corn, and the different contents of oils and caffeine are all associated with the discrimination between coffee and its adulterants according to the PCA results (Reis et al., 2013b). Using LDA, the samples are separated into six groups according to their classes: pure coffee, adulterated coffee with adulterant levels as low as 1/100 g, spent coffee grounds, coffee husks, corn, and barley.
Combined with PCA and LDA, DRIFTS has been developed as an analytical tool for detecting adulteration in roasted and ground coffee. In this study, a PCA score plot based on PC1 and PC2 applying the first derivative FTIR spectra variable at wavenumbers of 3,200-700 cm −1 showed the best classification profiles compared to average spectra and FTIR spectra subjected to normalization and baseline correction. The method can classify ground coffee, coffee husks, and corn samples. LDA classification models in this research also provide complete discrimination between four groups of samples (authentic coffee, corn, coffee husks, and coffee adulterated with coffee husks, corn, or both with adulterant levels of 5%-50%) (Reis et al., 2013a). However, the validation datasets for confirmation of the developed classification models of authentic and adulterated coffee are not found.
The combination between PLS and PCR, as well as ATR-FTIR spectroscopy, has been applied to predict adulterant levels of non-Kona coffee in Kona coffee, the premium quality of coffee in the USA. ATR-FTIR spectra are subjected to some transformation of mean centering (MC) and the first and second derivatization, and the performance of the model using these transformed FTIR spectra is compared. Some wavenumber regions are also assessed for the optimization process. Finally, PLS using second derivative spectra at wavenumbers of 800-1,900 cm −1 could predict non-Kona coffee as an adulterant accurately and precisely as indicated by high R 2 values (> 0.999) and low standard error of calibration value of 0.81 (Wang et al., 2009). The calibration model was also validated using commercial Kona coffee blends. Furthermore, the combination of second FTIR spectra at wavenumbers of 800-1,900 cm −1 and canonical analysis could discriminate Kona coffee from different regions on Hawaii Island. The authentication of the geographical origin of Arabica and Robusta coffee in China was also carried out by applying the combination of mid-IR transmittance spectroscopy and some classification chemometrics of PCA, k-NN, PLS-DA, SIMCA, SVM, backpropagation neural network (BPNN), and radial basis function neural network (RBFNN). Using the whole FTIR spectra region (4,000-400 cm −1 ), both BPNN and RBFNN provide the best classification models with accuracy levels of 100% in training or test datasets (Zhang et al., 2016).
The first derivative FTIR spectra at wavenumber regions of 3,800-2,800 and 1,800-500 cm -1 combined with classification chemometrics of PCA and LDA have been used to facilitate the investigation and authentication of commercial coffee according to global quality (divided into gourmet, inferior, superior, or traditional classes). PC1 and PC2 contributed to 80.1% of the data variability, and based on the loading plot, it is suggested that carbohydrates, esters and lipids, caffeine, and chlorogenic acids (CGA) components contribute more to group differentiation. However, PCA does not offer substantial evidence of any categories among the studied coffee samples; therefore, the chemometrics of LDA, a supervised pattern recognition algorithm, was used. LDA using the same condition used in PCA can provide sensitivities of 83%-100% and specificities of 93% to 100%, showing that the developed model is reliable for commercial coffee authenticity (Silva et al., 2021). Cebi et al. (2017) developed FTIR spectroscopy and chemometrics of hierarchical cluster analysis (HCA) and PCA for the detection of SIB in dietary supplements. This advanced technique offers several advantages including being cost-effective, rapid, easy, non-destructive, and environmentally friendly. The variables used for modeling are FTIR spectra at wavenumbers of 2,746-2,656 cm −1 through the Euclidean distance and Ward's algorithm. The results show that the presence of the active drugs that are not allowed in herbals, including coffee, could be successfully determined at levels of 0.375-12 mg in a total of 1.75 g with acceptable validation criteria.
Data fusion (DF) obtained by DRIFTS and FTIR spectra obtained from ATR and DR in combination with PLS-DA is developed as a dependable technique for the discrimination of ground roasted coffee from different adulterants (spent coffee grounds, roasted coffee husks, roasted corn, and roasted barley). Compared to each sampling technique, DF improves discrimination models used for detecting and identifying the multiple adulterants in roasted and ground coffee. DF models could also detect samples blended with adulterants, even when four different adulterants were mixed. By considering the training/test sets, the application of DF decreased the percentage of misclassified samples. The PLS-DA model is also successfully used to verify whether unknown samples can be separated according to the adulterant types present in the authentic coffee (Reis et al., 2017).
FTIR spectroscopy with photoacoustic detection (FTIR-PAS) using the wavenumbers region of 4,000-600 cm −1 combined with PCA and PLS-DA was developed to discriminate coffee blends based on differences in coffee species, types, and number of defects. PCA allowed the prediction of the amount/fraction and kind of the defects in coffee blends, while PLS-DA could discriminate the samples according to their classes. Based on the loading plot, bands at 3,000-3,600 cm −1 provide more contribution on PCs 1 and PCs 4. The peak at 1,067 cm −1 comes from pyruvic acid, pyridine, and quinic acid, while the peak at 3,356 cm −1 is specific to CGA. Combining these techniques has been proven an easy, fast, and green solution as a quality control tool for roasted and ground coffee (Dias et al., 2017).
DRIFTS coupled with PLS was employed for the prediction of adulterant levels (spent coffee grounds, coffee husks, roasted corn, and roasted barley) in roasted and ground coffee. Some wavenumber regions or their combination and spectral preprocessing were compared to get the optimum condition capable of providing the best prediction modeling. The optimized models were obtained using spectra at combined wavenumbers of 3,200-2,730 and 1,800-700 cm −1 previously subjected to SNV and MC. Using this condition, R 2 values for the accuracy evaluation were > 0.99 with RMSEC of 1.96 and RMSEP of 3.74% using 10 latent variables. This indicated that using DRIFTS and PLS together is proven to detect and quantify some interfering substances in ground roasted coffee, with adulteration rates ranging from 1% to 66% w/w (Reis et al., 2013c). Furthermore, the same authors (Reis et al., 2016) compared two methods (DRIFT and FTIR-ATR) combined with PLS models to identify the adulterants present in ground and roasted coffee. FTIR-ATR provides a better model for predicting the adulteration levels at a range of 0.5% to 40% . The application of MIR spectroscopy combined with chemometrics to authenticate coffee has been summarized in Table 2.
Authentication of coffee using near-IR spectroscopy NIR spectroscopy can be developed into a reliable model for detecting coffee adulterations combined with multivariate techniques. Pizarro et al. (2007) used NIRS and multivariate calibration by quantifying the levels of coffee of Robusta variety in roasted coffee samples from varied origins (36 Arabica and 47 Robusta coffees). NIR spectra were subjected to preprocessing using OWAVEC (special software designed by the author's research group) and then subjected to partial least square regression (PLSR) for quantitative modeling. In addition, some preprocessing techniques, which include meaning centering, first derivative, and orthogonal signal correction (OSC), were also used for accomplishing the essential needs in PLSR modeling. The calibration model using NIR spectra subjected to OWAVEC preprocessing provided high-quality results with R 2 for the correlation between actual and predicted values of Robusta variety contents of > 0.999 and RMSEP of 0.79%. This method can detect and quantify potential coffee adulterations, although successful modeling critically depends on the signal preprocessing methods applied.
Civet coffee's high prices in coffee markets have attracted unethical players to adulterate it with low-price coffee or other cheaper additives (Adriansyah et al., 2021). NIRS in combination with full-spectrum (FS)-PLSR has been successful for quantitative analysis of the adulteration degree of ground roasted coffee samples in civet coffee in the concentration range of 0%-51%. Spectral data were scanned at wavelengths 1,300-2,500 nm. The samples were divided into calibration (84 samples) and validation (42 samples) datasets during modeling. FP-PLS provided the acceptable model with R 2 values for correlation between actual and NIRS predicted values of 0.96 (for calibration) and 0.92 (for validation), respectively. The accuracy of the developed method is good, as indicated by the low RMSEP value of 4.67. This result confirmed that the developed method could be applied as a nondestructive authentication system for civet coffee (Suhandy et al., 2018).
With the development of miniature instrumentation of NIR, portable micro-NIR was successfully applied as an effective means in quality control of Arabica coffee from the adulteration practice by identifying and quantifying Robusta coffee (at different roasting levels) as an adulterant. This method also analyzes Arabica coffee adulterated with inexpensive ingredients typically added in coffee, such as corn, peels, and sticks. PCA and PLS were used to treat NIR spectra at combined wavelengths of 900-1,000, 1,100-1,200, and 1,400-1,500 nm previously subjected to the Savitzky-Golay derivatization to perform these tasks. PCA could classify samples according to their level of adulterants. In addition, PLS could predict the level of adulterants with a limit of quantification values of 5-8 wt% with acceptable accuracy and precision as indicated by high R 2 values ranging from 0.9732 to 0.9925 and low values of RMSEP (2.8 wt%). This method saves time and sample preparation real-time data acquisition efficiency and can represent a significant variability of adulterants in highquality coffee (Correia et al., 2018).
Coffee adulteration with barley was successfully identified using a combination of NIRS and PLS. The adulterants of four types of barley with level ranges of 2%-20% (wt/wt) were added to roasted and ground coffee samples at different roasting degrees. The calibration and validation samples were selected using a D-optimal design, resulting in 100 datasets for calibration and 30 datasets for validation. The chemometrics of the genetic algorithm (GA) was applied to determine the wavelength regions giving the best models using PLS. Using absorbance values at wavenumbers of 6,032-5,748, 4,880-4,788, 4,688-4,628, and 4,336-4,276 cm −1 Table 1. The application of NIR and chemometrics for authentication of coffee products.

Results
Ref.

Authentication analysis of
Robusta coffee in roasted coffee by NIRS

Issues Methods and measurement conditions Chemometrics Results
Ref.
Differentiation of roasted Arabica coffee from common adulterants (roasted corn and coffee husks) DR-MIRS utilizing 20 scans at a resolution of 4 cm −1 at wavenumbers 4,000-400 cm −1 . The noise values on the upper and lower ends were removed To distinguish roasted Arabica coffee from adulterants, PCA and LDA were used Using DRIFTS in conjunction with PCA and LDA analysis, roasted coffee husks and roasted corn could be distinguished.
When it comes to distinguishing between roasted coffee, pure adulterants, and adulterated coffee samples, LDA could achieve 100% accuracy Reis et al. (2013a) Identification of adulterants in roasted and ground coffee (spent coffee grounds, coffee husks, roasted corn, and roasted barley) DRIFTS with a resolution of 4 cm 1 and 20 scans at 4,000-400 cm −1 PLS for quantification of adulterants using the variable of FTIR spectra previously subjected to preprocessing techniques (MC, derivatization, MSC, and SNV) PLSR was successfully implemented to detect and quantify various adulterants in ground, roasted coffee, with adulteration levels ranging from 1% to 66% wt/wt, utilizing first derivative FTIR spectra at combined wavenumbers of 3,200-2,730 and 1,800-700 cm -1 . The accuracy is greater than 98% Reis et al. (2013c) Multiple adulterants (coffee grounds, barley, and corn) detection in roasted and ground coffee FTIR, wavenumbers (1/ λ) 4,000-400 cm −1 with a 4 cm −1 resolution The Savitzky-Golay method was used to pretreat the spectrum. For classification, PCA and LDA were utilized With 100% recognition and prediction skills, DRIFTS can also be used to discriminate between roasted coffee and coffee that has been contaminated with numerous adulterants. Adulterants as little as 1/100 g can be detected using this technology Reis et al. (2013b) Authentication of Kona coffee from non-Kona coffees FTIR, wavenumbers (1/λ) 4,000-400 cm -1 using 256 scans at a resolution of 8 cm -1 Kona coffee levels in ground and brewed coffee samples are predicted using PLSR With the best degree of accuracy, FTIR paired with PLS employing second derivative FTIR spectra at 1,900-800 cm -1 may predict the levels of ground and brewed Kona coffee as adulterants in Tracing the commercial coffee from the potential adulteration practices ATR-FTIR with 32 scans and a resolution of 4 cm -1 at 1/4,000-500 cm -1 PCA was conducted to explore the spectral data, while LDA was used for the classification of commercial coffee samples Using the PCs derived from the PCA result as input variables, the classification performance of LDA was greatly enhanced using FTIR standard spectra at 4,000-500 cm -1 .  Cebi et al. (2017) Determination of specific defects in roasted ground coffee using quantitative methods FTIR coupled with the photoacoustic detector at wavenumbers 4,000-600 cm -1 , 16 scans at a resolution of 4 cm -1 . Each sample was replicated thrice To distinguish blends with different bases and separate the range of typical faults, the PCA and PLS-DA approaches were used FTIR-PAS combined with PCA and PLS-DA can differentiate and quantify defective coffee as roasted or ground coffee. PCA enabled the amount/fraction and nature of flaws in blends to be predicted. PLS-DA was able to classify 100% of the data into their respective classes Dias et al. (2017) Four adulterants (coffee husks, wasted coffee grounds, barley, and corn) were detected and quantified in roasted and ground coffee at the same time FTIR wavenumbers ranging from 4,000 to 700 cm -1 (4 cm -1 resolution, 20 scans, and background subtraction). Five times each sample was reproduced PLS for quantifying adulterants in roasted coffee samples with degrees of adulteration ranging from 0.5% to 66% in wt/wt PLS and ATR-FTIR have both proven to be effective methods for detecting various adulterants in roasted and ground coffee with low errors and excellent R 2 values of 0.99 Reis et al. (2016) Continued with total wavenumbers of 128, PLS-GA provides a low root mean standard error of 1.10% in calibration sets and 1.4% w/w and 0.8% w/w in the test and external set, respectively (Ebrahimi-Najafabadi et al., 2012). NIR spectroscopy combined with digital images using one supervised classification method of data-driven (DD)-SIMCA has been applied to authenticate gourmet ground roasted coffees from the traditional coffee and superior coffee samples. The analysis was performed directly by measuring the coffee powders into an NIR spectrometer at wavenumbers of 4,000 to 10,000 cm −1 using diffuse reflectance accessory at the resolution of 8 cm −1 spectral resolution and by integrating 32 scans. Using 10 PCs of NIR spectra variables previously subjected to preprocessing to remove noise and baseline shift corrections, including offset correction linear baseline correction and the Savitzky-Golay derivatization, the DD-SIMCA model could discriminate the studied coffee samples according to its classes with sensibility and specificity values of 1.00 (accuracy levels of 100%), in both the training and test sets (de Araújo et al., 2021). Table 1 provides a summary of coffee authentication using NIR spectroscopy coupled with chemometrics.

Authentication of coffee using RS and chemometrics
RS is considered a simple, fast, and reliable analytical technique for the authentication of coffee. In some cases, this technique also offers a non-destructive analytical method depending on the laser power used. The Raman spectrum provides valuable information about the structure and composition of the evaluated samples. The Raman effect relies on the inelastic scattering of monochromatic laser radiation by molecular vibration when the scattering is accompanied by a change of polarizability in the chemical bonds. Since the energy losses (frequency shifts) reflect the internal vibrational energies of the molecules in a sample, and the intensity of scattering is directly proportional to the concentration of these molecules, the Raman spectrum, which aggregates the effects of all the different functional groups, is considered a fingerprint of the sample (Dias and Yeretzian, 2016).
RS, in combination with PCA and PLS-DA, was developed for Arabica coffee's authentication from Robusta coffee. The Raman spectrum baseline was adjusted using the algorithm of weighted least squares, and the corrected Raman spectra were normalized per unit vector. The peaks at wavenumbers of 3,500-50 cm −1 were used as a variable during PCA and PLS-DA. The combination of RS with PCA proved to be a robust method to discriminate between the two coffee species based on explained variance, in which six PCs could explain 94.03% of the total data variance. During PLS-DA, the variables selected were based on variable importance in projection; if variable importance in projection (VIP) was greater than one, the variable was used for modeling (VIP). Peaks at 1,680, 1,637, 1,606, 1,479, and 1,336 cm −1 contributed more to the discrimination between Arabica coffee from Robusta coffee (El-Abassy et al., 2011). This result demonstrated that the use of RS in combination with PCA and PLS-DA could be a trustworthy method for coffee authentication. RS can also identify the chemical changes in natural pulped green coffee beans stored in various packaging materials using an appropriate chemometric tool (Q control charts) (Abreu et al., 2019).
The combination of RS and chemometrics techniques of PCA, LDA, mixture-DA, quadratic-DA, regularized-DA, PLS-DA, and SIMCA methods were compared and applied for the classification and discrimination of clonal varieties of coffee. The peaks of RS at wavenumbers of 1,200-1,800 cm −1 previously subjected to baseline alignment MC or multiplicative scatter correction (MSC) are compared for the discrimination of coffee. Raman spectra corrected with MSC provided more accurate results than other spectral treatments. Using MSC, Raman spectra coupled with LDA and SIMCA provide the best classification models for the correct classification of 98.7% and 97.3%, respectively (Luna et al., 2019). PCA and PLS-DA in combination were also successful in the discrimination of four genotypes of Arabic coffee, one Mundo Novo line (G1) and three Bourbon lines (G2, G3, and G4). The variables used were absorbance values at wavenumbers of 3,500-400 cm −1 . The bands at 1,567 and 1,479 cm −1 , which are associated with kahweol, contributed the most to genotype discrimination, followed by the bands at 1,502 and 1,442 cm −1 , which are related to the cafestol molecule and lipids, respectively, based on VIP scores. Thus, by selecting the appropriate bands with high VIP scores and chemometrics techniques, the discrimination of coffee genotypes is deemed possible (Figueiredo et al., 2019).

CONCLUSIONS
IR spectroscopy techniques, particularly fingerprinting models, can authenticate wide variations of coffee (civet, Arabica, and Robusta) from its adulterants. IR spectroscopic methods coupled with chemometric methodological approaches have proven to become such a powerful yet efficient technique for identifying and quantifying targeted and untargeted adulterants. Combined with unsupervised and supervised pattern recognition, FTIR spectra are successfully employed for the discrimination and classification of authentic and adulterated coffee with acceptable characteristics performance. In addition, with the optimization of FTIR spectra in terms of wavenumber region and FTIR spectral preprocessing combined with multivariate calibrations, the developed method is reliable enough for prediction of the levels of adulterants providing a fast, green, and analytical method with minimum use of chemicals and solvents. Next, the developed method could be standardized to be used for coffee authentication.

AUTHOR CONTRIBUTIONS
All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agree to be accountable for all aspects of the work. All the authors are eligible to be an author as per the international committee of medical journal editors (ICMJE) requirements/guidelines.

ACKNOWLEDGMENTS
The authors are grateful to the reviewers for their thoughtful comments on how to improve this review.