FTIR-based fingerprinting combined with chemometrics method for rapid discrimination of Jatropha spp. (Euphorbiaceae) from different regions in South Sulawesi

Jatropha species is a medicinal plant commonly used as a traditional medicine and raw material for biodiesel. The classification, identification, and discrimination of these closely related Jatropha spp. are crucial to ensure the raw material’s quality. This study developed an integrated method of Fourier transform infrared (FTIR) to explore functional groups’ information between Jatropha species combined with multivariate statistical analysis. Jatropha curcas and Jatropha multifida were collected from different regions. FTIR profiles were used to obtain the holistic fingerprinting pattern combined with principal component analysis (PCA). After that, the orthogonal partial least square-discriminant analysis (OPLS-DA) and variable importance in the projection (VIP) were used to screen potential characteristic functional group (VIP > 1) in Jatropha spp. FTIR data analysis using chemometrics, PCA, and hierarchical cluster analysis (HCA) classified and differentiated the leaves and stem bark of the samples from different origins. The OPLS-DA reveals that functional groups identified to distinguish J. curcas and J. multifida , namely C–H and C–O with VIP value > 1, are the most important functional groups. In summary, this research presented rapid discrimination between J. curcas and J. multifida from different regions and can be used to identify and discriminate closely related plant species.


INTRODUCTION
Jatropha spp., locally known as Jarak, belongs to the Euphorbiaceae family. Jatropha curcas and J. multifida are the most common species found in Indonesia. Jatropha curcas is widely cultivated as biodiesel feedstock (Ezeldin Osman et al., 2021;Shamsi and Babazadeh, 2022) and for phytoremediation of exmining lands rich in heavy metals (Bhatt et al., 2021;Riyazuddin et al., 2022), while J. multifida was developed as an ornamental plant. In Indonesia, ethnopharmacologically, the sap from the leaves and bark of this plant is generally used traditionally to treat wounds. Both species are reported to have anti-inflammatory, antimicrobial, anticancer, antiviral, antidiabetic, and anticoagulant activities (Abdelgadir and Van Staden, 2013;Srinivasan et al., 2019). However, this species has also been reported to have toxicity, especially to the seed organ (Dong et al., 2022).
Fourier transform infrared (FTIR) spectroscopy is a part of vibrational spectroscopy that provides the fingerprint information of fundamental vibrations of the chemical structure of materials in a nondestructive testing technology that has been widely used in various fields (Taylan et al., 2021). Compared to chemical detection, standardization, and quality control in herbal medicines, infrared spectroscopy combined with chemometrics is a quicker identification, reliable, effective, and low-cost method that is more convenient (Wu et al., 2022). A combination of FTIR and chemometrics can be applied to classify medicinal plants using fingerprint analysis, for example, classification of Sida rhombifolia from different growing locations ; Andrographis paniculata based on planting age and the solvent extraction method used (Kautsar et al., 2021); discrimination of Curcuma longa, Curcuma xanthorrhiza, and Zingiber cassumunar from different locations ; differentiation of wild and cultivated agarwood (Yao et al., 2022); and quality grade discrimination from Gastrodia elata powder (Zhan et al., 2022).
Closely related plant species may have similar compositions of compounds, but their metabolism is influenced by several factors, such as altitude, planting age, growing location, and water requirements (Kautsar et al., 2021;Umar et al., 2021). The three sampling locations in this study have different altitudes, affecting the levels of secondary metabolites. At higher altitudes, soil pH and soil micronutrient content decrease, attributed to lower mineralization processes at lower pH. Some compounds increase in response to different altitudes. These secondary metabolites help adapt stressed plants to different environmental conditions (Hashim et al., 2020). In addition, these two species are widely used as raw materials for traditional medicine. They have different pharmacological effects (Sabandar et al., 2013), thus allowing errors in the selection and adulteration of raw materials. For this reason, quality control of Jatropha species becomes crucial and discrimination between the two species in this study has not been previously reported. However, because of the complex spectral data from plant samples, it is not easy to distinguish them visually. Therefore, chemometric techniques are needed to overcome them. Our literature study has shown no reports applying FTIR-based combined metabolomics fingerprinting and chemometrics to classify and discriminate J. curcas and J. multifida and their organs from different regions. The application of chemometrics methods, mainly principal component analysis (PCA) and orthogonal partial least squarediscriminant analysis (OPLS-DA), was undertaken to discriminate and identify the key functional group between different types of plants from different geographical origins (Nasr et al., 2022;Yao et al., 2022). The OPLS-DA score plot model has also been applied to identify markers of functional groups and metabolites from samples (Yao et al., 2022). This study aims to classify J. curcas and J. multifida (species and plant organs) from different growing locations based on a combination of FTIR spectrum data with chemometric analysis using PCA and identification of functional group markers using OPLS-DA.

Plant material and chemicals
Jatropha curcas and J. multifida samples were collected from different regions (Toraja Utara, Tana Toraja, Maros, Gowa, and Makassar) in South Sulawesi (Table 1). The plant parts used are leaves and stem bark. Potassium bromide (KBr) was purchased from Sigma-Aldrich (St. Louis, MO) for spectroscopy grade.

Sample preparation
All samples were sieved, dried, and pulverized before use. In this experiment, the powder samples (10 mg) were blended with KBr (90 mg) disks. The powdered samples were pressed to the germanium crystal surface for spectra collection.

FTIR spectra acquisition
The FTIR spectra of samples were scanned using an FTIR spectrophotometer (Nicolet™ iS10 FTIR, Thermo Scientific™, USA), controlled with the OMNIC™ software (Thermo Scientific™, USA). The measurements were carried out in 4,000-400 cm -1 with 16 scans at a resolution of 16 cm -1 and an interval of 1.928 cm -1 using horizontal attenuated total reflectance (HATR) composed of germanium crystal type. All FTIR spectra were corrected against the FTIR spectrum of air as background. In addition, FTIR was recorded in five replications as absorbance values at each data point. FTIR analysis was used as a qualitative technique for analyzing the functional groups of materials. This analysis was used to obtain the spectrum pattern from the sample of plant parts, i.e., leaves and stem bark. Then, the obtained spectrum and functional groups were analyzed by chemometrics to identify, discriminate, group, and detect any functional groups that might be potential markers to distinguish J. curcas and J. multifida.

Data analysis
The processed data wavenumber from FTIR analysis was analyzed using MetaboAnalyst 5.0 (https://www.metaboanalyst. ca/) for multivariate data analysis (MVDA) (Xia and Wishart, 2016). The principal component analysis evaluated the score plot for sample grouping, similarities, and differences among species and plant parts. Furthermore, OPLS-DA was used for separation between the two Jatropha species (J. curcas and J. multifida). In addition, VIP scores were used to screen potential characteristic functional groups in the samples. Sample normalization by median and the Pareto scaling (data scaling) were chosen for PCA and OPLS-DA analysis. MVDA results were subjected to several validation tools such as permutation value (R 2 and Q 2 ) and VIP to confirm the reliability of the OPLS-DA model (Lim et al., 2022).

Sample
Sample powder was used as a substitute for the extract to avoid bias in the FTIR analysis. The presence of solvent extraction such as water and alcohol will produce a stretching vibration pattern -OH at 3,200-3,500 cm -1 (Rohman et al., 2021).

Spectroscopy characteristics
In this study, we evaluated the capability of rapid, nondestructive, reliable, and robust FTIR spectroscopy coupled with multivariate analyses for discrimination of J. curcas and J.   multifida. The typical representative FTIR spectra of the Jatropha sample powder are shown in Figure 1, and the overall sample spectra are shown in Figure S1.
The characteristic FTIR peaks were initially used to distinguish the different substances (Zhan et al., 2022). The FTIR spectra of the leaves and stem bark samples from both species showed the same pattern but showed differences in the intensity of the peak transmittance. Each peak and shoulder in FTIR spectra corresponded to functional groups and are responsible for infrared absorption from metabolites in J. curcas and J. multifida (Akin Geyik et al., 2022). The spectral patterns formed on the leaves and stems bark of the two species show a noticeable difference, with wavenumbers 2,848 cm -1 (leaves) and 1,421 cm -1 , 1,384 cm -1 , 1,247 cm -1 , and 1,156 cm -1 (stem bark). The peak near 2,848 cm -1 corresponds to C-H (alkane, aldehyde) asymmetric stretching vibration. 1,421 cm -1 is attributed to C-C (aromatic) bending vibration. The spectral bands near 1,384 cm -1 correspond to C-H (alkanes) bending vibration. The peak near 1,247 and 1,156 cm -1 is assigned to C-O (esters, ethers, alcohol, and carboxylic acid) stretching vibration and C-N (aliphatic amine) stretching vibration. The fingerprint area at 1,000-400 cm −1 has the same pattern but different absorbance (Gad and Bouzabata, 2017;Wang et al., 2021). The functional groups present in the samples and their relative abundance (absorbance values) with a hierarchical cluster analysis (HCA) model in the form of a heatmap are shown in Figure 2. According to Vilkickyte and Raudone (2021), sunshine duration, temperature, air humidity, longitudes, altitudes of collecting locations, and macronutrients in the soil significantly affect the levels of compounds from plants.
PCA carried out exploratory analysis with FTIR spectral data in 4,000-400 cm -1 . The PCA score plot is shown in Figure 3.
The principal components (PC) showed 97.8% (PC-1, 92.6% and PC-2, 5.2%) of explained variance when the spectra FTIR data were analyzed using PCA. The total PC values on leaves and stem bark discrimination of J. curcas (Fig. S2) and J. multifida (Fig. S3) were 97.65 (PC-1, 93.1% and PC-2, 4.5%) and 98.8% (PC-1, 88.7% and PC-2, 10.1%), respectively. We also performed an analysis of the leaves and stems of each sample. The total PC value of J. curcas and J. multifida (Fig. S4) leaves was 98.5%, while that of the stem bark (Fig. S5) was 98.1%. In this study, PCA and HCA, as some unsupervised pattern recognition techniques, have been used to obtain information on sample grouping patterns, similarities, and differences between plant species and parts (Batsukh et al., 2020;Umar et al., 2021). The total PC value obtained from each sample analysis shows an outstanding accuracy in grouping samples based on species, organs, and geographic origin (Umar et al., 2021). The sample group close to the central coordinate point (0,0) is similar. On the contrary, further away from the central coordinate point shows a high difference with each sample .
The OPLS-DA was used to identify the functional groups with the highest ranking using the VIP value parameters, and it is imperative to consider the VIP value in interpreting the OPLS-DA data. Four functional groups were identified (CH stretching, C-H bending, C-O stretching, and C=C stretching) based on the VIP score greater than 1.0 (Fig. 4). Functional groups with VIP values > 1 are thought to play an important role in distinguishing J. curcas and J. multifida (Mashiane et al., 2021).
Using the OPLS-DA loading S-plot (Fig. 5), CH stretching and C-O stretching vibration were identified as functional groups separating J. curcas and J. multifida. The loading S-plot model was also used to identify differentiating compounds in the cooking method of pumpkin leaves (Momordica balsamina L.) (Mashiane et al., 2021) and authentication of saffron spice accessions (Hegazi et al., 2022).
In this study, we also embedded sample information into the OPLS-DA model to use the power of metabolite abundance in-group discrimination and used permutation tests to verify and evaluate the quality of the OPLS-DA model. Thus, we performed a supervised multivariate analysis to see if the functional groups' information alone could discriminate the two samples (Misra et al., 2019;Pan et al., 2022;Worley and Powers, 2016). The T-score and orthogonal T-score values are 1.6% and 90.4%, respectively (Fig. 6). The values obtained can clearly distinguish the two species of J. curcas (JC) and J. multifida (JM). The highest permutations value of R 2 (0.919) and Q 2 (0.89) (Fig. S6) for the OPLS-DA model indicates that the model was stable and suitable for fitness and prediction (Pan et al., 2022). This combination approach between FTIR and chemometrics has proven to effectively discriminate the species and organs of J. curcas and J. multifida. Because of its benefits, the combination of these methods can be used as one of the standard methods in identifying raw materials from plants in the future.

CONCLUSION
In this study, FTIR spectroscopic data combined with chemometric techniques with PCA and OPLS-DA could classify J. curcas and J. multifida from different regions with clear separation and identify the main functional groups that distinguish the two species. As a safe technique, this method was developed because of free solvent and minimal sample preparation. Therefore, this Figure S1. FTIR spectra using Jatropha spp. powdered samples from different regions.