Integrating molecular docking and molecular dynamics simulations to evaluate active compounds of Hibiscus schizopetalus for obesity

The group screened and identified the content of Hibiscus schizopetalus by protein–protein interaction, molecular docking, and dynamics as a potential therapy for obesity through pancreatic lipase (PNLIP) as a protein target. First, the group collected all active ingredients of H. schizopetalus from an online database (http://www.knapsackfamily. com/ and http://ijah.apps.cs.ipb.ac.id/) to identify and isolate active compounds. The 3-D structures and canonical of the active compound were taken from the PubChem database, and then all compounds were analyzed by pkCSM and Tox-Protox II to get pharmacokinetics and physical-chemistry properties. The protein target of obesity was identified using the Open Target Platform. After the protein targets of plant extract and obesity were collected, the group analyzed them using Cytoscape. Protein–protein interaction was analyzed using String, Gene ontology, and KEGG pathway. Virtual screening was done by Pyrx software, and visualization was done by Discovery Studio Biovia, proceed by molecular docking using AutoDockTools-1.5.7, and finally, molecular dynamics (MDs) was done using YASARA software. The group collected 70 compounds from a research journal and found 196 protein targets. The target of obesity was 165 protein targets. The 196 protein targets of H. schizopetalus and 165 protein targets were analyzed and merged using Cytoscape and 11 proteins targeting H. schizopetalus and obesity. After that, the group analyzed which compound of H. schizopetalus affected 11 protein targets by Pyrx with the highest binding affinity. PNLIP has the highest binding affinity compared to other proteins, so the group analyzed this PNLIP protein with its relationship to obesity. The group found that three proteins that work on PNLIP are beta-sitosterol, kaempferol, and gallocatechin gallate. After docking these three proteins, the group found only one active compound has the highest binding affinity compared to the commercial drug Orlistat. Then, the process ended by performing MDs of the active compound as a candidate drug for anti-obesity. In this study, the group found that gallocatechin gallate, as an active compound of H. schizopetalus , can inhibit PNLIP enzymes for obesity therapy by bio-informatics study.


INTRODUCTION
Over 1 billion people worldwide are obese: 650 million adults, 340 million adolescents, and 39 million children.The World Health Organization (WHO) estimates that in 2025, around 167 million adults and children will become unhealthy due to being overweight or obese, so the WHO asks all countries to develop anti-obesity drugs or other drugs [1].Data in Indonesia shows that around 13.5% of adults aged 18 years and over are overweight.In comparison, 28.7% are obese [body mass index (BMI) ≥ 25], and based on the indicators of the RPJMN (National Medium Term Development Plan) in 2015-2019, 15.4% were obese (BMI ≥ 27).The data results on children aged 5-12 were 18.8% overweight and 10.8% obese [2].Since 2004, obesity can be described as the "New World Syndrome," where its prevalence steadily increased globally in all age groups [3].Obesity is a condition of abnormal or excessive fat accumulation, which can pose a health risk [1].

Identification and screening of active compounds of H. schizopetalus
Plant active compounds can be obtained from an online database (http://www.knapsackfamily.com/and http:// ijah.apps.cs.ipb.ac.id/).However, this active compound of H. schizopetalus has yet to be found in the online database because many researchers have not extensively studied this plant.The active compound obtained is an isolated component from the leaves and flowers of H. schizopetalus [16] using several research journals from PubChem web (https://pubchem.ncbi.nlm.nih.gov/), which have succeeded in isolating the active compound [17].Protein codes of all active compounds were found using a database application (https://www.genecards.org/) [18].
Computer-based methods are often used as the initial criteria for eliminating compounds with poor pharmacokinetic and physiochemical profiles and toxicity levels [19].The pharmacokinetic phase has four stages: absorption, distribution, metabolism, and excretion (ADME), including toxicity.Lipinksi's role of five (Ro5) (1997) is used to predict the oral bioavailability of a drug.This rule is widely used as a filter for drug-like properties.This rule is also based on the physicochemical characteristics [20] of the tested compounds, among others: clog P ≤ 5, molecular weight (MW) ≤ 500 g/mol, number of hydrogen bond acceptors (HBA) (sum of N and O atoms) ≤ 10, number of hydrogen bond donors (HBD) (sum of OH and NH groups) ≤ 5, number of rotatable bonds ≤ 10, and polar surface area < 140 Å 2 [21].
The toxicity is predicted using the application ProTox-II.To predict various toxicity endpoints, such as acute toxins, hepatotoxicity, cytotoxics, carcinogenic, mutagenicities, immunotoxication, adverse outcomes pathways (Tox21), and toxicities targets, ProTox-II incorporates molecular similarity, pharmacophores, fragment propensities, and machine-learning models.The web server accepts a 2-D chemical structure as input.It outputs 33 models with confidence ratings, an overall toxicity radar chart, and the 3 most similar compounds with known acute toxicity.It also reports the chemical's potential toxicity profile [23].

Obesity target prediction
The obesity gene is obtained from databases such as PubMed and STRING disease in the Cytoscape 3.9.1 application, which stores information for up to thousands of target genes.By using the keyword "obesity," you will find obesity genes.The confidence value created is 0.9.In addition, to see the similarity of target genes, you can use DisGeNET (https://www.disgenet.org/search)by typing obesity.
Many drugs have been used to prevent and treat obesity for many years.The use of anti-obesity drugs, such as phentermine, rimonabant, mazindol, diethylpropion, and sibutramine, has been withdrawn from the market by the Food and Drug Administration (FDA) and European Medicines Agency due to their side effects, such as tachycardia, palpitations, hypertension, fatigue, dizziness, insomnia, excessive excitement, tremors, changes in libido, palpitations, shortness of breath, anxiety, and chest pain.Drugs have a relationship with an increased risk of nonfatal severe cardiovascular events such as stroke and myocardial infarction and even hallucinations to suicidal ideation [4].
Only Orlistat has been approved by the FDA [5] and can be used long-term in treating obesity [4].Orlistat works by reversibly inhibiting gastric and pancreatic lipase (PNLIP) enzymes.Lipase inactivation prevents hydrolysis of triglycerides so that free fatty acids (FAs) are not absorbed [6].Several clinical trials have confirmed that Orlistat, a PNLIP inhibitor, can reduce obesity caused by a high-fat diet [7].
In the last 5 years, there has been a global increase in the use of herbal/traditional medicines (TMs) as complementary and alternative medicine developed by developing countries.TMs derived from plants can complement each other and act as an alternative treatment important in maintaining health [8].Several studies have shown that herbal plants, as TMs, can act as lipase inhibitors in controlling obesity [9].Indonesia, as a country with high diversity, uses the Hibiscus schizopetalus (Kembang sepatu gantung) as an ornamental plant, and the people of North Maluku use it to induce labor and improve the condition of pregnant women [10]; the people of Colombia use an infusion of H. schizopetalus flowers to treat colds and coughs [11].In Malaysia, H. schizopetalus is also a medicinal plant because it contains chemical compounds, including anthocyanins and triterpene esters, such as antioxidants, antipyretic, anti-inflammatory, analgesic, hypoglycemic, and hypolipidemic [12].Hibiscus schizopetalus was chosen in this study because it is suspected to have active compounds that act as lipase inhibitors.However, it is not known which active compounds can act as lipase inhibitors.Hibiscus schizopetalus is a plant from the Plantae kingdom, Tracheophyta division, Magnoliopsida class, Malvaceae family, and Hibiscus L genus [13].
Technological developments have resulted in drug development, starting with the stages of the bio-informatics method (in silico), which is simply a computer-based method and uses several software programs easily available on the public web [14].In this in silico study, it is generally preferred at an early stage to predict and provide a temporary estimate of a study related to the activity of a ligand/compound.The in silico method does not require expensive costs and does not require a very long time [15].Searching for target proteins and interactions of obesity proteins with active compounds in the H. schizopetalus plant is a step in this in silico study.The in silico method, especially regarding time and prediction of interactions with target proteins, provides good added value in this study.Furthermore, the results of these interactions will be compared with the commercial drug orlistat.complex can be downloaded via the https://www.rcsb.org/platform.The structure is then prepared by removing water molecules, adding hydrogen atoms, and optimizing amino acids.Meanwhile, the ligand is released and stored in program database (PDB) format for grid box optimization in the method validation process.This preparation can be done using the Biovia Discovery Studio application.The molecular structure can be obtained by downloading the 3-D structure from Databank (https://www.rcsb.org)[25] and converting it into a PDB file using UCSF Chimera (https://www.cgl.ucsf.edu/chimera/)[26].

Simulation of molecular docking and visualization of docking results
Molecular docking or the blind docking process can be done using the Pyrx Vina application (https://pyrx.sourceforge.io/) [27].Then, to visualize the docking results, BIOVIA application is used (https://discover.3ds.com/discovery-studio).

Molecular docking validation
The native ligand that had been separated was then re-attached to the macromolecules using the AutoDock 4.0 application.This is done to get the grid box value associated with the active site of the macromolecule.
The appropriate grid box parameter, with a reference value of root mean square deviation (RMSD) < 2 A, will be the benchmark for the next virtual screening.
Validation for molecular docking will be conducted using the auto dock application (AutoDockTools-1.5.7) [28], where at the time of this validation, native ligands are used to determine the grid box that will be used for docking macromolecules and ligands to form a complex ligand-protein.
The interactions and binding of these complex ligand proteins will be further analyzed through dynamic molecular simulations.

Molecular dynamics (MDs) simulation
MDs simulation was carried out using the YASARA Dynamics software package (http://www.yasara.org/).The hit's finest pose from the virtual showing was chosen, and YASARA Structure's scene mode was then set up using the default mode [29].Complex ligand-protein from the AutoDock application will be uploaded in YASARA software.It will use NaCl 0.9%, and the temperature will be set at 298K for simulation.Water density is 0.997, and to represent solvation on water molecules using a cell shape cubic box.The whole system was neutralized to pH 7.4 and used at normal speed.The length of the MD run was 34.60 ns, and the forcefield used AMBER14.Hydrogen bonds, RMSD of protein (backbone and side chains), and root mean square fluctuation (RMSF) of amino acid residues were used to evaluate the ligand-protein interaction [30,31].

RESULT AND DISCUSSIONS Active compounds of H. schizopetalus
The physicochemical and pharmacokinetic properties of active compounds were predicted, as shown in Tables 1 and 2 [17].Each of these compounds is then downloaded in SMILES canonical form, 2-D and 3-D structures in spatial data file format using the facilities available in PubChem (https:// This data collection is in a separate place to facilitate data processing.List the target genes in text format or Excel.If the data is complete, then each network construction is made.The target for obesity is done by analyzing the application (https://platform.opentargets.org/).Choose a target protein with an Overall Association Score value above 0.50.

Relation protein target-active compound
After obtaining the protein target of obesity and the protein target of the active compounds of H. schizopetalus, we performed the search of the relationship slices using the Venny diagram (https://bioinfogp.cnb.csic.es/tools/venny/).The entire list of each compound was then collected, and then through the STRING Database in Cytoscape 3.9.1, a retrieving network was carried out with a confidence value of 0.9 to form a target network of H. schizopetalus extract compounds.
The second network, the obesity gene network, is carried out similarly.A list of target genes was collected, and then a retrieving network was carried out with a confidence value of 0.9.The result will form an obesity network.
Next, carry out the process of making a gene slice that is the same between H. schizopetalus extract compounds and the obesity network by selecting "tools," then "merge," then selecting the desired network, and selecting "intersection."Then, target genes will be found that can be used as therapeutic targets.

Protein network construction
Using CytoCluster, the formed network can be grouped into several nearby clusters to facilitate the hypothesis of possible pathways.The cluster is assessed based on closeness centrality, degree, closeness centrality, and betweenness values.The data is processed using an application to see the protein component network and targets (https://cytoscape.org/).This network of protein components and targets was obtained by interacting with active plant compounds with target codes for each active compound using a database (https://www.genecards.org/).The network construction will show the interaction between the active compound of H. schizopetalus and the target code of obesity.

Pathway and enrichment analysis
Analysis of interactions between genes/proteins can be carried out with the help of the STRING platform https://stringdb.org/,STITCH http://stitch.embl.de/,and Human References Interactome http://www.interactome-atlas.org/.
The results of the analysis of STRING can lead researchers to further analyze the possible pathways as interventions from test substances.Accessible platforms for this analysis include DAVID: Functional Annotation Tools (ncifcrf.gov),metascape.org,and the KEGG Pathways database https://-www.genome.jp/.To obtain the target protein that has the desired pharmacological effect [24], a specific protein as the best target can be searched using the metascape application (https://metascape.org/)and KEGG (https://www.kegg.jp/).

Macromolecule preparation
The ligand and target structures must be prepared in advance.The target macromolecule/protein attached to the pubchem.ncbi.nlm.nih.gov).When choosing the most suitable structure is important to match the MW and molecular formula according to the LC/GC-MS information for the extract.Each compound was then selected for its bioavailability value through the Swiss ADME platform (http://www.swissadme.ch/index.php)and SCFBio (http://www.scfbio-iitd.res.in/software/drugdesign/lipinski.jsp.The result is 14 compounds found in H. schizopetalus with a bioavailability value of 0.17-0.55,as shown in Table 2.
The results of the description of the number of genes involved using https://bioinformatics.psb.ugent.be/webtools/Venn/ can be seen in Figure 1.
It was seen that 11 genes were the same between obesity targets and H. schizopetalus compounds.If calculated, as many as 7.14% of obesity genes are targets of H. schizopetalus compounds.This means that H. schizopetalus can interfere with many of the genes involved in obesity.
From here, it can then be filtered for genes that tend to have a close relationship with one another based on the values of closeness centrality, degree, closeness centrality, and betweenness values.The grouping above uses CytoCluster, the choice of ClusterONE algorithm, and then other parameters based on default.Then, several clusters with different confidence values will be formed, so Cluster 5 is chosen, which has a p-value of 0.001, as shown in Figure 2a and b.

Protein network construction
Based on the Venn diagram in Figure 1 and the construction of the target protein and active compounds in Figure 2a and b, it is obtained that 11 target proteins can be  (TG) substrates found in digested oils into monoglycerides and free fats.After the human body digests the fat-containing food, the lipids containing triglycerides are first hydrolyzed by lipase into monoglycerides, glyceryl esters, and free FAs, and the content of 1,2-glycolide and FAs in the product is higher.
Then, the fat enters the body and is further hydrolyzed into monoacylglycerol (MG) and FA by gastric lipase (10%~30% decomposition) and PNLIP (50%~70% decomposition) in the digestive tract and small intestine such as in Figure 4.Then, cholesterol and lipoproteins (LPAs) are formed in the body.MG, FA, cholesterol, LPA, and bile acids are absorbed by the small intestine, and then TG is re-synthesized as energy storage for adipose tissue.Then this adipose tissue will secrete excess fat, which, if not removed, will result in fat accumulation and can cause obesity [7].
Figure 5 shows that the enrichment analysis related to the metabolic process is carried out by the PNLIP protein through the olefinic compound metabolic pathway so that this PNLIP protein will be further analyzed.
From the 11 target proteins above, the protein code is searched using the protein database (https://www.rcsb.org/),and the data obtained are shown in Table 3.
The topology above can be interpreted as an identification of the role of genes in the network.The degree can be interpreted as whether the gene is directly connected to other genes in the network; closeness centrality is how close the gene is to other genes indirectly.

KEGG pathway
The gene is then analyzed in terms of its enrichment and gene ontology to see the most likely pathways and to see that interaction analysis between genes/proteins can be carried out with the help of the STRING platform https://string-db.org/, DAVID: Functional Annotation Tools (ncifcrf.gov)and metascape.org,which is linked to other gene ontology servers such as the GAD Disease Genetic Association Disease Database (https://geneticassociationdb.nih.gov/) and the KEGG Pathways database https://-www.genome.jp/and Human References Interactome http://www.interactome-atlas.org/.The group can identify overrepresented genes in obesity networks using KEGG pathways [23].The result from KEGG is known to have four signaling pathways: glycerolipid metabolism, metabolic pathway, pancreatic secretion, fat digestion and absorption, and vitamin digestion and absorption.Then, each signal pathway is analyzed, and the most likely pathway associated with obesity is the fat digestion and absorption pathway, as shown in Figure 4 below.
According to Figure 4 above, it can be seen that lipase plays an important role in the upstream section.Lipase is the main lipase enzyme secreted from the pancreas, which hydrolyzes dietary fats in the digestive system, converting triacylglycerol    obtained by downloading the 3-D structure from the database (https://pubchem.ncbi.nlm.nih.gov/) and has been converted to PDB format.The structures of best hit compounds can be seen in Table 1.The docking scores and binding activity of the hits compound are shown in Table 5.

Interaction of best hit compounds with ligand on receptor using BIOVIA Discovery Studio
Based on the results of this bioinformatics study, one of the most potent compounds is gallocatechin gallate, as shown in Figure 6a-e.(−)-gallocatechin gallate is a gallate ester formed by the formal condensation of gallic acid's carboxy group with the (3R)-hydroxy group of (−)-gallocatechin.Green tea contains a natural substance.It functions as an inhibitor of EC 3.4.22.69 (severe acute respiratory syndrome coronavirus major proteinase), a human xenobiotic metabolite, an antineoplastic drug, and a plant metabolite.It contains gallate ester, polyphenol, and catechin.It is structurally similar to (−)-gallocatechin and gallic acid.It is the (+)-gallocatechin gallate enantiomer.

Molecular docking validation using AutoDock tools
Grid box validation requires the native ligand MUP structure and 1LPB macromolecule in PDB format.This validation is necessary for the formation of a ligand-protein complex.Validation is done with three types of grid box sizes, methoxyundecylphosphinic acid C 12 H 27 O 3 P derived from 3-D macromolecular structures (PDB ID: 1LPB).
The protein target (PNLIP) was then docked with 14 active compounds with known pharmacokinetic and physiochemical properties and 1 native ligand to determine which target protein has the most powerful role in obesity.The PNLIP (PDB ID: 1LPB) macromolecule [42] was obtained from the information contained in the active compounds in H. schizopetalus extract.An analysis has been carried out related to the target of obesity.The results of the docking can be seen in Table 4.

Preparation of ligands from active compounds of H. schizopetalus
The molecular structure of the active compounds from H. schizopetalus and Orlistats as commercial drugs were

MDs simulation
Based on the docking results of the formation of the complex ligand-protein, it was known that the active compound gallocatechin gallate has the highest binding energy, so this active compound will be analyzed for MD analysis.The results of the simulation of MDs at 34.60 ns are shown in Table 8.
MDs simulation results of PNLIP proteins against the active compound gallocatechin gallate showed binding stability at times above 25 ns.RMSF shows the binding site's stability between the compounds with the target protein [43].RMSF in Table 8 shows a value close to zero, which means no residue was detected on the molecular bond above.as shown in Table 6.The grid box was set to the center position of the ligand.

Redocking using AutoDock tools
The RMSD parameter of heavy atoms between the molecular docking conformation and experimental findings (crystallography), which should be no more than 2-3 Å, can be used to evaluate the validity of a molecular docking process.The grid box dimensions were 60 each.The target protein was prepared, and the grid box was set to the center position of the ligands, as shown in Table 7.

ETHICAL APPROVALS
This study does not involve experiments on animals or human subjects.

Figure 2 .
Figure 2. (a) Network construction obesity and active compound.(b) Combined network construction and active compound.

Figure 4 .
Figure 4. KEGG pathway of fat digestion and absorption.

Table 3 .
Complex proteins and ligands from the PDB database.

Table 4 .
Binding affinity of the protein target and active compound of H. schizopetalus.

Table 5 .
Docking scores, H-bond interactions, electrostatic/ hydrophobic interactions, and the binding affinity of the best hit compound BIOVIA Discovery Studio.