Skip to main content

In-silico analysis of deleterious non-synonymous SNPs in the human AVPR1a gene linked to autism

Abstract

Single nucleotide polymorphisms are the most prevalent type of DNA variation occurring at a single nucleotide within the genomic sequence. The AVPR1a gene exhibits genetic polymorphism and is linked to neurological and developmental problems, including autism spectrum disorder. Due to the difficulties of studying all non-synonymous single nucleotide polymorphisms (nsSNPs) of the AVPR1a gene in the general population, our goal is to use a computational approach to identify the most detrimental nsSNPs of the AVPR1a gene. We employed several bioinformatics tools, such as SNPnexus, PROVEAN, PANTHER, PhD-SNP, SNP & GO, and I-Mutant2.0, to detect the 23 most detrimental mutants (R85H, D202N, E54G, H92P, D148Y, C203G, V297M, D148V, S182N, Q108L, R149C, G212V, M145T, G212S, Y140S, F207V, Q108H, W219G, R284W, L93F, P156R, F136C, P107L). Later, we used other bioinformatics tools to perform domain and conservation analysis. We analyzed the consequences of high‑risk nsSNPs on active sites, post-translational modification (PTM) sites, and their functional effects on protein stability. 3D modeling, structure validation, protein-ligand binding affinity prediction, and Protein-protein docking were conducted to verify the presence of five significant substitutions (R284W, Y140S, P107L, R149C, and F207V) and explore the modifications induced due to these mutants. These non-synonymous single nucleotide polymorphisms can potentially be the focus of future investigations into various illnesses caused by AVPR1a malfunction. Employing in-silico methodologies to evaluate AVPR1a gene variants will facilitate the coordination of extensive investigations and the formulation of specific therapeutic approaches for diseases associated with these variations.

Peer Review reports

Introduction

Globally, the human genome is approximately 99.9% identical, with individual genetic variances making up the remaining 0.1%. These genetic differences arise from random mutations [1]. The most ubiquitous kind of genetic variation in humans is represented by single-nucleotide polymorphisms (SNPs), an invaluable resource for deciphering complicated genetic features [2]. Missense mutations, also known as non-synonymous single nucleotide polymorphisms (nsSNPs), have the potential to induce phenotypic diversity in humans through modifications in protein expression [3]. Prior research suggests that non-synonymous single nucleotide polymorphisms (nsSNPs) contribute to around 50% of the mutations linked to different genetic disorders [4]. Substituting amino acids in conserved regions can affect the structure, stability, and function of proteins. Non-synonymous single nucleotide polymorphisms (nsSNPs) have the potential to alter the function of proteins, which in turn can elevate susceptibility to human diseases [5]. Autism spectrum disorder (ASD) is a severe neuropsychiatric illness that has strong hereditary underpinnings. Nevertheless, the genetic variables that contribute to autism are quite diverse, with several loci fulfilling distinct functions in various individuals [6].

Autism is a neurodevelopmental condition caused by several genes, with more than 90% of cases being influenced by genetics [7]. Arginine vasopressin (AVP) is an endogenous ligand that spontaneously binds to and stimulates AVPR1 A receptors in both the peripheral and central nervous systems. The AVPR1 A, or arginine vasopressin receptor 1 A, has a profound influence on behaviors such as forming pair bonds, providing parental care, displaying aggression, and managing stress [8,9,10,11]. This receptor plays a crucial function in brain signaling. Pharmacological approaches and the examination of various animal models have demonstrated the benefits of understanding the role of AVPR1 A in behavior [12, 13]. AVP receptors have seven transmembrane domains and are categorized as G-protein-coupled receptors. At least three types of vasopressin receptors (V1R/V1a, V2R, and V3R/V1b) have been found in humans. AVPR1a, located on chromosome 12q14-15, is especially relevant to human behavioral research. This is because the specific patterns of V1a receptor gene expression in the brain play a significant role in the observed variations in social and reproductive behavior within and between species. The Vole model has demonstrated this relationship [14,15,16].

Preclinical research has demonstrated that arginine vasopressin (AVP) enhances some social behaviors, such as association and connection, through interacting with the V1a receptor (AVPR1 A) in the brain. The effects of AVP on behavior and the location of the V1a receptor in the brain differ significantly among different mammalian species [17]. This suggests that the AVPR1a gene is a probable candidate for susceptibility to autism [18]. Previous studies investigating familial ties have demonstrated a strong association between the AVPR1 A gene and autism [19]. The presence of two microsatellite polymorphisms, RS1 and RS3, in the vicinity of the promoter region of AVPR1 A, which codes for the receptor subtype primarily responsible for regulating behavior, has been linked to autism and behavioral traits [20, 21]. The severity of autistic traits can be significantly influenced by a single nucleotide polymorphism (SNP) of the AVPR1a gene [22]. The AVPR1a gene encodes the vasopressin V1a receptor, one of the primary receptors for arginine vasopressin (AVP). A low arginine vasopressin (AVP) concentration level in cerebrospinal fluid (CSF) is an indicator of social impairment in monkeys with low social behavior and autistic children [23]. An extensive association study was conducted involving 3 microsatellites and twelve tag single nucleotide polymorphisms (SNPs) situated within and near the AVPR1 A gene in 205 Finnish families. This was followed by an assessment of the gene’s promoter, which revealed a significant correlation with autism [24]. A study was undertaken in the Korean population to evaluate the relationship between autism spectrum disorder and changes in the AVPR1 A promoter region. The study used a family-based association test (FBAT) for this purpose. The results suggest that alterations in the AVPR1 A promoter region may have a role in the development of ASD and the regulation of AVPR1 A expression [25]. Here, we explored several computational approaches to pin down non-synonymous polymorphisms in the human AVPR1 A gene.

Materials and methods

The overall workflow of this project is shown in Fig. 1.

Fig. 1
figure 1

Project workflow

Retrieval of SNPs

A total of 402 nsSNPs associated with the human AVPR1a gene were retrieved from the dbSNP database (https://www.ncbi.nlm.nih.gov/). We collected information on SNPs, including SNP ID, protein accession number, location, residue alteration, and global minor allele frequency (MAF) [26]. The AVPR1a gene sequence was sourced from Uniprot (https://www.uniprot.org). Studies investigated the harmful effects of missense SNPs on the AVPR1a gene.

GeneMANIA to understand AVPR1a interactions with other genes

GeneMANIA (https://genemania.org/) was used to investigate the relationship between the AVPR1a gene and other genes based on pathways, expression, localization, genetics, and protein interaction. This tool confirms the connective network between the AVPR1a gene and other genes [27].

Screening of deleterious nsSNPs

We employed two different bioinformatics tools to evaluate the likely impact of genetic variations extracted from the dbSNP databases. The tools mentioned above used: SNPnexus (https://www.snp-nexus.org) includes Sorting Intolerant from Tolerant (SIFT) and Polymorphism Phenotyping (PolyPhen) [28]. SIFT predicts harmful nsSNPs by examining protein homology sequences and natural nsSNP alignments. A score below 0.05 indicates that SIFT considers the nsSNPs to have a deleterious effect on protein function [29]. PolyPhen-2 predicts the functional impact of amino acid substitutions on protein structure and function using sequence-based characterization [30]. PolyPhen generates a position-specific independent count (PSIC) score for each amino acid variant. Differences in PSIC scores for variants indicate their direct functional impact [31, 32]. PROVEAN is another tool we used to screen deleterious nsSNPs. PROVEAN predicts the functional impact of variants. A threshold value of ≥ −2.5 indicates a deleterious nsSNP [33].

Confirmatory analysis of the deleterious nsSNPs

We cross-checked our screened nsSNPs with another three bioinformatics tools to reconfirm the level of severity and deleterious nature. The biological and evolutionary information for every protein-coding gene is compiled in PANTHER (http://pantherdb.org) [34]. PPh-2 (http://genetics.bwh.harvard.edu/pph2) predicts how point mutations affect protein expression [35]. Mutpred2 (http://mutpred.mutdb.org/) is used to assess, using molecular and biological data, the possible structural consequences of nsSNPs arising from alterations in proteins [36].

Screening of disease-associated SNPs

To examine the association of screened nsSNPs with a disease, PhD-SNP, SNPs&GO, and Meta-SNP were performed. In order to categorize an SNP’s effect as either disease-related or neutral, the PhD-SNP tool (https://snps.biofold.org/phdsnp/phd-snp.html) generates an accuracy index score from 36,000 benign and harmful SNVs. It was developed and verified using the ClinVar dataset [37]. SNPs&GO (https://snps-and-go.biocomp.unibo.it/snps-and-go) assesses changes in amino acids at a particular location in a protein [38]. SNPs&GO and PhD-SNP are pivotal approaches based on machine learning that leverage comparative conservation scores derived from multiple sequence alignments [39]. In Meta-SNP (https://snps.biofold.org/meta-snp), the outputs from individual predictors are combined as input, and disease occurrence is predicted if mutations surpass a threshold of 0.5 [40].

Functional effects of SNPs on protein stability

To determine the changes in protein stability, we used three different tools: MUpro, I-Mutant 2.0, and INPS3D. Protein stability assessment is commonly conducted using the MUpro server (http://mupro.proteomics.ics.uci.edu). This web server is built using two machine learning techniques: Support Vector Machines (SVM) and Neural Networks. These techniques assess how single-site changes in amino acids affect the stability of proteins and display the results as a rise or fall, denoted by positive or negative scores [41]. The neural network technique is employed by the I-Mutant 2.0 web server (https://folding.biofold.org/i-mutant/i-mutant2.0.html). It is applied to predict potential changes in protein stability after mutations. A reliability index (RI) of 0 to 10, with 10 denoting the maximum dependability, is used to make predictions. The server also assesses the degree of protein instability and gives a free energy change number (ΔΔG) that shows if stability will rise or fall. A protein stability decrease is indicated by a ΔΔG value less than 0, whereas an increase in protein stability is suggested by a value greater than 0 [42]. Protein stabilities can be predicted for both wildtype and mutant variants using a recently developed tool named INPS3D. The INPS-MD (Impact of Non-synonymous mutations on Protein Stability—Multi Dimension) web server (https://inpsmd.biocomp.unibo.it/inpsSuite/default/index3D) was utilized for this purpose. This tool takes into account several variables, including the molecular weights and hydrophobicities of the native and mutated amino acids, the alignment score difference, the likelihood of the original residue undergoing mutation, the relative solvent accessibility (RSA) of the original amino acid, and the local energy difference between the wildtype and altered protein structures [43].

Domain analysis of AVPR1a

We utilized a widely used computational tool, InterPro (https://www.ebi.ac.uk/interpro/), to identify the functional domains of our desired protein (AVPR1a) [44]. This application uses a database of protein families, domains, and functional sites to find motifs and domains of proteins and, in turn, determine their functional characterization [45].

Conservation analysis

In order to evaluate the amino acid conservation pattern within the protein sequence, we made use of the predict protein server (https://predictprotein.org). The AVPR1a protein’s single-letter amino acid sequence was submitted for evaluation. More than thirty tools are integrated with this service, including ConSurf and other techniques for finding functional areas. Evolutionary conservation was analyzed using Bayesian empirical inference [46].

Predictions of ligand binding sites

The meta-server program COACH (http://zhanglab.ccmb.med.umich.edu/COACH/) used two comparison techniques, TM-SITE and S-SITE, to find ligand binding templates from the BioLiP protein function database in order to predict protein-ligand binding sites. Additionally, sequence feature correlations and binding-specific sub-structure were used. In order to anticipate ligand binding sites (LBS), COACH employs a consensus approach by combining the predictions from several algorithms, including TM-SITE, S-SITE, COFACTOR, FINDSITE, and ConCavity. Cluster size, PDB hits, ligand names, consensus binding residues, and downloadable complex structures are the factors used by the COACH server to select the top ten models. Each model is then given a C- scores. The expected reliability is shown by the C-score, which has a range of 0 to 1. Higher scores correspond to higher reliability [47].

Prediction of post translational modification (PTM) site

The neural network-based and frequently used program which called NetPhos 3.1 (https://services.healthtech.dtu.dk/services/NetPhos-3.1/) was used to estimate the probable phosphorylation sites of the AVPR1a protein. If a threshold score is more than 0.5, it suggests that a certain location is probably phosphorylated [48]. In order to forecast probable locations of MHC-binding sites, we employed GPS-MBA 1.0 (https://mba.biocuckoo.org/) [49]. To identify potential SUMOylation and ubiquitylation sites, we used GPS-SUMO (https://sumosp.biocuckoo.org/) and GPS-Uber (http://gpsuber.biocuckoo.cn/wsresult.php) [50, 51].

3D modeling

The native structure of the AVPR1a protein was downloaded from the AlphaFold protein structure database (AlphaFold DB, https://alphafold.ebi.ac.uk/) [52]. AlphaFold2 predicted the rest of the mutant protein structure [53]. The protein sequences of mutants were modified according to the substitution of amino acid positions. In order to minimize steric clashes, obtain precise side-chain locations, and eliminate distracting stereochemical violations without compromising accuracy, we employ gradient descent in the Amber force field through AlphaFold2 to predict relaxed structure [54]. The ModRefiner tool (http://zhanglab.ccmb.med.umich.edu/ModRefiner) was utilized to refine the predicted structure [55].

Structural validation and RMSD calculation

The selected structural model was validated using the widely accepted server SAVES v6.0 (https://saves.mbi.ucla.edu). This site offers tools like PROCHECK and ERRAT to assess the overall quality of the 3D model [56]. Furthermore, the RAMACHANDRAN plot produced by PROCHECK was used to evaluate the model’s quality [57]. The alignment of a protein’s tertiary structure with its primary structure is evaluated by 3D verification [58]. We utilized the pyMOL tool (https://pymol.org/2) to compute the root-mean-square deviation (RMSD) by superimposing the native and mutant protein structures, representing the difference between the two compared models. A higher RMSD value indicates a greater deviation between the two structures. On a scale ranging from 0 to 1, the TM-score evaluates the structural similarity of two models; a score of 1 denotes total similarity, while lower values suggest growing dissimilarity [59]. Afterward, the template modeling score (TM-score) was calculated by comparing the wild-type protein structure with mutant protein structures using TM-align (https://zhanglab.ccmb.med.umich.edu/TM-align) [60].

Protein-ligand interaction analysis

We conducted docking of all chosen ligands with AVPR1a using the PyRx program (https://pyrx.sourceforge.io) [61]. Virtual ligand screening was carried out using the Lamarckian genetic algorithm (LGA), which combines AutoDock and AutoDock Vina [62]. By applying AutoDock tools to convert PDB files to Pdbqt format and ascertain binding affinities. The grid size was modified as per the center (XYZ axis). The grid box center remained at coordinates X: −7.4813, Y: 4.9867, Z: 10.6823, with dimensions set to X: 109.4936, Y: 98.4684, and Z: 130.9739 Å [63]. Stronger ligand binding ability with the target receptor is indicated by negative values of the binding affinities of the ligands to the receptors, which were computed in kcal/mol [64]. Discovery Studio (https://discover.3ds.com/discovery-studio-visualizer-download) was utilized to visualize 2D and 3D interactions between ligands and proteins. It depicted the position and size of binding sites, nonbonding interactions, bonding angles and lengths of a docked ligand [65].

Analyze docking results of protein-protein complex by ClusPro

We utilized the ClusPro web server (https://cluspro.org) to conduct protein-protein docking analysis. This tool is extensively employed for studying protein-protein docking interactions. ClusPro offers various sophisticated options to tailor the search procedure, such as removing unstructured protein regions, applying attraction or repulsion forces, considering pairwise distance constraints, producing homo-multimers, incorporating data from small-angle X-ray scattering (SAXS), and locating heparin-binding sites. Based on the type of protein, six different energy functions are accessible. Ten models, each with a center of densely packed clusters of low-energy docked structures, are produced by docking with each set of energy parameters [66].

Molecular dynamics (MD) simulation

The protein–ligand complexes were subjected to MD simulations using GROMACS [67] and the WebGro server (https://simlab.uams.edu/). The ligand topology files were generated using the PRODRG Server [68], with a triclinic simulation box employed for system setup. The complexes were solvated using the SPC water model, and the system was neutralized by adding 0.15 M NaCl. The simulations were performed using the GROMOS96 43a1 force field. An initial energy minimization was carried out with 5000 steps of the steepest descent algorithm. Subsequently, the system was equilibrated under NVT and NPT ensembles with standard parameters, maintaining a temperature of 300 K and a pressure of 1.0 bar. Using the Leap-frog MD integrator, the MD trajectories were generated over a 200 ns timescale, with trajectory snapshots taken every 0.1 ns, yielding 2000 frames for analysis. The trajectory snapshots were subsequently analyzed to determine Rg, RMSD, RMSF, and SASA [67, 69].

Results

Download datasets of interest

The SNPs of the AVPR1a gene were acquired from the dbSNP database, which is widely considered the most actively utilized and comprehensive database currently accessible. According to the NCBI dbSNP database, the human AVPR1a gene displayed a sum of 4190 single nucleotide polymorphisms (SNPs). Among the entire collection, there were 402 non-synonymous SNPs (nsSNPs) (Table S1), 177 synonymous SNPs, 1625 SNPs placed in the 3’ UTR, 168 SNPs in the 5’ UTR, and 893 SNPs in intronic regions. The remaining SNPs were classified into various categories (Fig. 2). Only the non-synonymous single nucleotide polymorphisms (nsSNPs) were selected for this investigation.

Fig. 2
figure 2

SNP distribution graph of AVPR1a gene. 3’UTR = 38.7 %, Others = 22.07%, Intron = 21.31%, Nonsynonymous = 9.6%, Synonymous = 4.20%, 5’ UTR = 4%

GeneMANIA to understand AVPR1a interactions with other genes

GeneMANIA efficiently analyzes the other genes related to the AVPR1a gene. The graphical representation of the analysis is illustrated in Fig. 3. These findings suggest that the AVPR1a gene may have a functional connection to the co-expressed genes and could be involved in common biological pathways. So, if any mutation occurs in the AVPR1a gene, it may also affect the overall gene network interactions among all the related genes.

Fig. 3
figure 3

Other genes related to the AVPR1a gene. Different colors of the legend indicated the parameters of constructed networks

Screening of deleterious nsSNPs

SIFT and Polyphen score initially indicates deleterious SNPs based on score. SIFT score 0 and polyphen score 1 denote the most deleterious nature of nsSNPs. The range of the Polyphen and SIFT output scores is 0 to 1. We specifically chose common non-synonymous single nucleotide polymorphisms (nsSNPs) that received a score of 0 in the SIFT algorithm and a score of 1 in the PolyPhen algorithm. This selection criteria ensures that only the most harmful SNPs are included in our study. The use of the PROVEAN tool facilitates the identification of the most harmful SNPs through further investigation. The threshold value of this tool is −2.5. The PROVEAN tool classified nsSNPs as harmful when the result was less than −2.5. Conversely, a score greater than −2.5 is anticipated to be neutral. Finally, we have identified a total of 23 nsSNPs that met the specified criteria. These nsSNPs have been classified as having a high likelihood of impacting protein function (Table 1).

Table 1 Risky nsSNPs screened by SIFT, Polyphen, and PROVEAN

Confirmatory analysis of the deleterious nsSNPs

Another three computational tools were applied to reconfirm the detrimental nature of initially screened nsSNPs to maintain the required accuracy. The combined prediction result of PANTHER, PPh-2, and Mutpred2 contributes to finalize the number of deleterious SNPs for further analysis (Table 2). The PANTHER tool makes a prediction about how the nsSNPs will affect the way the protein functions. Every screened nsSNP in PPh-2 was predicted to be harmful (PSIC >0.5); these variants were expected to be extremely harmful, with a PSIC score of 1. The MutPred2 score shows how likely it is that a change in an amino acid will impact the function of the protein. Pathogenicity is predicted to be using a score threshold of 0.5. The higher the score, the more probable it is that an amino acid substitution is linked to a particular disease.

Table 2 Confirmatory assessment of screened deleterious nsSNPs through PANTHER, PPh-2, and Mutpred2

Screening of disease-associated SNPs

It is very crucial to identify nsSNPs related to disease for further analysis. The Meta-SNP algorithm detected 22 nsSNPs that were associated with disease, excluding D202 N (rs1267958616). The G212S (rs376518166) mutation was classified as neutral by SNAP, but the remaining mutations were deemed to be associated with disease. The PhD-SNP analysis identified a total of 3 nsSNPs that were determined to be neutral: E54G, V297M, and R284 W and the rest of the nsSNPs were confirmed to be disease-causing. The SNPs & GO software identifies a total of 23 nsSNPs that are associated with disease. The comprehensive forecast outcomes are succinctly outlined in Table 3.

Table 3 Prediction of disease-associated SNPs by Meta-SNP, SNAP, PhD-SNP, and SNPs & GO

Impact of screened nsSNPs on the stability of proteins

According to the MU Pro tool, Q108L and P107L make the protein more stable, whereas the remaining 21 nsSNPs were projected to make less stable, which would reduce the protein activity. I-Mutant2.0 detected S182 N and Q108L increased, and the remaining were decreased the the protein stability. The indicated structural effect of 23 possible nsSNPs was acquired from INPS3D. The outputs of the protein stability evaluation are presented in Table 4.

Table 4 Results of protein stability changes due to the nucleotide polymorphisms predicted by MU Pro, I-Mutant2.0, and INPS3D

Detection of nsSNPs on the AVPR1a domains

InterPro predicted two functional domains of AVPR1a, which are (GPCR_Rhodpsn_7 TM) Seven-transmembrane rhodopsin-like G protein-coupled receptors domain (from amino acid 68 to 348), (V1R_C) Conserved C-terminal domain of Vasopressin V1 receptors (from amino acid 372 to 418) (Fig. 4). This domain analysis result indicated that 22 out of 23 nsSNPs are positioned in the large GPCR_Rhodpsn_7 TM domain. Polymorphism in the domain area could significantly alter the activity of protein.

Fig. 4
figure 4

Schematic representation of the matched high-risk nsSNPs in the MHC binding and phosphorylation sites (PTM sites) into the domain area of the AVPR1a protein

Conservation analysis

The research conducted by the Conservancy demonstrated a significant level of preservation in both the structure and function of all AVPR1a residues. Predict protein server provides 3 types of conservation scores: 1–3 (Minimal), 4–6 (Mid-level), 7–9 (high). We focused solely on the 7–9 scoring residues corresponding to the residues in the highly conserved region (Figure S2). We noticed that all 23 detected nsSNPs were present in the highly conserved region (Table 5). Prior research has demonstrated that essential amino acids, which play an active role in multiple biological functions, are important. All of these are situated within a protein’s conserved area. Therefore, it may be inferred that non-synonymous single nucleotide polymorphisms (nsSNPs), which exhibit a high degree of conservation, have a significant detrimental impact on both the structural and functional characteristics of the AVPR1a protein.

Table 5 Conservation analysis results of identified 23 mutants evaluated by predict protein server

High‑risk nsSNPs consequences on ligand binding sites

We employed the COACH server to forecast the ligand binding location of the AVPR1a protein. The COACH server utilizes a combination of programs from TM-SITE, S-SITE, COFACTOR, FINDSITE, and ConCavity to estimate the combined output. The predicted binding site residues are E54, Q108, W111, Q131, V132, M135, F136, D202, C203, W204, F207, Q209, K128, W304, F307, F308, M220, I224, S338, A334, Q311. We noticed that E54, Q108, F136, D202, C203, and F207 positions matched our screened highly deleterious nsSNPs. Hence, we can conclude that mutation of these positions can significantly alter the function of the protein due to active site modifications.

High‑risk nsSNPs consequences on post-translational modification (PTM) site

NetPhos 3.1 tool predicted probable 43 phosphorylation sites in the AVPR1a protein (Table S2). The positions are S4, S16, T23, T28, S29, T61, S70, T79, T83, S84, S94, T114, S138, Y150, S167, S182*, T183, Y186, S190, T206, S213, T234, S253, S256, S278, S281, S283, T289, T323, S338, S341, S380, S382, T386, Y388, S389, S393, S397, S404, S407, S408, S410, S417 (Figure S3). Among them, S182 position matched our predicted highly damaging nsSNPs. To unveil the positions of the MHC binding sites of AVPR1a protein, we employed GPS-MBA 1.0 tool (Table S3). The position ranges are 22–33, 26–34, 29–37, 96–104, 100–108, 154–162, 253–261, 260–268, 273–281. In the position range 100–108, we found 107 and 108 positions which match our screened risky nsSNPs. The presence of highly damaging nsSNPs of these PTM sites clearly indicates that mutation of those positions can significantly affect protein activities (Fig. 4). We also predict SUMOylation and ubiquitylation sites of AVPR1a protein, but none was found in our screened risky nsSNPs.

3D modeling

We predicted the 3D structures of all 23 mutants using AlphaFold2. We utilized the platform AlphaFold2 colab to perform structure prediction. We used relax number 5 for all of the mutants to predict the proper relaxed structure. We used modified protein sequences for all 23 mutants according to their mutation position changes. The predicted protein structures were downloaded in PDB format. We got wild-type protein sequence from the UniProt database and downloaded wild-type structure from the AlphaFold Protein Structure Database (Figure S1).

Structural validation and RMSD calculation

The modeled structures were validated by the SAVES v6.1 server, and the evaluation of the secondary structure was conducted using the RAMACHANDRAN plot. The RAMACHANDRAN plot revealed that a significant proportion of the residues of amino acids in the projected structures occupied a region with a significant level of favorability. The comprehensive validation results, RMSD values, and TM scores for all mutants are presented in Table 6. We calculated the RMSD of all 23 mutants using PyMOL (Fig. 5) and nominated 5 mutants (R284 W, Y140S, P107L, R149 C, and F207 V) based on the maximum RMSD value (Fig. 6). TM scores of all screened mutants also indicate the structural similarity and dissimilarity between the native and mutant protein models.

Table 6 Structural validation and RMSD calculation results of all the 23 High‑risk nsSNP mutants by several rigorous analysis tools
Fig. 5
figure 5

RMSD line graph of selected deleterious mutants

Fig. 6
figure 6

Comparison between native protein structure and nominated 5 mutant forms (R284 W, Y140S, P107L, R149 C, and F207 V). The green color denotes native residues, while the mutated residues are yellow

Protein-ligand interaction analysis

As we know, our targeted gene is associated with autism, so for molecular docking analysis, we used 93 compounds with potential for autism treatment as references that are also available in the drug bank, including approved and clinical trial drugs category (Table S4). Our primary goal is to assess the variation in protein-ligand interactions between native and mutant proteins. We downloaded our reference compound structures in sdf format from the PubChem database. After molecular docking analysis, we selected the top 3 compounds for each target based on maximum binding affinity. The selected top 3 drug compounds represent lead molecules for each mutated variant and may have the potential to work properly against those reported deleterious SNPs (Table 7). We used Discovery Studio to visualize the 2D interactions (Fig. 7). We found significant differences between native and mutants in binding affinity and interacting residues responsible for hydrogen and hydrophobic bond formation (Table 7). In some cases, we got utterly new best-binding molecules compared to the native protein binding interaction profile. For example, Mutant F207 V exhibits an entirely new binding interaction profile compared to the native protein. Hence, we can conclude that polymorphisms can drastically alter the protein’s conformation.

Table 7 Binding affinities and Interacting residues of protein ligands of native AVPR1a and mutant R284 W, Y140S, P107L, R149 C, and F207 V that acquired through molecular docking and non-bond interaction analysis
Fig. 7
figure 7

2D visual representation of wild protein and mutant variants (R284 W, Y140S, P107L, R149 C, and F207 V) with ligands including their binding residues

Analyze docking of protein-protein complex by ClusPro

To assess the variation of the protein-protein docking results, we used native protein as a reference protein and evaluated the changes against the mutants. We perform protein-protein docking among 6 protein-protein complexes. These are Native-Native, Native-F207 V, Native-P107L, Native-R149 C, Native-R284 W, Native-Y140S. We noticed significant variation in the binding energy among the 6 complexes (Fig. 8). Thus, we can conclude that mutations can significantly alter the structural and functional characteristics of the protein.

Fig. 8
figure 8

Binding affinities of 6 protein-protein complexes analyzed by ClusPro

Assessment of MD simulation trajectories

Molecular dynamics (MD) simulations are an essential computational method for studying the conformational flexibility, thermodynamic stability, and time-dependent behavior of biomolecular systems [64]. We performed a 200 ns molecular dynamics simulation to validate the docking outputs and the dynamic behaviour of the best resulting drug molecules against the five nominated mutated variants of the target gene in the cellular environment. After the simulation was completed, the dynamic trajectories were examined, and various metrics such as root-mean-square deviation (RMSD), radius of gyration (Rg), solvent-accessible surface area (SASA), and root-mean-square fluctuation (RMSF) were determined (Figs. 9 and 10). To understand the dynamic profiling of the native AVPR1 A protein, we simulated it. We considered it as a control (AVPR1 A_APO) to evaluate the flexibility potential of the mutated protein form in complex with the best binding drug molecules (Y140S_115237, R149 C_115237, R284 W_5073, P107L_213046, and F207 V_46200932). The PubChem CIDs of the studied compounds were retrieved as: Paliperidone (115237), Risperidone (5073), Lurasidone (213046), and Balovaptan (46200932).

Fig. 9
figure 9

a RG and b RMSD, analysis of the unligated control (AVPR1 A_APO) and (Y140S_115237, R149 C_115237, R284 W_5073, P107L_213046, and F207 V_46200932) complex. For each system, the MD simulations were performed for 200 ns

Fig. 10
figure 10

a RMSF and b SASA, analysis of the unligated control (AVPR1 A_APO) and (Y140S_115237, R149 C_115237, R284 W_5073, P107L_213046, and F207 V_46200932) complex. For each system, the MD simulations were performed for 200 ns

The stiffness or flexibility of the ligand-protein complexes was evaluated based on their radius of gyration (Rg) values. Increased Rg values indicated greater conformational instability, while reduced Rg values reflected a more stable and rigid complex formation. Analysis of the Rg revealed that all protein-ligand complexes exhibited lower Rg values compared to the native (unbound) protein, suggesting increased structural compactness upon ligand binding (Fig. 9a). Among the complexes, R284 W_5073 displayed the most rigid Rg profile, indicating enhanced conformational stability. Notably, while significant fluctuations were observed in most complexes during the initial 0–100 ns simulation period, all complexes attained stability in the 101–200 ns timeframe, demonstrating sustained structural integrity over extended dynamics. To evaluate the firmness of the protein-ligand complexes, the root-mean-square deviation (RMSD) was measured and examined. Initial fluctuations (0–50 ns) were observed in all complexes except for the P107L_213046 complex, which exhibited relatively stable behavior. Notably, four out of the five complexes (R149 C_115237, R284 W_5073, P107L_213046, and F207 V_46200932) achieved equilibrium after 51 ns and maintained stable conformations until the end of the simulation (200 ns). In contrast, the Y140S_115237 complex displayed a fluctuating nature but still holds its potential as compared to the RMSD profile of the control (Fig. 9b).

To examine ligand-induced conformational changes and stabilization in the protein, Root Mean Square Fluctuation (RMSF) analysis was performed. Except Y140S_115237, all the complexes displayed an overall similar range of fluctuations compared to the control, suggesting desirable protein residue flexibility. The Y140S_115237 complex exhibited a bit higher fluctuations, particularly within the residue region 380–418 (Fig. 10a). An analysis of the Solvent Accessible Surface Area (SASA) was conducted to assess protein folding, structural stability, and the influence of ligands on the protein’s surface area. We noticed that all ligand-bound complexes exhibited a significantly reduced solvent-accessible surface area (SASA) profile compared to the control (AVPR1 A_APO) over the entire 200 ns simulation trajectory. The persistently lower SASA values of all the complexes suggest enhanced structural compactness and stability, indicative of tighter packing and more favorable ligand-induced conformational dynamics (Fig. 10b). This consistency underscores the thermodynamic stability of the complexes, aligning with criteria for optimal binding interactions and reinforcing their potential as promising candidates for further investigation.

Discussion

In this study, we utilized several in silico tools to identify the most detrimental missense mutations in the AVPR1a gene. We successfully identified 23 highly deleterious nsSNPs with a strong potential to impact AVPR1a gene activity drastically. To determine the most significant genetic alterations, an approach combining predictions from multiple tools was employed to examine the nsSNPs in the AVPR1a gene that have a high likelihood of impacting biological processes. AVP and AVPR1a have already been shown to be related to anxiety-like behavior in multiple studies [70, 71]. AVPR1a gene methylation is directly associated with social behavioral changes [72]. A previous research study identified two microsatellite polymorphisms in the 5′ flanking region of the AVPR1a gene in 115 autism trios. Furthermore, they successfully screened approximately 2 kb of the 5′ flanking region and the coding region, identifying 10 single nucleotide polymorphisms (SNPs) [17]. Another study suggests that variations in the noncoding regions of the vasopressin 1a receptor gene (AVPR1a) are associated with a range of socioemotional traits in voles, chimps, and humans. These variations may influence behavioral changes by altering gene expression at specific sites [73]. The AVPR1a gene plays a key role in regulating social behaviors, including social interaction, social recognition, pair bonding, and aggression, primarily by encoding the vasopressin receptor 1 A (V1aR). Due to a single nucleotide polymorphism (SNP) or point mutation in the AVPR1a gene, changes in gene expression occur, which could directly contribute to autism. For instance, the SNP of the AVPR1a gene (rs1042615) was identified in a previous study on autism susceptibility [18]. Altered vasopressin signaling may affect emotional processing and social memory in ASD.

Additionally, polymorphism in the AVPR1a gene is directly responsible for other health conditions, such as pain. The SNP of the AVPR1a gene (rs10877969) was previously identified as a candidate pain SNP, found in the promoter region of the AVPR1a gene on chromosome 12 [74]. The AVPR1a gene (12q14–15) has three microsatellite loci ((GT)25, RS1, and RS3) that are functionally significant in its promoter region [75]. The RS3 microsatellite is associated with altruism [76] and autism [17], while RS1 is responsible for Novelty Seeking and Harm Avoidance variation [20] and autism [21]. The role of the AVPR1a gene in regulating social behavior is supported by experimental research conducted in animal model: in particular, AVPR1 A antagonist led to a reduction in aggression [77], while decreased AVPR1 A resulted in reduced anxiety and social behavior deficits in voles [73]. The promoter region of AVPR1 A has polymorphisms that may interact differently with specific transcriptional factors, affecting quantitative aspects like sociality in autistic children [25]. This volume of data persuades us that the AVPR1a gene should be a prime candidate for our rigorous investigation, aligning with the theme of this study.

We utilized the NCBI dbSNP database to identify all available SNPs for the AVPR1a gene. We then utilized GeneMANIA to assess the overall related gene network interaction types of the AVPR1a gene. After that, screening began with all accessible non-synonymous SNPs of the AVPR1a gene. Figure 2 shows the segmental SNP distribution graph of the AVPR1a gene. In this study, rs1260022270, rs1267958616, rs1321994497, rs1325662981, rs1337643184, rs1338176647, rs1369668995, rs1377891669, rs1417441306, rs1424280726, rs1440280008, rs1449556252, rs369710823, rs376518166, rs745458336, rs748572296, rs754449459, rs758567125, rs767540299, rs772227542, rs773269527, rs776488571, rs780705756 are considered most risky among total 402 non-synonymous SNPs of the AVPR1a gene according to our meta-analysis by different reliable computational methods. We initially screened deleterious nsSNPs and then performed confirmatory analysis and disease association studies for those nsSNPs. We predicted the probable impact of screened SNPs on the stability of the protein. Domain analysis identified two functional domains, and the output confirms the presence of 22 nsSNPs out of 23 in the functional domain area. Then, Conservation analysis was employed to identify the highly conserved regions of the target protein and pinpoint the screened risky nsSNPs within those regions of the protein. Post-translational modification (PTM) site prediction identified probable PTM sites, and the prediction of ligand binding sites pinpointed the active sites of our desired protein. We identified high-risk nsSNP consequences at both PTM sites and active sites of the AVPR1a protein, indicating that protein function may be significantly altered due to these polymorphisms in crucial sites of the protein. Homology modeling with AlphaFold2 was used to generate a relaxed 3D model of the protein sequences of mutants, and the wild-type structure was downloaded from the AlphaFold Protein Structure Database. Structure validation was necessary to assess the accuracy level of the modeled structures, and we utilized the SAVES v6.1 server to validate these three-dimensional protein structures.

RMSD calculation measures the structural changes of protein due to mutations. Figure 5 presents the RMSD of all selected mutants in a line graph format. High RMSD mutants, including R284 W, Y140S, P107L, R149 C, and F207 V, were nominated as the 5 most detrimental variants. PyRx was used to assess the variation in protein-ligand binding affinities among native and mutant proteins. We docked 93 reference ligands to both wild-type and 5 nominated mutants. For the F207 V variant, Balovaptan, Naltrexone, and Risperidone were the top three compounds with the best binding energy profiles, suggesting potential efficacy against this mutant. Similarly, for P107L, the most promising compounds were Lurasidone, Balovaptan, and Paliperidone; for R149 C, Paliperidone, Leucovorin, and Lurasidone; for R284 W, Risperidone, Piperacillin, and Paliperidone; and for Y140S, Paliperidone, Lurasidone, and Brexpiprazole. These compounds exhibited strong binding affinities and may have therapeutic potential against their respective mutant variants. Furthermore, post-docking analysis was carried out using Discovery Studio to evaluate non-bond interactions. The purpose of docking is to examine how the activity of binding ligands correlates with 3D protein structure and to identify potential compounds that are feasible for working against target mutant variants. To observe the disparity in protein-protein interaction levels, we utilized a widely used server called ClusPro. We observed significant variations in both protein-protein and protein-ligand docking outputs. Those significant variations suggest a noticeable impairment in the AVPR1a protein due to polymorphisms. We carried out a 200 ns molecular dynamics simulation to verify the docking outcomes and analyze the structural dynamics of the most promising drug molecules when bound to the five specified mutant forms of the target gene in an artificially produced relevant environment [78]. A detailed evaluation of dynamic properties was conducted by computing critical metrics—including RMSD, RMSF, SASA, and Rg—from MD simulation data. The results collectively indicated stable structural conformations in most protein-ligand systems. All complexes showed considerable dynamic trends in MD simulations, reflecting stable binding affinities and confirming the structural robustness. The research methodology employed in this study is based on establishing a connection between the alterations and their molecular effects on the protein. When numerous programs or tools are used to achieve a single goal, the results are more dependable since each operates using a different algorithm. Even though our screened highly detrimental nsSNPs in this study were not tested under laboratory conditions, such as in vitro or other assays related to identifying the functional significance of mutations, the overall findings, obtained through rigorous meta-analysis using different computational approaches, highly prioritized those nsSNPs for further laboratory studies and clinical assessments. To further understand the specific function of these harmful nsSNPs on the AVPR1a gene, thorough wet lab research and trials on various model species may be beneficial. Future genome association studies will be capable of identifying damaging SNPs associated with specific patients with autism and other health conditions based on the findings of this study.

Conclusion

This study employed in-silico analysis to investigate the potential impact of nsSNPs on the structure, function, and stability of the AVPR1a protein. The presence of 23 mutations likely caused impairment in the structure and function of the AVPR1a protein, potentially affecting its activity. We evaluated the influence of 23 non-synonymous single nucleotide polymorphisms on changes in protein stability and functional alterations. Additionally, we predict functional domains, probable active sites, and PTM sites of the AVPR1a protein and unveil the consequences of the presence of high-risk nsSNPs in these domain areas, ligand binding sites, and PTM sites. We propose 5 mutants based on high RMSD values. Then, we evaluated the variation in several interaction profiles between native and mutant proteins through analysis of protein-ligand and protein-protein docking interactions. It exhibits the effects of mutants on the protein’s conformational changes, such as alterations in protein structural and functional properties. To fully understand and analyze these data on SNPs, it is necessary to conduct comprehensive clinical trials that include a diverse population. Additionally, experimental studies focusing on mutations are required to validate the findings.

Data availability

All relevant data generated or analysed during this study are included in this manuscript, and will be available upon reasonable request. The datasets retrieved during the current study are available in the NCBI (https://www.ncbi.nlm.nih.gov/) and Uniprot database (https://www.uniprot.org). The NCBI Gene ID: 552 and Uniport accession: P37288. The details about considered genetic polymorphisms for this study of the AVPR1a gene are listed in the S1 Table of the supplementary file. All relevant data generated through rigorous analysis are included in this manuscript.

Abbreviations

nsSNPs:

Non-synonymous single nucleotide polymorphisms

ASD:

Autism spectrum disorder

AVP:

Arginine vasopressin

AVPR1 A :

Arginine vasopressin receptor 1A

FBAT:

Family-based association test

MAF:

Minor allele frequency

SVM:

Support Vector Machines

RSA:

Relative solvent accessibility

LBS:

Ligand binding sites

RMSD:

Root-mean-square deviation

LGA:

Lamarckian genetic algorithm

SAXS:

Small-angle X-ray scattering

References

  1. Forsberg L, de Faire U, Marklund SL, Andersson PM, Stegmayr B, Morgenstern R. Phenotype determination of a common Pro-Leu polymorphism in human glutathione peroxidase 1. Blood Cells Mol Dis. 2000;26:423–6.

    Article  CAS  PubMed  Google Scholar 

  2. Marth GT, Korf I, Yandell MD, Yeh RT, Gu Z, Zakeri H, Stitziel NO, Hillier L, Kwok P-Y, Gish WR. A general approach to single-nucleotide polymorphism discovery. Nat Genet. 1999;23:452–6.

    Article  CAS  PubMed  Google Scholar 

  3. George Priya Doss C, Rajasekaran R, Sudandiradoss C, Ramanathan K, Purohit R, Sethumadhavan R. A novel computational and structural analysis of nsSNPs in CFTR gene. Genomic Med. 2008;2:23–32.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Radivojac P, Vacic V, Haynes C, Cocklin RR, Mohan A, Heyen JW, Goebl MG, Iakoucheva LM. Identification, analysis, and prediction of protein ubiquitination sites. Proteins Struct Funct Bioinf. 2010;78:365–80.

    Article  CAS  Google Scholar 

  5. Jia M, Yang B, Li Z, Shen H, Song X, Gu W. Computational analysis of functional single nucleotide polymorphisms associated with the CYP11B2 gene. PLoS One. 2014;9:e104311.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Persico AM, Napolioni V. Autism genetics. Behav Brain Res. 2013;251:95–112.

    Article  PubMed  Google Scholar 

  7. Bailey A, Phillips W, Rutter M. Autism: towards an integration of clinical, genetic, neuropsychological, and neurobiological perspectives. Autism. 2013;17(2):159–96.

    Google Scholar 

  8. Meyer-Lindenberg A, Domes G, Kirsch P, Heinrichs M. Oxytocin and vasopressin in the human brain: social neuropeptides for translational medicine. Nat Rev Neurosci. 2011;12:524–38.

    Article  CAS  PubMed  Google Scholar 

  9. Koshimizu T, Nakamura K, Egashira N, Hiroyama M, Nonoguchi H, Tanoue A. Vasopressin V1a and V1b receptors: from molecules to physiological systems. Physiol Rev. 2012;92:1813–64.

    Article  CAS  PubMed  Google Scholar 

  10. Insel TR. The challenge of translation in social neuroscience: a review of oxytocin, vasopressin, and affiliative behavior. Neuron. 2010;65:768–79.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Egashira N, Mishima K, Iwasaki K, Nakanishi H, Oishi R, Fujiwara M. Role of vasopressin receptor in psychological and cognitive functions. Nihon Yakurigaku Zasshi. 2009;134:3–7.

    Article  CAS  PubMed  Google Scholar 

  12. Bielsky IF, Hu S-B, Szegda KL, Westphal H, Young LJ. Profound impairment in social recognition and reduction in anxiety-like behavior in vasopressin V1a receptor knockout mice. Neuropsychopharmacology. 2004;29:483–93.

    Article  CAS  PubMed  Google Scholar 

  13. Appenrodt E, Schnabel R, Schwarzberg H. Vasopressin administration modulates anxiety-related behavior in rats. Physiol Behav. 1998;64:543–7.

    Article  CAS  PubMed  Google Scholar 

  14. Thibonnier M. Vasopressin receptors: molecular mechanisms in hypertension and cardiovascular diseases. Mol Mech Hypertens. 2006:173.

  15. Thibonnier M, Graves MK, Wagner MS, Chatelain N, Soubrier F, Corvol P, Willard HF, Jeunemaitre X. Study of V1-vascular vasopressin receptor gene microsatellite polymorphisms in human essential hypertension. J Mol Cell Cardiol. 2000;32:557–64.

    Article  CAS  PubMed  Google Scholar 

  16. Hammock EAD, Young LJ. Variation in the vasopressin V1a receptor promoter and expression: implications for inter-and intraspecific variation in social behaviour. Eur J Neurosc. 2002;16:399–402.

    Article  Google Scholar 

  17. Kim S-J, Young LJ, Gonen D, Veenstra-VanderWeele J, Courchesne R, Courchesne E, Lord C, Leventhal BL, Cook EH Jr, Insel TR. Transmission disequilibrium testing of arginine vasopressin receptor 1A (AVPR1A) polymorphisms in autism. Mol Psychiatry. 2002;7:503–7.

    Article  CAS  PubMed  Google Scholar 

  18. Wassink TH, Piven J, Vieland VJ, Pietila J, Goedken RJ, Folstein SE, Sheffield VC. Examination of AVPR1a as an autism susceptibility gene. Mol Psychiatry. 2004;9:968–72. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/sj.mp.4001503.

    Article  CAS  PubMed  Google Scholar 

  19. Yirmiya N, Rosenberg C, Levi S, Salomon S, Shulman C, Nemanov L, Dina C, Ebstein RP. Association between the arginine vasopressin 1a receptor (AVPR1a) gene and autism in a family-based study: mediation by socialization skills. Mol Psychiatry. 2006;11:488–94.

    Article  CAS  PubMed  Google Scholar 

  20. Meyer-Lindenberg A, Kolachana B, Gold B, Olsh A, Nicodemus KK, Mattay V, Dean M, Weinberger DR. Genetic variants in AVPR1A linked to autism predict amygdala activation and personality traits in healthy humans. Mol Psychiatry. 2009;14:968–75.

    Article  CAS  PubMed  Google Scholar 

  21. Tansey KE, Hill MJ, Cochrane LE, Gill M, Anney RJL, Gallagher L. Functionality of promoter microsatellites of arginine vasopressin receptor 1A (AVPR1A): implications for autism. Mol Autism. 2011;2:1–8.

    Article  Google Scholar 

  22. Wilczyński KM, Auguściak-Duma A, Stasik A, Cichoń L, Sieroń A, Janas-Kozik M. The role of single nucleotide polymorphisms within genes for oxytocin and vasopressin receptors in the presentation and severity of autistic traits. Eur Psychiatry. 2023;66:S102–S102.

    Article  PubMed Central  Google Scholar 

  23. Talbot CF, Oztan O, Simmons SMV, Trainor C, Ceniceros LC, Nguyen DKK, Del Rosso LA, Garner JP, Capitanio JP, Parker KJ. Nebulized vasopressin penetrates CSF and improves social cognition without inducing aggression in a rhesus monkey model of autism. Proc Natl Acad Sci. 2024;121:e2418635121.

    Article  CAS  PubMed  Google Scholar 

  24. Kantojärvi K, Oikkonen J, Kotala I, Kallela J, Vanhala R, Onkamo P, Järvelä I. Association and promoter analysis of AVPR1A in finnish autism families. Autism Res. 2015;8:634–9.

    Article  PubMed  Google Scholar 

  25. Yang SY, Kim SA, Hur GM, Park M, Park J-E, Yoo HJ. Replicative genetic association study between functional polymorphisms in AVPR1A and social behavior scales of autism spectrum disorder in the Korean population. Mol Autism. 2017;8:1–10.

    Article  Google Scholar 

  26. Bhagwat M. Searching NCBI’s dbSNP database. Curr Protoc Bioinformatics. 2010;32:1–19.

    Article  Google Scholar 

  27. Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010;38:W214–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hasnain MJU, Shoaib M, Qadri S, Afzal B, Anwar T, Abbas SH, Sarwar A, Talha Malik HM, Tariq Pervez M. Computational analysis of functional single nucleotide polymorphisms associated with SLC26A4 gene. PLoS One. 2020;15:e0225368.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Mahmud Z, Malik SUF, Ahmed J, Azad AK. Computational analysis of damaging single-nucleotide polymorphisms and their structural and functional impact on the insulin receptor. Biomed Res Int. 2016;2016:2023803.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Islam MJ, Khan AM, Parves MR, Hossain MN, Halim MA. Prediction of deleterious non-synonymous SNPs of human STK11 gene by combining algorithms, molecular docking, and molecular dynamics simulation. Sci Rep. 2019;9:16426.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of amino acid substitutions and indels. PLoS One. 2012;7:e46688.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD. PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res. 2016;44:D336–42.

    Article  CAS  PubMed  Google Scholar 

  35. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet. 2013;76:7–20.

    Google Scholar 

  36. Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam H-J, Mort M, Cooper DN, Sebat J, Iakoucheva LM. Inferring the molecular and phenotypic impact of amino acid variants with MutPred2. Nat Commun. 2020;11:5918.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006;22:2729–34.

    Article  CAS  PubMed  Google Scholar 

  38. Magesh R, George Priya Doss C. Computational pipeline to identify and characterize functional mutations in ornithine transcarbamylase deficiency. 3 Biotech. 2014;4:621–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Calabrese R, Capriotti E, Fariselli P, Martelli PL, Casadio R. Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum Mutat. 2009;30:1237–44.

    Article  CAS  PubMed  Google Scholar 

  40. Leong IUS, Stuckey A, Lai D, Skinner JR, Love DR. Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations. BMC Med Genet. 2015;16:1–13.

    Article  Google Scholar 

  41. Cheng J, Randall A, Baldi P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins Struct Funct Bioinf. 2006;62:1125–32.

    Article  CAS  Google Scholar 

  42. Capriotti E, Fariselli P, Casadio R. I-Mutant2. 0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005;33:W306–10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Savojardo C, Fariselli P, Martelli PL, Casadio R. INPS-MD: a web server to predict stability of protein variants from sequence and structure. Bioinformatics. 2016;32:2542–4.

    Article  CAS  PubMed  Google Scholar 

  44. Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 2012;40:D306–12.

    Article  CAS  PubMed  Google Scholar 

  45. Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, Bucher P, Cerutti L, Corpet F, Croning MDR. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 2001;29:37–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Yachdav G, Kloppmann E, Kajan L, Hecht M, Goldberg T, Hamp T, Hönigschmid P, Schafferhans A, Roos M, Bernhofer M. PredictProtein—an open resource for online prediction of protein structural and functional features. Nucleic Acids Res. 2014;42:W337–43.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Yang J, Roy A, Zhang Y. Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics. 2013;29:2588–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004;4:1633–49.

    Article  CAS  PubMed  Google Scholar 

  49. Cai R, Liu Z, Ren J, Ma C, Gao T, Zhou Y, Yang Q, Xue Y. GPS-MBA: computational analysis of MHC class II epitopes in type 1 diabetes. PLoS One. 2012;7:e33884.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Zhao Q, Xie Y, Zheng Y, Jiang S, Liu W, Mu W, Liu Z, Zhao Y, Xue Y, Ren J. GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs. Nucleic Acids Res. 2014;42:W325–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Wang C, Tan X, Tang D, Gou Y, Han C, Ning W, Lin S, Zhang W, Chen M, Peng D. GPS-Uber: a hybrid-learning framework for prediction of general and E3-specific lysine ubiquitination sites. Brief Bioinform. 2022;23:bbab574.

    Article  PubMed  Google Scholar 

  52. Varadi M, Bertoni D, Magana P, Paramval U, Pidruchna I, Radhakrishnan M, Tsenkov M, Nair S, Mirdita M, Yeo J. AlphaFold protein structure database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Res. 2024;52:D368–75.

    Article  CAS  PubMed  Google Scholar 

  53. Skolnick J, Gao M, Zhou H, Singh S. AlphaFold 2: why it works and its implications for understanding the relationships of protein sequence, structure, and function. J Chem Inf Model. 2021;61:4827–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Yang Z, Zeng X, Zhao Y, Chen R. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct Target Ther. 2023;8:115.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Xu D, Zhang Y. Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimization. Biophys J. 2011;101:2525–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Prot Sci. 1993;2:1511–9.

    Article  CAS  Google Scholar 

  57. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283–91.

    Article  CAS  Google Scholar 

  58. Mahfuz A, Khan MA, Deb P, Ansary SJ, Jahan R. Identification of deleterious single nucleotide polymorphism (SNP) s in the human TBX5 gene & prediction of their structural & functional consequences: an in silico approach. Biochem Biophys Rep. 2021;28:101179.

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Carugo O, Pongor S. A normalized root-mean-spuare distance for comparing protein three-dimensional structures. Protein Sci. 2001;10:1470–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Yuliana D, Bahtiar FI, Najib A. In silico screening of chemical compounds from roselle (Hibiscus Sabdariffa) as angiotensin-I converting enzyme inhibitor used PyRx program. ARPN J Sci Technol. 2013;3:1158–60.

    Google Scholar 

  62. Dallakyan S, Olson AJ. Small-molecule library screening by docking with PyRx. Chem Biol Methods Protoc. 2015;1263:243–50.

    Article  CAS  Google Scholar 

  63. Kawsar, Sarkar MA, et al. Chemical descriptors, PASS, molecular docking, molecular dynamics and ADMET predictions of glucopyranoside derivatives as inhibitors to bacteria and fungi growth. Org Commun. 2022;15(2):203.

  64. Alom MW, Jibon MDK, Faruqe MO, Rahman MS, Akter F, Ali A, Rahman MM. Integrated gene expression data-driven identification of molecular signatures, prognostic biomarkers, and drug targets for glioblastoma. Biomed Res Int. 2024;2024:6810200.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Kumer A, Chakma U, Rana MM, Chandro A, Akash S, Elseehy MM, Albogami S, El-Shehawi AM. Investigation of the new inhibitors by sulfadiazine and modified derivatives of α-d-glucopyranoside for white spot syndrome virus disease of shrimp by in silico: quantum calculations, molecular docking, ADMET and molecular dynamics study. Molecules. 2022;27(4):3694.

  66. Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, Beglov D, Vajda S. The ClusPro web server for protein–protein docking. Nat Protoc. 2017;12:255–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, Lindahl E. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25.

    Article  Google Scholar 

  68. Schüttelkopf AW, Van Aalten DMF. PRODRG: a tool for high-throughput crystallography of protein–ligand complexes. Biol Crystallogr. 2004;60:1355–63.

    Article  Google Scholar 

  69. Kadoura A, Salama A, Sun S. Switching between the NVT and NpT ensembles using the reweighting and reconstruction scheme. Procedia Comput Sci. 2015;51:1259–68.

    Article  Google Scholar 

  70. Neumann ID, Landgraf R. Balance of brain oxytocin and vasopressin: implications for anxiety, depression, and social behaviors. Trends Neurosci. 2012;35:649–59.

    Article  CAS  PubMed  Google Scholar 

  71. Bielsky IF, Hu S-B, Ren X, Terwilliger EF, Young LJ. The V1a vasopressin receptor is necessary and sufficient for normal social recognition: a gene replacement study. Neuron. 2005;47:503–13.

    Article  CAS  PubMed  Google Scholar 

  72. Bodden C, van den Hove D, Lesch KP, Sachser N. Impact of varying social experiences during life history on behaviour, gene expression, and vasopressin receptor gene methylation in mice. Sci Rep. 2017;7:8719.

    Article  PubMed  PubMed Central  Google Scholar 

  73. Barrett CE, Keebaugh AC, Ahern TH, Bass CE, Terwilliger EF, Young LJ. Variation in vasopressin receptor (Avpr1a) expression creates diversity in behaviors related to monogamy in prairie voles. Horm Behav. 2013;63:518–26. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.yhbeh.2013.01.005.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Roach KL, Hershberger PE, Rutherford JN, Molokie RE, Wang ZJ, Wilkie DJ. The AVPR1A gene and its single nucleotide polymorphism rs10877969: a literature review of associations with health conditions and pain. Pain Manag Nurs. 2018;19:430–44. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.pmn.2018.01.003.

    Article  PubMed  PubMed Central  Google Scholar 

  75. Kazantseva AV, Kutlumbetova YY, Malykh SB, Lobaskova MM, Khusnutdinova EK. Arginine-vasopressin receptor gene (AVPR1A, AVPR1B) polymorphisms and their relation to personality traits. Russ J Genet. 2014;50:298–307.

    Article  CAS  Google Scholar 

  76. Avinun R, Israel S, Shalev I, Gritsenko I, Bornstein G, Ebstein RP, Knafo A. AVPR1A variant associated with preschoolers’ lower altruistic behavior. PLoS One. 2011;6:e25274.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Caldwell HK, Lee H-J, Macbeth AH, Young WS III. Vasopressin: behavioral roles of an “original” neuropeptide. Prog Neurobiol. 2008;84:1–24.

    Article  CAS  PubMed  Google Scholar 

  78. Du Y, Wang H, Chen L, Fang Q, Zhang B, Jiang L, Wu Z, Yang Y, Zhou Y, Chen B. Non-RBM mutations impaired SARS-CoV-2 spike protein regulated to the ACE2 receptor based on molecular dynamic simulation. Front Mol Biosci. 2021;8:614443.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU). We also would like to thank the Department of Computer Science and Engineering, University of Rajshahi, Rajshahi-6205, Bangladesh, for providing high-performance computers.

Mandated data types

Gene info: NCBI Gene ID: 552.

Sequencing data: Not applicable.

Genetic polymorphisms: The details about genetic polymorphisms of the AVPR1a gene were already stated in the S1 Table of the supplementary file.

Linked genotype and phenotype data: Not applicable.

Macromolecular structure: Not applicable.

Microarray data: Not applicable.

Crystallographic data for small molecules: Not applicable.

Clinical trial number

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

M.D.K.J., M.A.I., and M.E.H. contributed to the conceptualization and design of the study. M.O.F., R.Z., and U.K.A. carried out data collection and initial analysis. B.S. and Y.K.T. contributed to methodology and validation. M.K. supervised the project and provided critical revisions. M.J. and M.E.A.Z. contributed to writing, review, and editing of the manuscript. All authors contribute significantly as well as they have read and approved the final manuscript.

Corresponding authors

Correspondence to Yewulsew Kebede Tiruneh, Md. Khalekuzzaman or Magdi E. A. Zaki.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jibon, M.K., Islam, M., Hosen, M. et al. In-silico analysis of deleterious non-synonymous SNPs in the human AVPR1a gene linked to autism. BMC Genomics 26, 492 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12864-025-11655-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12864-025-11655-1

Keywords