- Research
- Open access
- Published:
Phylogenetic analysis of Asiatic species in the tropical genus Beilschmiedia (Lauraceae)
BMC Genomics volume 26, Article number: 226 (2025)
Abstract
The tropial genus Beilschmiedia, comprising over 250 species worldwide, includes approximately 40 species distributed in the northern tropical forests of Asia. However, the phylogenetic relationships among these Asiatic Beilschmiedia species remain incompletely understood. In this study, we sequenced and assembled complete chloroplast genomes from six Asiatic Beilschmiedia species, including five from China and one from Indonesia. The genomes range in size from 158,275 to 158,620Â bp and exhibit a typical quadripartite structure, similar to other basal Lauraceae species. We identified 116 to 122 simple sequence repeats (SSRs) and 19 to 28 dispersed repeats within the genomes. The relative synonymous codon usage (RSCU) of 79 protein-coding genes exhibited minimal variation. Notably, the boundary genes rpl23 and ycf1 displayed varying degrees of expansion and contraction, along with incomplete replication phenomena. Using a sliding window approach, we constructed a coalescent tree with ASTRAL software to analyze the phylogenetic relationships. The resulting main topology was highly consistent with the Maximum Likelihood (ML) and Bayesian inference (BI) analyses, clearly dividing the Asiatic core Beilschmiedia into two distinct groups: Group A and Group B. Group A showed an extremely low nucleotide diversity (Ï€) value of 0.00063, while Group B exhibited 2.79-fold higher diversity. The highly variable regions trnS-trnG and rpl32-trnL are proposed as molecular markers for distinguishing between Groups A and B. Furthermore, we identified seven additional highly variable regions: ndhF, ndhF-rpl32, rpl2, rpl2-trnH, rpl32, rps15-ycf1, and ycf1. These regions may serve as potential molecular markers for the Asiatic Beilschmiedia species. These findings provide new insights into the phylogenetic relationships among Asiatic Beilschmiedia species, highlighting the potential of specific molecular markers in future research.
Introduction
The Sunda-Sahul Convergence Zone, which harbors some of the world’s most unique, diverse, and threatened plant species, is a critical region for understanding both the distribution and evolutionary history of its flora, making it a long-standing focus of biogeographical research [1, 2]. Many species within the family Lauraceae, particularly those of the genus Beilschmiedia Nees, serve as exemplars of this significance.
Beilschmiedia is recognized as one of the most widespread pantropical genera, with approximately 250 species distributed globally [3, 4]. These trees and shrubs, which are typically tall and evergreen, often dominate rainforest ecosystems. Their durable wood is highly valued in construction and furniture making, and certain species are notable for the high oil content in their seeds, which yield high-quality oils suitable for industrial applications [5]. To date, the subtribe Beilschmiediineae, encompassing six genera (Beilschmiedia, Endiandra R. Br., Hexapora, Potameia Thouars, Syndiclis Hook.f., and Yasunia van der Werff), has been subject to ongoing debate regarding the phylogenetic relationships among its genera and among species within certain genera [6]. Resolving the phylogenetic relationships within this subtribe is a complex task that requires additional suitable material and robust molecular evidence. The advancement and cost reduction of sequencing technology provides a valuable opportunity to address this challenge.
The chloroplast genome serves as a critical tool for elucidating the phylogenetic relationships and biogeographic history of plants, owing to its highly conserved and orthologous genes, as well as its moderate evolutionary rate [7, 8]. The initial molecular evidence for the Lauraceae family stems from the chloroplast matK sequence, which was employed in a study to reconstruct the phylogenetic relationships among 48 Lauraceae species, providing robust support for the Beilschmiedia-Cryptocarya clade [9]. Following this, subsequent researchers have strived to elucidate the phylogenetic and biogeographic puzzles of the genus Beilschmiedia through molecular methodologies [10, 11]. Rohwer et al. (2014) constructed a phylogenetic tree for the Cryptocarya group based on chloroplast trnK intron and ITS sequences, uncovering that two American species of Beilschmiedia and five Asiatic species formed a cluster [12]. This finding was subsequently validated by further studies. Song et al. (2020) constructed a phylogenetic relationship using chloroplast genomes from 89 taxa across all subfamilies and 24 genera of Lauracea, and laid the foundation for the phylogenetic framework of the family [13]. Among them, six of the 12 Beilschmiediineae species are Asiatic Beilschmiedia. Li et al. (2020) investigated the phylogenetic relationships within the Beilschmiediineae using chloroplast genomes from 35 species, including 14 Asiatic Beilschmiedia species, revealing that B. brenesii from the American and Asiatic species converged to form the Asia-American clade [14]. Those studies aimed to resolve intergeneric relationships within the Beilschmiediineae subtribe, but neither managed to clarify the intrageneric phylogenetic relationships within the Asiatic Beilschmiedia genus.
In this study, we report the newly completed chloroplast genomes and characteristics of B. kunstleri and five other Beilschmiedia species, marking the first publication for members of Beilschmiedia from the Malay Archipelago. We employed the chloroplast genomes of 42 species of the subtribe Beilschmiediineae, including 26 Asiatic species of Beilschmiedia, to reconstruct the phylogenetic relationships of Asiatic Beilschmiedia using coalescent and concatenated methods. This research not only enhances our understanding of the evolutionary history and biogeography of Asiatic Beilschmiedia but also provides a valuable dataset for future studies in plant systematics and conservation biology.
Materials and methods
Plant material and chloroplast genome sequencing
Fresh leaves or silica-gel dried materials from B. kunstleri collected in Sulawesi Island, Indonesia, along with samples from five other species (B. purpurascens, B. robusta, B. yunnanensis, B. tungfangensis, and B. fordii) sourced from Yunnan, Hainan and Guangdong, China, were collected. The six specimens were identified by Professor Song Yu and were deposited in the Biodiversity Group of the Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences. Genomic DNA was extracted from 2Â g of leaves using the modified CTAB method [15], and high-quality DNA samples meeting the library construction criteria were selected for sequencing. A paired-end library with an insert size of 500Â bp was constructed and sequenced on the Illumina HiSeq 2500 platform (BGI-Shenzhen), yielding more than 4 Gb of raw data for each sample.
Genome assembly and annotation
The raw data were processed using fastp v0.23.2 [16] to filter out low-quality reads and adapters, generating a high-quality dataset of clean reads for downstream analyses. The chloroplast genomes were assembled with GetOrganelle v1.7.6.1 [17], followed by a comprehensive evaluation using Bandage v0.8.1 [18]. The genomes were initially auto-annotated with CPGAVAS2 [19] and subsequently manually examined in Geneious Prime v2023.2.1 [20] to precisely define the start/stop codons and intron/exon boundaries of protein-coding genes. A genome map was generated using OGDRAW [21]. The annotated genome sequences were deposited in the Lauraceae Chloroplast Genome Database (LCGD, https://lcgdb.wordpress.com) and GenBank, with the accession numbers LAU00234-00239 and PQ899492-899497, respectively.
Repeat sequence analysis
Simple sequence repeats (SSRs) and dispersed repeats were analyzed separately using the online programs MISA [22] and REPuter [23]. SSRs included mononucleotides, dinucleotides, trinucleotides, tetranucleotides, pentanucleotides, and hexanucleotides, with repeat thresholds set at 10, 5, 4, 3, 3, and 3, respectively. The minimum size for dispersed repeats was set to 30Â bp. Visualization was performed using Origin v2021.
Codon preference and IR boundary analyses
Protein-coding genes were extracted from the six chloroplast genomes using PhyloSuite v1.2.3 [24]. The relative synonymous codon usage (RSCU) of the coding sequences was calculated using CodonW v1.4.2, followed by visualization in the HeatMap module of TBtools-II v2.112 [25]. The contraction and expansion of IR boundaries were analyzed using CPJSdraw v1.0.0 [26].
Phylogenetic analyses
To elucidate the phylogenetic relationships within the Asiatic Beilschmiedia genus, sequences of 46 published species were downloaded from the NCBI and LCGDB databases, including 34 from Beilschmiedia, four from Syndiclis, four from Endiandra, one from Potameia, and three outgroups (Cryptocarya chinensis, C. chingii, and Eusideroxylon zwageri) (Supplementary Material 1: Table S1). The downloaded 46 sequences, along with the six newly sequenced ones, were aligned using MAFFT v7.520 [27], followed by manual adjustment with BioEdit v7.0.9.0 [28]. We employed both coalescent and concatenation methods to reconstruct the phylogenetic tree. For the coalescent method, the aligned matrix was divided into segments with a window size of 15,000 bp and a step size of 1,000 bp, after which IQtree v2.2.2 [29] was used to construct phylogenetic trees for each segment. Newick Utilities [30] were utilized to filter out nodes with bootstrap (BS) support values ≤ 10%, and subsequently, Astral-III v5.7.8 [31] was employed to construct the coalescent tree. For the concatenation method, both maximum likelihood (ML) and Bayesian inference (BI) approaches were utilized. In the ML approach, the embedded ModelFinder v2.3.6 [32] in IQtree v2.2.2 was used to calculate the best nucleotide substitution model, which was then applied to construct the ML tree with 1,000 replicates. For the BI approach, the best model was calculated using jModelTest v2.1.10 [33], and the analysis was performed in MrBayes [34], with the Markov Chain Monte Carlo (MCMC) chains running for 2,000,000 generations, sampling every 1,000 generations, and discarding the first 25% of trees as burn-in. Finally, all trees were visualized using FigTree v1.4.4.
Sliding window analysis of the chloroplast genomes
Based on the results of the phylogenetic tree, the sequences of Group A and Group B were extracted from the adjusted matrix to obtain the corresponding matrices. DnaSP v6.12.03 [35] was used to calculate the nucleotide diversity values (Ï€) for Groups A and B, with a window length set to 600Â bp and a step size set to 200Â bp. The values were plotted in the chloroplast genomes using an R program.
Results
Characteristics of the six newly obtained Beilschmiedia chloroplast genomes
By capturing chloroplast genome diversity, we newly assembled the genomes of six accessions within the genus Beilschmiedia. These chloroplast genomes exhibited the typical circular, double-stranded structure, with lengths ranging from 158,275 bp (B. purpurascens) to 158,620 bp (B. kunstleri) (Fig. 1; Table 1). They displayed a quadripartite structure, consisting of a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeat (IR) regions. The LSC region ranged in length from 89,199 bp to 89,377 bp, with a GC content of 37.71–37.76% (Table 1). The SSC region varied in length from 18,133 bp to 18,234 bp, with a GC content of 33.90–34.02%. The IR regions ranged from 25,436 bp to 25,505 bp in length and exhibited a GC content of 43.03–43.05%.
Despite slight variations in chloroplast genome lengths among the six species, genetic composition analyses revealed shared features. A total of 131 genes were annotated in all six Beilschmiedia species, including 85 protein-coding genes, 37 tRNA genes, eight rRNA genes, and one pseudogene (ycf15) (Table 1 and Supplementary Material 1: Table S2). All eight rRNA genes were located in the IR regions, while 23 tRNA genes were found in the SSC and LSC regions, and the remaining tRNA genes were positioned in the IR regions. Among the functional genes, 12 protein-coding genes (atpF, ndhA, ndhB (2), petB, petD, rpl2, rpl16, rpoC1, rps12 (2), and rps16) and 8 tRNA genes (trnA-UGC (2), trnG-UCC, trnI-GAU (2), trnK-UUU, trnL-UAA, and trnV-UAC) contained one intron, while clpP1 and ycf3 contained two introns (Supplementary Material 1: Table S2). The rps12 gene was trans-spliced, with one exon located in the LSC region (5′ end) and the other in the IR region (3′ end).
Circular gene map of the six Beilschmiedia chloroplast genomes. Genes drawn outside the circle are transcribed in the counterclockwise direction, those inside are transcribed in the clockwise direction. Different colored bars represent genes with different functions. An asterisk (*) on a gene name indicates that the gene contains introns. The dashed dark gray area in the inner circle indicates the GC content of the chloroplast genome, and the light gray area shows the AT content
SSR and dispersed repeat analyses
The six chloroplast genomes were analyzed for SSRs, revealing a total of 116 to 122 SSRs dispersed throughout each genome. These SSRs ranged from mononucleotide to hexanucleotide repeats, although not all six types were present in every genome (Fig. 2 and Supplementary Material 1: Table S3). Mononucleotide repeats were the most abundant, accounting for 80.17–81.97% of the total SSRs, with minimal variation among the genomes (Fig. 2A). The other five types of repeats showed even less variation, with no more than a single occurrence difference between genomes, indicating relatively low SSR diversity. Mononucleotide repeats were predominantly composed of A/T sequences, with only one to five C/G repeats observed (Fig. 2B). When mapped onto the genomic landscape, 71.07–72.13% of the SSRs were located in intergenic regions, while only 10 to 11 were found within coding regions (Supplementary Material 1: Table S3). Notably, the coding regions containing SSRs exhibited a high level of consistency across the six genomes, with all SSRs located in atpB, cemA, psbF, rpl22, rpoB, rpoC2, ycf1, and ycf2.
Type and number of repeats. (A) The type frequency of different SSR types. (B) The type frequency of SSR motifs in different repeat class types. (C) Type and number of dispersed repeats. On the left side of the 0-axis is the distribution of repeats of varying lengths, while on the right side is the distribution of different types of repeats. F represents forward repeats, R stands for reverse repeats, C denotes complement repeats, and P indicates palindromic repeats
Additionally, four types of dispersed repeat sequences—forward, reverse, complement, and palindromic repeats—were identified across the six genomes, with totals ranging from 19 (B. robusta) to 28 (B. yunnanensis). These repeats are predominantly composed of forward and palindromic repeat sequences, as observed in the genomes of B. fordii, B. purpurascens, B. robusta, and B. tungfangensis, which exclusively contain these two types of repeats (Fig. 2C). Reverse repeat is uniquely present in B. kunstleri, while complement repeat is found only in B. yunnanensis. The longest repeat is a palindromic repeat sequence in B. purpurascens, spanning 52 bp, and is unique within the 50–59 bp range. The majority of repeat lengths are enriched within the 30–39 bp range, exhibiting large variation, whereas the 40–49 bp range consistently contains two repeats per genome.
Codon usage bias analysis
The relative synonymous codon usage (RSCU) preferences of 79 protein-coding genes from six Beilschmiedia species exhibit similar frequencies with minimal variation. These genes generate between 22,080 and 22,099 codons, which can be categorized into 64 distinct codon types, encoding 20 amino acids (excluding stop codons) (Fig. 3). An RSCU value greater than 1 indicates a preference for codon usage, while a value less than 1 suggests a lower frequency of use. Among the 64 codon types, 31 have RSCU values greater than 1, with 13 ending in A, 16 in U, one in C, and one in G, indicating a preference for codons ending in A/U. The highest RSCU values are observed for AGA and GCU, both exceeding 1.8, which encode Arginine and Alanine, respectively. AUG and UGG, encoding Methionine and Tryptophan, respectively, have RSCU values of 1, indicating no usage preference. Notably, AUG is frequently used as the initiating codon in translation, serving as the starting amino acid for protein synthesis. CGC and AGC exhibit the lowest RSCU values and are shared among the six genomes.
Contraction and expansion of the IR regions
The boundaries of LSC/IRb (JLB), IRb/SSC (JSB), SSC/IRa (JSA), and IRa/LSC (JLA) are primarily associated with six genes: rps19, rpl23, ycf1, ndhF, trnN, and rpl2. The genes rpl23 and ycf1, along with their duplicated copies, are located at these four boundaries, while rps19, ndhF, trnN, and rpl2 are situated adjacent to the boundaries (Fig. 4). Genes not located at the boundaries exhibit identical lengths across the six genomes; however, their distances from the boundaries vary. In contrast, genes situated at the boundaries have undergone varying degrees of expansion and contraction. This phenomenon is observed not only among different genomes but also at different boundaries within the same genome, as exemplified by rpl23 and ycf1. In the genome of B. fordii, the rpl23 genes are completely identical. However, in the other genomes, they vary, with an expansion of 1 to 20 bp occurring in the LSC region. The pattern of rpl23 differs significantly from that of ycf1. The ycf1 genes have undergone severe incomplete duplication. The main body of ycf1 at the JSA boundary is located in the SSC region, while the ycf1 at the JSB boundary has lost most of its segment in the SSC region.
Phylogenetic analyses
After manual adjustment, a matrix of 166,531 bp in length was obtained, which was then divided into 167 segments to construct a coalescent tree. The main topological structure of the ASTRAL tree grouped the 30 individuals of 24 species of the Asiatic Beilschmiedia into two groups: Group A, consisting of 14 individuals from 10 species, and Group B, consisting of 16 individuals from 14 species (Fig. 5). The topology received strong support at the major branches (q1), but the quartet support values for the internal nodes within Group B were relatively low (q1 = 0.38, q2 = 0.27, q3 = 0.35), with each accounting for about one-third. This distribution suggests that the phylogeny of Group B is unstable, and different topologies may arise when using various chloroplast genome regions. Within Group B, B. kunstleri (LAU00236) is closely related to B. sp. (LAU00127) and B. mengwangensis (LAU00129), which forms a sister branch with B. yunnanensis (LAU00234) and B. tungfangensis (LAU00235) in the same subclade.
The ML and BI trees obtained from the concatenation method were largely consistent, except for a positional change of B. fordi (LAU00238) and B. kweichowensis (ON881523) in Group A (Supplementary Material 2: Fig. S1). The branch node exhibited very low support values (bootstrap support, BS = 40; posterior probability, PP = 0.59). The topological structure of the concatenated tree was also generally consistent with that of the coalescent tree, both dividing the Asiatic Beilschmiedia into two groups and exhibiting high support values at the main nodes (Supplementary Material 2: Fig. S1). However, the results for B. fordii (LAU00238) and B. kweichowensis (ON881523) in the concatenated tree were inconsistent with those in the coalescent tree, exhibiting a third topological structure. Additionally, some minor differences were observed between the internal branches of the concatenated and coalescent trees, especially in branches with lower support values. For example, the positions of B. laevis (LAU00091) and B. cylindrica (LAU00092) were swapped in Group A, and within Group B, B. kunstleri (LAU00236) did not cluster with B. sp. (LAU00127) and B. mengwangensis (LAU00129) in the concatenated tree but instead formed a sister relationship with B. yunnanensis (LAU00234) and B. tungfangensis (LAU00235). These changes occurred primarily in smaller branches. Another notable difference involved the two individuals of B. glauca (MT720938 and LAU00126). In the coalescent and BI trees, they formed a sister relationship with Syndiclis and the Asiatic Beilschmiedia, whereas, in the ML tree, they clustered exclusively with Syndiclis.
Phylogenetic reconstruction of the relationships in Beilschmiediineae based on analysis of whole-genome data set of chloroplasts. Pie charts for major clades correspond to quartet support, blue: main topology (q1), yellow: first alternative topology (q2), grey: second alternative topology (q3). The numbers on the branches represent q1/q2/q3 values
Identification of the most variable regions
The genetic diversity of the Groups A and B was assessed by calculating the nucleotide diversity values (π). Group A, comprising 14 Beilschmiedia sequences, exhibited minimal nucleotide variation, with π values ranging from 0 to 0.00606 and with an average of 0.00063 (Fig. 6). In contrast, Group B, which encompasses 16 sequences, displayed a broader range of π values, spanning from 0 to 0.01236, with an average of 0.00176. A significant discrepancy was observed between the two groups, with Group B demonstrating 2.79 times the nucleotide diversity of Group A. The nucleotide variation patterns between Groups A and B were strikingly similar, with a majority of high-variability regions being shared. Within Group B, we identified 11 highly variable sites (π > 0.0052), including the regions of atpB-rbcL, ndhF, ndhF-rpl32, rpl2, rpl2-trnH, rpl32-trnL, rpl32, rps15-ycf1, trnS-trnG, ycf1, and ycf4-cemA. All these sites, except trnS-trnG and rpl32-trnL, were also present in Group A (Fig. 6). Five of these sites (atpB-rbcL, trnS-trnG, rpl2, rpl2-trnH, and ycf4-cemA) are located in the LSC region, while the remaining sites are situated in the SSC region, except for part of the ycf1 gene located in the IR region—there are no other high variability regions in the IR. Notably, these highly variable sites predominantly occur in intergenic regions, with only four sites (ndhF, rpl2, rpl32, and ycf1) being located within genes.
Discussion
Gene transfer and loss in plant chloroplast genomes are relatively common and play a critical role in plant evolution and adaptation [36, 37]. This phenomenon is also observed in the chloroplast genomes of Lauraceae plants, particularly in the genus Cassytha. As the sole hemiparasitic species within the Lauraceae family, Cassytha has undergone significant changes in its chloroplast genome, including the loss of the NADH dehydrogenase (ndh) gene family and the entire IR region [38]. In this study, we analyzed the chloroplast genomes of six Beilschmiedia species and found them to be highly similar, all encoding 131 genes, which is consistent with previous reports [39]. However, slight discrepancies were observed compared to the results of Song et al. (2017), primarily concerning the pseudogene ycf15 [38]. Previous studies have shown that the genome size of the basal Lauraceae group typically ranges from 157,577Â bp to 158,530Â bp, a range that has been continuously updated with increasing sample sizes. The B. kunstleri analyzed in this study provides additional data supporting this observation. Among the six newly sequenced genomes, B. kunstleri exhibited the largest genome size at 158,620Â bp, while B. purpurascens had the smallest at 158,275Â bp, resulting in a difference of 345Â bp. These variations arise from multiple regions of the genome and are primarily associated with insertions, deletions, and the contraction and expansion of the IR regions [40].
The expansion and contraction of the IR regions in chloroplast genomes not only reflect the structural diversity of the genome but may also be closely related to the phylogenetic relationships of species [41, 42]. The genomes of the six species are highly similar, with no rearrangement events, indicating the conservativeness of chloroplast genomes and their consistency within the same genus. In previous reports, the ycf1 gene is located at the boundary and exhibits incomplete replication [38]. This is also observed in this study, where the length of ycf1 at the boundaries between JSB (IRb/SSC) and JSA (IRa/LSC) shows varying degrees of expansion and contraction among the six genomes, although no specific pattern is evident. The unique location of ycf1 may account for the size differences, which are reflected in the overall genome size. Additionally, we found that, apart from the incomplete replication of ycf1, rpl23 exhibits expansion and contraction. For instance, in B. kunstleri, B. purpurascens, B. robusta, B. tungfangensis, and B. yunnanensis, these genes are not entirely consistent but have expanded by 1–20 bp towards the LSC region within the same genome. The expansion and contraction of the IR regions in chloroplast genomes are influenced by various factors and represent a complex evolutionary process. These processes may be more pronounced in certain species or genomic regions, such as in Paphiopedilum [43], while being relatively minor in other species or regions. This uneven change is a natural phenomenon in the evolution of chloroplast genomes, reflecting the dynamic responses of genomes to environmental and genetic pressures.
This study represents the first application of the ASTRAL method to reconstruct the phylogeny of the Asiatic Beilschmiedia genus based on complete chloroplast genomes. Comparative analysis of the resulting topology with those generated by concatenation methods (including ML and BI trees) revealed a high degree of similarity. Both approaches consistently divided the Asiatic core Beilschmiedia populations into two groups: Group A and Group B. The morphology of the terminal bud (TB) is considered a key trait for Beilschimediinae, particularly as a primary distinguishing feature among Beilschmiedia species [44, 45]. It can be categorized into large-TB and small-TB types (including crumpled, slender, and crescent forms). However, our molecular evidence did not fully align with the classification based on TB shape, consistent with Li et al. (2020) [14]. The discordance between molecular and morphological evidence remains a major challenge in contemporary plant systematics [46,47,48].
In the Asiatic core Beilschmiedia populations of Group A and Group B, the intermingled distribution of large-TB and small-TB species resulted in no significant differences in TB length, width, or length-to-width ratios between the two groups. Nevertheless, Group A predominantly comprised large-TB species, while Group B was dominated by small-TB species, suggesting that TB shape retains potential value as a taxonomic character [49]. Measurements of TB revealed the following ranges: large-TB measured 4.17–8.23 mm in length, 2.3–4.43 mm in width, and had a length-to-width ratio of 1.68–2.97; small-TB measured 2.95–8.0 mm in length, 0.8–2.25 mm in width, and had a length-to-width ratio of 1.36–6.13. Due to the diversity within small-TB types, neither length nor length-to-width ratio serves as reliable distinguishing criteria. In contrast, width (diameter) showed potential as a distinguishing feature, although this conclusion requires further validation with additional species data. It is also important to note that TB size may vary slightly between live plants and herbarium specimens, and even among different branches of the same species. The observed conflict between morphological traits and molecular evidence in phylogenetic analyses may be attributed to several factors: (1) The study primarily relied on chloroplast genomes for phylogenetic reconstruction. Given their maternal inheritance, the resulting phylogenetic relationships may not fully reflect the actual evolutionary history of the species, particularly in cases involving hybridization or introgression [50, 51]; (2) Although TB shape is considered an important taxonomic character, it may have been subject to independent natural selection pressures during evolution. Furthermore, our phylogenetic results were highly similar to those previously reported, yet there are still some differences. Unlike our study, Li et al. (2020) did not divide the Asiatic core Beilschmiedia into two groups [14]. This discrepancy may stem from incomplete sampling, as our study included approximately twice the number of Asiatic core Beilschmiedia populations compared to those in Li et al.‘s study. Song et al. (2023) also divided Asiatic Beilschmiedia into two groups, corresponding to our Group A and Group B [6]. However, in their study, B. kunstleri clustered with Syndiclis, while in our study, it was placed in Group B of the Asiatic Beilschmiedia. This difference may be due to insufficient informative sites provided by the barcode sequences. In conclusion, this study provides a more comprehensive and higher-resolution phylogenetic framework for the Asiatic core Beilschmiedia, laying a solid foundation for future taxonomic research.
Comparative analysis of chloroplast genome lengths between Groups A and B. A, The lengths of distinct genomic regions are mapped onto the phylogenetic tree, with red branches representing Group A and blue branches indicating Group B. The bar graphs, color-coded in blue, green, purple, and dark red, represent the total chloroplast genome length, the length of the LSC region, the length of the SSC region, and the length of the IR region, respectively, for each species, with units in kilobases (kb). B, Comparative analysis of the total chloroplast genome length; C, Comparative analysis of the LSC region length; D, Comparative analysis of the IR region length; E, Comparative analysis of the SSC region length
Mapping the chloroplast genome sizes of species from Groups A and B onto their corresponding phylogenetic trees revealed that the sizes of these genomes are very similar, with no distinct phylogenetic pattern emerging (Fig. 7A). This suggests that the chloroplast genomes of the species in these two groups did not undergo significant expansion or contraction during evolution. Similar ecological and environmental pressures may have contributed to the maintenance of their chloroplast genome sizes within a comparable range [52, 53]. A comparative analysis of the genome lengths and the lengths of various regions in Groups A and B showed significant differences in the IR and SSC regions (Fig. 7D and E), while no significant differences were observed in the total length and the LSC region (Fig. 7B and C). The variation in the IR and SSC regions may be the primary factors contributing to the differences in genome lengths.
Hypervariable loci have been widely used as barcoding markers for taxon identification and phylogenetic analysis [54]. The phylogenetic results in this study divided the Asiatic Beilschmiedia species into Groups A and B, and we aimed to explore the sequence variation within these two groups to identify potential molecular markers for distinguishing them. We successfully identified two potential sites, trnS-trnG and rpl32-trnL. However, we recommend using multiple regions in combination to improve the accuracy of the results. This recommendation is based on two main reasons: (1) the sequence variation among the 14 individuals in Group A is remarkably low, and (2) Group A and Group B share most of the hypervariable loci. These shared hypervariable regions (atpB-rbcL, ndhF, ndhF-rpl32, rpl2, rpl2-trnH, rpl32, rps15-ycf1, ycf1, and ycf4-cemA) have also been reported in previous studies on Lauraceae [55, 56]. By combining the variation patterns of Group A and Group B, the seven hypervariable loci (ndhF, ndhF-rpl32, rpl2, rpl2-trnH, rpl32, rps15-ycf1, and ycf1) may serve as more effective markers for species identification and phylogenetic studies, particularly the top three regions with the highest variation (ycf1, ndhF, and ndhF-rpl32). The identified hypervariable loci are not randomly distributed but represent hotspots for the Indel (insertion/deletion) and single nucleotide polymorphism (SNP) mutation events [57, 58]. These mutation events contribute to the rapid evolution of these hypervariable loci, which are regions of the chloroplast genome that evolve at an accelerated rate and ultimately influence genome size.
Conclusion
This study successfully assembled the chloroplast genomes of six Beilschmiedia species, thereby enriching the genomic data resources for this genus. The comparative analysis demonstrated that the genetic composition, repeat sequences, and codon usage bias of these genomes exhibited minor variations, which strongly supports the conserved nature of chloroplast genomes. Notably, expansion and contraction of varying degrees were observed in the rpl23 and ycf1 genes at the IR boundaries. The coalescent tree and the concatenated tree displayed a high level of concordance, supporting the division of the Asiatic core Beilschmiedia into two distinct groups, providing a more comprehensive and higher-resolution phylogenetic framework. Our results demonstrate that the trnS-trnG and rpl32-trnL regions represent promising molecular markers for distinguishing between these groups. Furthermore, we propose seven additional candidate markers for further exploration and validation in future studies.
Data availability
The chloroplast genome sequences for the six species of Beilschmiedia in this study have been deposited in the Lauraceae chloroplast genome database (https://lcgdb.wordpress.com/) and GenBank, with the accession number LAU00234-00239 and PQ899492-899497, respectively.
References
Joyce EM, Thiele KR, Slik FJW, Crayn DM. Checklist of the vascular flora of the Sunda-Sahul Convergence Zone. Biodivers Data J. 2020;8:e51094.
Holzmeyer L, Hauenschild F, Muellner-Riehl AN. Sunda–Sahul floristic exchange and pathways into the Southwest Pacific: new insights from wet tropical forest trees. J Biogeogr. 2023;50(7):1257–70.
Nishida S. Revision of Beilschmiedia (Lauraceae) in the neotropics. Ann Mo Bot Gard 1999:657–701.
van der Werff H. An annotated key to the genera of Lauraceae in the Flora Malesiana region. Blumea: Biodivers Evol Biogeogr Plants. 2001;46(1):125–40.
Salleh WMNHW, Ahmad F, Khong HY, Mohamed Zulkifli R. Comparative study of the essential oils of three Beilschmiedia species and their biological activities. Int J Food Sci Technol. 2015;51(1):240–9.
Song Y, Xia SW, Tan YH, Yu WB, Yao X, Xing YW, Corlett RT. Phylogeny and biogeography of the Cryptocaryeae (Lauraceae). Taxon. 2023;72(6):1244–61.
Clegg MT, Gaut BS, Learn GH Jr., Morton BR. Rates and patterns of chloroplast DNA evolution. Proc Natl Acad Sci U S A. 1994;91(15):6795–801.
Daniell H, Lin CS, Yu M, Chang WJ. Chloroplast genomes: diversity, evolution, and applications in genetic engineering. Genome Biol. 2016;17(1):134.
Rohwer JG. Toward a phylogenetic classification of the Lauraceae: evidence from matK sequences. Syst Bot 2000, 25(1).
Chanderbali AS, van der Werff H, Renner SS. Phylogeny and historical Biogeography of Lauraceae: evidence from the Chloroplast and Nuclear genomes. Ann Mo Bot Gard 2001, 88(1).
Rohwer JG, Rudolph B. Jumping Genera: the phylogenetic positions of Cassytha, Hypodaphnis, and Neocinnamomum (Lauraceae) based on different analyses of trnK intron sequences. Ann Mo Bot Gard. 2005;92(2):153–78.
Rohwer JG, De Moraes PLR, Rudolph B, Werff HVD. A phylogenetic analysis of the Cryptocarya group (Lauraceae), and relationships of Dahlgrenodendron, Sinopora, Triadodaphne, and Yasunia. Phytotaxa 2014, 158(2).
Song Y, Yu WB, Tan YH, Jin JJ, Wang B, Yang JB, Liu B, Corlett RT. Plastid phylogenomics improve phylogenetic resolution in the Lauraceae. J Syst Evol. 2019;58(4):423–39.
Li H, Liu B, Davis CC, Yang Y. Plastome phylogenomics, systematics, and divergence time estimation of the Beilschmiedia group (Lauraceae). Mol Phylogenet Evol. 2020;151:106901.
Doyle JJ, Doyle JL. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. PHYTOCHEMICAL Bull 1987 v.19(1):11–5.
Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884–90.
Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, Li DZ. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020;21(1):241.
Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31(20):3350–2.
Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, Liu C. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res. 2019;47(W1):W65–73.
Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.
Greiner S, Lehwark P, Bock R. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res. 2019;47(W1):W59–64.
Beier S, Thiel T, Munch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33(16):2583–5.
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29(22):4633–42.
Zhang D, Gao F, Jakovlic I, Zou H, Zhang J, Li WX, Wang GT. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20(1):348–55.
Chen C, Wu Y, Li J, Wang X, Zeng Z, Xu J, Liu Y, Feng J, Chen H, He Y, Xia R. TBtools-II: a one for all, all for one bioinformatics platform for biological big-data mining. Mol Plant. 2023;16(11):1733–42.
Li H, Guo Q, Xu L, Gao H, Liu L, Zhou X. CPJSdraw: analysis and visualization of junction sites of chloroplast genomes. PeerJ. 2023;11:e15326.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. In: Nucleic acids symposium series: 1999: Oxford; 1999: 95–98.
Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74.
Junier T, Zdobnov EM. The Newick utilities: high-throughput phylogenetic tree processing in the UNIX shell. Bioinformatics. 2010;26(13):1669–70.
Zhang C, Rabiee M, Sayyari E, Mirarab S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics. 2018;19(Suppl 6):153.
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587–9.
Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9(8):772.
Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61(3):539–42.
Rozas J, Ferrer-Mata A, Sanchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sanchez-Gracia A. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol Biol Evol. 2017;34(12):3299–302.
Martin W, Rujan T, Richly E, Hansen A, Cornelsen S, Lins T, Leister D, Stoebe B, Hasegawa M, Penny D. Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proc Natl Acad Sci U S A. 2002;99(19):12246–51.
Li Y, Zhou JG, Chen XL, Cui YX, Xu ZC, Li YH, Song JY, Duan BZ, Yao H. Gene losses and partial deletion of small single-copy regions of the chloroplast genomes of two hemiparasiteic Taxillus species. Sci Rep. 2017;7(1):12834.
Song Y, Yu WB, Tan Y, Liu B, Yao X, Jin J, Padmanaba M, Yang JB, Corlett RT. Evolutionary comparisons of the Chloroplast Genome in Lauraceae and insights into loss events in the magnoliids. Genome Biol Evol. 2017;9(9):2354–64.
Zhu W, Zhang H, Li Q, Cao Z, Song Y, Xin P. Complete plastid genome sequences of three Tropical African Beilschmiediineae Trees (Lauraceae: Crytocaryeae). Forests 2024, 15(5).
Xiao-Ming Z, Junrui W, Li F, Sha L, Hongbo P, Lan Q, Jing L, Yan S, Weihua Q, Lifang Z, et al. Inferring the evolutionary mechanism of the chloroplast genome size by comparing whole-chloroplast genome sequences in seed plants. Sci Rep. 2017;7(1):1555.
Wang RJ, Cheng CL, Chang CC, Wu CL, Su TM, Chaw SM. Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol Biol. 2008;8:36.
Wang X, Dorjee T, Chen Y, Gao F, Zhou Y. The complete chloroplast genome sequencing analysis revealed an unusual IRs reduction in three species of subfamily Zygophylloideae. PLoS ONE. 2022;17(2):e0263253.
Guo YY, Yang JX, Bai MZ, Zhang GQ, Liu ZJ. The chloroplast genome evolution of Venus slipper (Paphiopedilum): IR expansion, SSC contraction, and highly rearranged SSC regions. BMC Plant Biol. 2021;21(1):248.
Liu B, Yang Y, Xie L, Zeng G, Ma K. Beilschmiedia turbinata: a newly recognized but dying species of Lauraceae from tropical Asia based on morphological and molecular data. PLoS ONE. 2013;8(6):e67636.
Liu B. Systematics and Biogeography of the Subtribe Beilschmiediineae (Lauraceae) in China. Ph.D.: the Chinese Academy of Sciences; 2013.
Donoghue MJ, Sanderson MJ. The Suitability of Molecular and Morphological Evidence in Reconstructing Plant Phylogeny. In: Molecular Systematics of Plants. Edited by Soltis PS, Soltis DE, Doyle JJ. Boston, MA: Springer US; 1992: 340–368.
Patterson C, Williams DM, Humpries CJ. Congruence between Molecular and Morphological Phylogenies. Annu Rev Ecol Syst. 1993;24:153–88.
Oyston JW, Wilkinson M, Ruta M, Wills MA. Molecular phylogenies map to biogeography better than morphological ones. Commun Biol. 2022;5(1):521.
Li XWLJ, Huang PH, Wei FN, Cui HB, van der Werff H. Lauraceae. Wu ZY, Raven PH, Hong DY, editors Flora of China. Volume 7. Beijing: Science; St. Louis: Missouri Botanical Garden Press. 2008;7:102–254.
Lee-Yaw JA, Grassa CJ, Joly S, Andrew RL, Rieseberg LH. An evaluation of alternative explanations for widespread cytonuclear discordance in annual sunflowers (Helianthus). New Phytol. 2019;221(1):515–26.
Xu Y, Wei Y, Zhou Z, Cai X, Boden SA, Umer MJ, Safdar LB, Liu Y, Jin D, Hou Y, et al. Widespread incomplete lineage sorting and introgression shaped adaptive radiation in the Gossypium Genus. Plant Commun. 2024;5(2):100728.
Bennett MD, Leitch IJ. CHAPTER 2 - Genome Size Evolution in Plants. In: The Evolution of the Genome. Edited by Gregory TR. Burlington: Academic Press; 2005: 89–162.
Dobrogojski J, Adamiec M, Luciński R. The chloroplast genome: a review. Acta Physiol Plant 2020, 42(6).
Hollingsworth PM, Forrest LL, Spouge JL, Hajibabaei M, Ratnasingham S, van der Bank M, Chase MW, Cowan RS, Erickson DL. A DNA barcode for land plants. Proceedings of the National Academy of Sciences 2009, 106(31):12794–12797.
Song Y, Dong W, Liu B, Xu C, Yao X, Gao J, Corlett RT. Comparative analysis of complete chloroplast genome sequences of two tropical trees Machilus yunnanensis and Machilus balansae in the family Lauraceae. Front Plant Sci. 2015;6:662.
Song Y, Yao X, Tan Y, Gan Y, Yang J, Corlett RT. Comparative analysis of complete chloroplast genome sequences of two subtropical trees, Phoebe Sheareri and Phoebe Omeiensis (Lauraceae). Tree Genet Genomes 2017, 13(6).
Shaw J, Lickey EB, Schilling EE, Small RL. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III. Am J Bot. 2007;94(3):275–88.
Worberg A, Quandt D, Barniske A-M, Löhne C, Hilu KW, Borsch T. Phylogeny of basal eudicots: insights from non-coding and rapidly evolving DNA. Organisms Divers Evol. 2007;7(1):55–77.
Acknowledgements
The authors would like to thank Di Zhang at the Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, for checking the data. We sincerely thank two anonymous reviewers for their critical and invaluable comments that greatly improved our manuscript.
Funding
This work was supported by the Yunnan Province landscape architecture first-class discipline construction fund; Key Technologies Research for the Germplasm of Important Woody Flowers in Yunnan Province (No. 202302AE090018); Special Program for Technology Bases and Talents of Guangxi (No. GuikeAD23026281); the National Natural Science Foundation of China (No. 32260060 & 32060710); the Project of the Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences (No. Y4ZK111B01) and Yunnan Province Science and Technology Department (No. 202203AP140007).
Author information
Authors and Affiliations
Contributions
W.Z., Y.S., and P.Y.X. designed the work. J.R.M. prepared the datasets. W.Z., J.R.M., Y.H.T., Y.S., and P.Y.X. contributed materials/analysis tools. W.Z. wrote the manuscript.W.Z., Y.S., and P.Y.X. revised the manuscript. All authors approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
This study’s material collections and experimental research complied with relevant institutional, national, and international guidelines and legislation. No specific permissions or licenses were required.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Material 1: Table S1:
Chloroplast genomes obtained from NCBI and LCGDB for this study. Table S2: Genes present in the chloroplast genome of Beilschmiedia. Table S3: SSR analysis of six newly sequenced chloroplast genomes.
Supplementary Material 2: Fig. S1:
Phylogeny of the Beilschmiediineae group integrating maximum likelihood (ML) and Bayesian inference (BI) trees based on complete chloroplast genomes. The numbers above the nodes of the ML tree (left) represent bootstrap percentages (BS, %) and those above the nodes of the BI tree (right) indicate Bayesian posterior probabilities (PP).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, W., Ma, J., Tan, Y. et al. Phylogenetic analysis of Asiatic species in the tropical genus Beilschmiedia (Lauraceae). BMC Genomics 26, 226 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12864-025-11354-x
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12864-025-11354-x