Fig. 3

Proportions of classified reads of (A) basecalling models, (B) sequencing and (C) extraction kits. The Australian dataset here only included the samples extracted by the PowerFecal kit. The Australian dataset used the ligation kit SQK-LSK109 and the rapid sequencing kit SQK-RBK110.96; The Spanish dataset used the ligation sequencing kit SQK-LSK109 and the rapid sequencing kit SQK-RBK004. The DNA extraction kit dataset only used samples sequenced by the ligation kit SQK-LSK109 from the Australian dataset. Data for sequencing and extraction kit analysis were basecalled under SUP mode. Dotted lines in boxplots link the samples from the same DNA sample. A t-test was used to compare the means of different basecalling, extraction, and sequencing protocols (An unpaired t-test for the Australian dataset and a paired t-test for the Spanish dataset). P-values < 0.05: *; P-values < 0.01: **; P-values < 0.001: ***. FAST: Fast basecalling; HAC: High-accuracy basecalling; SUP: Super accurate basecalling