Abstract:
Respiratory syncytial virus (RSV) is a leading cause of lower respiratory tract infections among children under the age of five. Nonetheless, no effective vaccine against the virus exists, but there have been efforts to guide vaccine development by seeking to understand the transmission and evolutionary patterns of the virus using targeted partial and whole genome sequencing studies. This project aimed to develop a viral metagenomics enrichment protocol for RSV whole genome sequencing (WGS) using the ONT MinION device as a step towards unbiased sequencing of respiratory viruses. However, nasopharyngeal samples contain higher quantities of host and bacterial nucleic material relative to viral genetic material. This presents challenges during virus metagenomics sequencing which underpins agnostic sequencing protocols. Two unbiased viral enrichment protocols were therefore assessed using a similar set of samples. Protocol 1 involved physical pre-treatment of samples by centrifugal processing before RNA extraction, while protocol 2 entailed direct RNA extraction from samples without a pre-treatment step. From the centrifugal processing protocol, a pellet and supernatant were obtained after centrifugation at 8000rpm for five minutes, while concentrates and filtrates were obtained after centrifugal filtration of the supernatants at 14000 rpm for one hour using 3kD centrifugal filters, with the main fraction of interest being the concentrate. Concentrates from protocol 1 were divided into two fractions; one was DNase treated while the other was not, followed by RNA extraction. Extracted RNA from protocol 2 on the other hand, was divided into two fractions; one was DNase treated whilst the second was not. RNA from both protocols was converted to cDNA, amplified using the sequence independent single primer amplification (SISPA) approach, libraries prepared, and sequencing done. DNase-treated fractions from both protocols recorded significantly reduced host and bacterial contamination unlike the untreated fractions (in each protocol p<0.01). Additionally, DNase treatment after RNA extraction (Protocol 2) (p<0.01) enhanced host and bacterial read reduction compared to when done before (Protocol 1). However, neither protocol yielded whole RSV genomes. Sequenced reads mapped to parts of the nucleoprotein (N gene) and polymerase complex (L gene) from Protocol 1 and 2, respectively. The incomplete genome segments from both protocols were attributed to amplification biases introduced when part of the tag, in tagged Endoh primers anneals to the genome, due to the shortness (6 bases) of the random sequence. This study recommends that the random sequence in the tagged Endoh primers be extended in length to around 9-12 bases instead of six since the length of the random sequence in tagged random primers is an important factor for the success of SISPA.