Whole genome sequencing, genome mining and molecular networking constitutes the base for the mass spectrometry guide that genome mining protocol that we describe here. And are the most contemporary approach in natural product discovery. Genome mining and molecular networking match in cluster, in coded annotations in whole genome sequencing with chemical restructure signatures from crude extracts providing massive prospects for uncovering navel by active molecules.
In principle, genome mining targets a single or a small group of molecules per experiment. Thereby resulting in a slow discovery rate. In this sense, using genome mining along with molecular networking represents an important advance for natural products research.
Mass spectrometry guide-genome mining is a versatile method which can structure information of several chemotypes contained in the large amount of data from a crude extract simultaneously. Today it is high range methodology capable to explore bacterial, fungal or plants sequence providing. Chemotype to genotype and genotype to chemotype bioinformatics links between biosynthetic clusters and are small molecules products.
Hearing the protocol is demonstrated to characterize navel cycle depths Peptide products observed in metabolic extracts of streptomyces species. But it's applicable to fungal and plants sequence and extracts as well. It's very important to have nicely curated genome dataset and nowadays fragmented draft sequences have increasingly being supplanted by near finished sequence data.
Also, different strategies have been developed to access new bioactive natural products and certainly combining them with MS-Guide approach. We will lead to improvement in natural products isolation and structure elucidation. As here we are connecting genotype to chemotype approach, bioinformatics links between biosynthetic clusters and there's small molecule products are crucial.
to estimate the variety and class of compounds and encode by genome. Which means the replication. Later, clustering based on ion similarities can be preformed by using GNPS.
Demonstrating the procedure will be Dr Renata Sigrist and Professor Angolini my collaborators in this study. To obtain in silico information about secondary metabolism gene clusters annotations from a complete sequence genome. Submit the sequence file, gene bank E-M-B-L or faster format, to an anti-smash platform.
So like the gene clusters of interest from the output data based on the most similar known cluster. Based on the DNA sequence information of the BGC, design primaries between 20 to 25 nucleotides flanking the gene cluster for ecoli streptomyces. Artificial chromosome library screening.
After obtaining the recombinant heterologous organism according to the manuscript, inoculate one 100th of the strains preculture in appropriate fermentation media. Such as liquid ISP2 media. Place the culture on an incubator at 30 degree Celsius and 220 RPM, for seven days.
After inoculation, centrifuge the cultures at 2200 times G for 10 minutes. Discard the cells and transfer the supernatant into a separatory funnel. Add one to one volume of ethyl acetate and shake.
Wait one to three minutes for the organic and acquiesced layers to separate. And collect the organic layer in an Erlenmeyer flask. To acquire MS/MS data.
Program suitable HPLC and mass spectrometry methods using the control software. It will be noted that MS/MS networking is the detectable molecular network under the given mass Spectrometric conditions. Convert mass spectra to mzXML format using MS Convert from protea wizard.
Adjust the input parameters for the conversion. Upload the converted LC-MS/MS files into the GNPS database. Two options are available.
Using a file transfer protocol or directly in a browser through the online platform. After creating an account in GNPS, log in to the created account and select create a molecular network. Add a job title.
To perform basic options. Select the mzXML files to perform the molecular network. Organize them in up to six groups.
So like the libraries for the dereplication routine. Select the precursor ion mass tolerance and fragment Da on mass tolerance of 0.02 Dalton and 0.05 Dalton respectively. To perform advanced network options, select the parameters which directly influenced the network cluster size and form.
Advanced parameters are in GNPS documentation. Choose an email address to receive an alert when the work is done and submit the job. To analyze GNPS results.
Log into GNPS. Select jobs, published job, done to open a job. A webpage opens with all results obtained from molecular networking displayed.
Select view spectral families in browser network visualizer to see all network clusters. A list is displayed with all generated molecular networking clusters. Tentative molecular identification is displayed in all IDs.
Select show to visualize them. To analyze the molecular network cluster, select visualize network. In the node labels box, select parent mass.
In the edge labels box, select co-sign or Delta Emz. To observe node similarity or mass difference between nodes respectively. In the case of multi-group analysis, click draw pies in the node coloring box.
To observe the frequency at which each node appears in each group. To see all libraries, select view, all library, hits. Open the fragmentation spectra directly in the GNPS platform.
Or in original raw files to manually confirm dereplication compounds and structure elucidation of related compounds. The protocol was successfully exemplified using a combination of genome mining, heterologous expression and MS-guided code approaches. To access new specialized Valinomycin and analog molecules.
The genome to molecule workflow for the target valinomycin is presented here. The chromatogram showed that valinomycin, montanastatin and five analogs were produced by valinomycin, biosynthetic gene clusters expression in a heterologous host. Molecular networking ions corresponding to valinomycin, an already known compound with the corresponding biosynthetic gene clusters annotated in streptomyces species, CBM AI two zero four two genome.
Are clustered with ions related to analogs, firstly described for valinomycin biosynthetic gene cluster. The MS spectra and chemical structures for valinomycin and related analogs, are shown here. Perform whole genome sequencing and proper dataset curating is the first step to electing a biosynthetic cluster for MS-guide the genome mining.
The whole procedure can be performed with the wild types string crude extract, but you can opt to select the gene cluster of your interest and promote heterologous expression. Different methods are described the to capture the whole biosynthetic gene cluster from a DNA sample. The strongest advantage of this protocol is its ability to rapidly replicates metabolic profiles and breed genomic information with MS data.
In order to elucidate these such as of new molecules. And mass guide-genome mining was firstly described in the field of peptide genomics and glycogenomics by Doris Chime and colleagues. And after the introduction of the NPS integrated metabolomics and genome mining approach have become the most versatile avenue to connect molecular networks with biosynthetic capabilities.