In the context of chemical safety, the SeqAPASS protocol provides a rapid and streamlined process for evaluating protein conservation across the diversity of species, for extrapolation of both toxicity and biological pathway knowledge. SeqAPASS was developed as a simplified, transparent, and user-friendly web-based interface for rapid cross-species extrapolation, intended for use by both researchers as well as decision makers. The tool addresses the challenge of extrapolating chemical toxicity knowledge using available information about species sensitivity, to consistently predict chemical susceptibility for hundreds to thousands of species that could never be tested in a laboratory setting.
To begin working with the SeqAPASS tool, first, navigate to seqapass.epa.gov. Existing users can select the login button to your account. New users can create an account by selecting the link on the homepage.
After logging in to the SeqAPASS tool, navigate to the Request SeqAPASS Run tab. SeqAPASS includes several helpful resources to help users identify a query protein target. These can be accessed by clicking the dropdown buttons under Identify a Protein Target.
Once a protein target is identified, click either By Species or By Accession under Compare Primary Amino Acid Sequences. For example, to analyze the conservation of the mu opioid receptor, click By Accession, and enter the accession number in the protein accession box. Unless a protein accession was already identified by the user, the species search option is recommended.
Select Request Run to initiate the query. Once your query has been submitted, select the SeqAPASS Run Status tab at the top of the page to view the status of the submitted run. The time to completion will depend on current tool usage.
In the View SeqAPASS Reports tab, all runs conducted under a given account are listed. Select View Report to view data in the web browser, and click Request Selected Report to open the level one Query Protein Information page, and view results and data customization options. To develop and run a level two SeqAPASS analysis, assessing the conservation of specific protein domains, click the plus sign next to the Level Two header on the the Level One Query Protein Information page to populate the Level Two query menu.
Click the Select Domain box to populate a list of functional domains for the query protein. Once a domain is identified from the NCBI Conserved Domain Database, and selected in the SeqAPASS dropdown list, initiate the level two query. This is done by clicking the Request Domain Run button.
Click Refresh Level Two and Three Runs to populate level two data. Under View Level Two Data, click the Select Completed Domain box to select the completed domain accession. You can then click the View Level Two data button to open the results.
Data is displayed under the View SeqAPASS tab for the selected level, first, let's look at the level one data. On the Query Protein Information page, select the radio button next to the primary or full report to view the data in either a condensed or expanded format. Species will either have a susceptibility prediction of yes or no, indicating whether the protein is conserved relative to the query sequence.
Within the report, click on the appropriate protein accession or species ID to link out to the NCBI databases and access further information. To explore corresponding toxicity data for the species with susceptibility predictions, scroll to the right side of the Results table to view the ECOTOX column. Click links to open the species in question within the ECOTOXicology knowledge base, where you can find toxicity data In addition to full and primary reports, a summary report of the data can be viewed by selecting View Level One Summary Report.
The summary report displays summary metrics and susceptibility predictions across taxonomic groups. Data in any report format can be easily downloaded. With the desired report type selected, click Download Table to save it as a spreadsheet file.
To view level two data, scroll to the top of the View SeqAPASS tab, and select Level Two. Level two SeqAPASS data is displayed in reports like that of level one, and is also available at the bottom of the Query Protein Information page. Select the radio button next to the desired report type to view the data.
Level two reports display similar information to that of level one, with added information on the protein domain. SeqAPASS data can be easily visualized to facilitate interpretation. Click the plus sign next to Visualization, followed by the Visualize Data button, to open an interactive BoxPlot in a new tab.
In the new tab, select the BoxPlot button. At the top of the BoxPlot page, control options can help the user customize the graph. Users can add or remove taxonomic groups from the graph, they can select species to be displayed in a legend, choose how the species'names are displayed, and highlight specific species subsets, such as threatened or endangered species.
Visualizations can be easily exported and saved. Click Download BoxPlot to select a file type and download the figure. Prior to downloading, users can customize the file type resolution of the image.
Although default report settings are sufficient for most analyses, parameters can be changed using the sub menus at the top of Query Protein Information menu. To record current report settings, click the Download Current Report Settings button to download a file capturing the current settings applied. This can be done for level one and level two.
Here the example is for level one. To initiate a level three SeqAPASS analysis, navigate to the Level One Query Protein Information page and click the plus sign next to the Level Three Header to populate the Level Three Query Menu that allows for individual amino acid residue comparisons. Before a level three analysis can be conducted, specific amino acid residues must be identified through a review of existing literature.
Click the plus sign next to the Reference Explorer to open the Reference Explorer Tool. This can help you generate a predefined Boolean string to query available literature. Click the Generate Google Scholar Link to generate a literature search string.
This can either be copied and used to search desired literature databases. Or, selecting Search Google Scholar will automatically search Google Scholar literature using the predefined search string. Once amino acid residues have been selected, users can set up a level three analysis.
Select the template sequence to which the user-selected species will be aligned and compared. Optionally, additional template sequences can be entered in the Additional Comparisons box. Before aligning specific sequences, users must enter a user-defined run name.
This will be used to identify the run. Select the taxonomic group in the Choose Taxonomic Group box. It should be noted that this process will be repeated for all taxonomic groups for which the user would like to evaluate for conservation.
Once all desired species are selected for alignment to the template sequence, select Request Residue Run. Once all species have been aligned, click Refresh Level Two and Three Runs, to populate the Select Level Three Run Name menu with the completed level three jobs. Level three data can be viewed either by individual taxonomic group, or across multiple taxonomic groups.
Click Combine Level Three Data to open the combined level three dialogue box, and analyze multiple taxonomic groups together at the same time. Using the Combine Level Three Reports dialogue box select the level three template to use as the basis for comparison. Next, choose which completed jobs to compare.
Order level three jobs as desired, and click View Level Three Data to produce a level three report page. In the Level Three Template Protein Information page, previously identified amino acid positions from the template sequence are displayed, and can be selected for alignment. Enter positions into the text box separated by commas, or alternatively, select from the residue list.
Once all positions are entered, select Copy to Residue list to shuttle residues into the selection box. Once all amino acid positions have been selected and checked for accuracy, click Update Report to update aligned sequences with the specified positions. Scroll to the bottom of the page to view a report of the results.
Like previous levels, select the radio button next to the primary or full report to select the report type. Level three reports display similar species and proteins, and include alignment and conservation information for each amino acid residue assessed. To save the report, click Download Table at the bottom of the report, and save it as a spreadsheet file.
Users can also select View Level Three Summary Report to view and download a summary report table. To view a visualization for level three data, click the plus sign next to the visualization header and select Visualize Data to open a new page. Click Heat Map on the visualization information page to open the interactive graphic and controls.
On the visualization page under Controls, select the taxonomic groups to be displayed in the heat map, shuttle them using the arrow button, and reorder if desired. By default, the heat map visualization displays the species'common name, the overall susceptibility predictions, and the match status of each amino acid residue. Options for manipulating heat map settings are in the expandable menus under the Control heading.
Under Report Options, a simple or a full report can be selected, and the displayed species can be toggled between common or scientific names. Under Optional Selections, subsets of species, such as ortholog candidates, threatened species, endangered species, and common model organisms, can be selected and highlighted in the heat map. Under Heat Map Settings, the information displayed on the heat map can be customized, by unchecking and checking boxes.
Susceptibility predictions and susceptibility text can be removed, and amino acid alignments and amino acid information can be removed. To save the heat map visualization, click Download Heatmap and choose the desired file type. SeqAPASS features a decision summary report, that allows users to quickly evaluate susceptibility predictions across levels and species simultaneously.
Clicking Push Level to DS Report from the Results or Data Visualization Pages will transfer the data to the DS Report tab, where all levels of analyses can be combined and displayed. The results of the level one analysis for the mu opioid receptor show a susceptibility cutoff of 55%with percent similarities for mammals, birds, reptiles, amphibians, and most fish species falling above this cutoff, and receiving a susceptibility prediction of Yes. Those boxes completely below the cutoff receive a susceptibility prediction of a no.
The level two analysis for the mu opioid functional domain identified a higher susceptibility cutoff of 88%similarity with mammals, birds, reptiles, amphibians, and most fish species above this cutoff, providing another line of evidence for conservation, or the mu opioid receptor across vertebrates. Within the level three analysis, all species assessed resulted in a susceptibility prediction of yes. As these amino acids are important in the binding of both strong mu opioid receptor agonists and strong antagonists, these data suggest that opioid compounds targeting human receptors may interact similarly with receptors across vertebrate species.
Level three of SeqAPASS is useful for hypothesis generation. Even if critical amino acid residues are not known for the protein of interest, the tool can be used to look across species and identify where conservation exists, or is less conserved. These hypotheses can then be tested with molecular biology techniques, such as site-directed mutagenesis studies.
The initial focus of SeqAPASS was on chemical targets. However, our research is expanding to explore primary metabolizing enzymes, transport proteins, as well as proteins involved in bioaccumulation.