Frequently asked questions
I would like to exemplary check the SepsiTest™ BLAST output but I don't have a test query at hand. You can copy/paste the following original EMBL entry into the box "FASTA Sequence" on the SepsiTest™ BLAST main page.
Also try to just blast the beginning of the sequence or an internal fragment. But notice that at least 200 bases are required!
It doesn't matter if you include the header of the sequence or not.>embl|AJ233425|AJ233425 Proteus vulgaris 16S rRNA gene (strain DSM 30118) gattgaacgctggcggcaggcctaacacatgcaagtcgagcggtaacagaggaaagcttgctttcttgctgacgagcggcggacgggtgagtaatgtatg gggatctgcccgatagagggggataactactggaaacggtagctaataccgcatgacgtctacggaccaaagcaggggctcttcggaccttgcgctatcg gatgaacccatatgggattagctagtaggtgaggtaatggctcacctaggcgacgatctctagctggtctgagaggatgatcagccacactgggactgag acacggcccagactcctacgggaggcagcagtggggaatattgcacaatgggcgcaagcctgatgcagccatgccgcgtgtatgaagaaggccttagggt tgtaaagtactttcagcggggaggaaggtgttaagattaatactcttagcaattgacgttacccgcagaagaagcaccggctaactccgtgccagcagcc gcggtaatacggagggtgcaagcgttaatcggaattactgggcgtaaagcgcacgcaggcggtcaattaagtcagatgtgaaagccccgagcttaacttg ggaattgcatctgaaactggttggctagagtcttgtagaggggggtagaattccacgtgtagcggtgaaatgcgtagagatgtggaggaataccggtggc gaaggcggccccctggacaaagactgacgctcaggtgcgaaagcgtggggagcaaacaggattagataccctggtagtccacgctgtaaacgatgtcgat ttggaggttgtgcccttgaggcgtggcttccggagctaacgcgttaaatcgaccgcctggggagtacggccgcaaggttaaaactcaaatgaattgacgg gggcccgcacaagcggtggagcatgtggtttaattcgatgcaacgcgaagaaccttacctactcttgacatccagagaatcctttagagatagaggagtg ccttcgggaactctgagacaggtgctgcatggctgtcgtcagctcgtgttgtgaaatgttgggttaagtcccgcaacgagcgcaacccttatcctttgtt gccagcacgtaatggtgggaactcaaaggagactgccggtgataaaccggaggaaggtggggatgacgtcaagtcatcatggcccttacgagtagggcta cacacgtgctacaatggcagatacaaagagaagcgacctcgcgagagcaagcggaactcataaagtctgtcgtagtccggattggagtctgcaactcgac tccatgaagtcggaatcgctagtaatcgtagatcagaatgctacggtgaatacgttcccgggccttgtacacaccgcccgtcacaccatgggagtgggtt gcaaaagaagtaggtagcttaaccttcgggagggcgcttaccactttgtgattcatgactggggtgaagtcctaacaaggtaaccgtaggggaac
Alternatively, we offer a screenshot of the corresponding result. Because the example sequence represents one of the entries of the SepsiTest™ BLAST database, you also find on the following page the position of the sequence in a phylogenetic tree which was calculated for the complete data set. More ...Which sequence format is required for a SepsiTest™ BLAST analysis? The system accepts the FASTA and the ABI format, plus pure DNA sequence information.
Please notice that also a number of FASTA-like formats are available. A description of the correct FASTA format you can find here.
In case the query is not readable by the system, you will get a corresponding message. In addition, SepsiTest™ BLAST will tell you if the quality of the sequence is not sufficient (<200 bases and/or >5% ambiguities). For sequences in the ABI format, also the ends will be checked and trimmed if required (removal of N's and bases with low quality values).
To the box "FASTA Sequence" you can copy sequences in the FASTA format (incl. header) as well as as pure DNA sequence information.What is special about the underlying rRNA gene database and what is the extent? The reference database contains 7,042 quality-checked, full-length 16S rRNA gene sequences, exclusively from cultivated reference strains of the described species of the domains Bacteria and Archaea (Yarza et al., 2008). It is the freely available database of the "All-Species Living Tree" project (based on release 'LTP_s95' of October 2008 - including a few documented modifications, see below). Detailed information and downloads you can find on the official LTP project webpage.
The database was supplemented with 342 selected 18S rRNA gene sequences (Eukarya) of the cultivated members of the genera Candida, Aspergillus and Cryptococcus within the Fungi. If you are interested in additional eukaryotic species, please let us know (info@molzym.com). The database will be constantly updated and extended.
All sequences correspond to the original Genbank/EMBL entries.
For a fast overview and your search for particular organisms, we offer an alphabetical sorted list of all SepsiTest™ BLAST entries within an Excel file - open/download
Compared to NCBI BLAST, SepsiTest™ BLAST does not contain any "dead weight" if your focus is on rRNA gene diagnostics. The presence of only high-quality and correctly classified rRNA gene sequences, in combination with an additional layer of interpretation, assures that only highly relevant results are provided.Which technology is used by SepsiTest™ BLAST to identify similar sequences within the reference database? SepsiTest™ BLAST is based on the common BLAST technology which identifies regions of high local similarity within a given set of sequences (Altschul et al., 1990). Therewith, sequences similar to a query sequence can be identified within a (large) reference database.
To avoid "false-positive" results, a minimum absolute length of the local alignment plus a minimum coverage of the alignment related to the full query sequence was defined. In combination with the high-quality reference database, only reasonable results are displayed.I have uploaded a known sequence and do not expect any highly similar sequences within the reference database. But I would at least expect a number of less similar sequences in the result list. SepsiTest™ BLAST only displays hits, if sequences are found within the reference database with an identity of at least 97% for the BLAST aligned region compared to the query sequence. This approach should help increase accessibility of the relevant results because the focus of SepsiTest™ BLAST is on the identification of described microorganisms.
If you don't get a positive result, this means that currently no similar completely described microorganism is deposited in the scientific literature (as of October 2008). For further investigations we recommend the use of NCBI BLAST in these cases. If professional support is required for the detailed classification of your sequences, feel free to contact us.
For a fast overview and your search for particular organisms, we offer an alphabetical sorted list of all SepsiTest™ BLAST entries within an Excel file - open/downloadWhat do I have to consider concerning the identities shown in the SepsiTest™ BLAST results? 1. Sequencing errors or unresolved bases within the query sequence will unavoidable reduce the given identity in an artificial way. The real identity may be higher than the one displayed.
2. For partial query sequences, the identities displayed only apply to the part of the rRNA gene which is covered by the fragment. The values shown do not necessarily equal to the values obtained from the corresponding full length sequences. The rRNA gene shows a high positional variability, whereas conserved and less conserved regions alternate. Thus, based on the rRNA region sequenced, even for closely related organisms the identities can differ significantly.The sorting of the results does not match to my expectations. Is the tool working inaccurately? Not at all! Within the realms of possibility SepsiTest™ BLAST provides very exact results.
An often completely underestimated aspect is the quality of the sequences deposited in the public databases, especially the quality of their annotation (their description/classification). Here, single errors can lead to a severe snowball effect. This means, new sequences are annotated based on afore wrongly classified sequences and so on. In worst case, after many years of usage of NCBI BLAST, expectations are build which are not supported anymore by a high-quality phylogenetic analysis. Only a small part of the publicly available rRNA gene sequences is validly described from a taxanomic point of view. In comparison, SepsiTest™ BLAST only contains validly described type strains. Because these highly relevant results are often "hidden" in the long list of NCBI BLAST results, SepsiTest™ BLAST can produce unexpected results at a first glance.
Apart of the laborious polyphasic approach, also some very general limitations causing problems in the identification of microorganisms. The resolution of each approach is limited, especially if identification is based on single criteria (e.g. a selected genetic marker). However, often this aspect remains unconsidered. Always keep in mind that there is unavoidable noise included in the results, originating from basic biological aspects (operon diversity etc.) up to the technical realization itself, caused by the complexity of the question (think about the open species concept in microbiology).I still have open questions related to SepsiTest™ BLAST ... No problem, send an email to info@molzym.com. We will help you!