!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! !!
!! N O T A B E N E !!
!! !!
!! The dFLASH server is still under development. If some similaritities are !!
!! not detected or if some of the answers do not make sense, it is very !!
!! likely that this is due to a bug in our code. For example, the reported !!
!! moderate sensitivity that one observes with globins is the result of an !!
!! initial poor design choice; the problem will be alleviated in the next !!
!! release of the server that should be available sometime in September. !!
!! Clearly, reporting of all such cases will help us to incorporate all the !!
!! needed fixes. !!
!! !!
!! Until the end of the summer, we consider dFLASH to be in an experimental !!
!! state. As such, we ask you that you please take this into account when !!
!! making comparisons to other existing similarity retrieval programs. !!
!! !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Dear User, welcome to the dFLASH server!
The dFLASH server is a "homologous sequence retrieval" program for
protein sequences (see also NOTES below). dFLASH is a distributed system
which runs on a small cluster of seven (7) NON-dedicated IBM/RS6000
workstations and has been implemented using the Concert/C language.
The server is now available 24 hours a day, 7 days a week.
Currently, you can submit retrieval requests to the dFLASH server only if
you have registered with us. However, the server will process and execute
requests from non-registered users for help and/or for on-line reprints of
relevant papers (see below). At the moment no sources or binaries are being
distributed.
If you wish to register, you need to send an email message to
"dflash@watson.ibm.com"
with your full name, affiliation, a daytime phone number, login name, and the
name of the host from which you wish to access the dFLASH server.
After we receive this information, we will process it as soon as possible.
You can begin using dFLASH as soon as we notify you that your registration has
been accepted. For the moment, we can process requests originating from email
addresses of the form
"user@[machine.]institution.type"
or
"user%machine@[machine.]institution.type"
We plan to further expand the accepted formats, depending on demand.
Once you have registered, you can use the dFLASH facilities by sending an
email message to
"dflash@watson.ibm.com"
IMPORTANT: the message should contain one of: dFLASH, dflash, DFLASH
---------- in the "Subject" line. Messages whose subject line does not
contain one of these words will be left unprocessed.
REQUEST FORMAT:
---------------
The typical message-body of an email request looks like:
PAM 250 (capitalized/mandatory)
ALIGNMENTS 50 (capitalized/optional)
THRESHOLD 30 (capitalized/optional)
BEGIN (capitalized/mandatory)
A_ONE_LINE_TEST_SEQ_LABEL (mandatory)
a_sequence_of_amino_acids_terminated_by_the_number_1
in this particular order. The test sequence should contain at least 30 and not
more than 1000 aminoacids. BUT it *may* contain CARRIAGE RETURNS and SPACES.
There is NO case sensitivity in the label and the test sequence itself.
The ALIGNMENTS line allows one to restrict the reported alignments to the given
number. If no ALIGNMENTS line is given, up to 10,000 (ten thousand) result
alignments may be returned. If the requested ALIGNMENTS value is greater than
10,000 it is changed to 10,000.
The THRESHOLD line allows one to restrict the number of reported sequences (and
thus alignments) to only those whose Score excees the given THRESHOLD value.
Typical values for the score threshold are between 40 and 60. If no THRESHOLD
line is given, or if the requested THRESHOLD value is smaller than 30, it is
changed to 30. Notice: if the THRESHOLD value is too small, you are running
the danger of upsetting your mailer program since chances are that you will
receive a very big file as a reply from the server.
Two example requests follow:
Example 1:
PAM 120
ALIGNMENTS 30
THRESHOLD 50
BEGIN
>DBHB_ECOLI dna-binding protein hu-beta
MNKSQLIDKIAAGADISK
AAAGRALDAIIASVTESLKEG DDVALVgfgtfavkeraartg
rnpqtgkei
tiaaakvpsfragkalkdavn 1
Note: all amino acids from "MNKS" through "kdavn" will be used
in the search. The alignments for the top 30 scoring sequences
will be returned. No reported sequence will have score that is
less than 50.
Example 2:
BLOSUM 62
BEGIN
Your-Favorite-Label Goes Here
MHYTKNIPLVMGYQYQVKGYILGVKQNKKLYEKMLDSFYKYFCNITQINSKTLN
FSNFVSTIVDSFL PKEYSQSISLEKK DSILELLLCDYISNLGTFITTEKMLPFIVKN
RKENYHKVTKEMQDYSLTFLLK KRMELYNKFLRKQAYVEPETELEETYA RLSSYNRSLLYQ
IEELTSEKKSFLEELSTLRKKYEKRQSEYRRLVQLLYQQIQRSSSSKTSYPLTKF
IETLPSEHFSNEE
YQKEASADQKVIL
EQEETELLREQELLASQEVTSKSPNNYPVPQSRTIVNKPSDNYPVPRSR
STKIDFDNSLQKQELHAKNGFS
EKAIVEFNQ
DKQPMFKEEAIVEFNQDKPEIKEETIVEFNQNKQPMFKEEAILEFNQDKQPEFKETI
LDNKEILDNKEDILEEENQDEPI
VQNPFLENFWKPEQKTFNQSGLFEESSDFSNDWSGGDVTLNFS1
Note: all amino acids from "MHYT" through "LNFS" will be used
in the search. The alignments for as many as 10,000 top scoring
sequences may be returned. The score threshold will default to
the value 30.
SCORING MATRICES:
-----------------
You can use both PAM and BLOSUM scoring matrices. The currently supported
distances are
for BLOSUM: 30, 35, 40, 45, 50, 55, 60, 62, 65, 70, 75, 80, 85, 90, 100
for PAM: 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150,
160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280,
290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,
420, 430, 440, 450, 460, 470, 480, 490, and 500.
TO OBTAIN HELP:
---------------
You can obtain this message at any moment by sending a message with one of:
{dFLASH, dflash, DFLASH} in the "Subject" line and a body containing the
phrase "send help". Notice that you do not have to be registered in order to
request the help file.
TO OBTAIN ON-LINE REPRINTS OF PAPERS
------------------------------------
You can obtain reprints (in PostScript) of relevant papers by sending a
message with one of: {dFLASH, dflash, DFLASH} in the "Subject" line and a body
containing
the phrase "send FLASHpaper" ---> returns to the originator of the
request a copy of the FLASH paper
the phrase "send CONCERTpaper" ---> returns to the originator of the
request a copy of a high-level paper
describing the CONCERT/C language
the phrase "send BAYESpaper" ---> returns to the originator of the
request a copy of a paper describing
a computer-vision application based on
a Bayesian geometric indexing framework
Notice that you do not have to be registered in order to request the reprints
of the above papers. Also, there can only be *one* such request per message!
IMPORTANT NOTES:
----------------
(1) our database search program currently operates with Swiss-Prot Release 25.
(2) at the moment we are putting together the version of the server that will
allow sequence searches in GenBank. The current projection is that the GenBank
search server will be available before the end of the summer.
Thank you for your interest in the dFLASH server.
Sincerely,
The dFLASH management
###############################################################################
COMMENTS??
----------
We will appreciate receiving your feedback, suggestions, comments, or bug
reports; all of these can be sent to "dflash@watson.ibm.com" Please, make sure
your "Subject" line contains the word "comments".
###############################################################################
REFERENCES
----------
If you make use of the dFLASH server, please reference
A. Califano and I. Rigoutsos, "FLASH: A Fast Look-up Algorithm for String
Homology." In Proceedings of the First International Conference on
Intelligent Systems for Molecular Biology, July 1993, Bethesda, MD.
If you wish to find out more about the dFLASH server, you can contact Andrea
Califano (acal@watson.ibm.com) or Isidore Rigoutsos (rigoutso@watson.ibm.com)
###############################################################################
For more information on the Concert/C language, please refer to
J. Auerbach, D. Bacon, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy,
A. Lowry, J. Russell, W. Silverman, R. Strom, D. Yellin, and S. Yemini,
"High-level language support for programming reliable distributed
systems." In Proceedings of the International Conference on Computer
Languages, April 1992, Oakland, California.
or contact Josh Auerbach (jsa@watson.ibm.com)
###############################################################################
--
##
________________________________________