FLASH e-mail server

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!                                                                           !!
!!                             N O T A     B E N E                           !!
!!                                                                           !!
!! The dFLASH server is still under development.  If some similaritities are !!
!! not detected or if some of the answers do not make sense, it is very      !!
!! likely that this is due to a bug in our code.  For example, the reported  !!
!! moderate sensitivity that one observes with globins is the result of an   !!
!! initial poor design choice; the problem will be alleviated in the next    !!
!! release of the server that should be available sometime in September.     !!
!! Clearly,  reporting of all such cases will help us to incorporate all the !!
!! needed fixes.                                                             !!
!!                                                                           !!
!! Until the end of the summer, we consider dFLASH to be in an experimental  !!
!! state.  As such, we ask you that you please take this into account when   !!
!! making comparisons to other existing similarity retrieval programs.       !!
!!                                                                           !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


    Dear User, welcome to the dFLASH server!


    The dFLASH server is a "homologous sequence retrieval" program for
protein sequences (see also NOTES below).  dFLASH is a  distributed system
which runs on a small cluster of seven (7) NON-dedicated IBM/RS6000
workstations and has been implemented using the Concert/C language.
The server is now available 24 hours a day, 7 days a week.


    Currently, you can submit retrieval requests to the dFLASH server only if
you have registered with us.  However, the server will process and execute
requests from non-registered users for help and/or for on-line reprints of
relevant papers (see below).  At the moment no sources or binaries are being
distributed.


    If you wish to register, you need to send an email message to
			"dflash@watson.ibm.com"
with your full name, affiliation, a daytime phone number, login name, and the
name of the host from which you wish to access the dFLASH server.


    After we receive this information, we will process it as soon as possible.
You can begin using dFLASH as soon as we notify you that your registration has
been accepted.  For the moment, we can process requests originating from email
addresses of the form
		"user@[machine.]institution.type"
			or
		"user%machine@[machine.]institution.type"
We plan to further expand the accepted formats, depending on demand.

    Once you have registered, you can use the dFLASH facilities by sending an
email message to
			"dflash@watson.ibm.com"

IMPORTANT:      the message should contain one of: dFLASH, dflash, DFLASH
----------      in the "Subject" line.  Messages whose subject line  does not
		contain one of these words will be left unprocessed.


REQUEST FORMAT:
---------------
The typical message-body of an email request looks like:

     PAM   250  					(capitalized/mandatory)
     ALIGNMENTS 50  					(capitalized/optional)
     THRESHOLD  30  					(capitalized/optional)
     BEGIN 						(capitalized/mandatory)
     A_ONE_LINE_TEST_SEQ_LABEL               		(mandatory)
     a_sequence_of_amino_acids_terminated_by_the_number_1

in this particular order.  The test sequence should contain at least 30 and not
more than 1000 aminoacids.  BUT it *may* contain CARRIAGE RETURNS and SPACES.
There is NO case sensitivity in the label and the test sequence itself.
The ALIGNMENTS line allows one to restrict the reported alignments to the given
number.  If no ALIGNMENTS line is given, up to 10,000 (ten thousand) result
alignments may be returned.  If the requested ALIGNMENTS value is greater than
10,000 it is changed to 10,000.
The THRESHOLD line allows one to restrict the number of reported sequences (and
thus alignments) to only those whose Score excees the given THRESHOLD value.
Typical values for the score threshold are between 40 and 60.  If no THRESHOLD
line is given, or if the requested THRESHOLD value is smaller than 30, it is
changed to 30.  Notice:  if the THRESHOLD value is too small, you are running
the danger of upsetting your mailer program since chances are that you will
receive a very big file as a reply from the server.

Two example requests follow:

Example 1:
		PAM   120
		ALIGNMENTS 30
		THRESHOLD  50
		BEGIN
		>DBHB_ECOLI dna-binding protein hu-beta
		MNKSQLIDKIAAGADISK
		AAAGRALDAIIASVTESLKEG  DDVALVgfgtfavkeraartg

		rnpqtgkei
		tiaaakvpsfragkalkdavn  1

     Note:  all amino acids  from "MNKS" through "kdavn"  will be used
	    in the search.  The alignments for the top 30 scoring sequences
	    will be returned.  No reported sequence will have score that is
	    less than 50.

Example 2:
		BLOSUM 62
		BEGIN
		Your-Favorite-Label Goes Here
	        MHYTKNIPLVMGYQYQVKGYILGVKQNKKLYEKMLDSFYKYFCNITQINSKTLN
		FSNFVSTIVDSFL PKEYSQSISLEKK DSILELLLCDYISNLGTFITTEKMLPFIVKN
		RKENYHKVTKEMQDYSLTFLLK KRMELYNKFLRKQAYVEPETELEETYA RLSSYNRSLLYQ
		IEELTSEKKSFLEELSTLRKKYEKRQSEYRRLVQLLYQQIQRSSSSKTSYPLTKF
		IETLPSEHFSNEE

		YQKEASADQKVIL
		EQEETELLREQELLASQEVTSKSPNNYPVPQSRTIVNKPSDNYPVPRSR
		STKIDFDNSLQKQELHAKNGFS
		EKAIVEFNQ

		DKQPMFKEEAIVEFNQDKPEIKEETIVEFNQNKQPMFKEEAILEFNQDKQPEFKETI
		LDNKEILDNKEDILEEENQDEPI
		VQNPFLENFWKPEQKTFNQSGLFEESSDFSNDWSGGDVTLNFS1

     Note:  all amino acids  from "MHYT" through "LNFS"  will be used
	    in the search.  The alignments for as many as 10,000 top scoring
	    sequences may be returned.  The score threshold  will default to
	    the value 30.



SCORING MATRICES:
-----------------
You can use both PAM and BLOSUM scoring matrices.  The currently supported
distances are

for BLOSUM:  30, 35, 40, 45, 50, 55, 60, 62, 65, 70, 75, 80, 85, 90, 100

for PAM:     10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150,
	     160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280,
	     290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,
	     420, 430, 440, 450, 460, 470, 480, 490, and 500.




TO OBTAIN HELP:
---------------
    You can obtain this message at any moment by sending a message with one of:
{dFLASH, dflash, DFLASH} in the "Subject" line and a body containing the
phrase "send help".  Notice that you do not have to be registered in order to
request the help file.


TO OBTAIN ON-LINE REPRINTS OF PAPERS
------------------------------------
    You can obtain reprints (in PostScript) of relevant papers by sending a
message with one of: {dFLASH, dflash, DFLASH} in the "Subject" line and a body
containing

the phrase 	"send FLASHpaper"  	---> returns to the originator of the
					request a copy of the FLASH paper

the phrase 	"send CONCERTpaper"  	---> returns to the originator of the
					request a copy of a high-level paper
					describing the CONCERT/C language

the phrase 	"send BAYESpaper"  	---> returns to the originator of the
					request a copy of a paper describing
					a computer-vision application based on
					a Bayesian geometric indexing framework

Notice that you do not have to be registered in order to request the reprints
of the above papers.  Also, there can only be *one* such request per message!



IMPORTANT NOTES:
----------------
(1) our database search program currently operates with Swiss-Prot Release 25.
(2) at the moment we are putting together the version of the server that will
allow sequence searches in GenBank.  The current projection is that the GenBank
search server will be available before the end of the summer.

Thank you for your interest in the dFLASH server.

					Sincerely,

					The dFLASH management


###############################################################################

COMMENTS??
----------
We will appreciate receiving your feedback, suggestions, comments, or bug
reports; all of these can be sent to "dflash@watson.ibm.com"  Please, make sure
your  "Subject" line contains the word "comments".

###############################################################################

REFERENCES
----------

If you make use of the dFLASH server, please reference

     A. Califano and I. Rigoutsos, "FLASH: A Fast Look-up Algorithm for String
     Homology."  In Proceedings of the First International Conference on
     Intelligent Systems for Molecular Biology, July 1993, Bethesda, MD.

If you wish to find out more about the dFLASH server, you can contact Andrea
Califano (acal@watson.ibm.com) or Isidore Rigoutsos (rigoutso@watson.ibm.com)

###############################################################################


For more information on the Concert/C language, please refer to

     J. Auerbach, D. Bacon, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy,
     A. Lowry, J. Russell, W. Silverman, R. Strom, D. Yellin, and S. Yemini,
     "High-level language support  for programming reliable distributed
     systems."  In Proceedings of the International Conference on Computer
     Languages, April 1992, Oakland, California.

or contact Josh Auerbach (jsa@watson.ibm.com)

###############################################################################

--
##

________________________________________