!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! !! !! N O T A B E N E !! !! !! !! The dFLASH server is still under development. If some similaritities are !! !! not detected or if some of the answers do not make sense, it is very !! !! likely that this is due to a bug in our code. For example, the reported !! !! moderate sensitivity that one observes with globins is the result of an !! !! initial poor design choice; the problem will be alleviated in the next !! !! release of the server that should be available sometime in September. !! !! Clearly, reporting of all such cases will help us to incorporate all the !! !! needed fixes. !! !! !! !! Until the end of the summer, we consider dFLASH to be in an experimental !! !! state. As such, we ask you that you please take this into account when !! !! making comparisons to other existing similarity retrieval programs. !! !! !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Dear User, welcome to the dFLASH server! The dFLASH server is a "homologous sequence retrieval" program for protein sequences (see also NOTES below). dFLASH is a distributed system which runs on a small cluster of seven (7) NON-dedicated IBM/RS6000 workstations and has been implemented using the Concert/C language. The server is now available 24 hours a day, 7 days a week. Currently, you can submit retrieval requests to the dFLASH server only if you have registered with us. However, the server will process and execute requests from non-registered users for help and/or for on-line reprints of relevant papers (see below). At the moment no sources or binaries are being distributed. If you wish to register, you need to send an email message to "dflash@watson.ibm.com" with your full name, affiliation, a daytime phone number, login name, and the name of the host from which you wish to access the dFLASH server. After we receive this information, we will process it as soon as possible. You can begin using dFLASH as soon as we notify you that your registration has been accepted. For the moment, we can process requests originating from email addresses of the form "user@[machine.]institution.type" or "user%machine@[machine.]institution.type" We plan to further expand the accepted formats, depending on demand. Once you have registered, you can use the dFLASH facilities by sending an email message to "dflash@watson.ibm.com" IMPORTANT: the message should contain one of: dFLASH, dflash, DFLASH ---------- in the "Subject" line. Messages whose subject line does not contain one of these words will be left unprocessed. REQUEST FORMAT: --------------- The typical message-body of an email request looks like: PAM 250 (capitalized/mandatory) ALIGNMENTS 50 (capitalized/optional) THRESHOLD 30 (capitalized/optional) BEGIN (capitalized/mandatory) A_ONE_LINE_TEST_SEQ_LABEL (mandatory) a_sequence_of_amino_acids_terminated_by_the_number_1 in this particular order. The test sequence should contain at least 30 and not more than 1000 aminoacids. BUT it *may* contain CARRIAGE RETURNS and SPACES. There is NO case sensitivity in the label and the test sequence itself. The ALIGNMENTS line allows one to restrict the reported alignments to the given number. If no ALIGNMENTS line is given, up to 10,000 (ten thousand) result alignments may be returned. If the requested ALIGNMENTS value is greater than 10,000 it is changed to 10,000. The THRESHOLD line allows one to restrict the number of reported sequences (and thus alignments) to only those whose Score excees the given THRESHOLD value. Typical values for the score threshold are between 40 and 60. If no THRESHOLD line is given, or if the requested THRESHOLD value is smaller than 30, it is changed to 30. Notice: if the THRESHOLD value is too small, you are running the danger of upsetting your mailer program since chances are that you will receive a very big file as a reply from the server. Two example requests follow: Example 1: PAM 120 ALIGNMENTS 30 THRESHOLD 50 BEGIN >DBHB_ECOLI dna-binding protein hu-beta MNKSQLIDKIAAGADISK AAAGRALDAIIASVTESLKEG DDVALVgfgtfavkeraartg rnpqtgkei tiaaakvpsfragkalkdavn 1 Note: all amino acids from "MNKS" through "kdavn" will be used in the search. The alignments for the top 30 scoring sequences will be returned. No reported sequence will have score that is less than 50. Example 2: BLOSUM 62 BEGIN Your-Favorite-Label Goes Here MHYTKNIPLVMGYQYQVKGYILGVKQNKKLYEKMLDSFYKYFCNITQINSKTLN FSNFVSTIVDSFL PKEYSQSISLEKK DSILELLLCDYISNLGTFITTEKMLPFIVKN RKENYHKVTKEMQDYSLTFLLK KRMELYNKFLRKQAYVEPETELEETYA RLSSYNRSLLYQ IEELTSEKKSFLEELSTLRKKYEKRQSEYRRLVQLLYQQIQRSSSSKTSYPLTKF IETLPSEHFSNEE YQKEASADQKVIL EQEETELLREQELLASQEVTSKSPNNYPVPQSRTIVNKPSDNYPVPRSR STKIDFDNSLQKQELHAKNGFS EKAIVEFNQ DKQPMFKEEAIVEFNQDKPEIKEETIVEFNQNKQPMFKEEAILEFNQDKQPEFKETI LDNKEILDNKEDILEEENQDEPI VQNPFLENFWKPEQKTFNQSGLFEESSDFSNDWSGGDVTLNFS1 Note: all amino acids from "MHYT" through "LNFS" will be used in the search. The alignments for as many as 10,000 top scoring sequences may be returned. The score threshold will default to the value 30. SCORING MATRICES: ----------------- You can use both PAM and BLOSUM scoring matrices. The currently supported distances are for BLOSUM: 30, 35, 40, 45, 50, 55, 60, 62, 65, 70, 75, 80, 85, 90, 100 for PAM: 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, and 500. TO OBTAIN HELP: --------------- You can obtain this message at any moment by sending a message with one of: {dFLASH, dflash, DFLASH} in the "Subject" line and a body containing the phrase "send help". Notice that you do not have to be registered in order to request the help file. TO OBTAIN ON-LINE REPRINTS OF PAPERS ------------------------------------ You can obtain reprints (in PostScript) of relevant papers by sending a message with one of: {dFLASH, dflash, DFLASH} in the "Subject" line and a body containing the phrase "send FLASHpaper" ---> returns to the originator of the request a copy of the FLASH paper the phrase "send CONCERTpaper" ---> returns to the originator of the request a copy of a high-level paper describing the CONCERT/C language the phrase "send BAYESpaper" ---> returns to the originator of the request a copy of a paper describing a computer-vision application based on a Bayesian geometric indexing framework Notice that you do not have to be registered in order to request the reprints of the above papers. Also, there can only be *one* such request per message! IMPORTANT NOTES: ---------------- (1) our database search program currently operates with Swiss-Prot Release 25. (2) at the moment we are putting together the version of the server that will allow sequence searches in GenBank. The current projection is that the GenBank search server will be available before the end of the summer. Thank you for your interest in the dFLASH server. Sincerely, The dFLASH management ############################################################################### COMMENTS?? ---------- We will appreciate receiving your feedback, suggestions, comments, or bug reports; all of these can be sent to "dflash@watson.ibm.com" Please, make sure your "Subject" line contains the word "comments". ############################################################################### REFERENCES ---------- If you make use of the dFLASH server, please reference A. Califano and I. Rigoutsos, "FLASH: A Fast Look-up Algorithm for String Homology." In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, July 1993, Bethesda, MD. If you wish to find out more about the dFLASH server, you can contact Andrea Califano (acal@watson.ibm.com) or Isidore Rigoutsos (rigoutso@watson.ibm.com) ############################################################################### For more information on the Concert/C language, please refer to J. Auerbach, D. Bacon, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy, A. Lowry, J. Russell, W. Silverman, R. Strom, D. Yellin, and S. Yemini, "High-level language support for programming reliable distributed systems." In Proceedings of the International Conference on Computer Languages, April 1992, Oakland, California. or contact Josh Auerbach (jsa@watson.ibm.com) ############################################################################### -- ##
________________________________________