Changes between Version 2 and Version 3 of AnalyzingCEGSRaceProducts

Show
Ignore:
Author:
titus (IP: 24.205.71.196)
Timestamp:
05/14/08 14:19:55 (10 years ago)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AnalyzingCEGSRaceProducts

    v2 v3  
    1717 
    1818Note that for this kind of search, a '''good match''' should be 1e-30 or lower (1e-40, 1e-50, etc.) -- you're searching with an ''actual zebrafish sequence'', not something cross species where you would expect lower scoring hits to be relevant. 
     19 
     20 
     21Once you have the whole-gene sequence, do further searches with ''that'' sequence, which won't include cloning sites or other trap sequence but '''will''' include all of the interesting protein domains etc. 
    1922 
    2023'''If blastn against known zebrafish messages doesn't work''' then (lucky you!) you may have found a new gene! Or (unlucky you) you may have a crappy sequence.  It can 
    4447== OK, so what does the gene ''do''? == 
    4548 
    46 You really have only two options for this.  First, you can hope that the gene itself is the subject of one or more publications; you can figure this out by going to the NCBI Web page and looking at the publication list.   In the (quite likely) case that the gene is from a large-scale cDNA collection or is just a predicted gene, it might be worth looking at other  
    47 for the message or its associated protein.   
     49You really have only two options for this.  First, you can hope that the gene itself is the subject of one or more publications; you can figure this out by going to the NCBI Web page and looking at the publication list associated with the RefSeq entry.   In the (quite likely) case that the gene is from a large-scale cDNA collection or is just a predicted gene, it might be worth looking at other species for matches to the message or its associated protein, but as a rule NCBI will already have annotated genes based on this information. 
     50 
     51The second (and much worse) option is to look at conserved protein domains.  I'm not really an expert in this, but I can suggest a few sites: [http://pfam.janelia.org/ the PFAM site] will do a search for known and interesting protein domains in your protein sequence (which you can always get from NCBI for a RefSeq gene).  To use PFAM, select "Sequence Search" and paste in your gene.  You can also use NCBI's [http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi Conserved Domain Database] to do a similar search.