Prediction of substrate for unknown genes
Amita Misra
Sunday, 11 January 2009 11:00 UTC
Hi All,
I’ve some novel genes from my plant and want to functionally characterize them invitro but I am not able to select the substrate with which it can act. These are the genes of a superfamily of proteins and have lots of similarity at sequence level.Similar sequences present in public databases are also of unknown function(after BLAST analysis).Whether bioinformatics can help in substrate prediction or not. Plzz suggest how should I proceed further.
take care
Amita
Updated 11 January 2009 15:21 UTC
-
Replies
-
Hi Amita,
I come from a background in fly development and genetics, and how we would go about solving this problem would be to disrupt the normal function of the gene and determine how this affects the organism at a morphological and molecular level. Gene function can be disrupted in flies: by making mutants; using RNAi; and by ectopic expression using reporter constructs. I am sure there are probably other approaches, too, that I haven’t thought of!
You don’t mention if there are similar genes/proteins present in model systems other than your plant. It might be worth checking the sequence databases and literature to find out, as similar genes in other organisms may be better characterised and the function might be conserved?
I hope this helps and that you make progress with your project.
Best,
Dot -
Hi Dorothy
Thanks for ur response.
Actually I m working on Plant CYPs. These genes are present in model plants like Arabidopsis but the CYPs which I have are also of unknown function in that. I am trying for RNAi but CYPs are superfamilies of genes and they are very similar at subfamily levels.Amita
-
Hi Amita,
You might want to look for smaller motifs rather than looking at full length similarities. BLAST is not the way to do that. A multiple sequence analysis might show regions of high similarity that could be functional, or you could try looking for known motifs.
Multiple sequence analysis:
You need a bunch of related sequences for this (the genes in the superfamily that you mentioned)
Use ClustalW or T-Coffee (they’re different in terms of how they work. Try both, and if you get similar alignments, don’t worry about it. If you don’t, go with the one that gives conserved local alignments (so there might be long sections that are really badly aligned, but spots where it’s almost perfect. That’s what you’re looking for, because those regions might have a similar function, e.g. binding a substrate.)
The sequences probably need to be in FASTA format before you can compare them.The idea behind this (and BLAST doesn’t do that) is that it shows which regions are especially conserved, and therefore probably functional. Once you find something like that, you can do a BLAST search with only that region of your gene(s), and see what that picks up. There might be another group of proteins with a similar region and a known function.
This is pretty vague, because I don’t know what your sequences look like and if this is even feasible in this case.
Another tip: If some of the proteins in your set are much bigger than the rest, find the part of the protein sequence that corresponds to the rest of the family, and cut off the extension before doing the multiple sequence alignment (all the sequences should be approximately the same length. If they already are, don’t worry about this bit.)
-
Hi Eva,
Thank you soo much for ur interest in the topic. I have already done multiple sequence alignment of my genes using clustal W.After alignment I got the conserved domains for CYPs.Most of the CYPs are of 1.5 Kb.
Now as per ur suggestion I’ll use conserved regions other than for CYPs for substrate binding…..take care,
Amita
-