Look for gene upstream sequences-please help

4 posts / 0 new
Last post
debdee
debdee's picture
Look for gene upstream sequences-please help

hi

could some one help me as to how to look for sequences upstream of the promoter of the gene of interest?? i am looking for the upstream sequence of a mouse gene. i am asked to look for the nonconserved regions in the upstream of the mouse gene of interest across species.

please help me

thank you

deb

ryan_m
ryan_m's picture
Hi Deb.

Hi Deb.
I can certainly give some suggestions. I'd first like to clarify what it is you want to identify. You mention you want sequence upstream of the promoter of your gene. Does this mean you want to ignore your promoter? If so, how do you plan to define the promoter (1kb upstream, 2kb, 5kb)? How much sequence are you interested in extracting? Perhaps you just want to grab 5kb upstream from your gene and call that the promoter? Either way, my guess is that you don't want to write any code to do this. For non-programmers, my guess is that you will want to try out "Galaxy". You will find a post regarding this tool and a link to the homepage within this forum. Once I get some more details on what exactly you want, I can walk you through how you can go about accomplishing it.

Ryan

debdee
debdee's picture
Hi Ryan

Hi Ryan

thanx for your mail. well i will try to clarify all the details as far as possible. i am looking for a sequence 2kb upstream of the promoter of my gene of interest. i am to say sincerely looking for 2kb upstream of the RosA26 promoter. i need to then look for conserved and non-conserved sequences of the upstream (2kb) across species that is in drosophila, rat, human, elegans and zebra fish. i hope i was able to convey my interest.

waiting eagerly for your help..

thanx again

Deb

ryan_m
ryan_m's picture
Hi Deb.

Hi Deb.
Has anyone characterised the promoter of these gene? If so, you can use their coordinates to find the position of the promoter that is furthest from your gene. That is the end of your 2kb piece. The start is another 2kb further from the gene. You can use the UCSC genome browser to extract this sequence. If you have the 'conservation' track turned on, you should also be able to identify the conservation within that region. Conservation is not a yes/no thing though, so you will need some criteria to say what is and is not conserved (e.g. does it align? is the percent identity > X? is the phastCons score > 0.9?). Once you have extracted your 2kb you may want to try to do your own alignments (to the genomes you are interested in) using multiz or another algorithm to ensure you are happy with the UCSC alignments.