If you're interested, please contact Georg Fuellen, via email (fuellen@techfak.uni-bielefeld.de), Phone (2903), or in person (M3-114).
These project proposals are intended to be ``Studienarbeiten'' (in accordance with the Bielefeld Curriculum in ``Naturwissenschaftliche Informatik'' -- NWI), organized by the Study Project Agency. However, if you're not a Bielefeld student, and found this page on the WWW, please inquire, and we will probably be able to establish a contact.

More about TRP:
Store-operated Ca2+ entry, a mode of Ca2+ influx activated by depletion of Ca2+ from the internal stores, has been
detected in a wide variety of cell types and may be the primary mechanism for Ca2+ entry in nonexcitable cells. TRP forms
a supramolecular complex, proposed to be critical for feedback regulation and/or activation, that includes rhodopsin,
phospholipase C, protein kinase C, calmodulin, and the PDZ domain-containing protein, INAD. INAD seems to be a scaffolding
protein that links TRP with several of these other proteins in the complex. In Drosophila eye, another member of the
family is expressed, TRP-like. It is suggested to form a heteromultimer with TRP with conductance characteristics distinct
from those of TRP or TRP-like homomultimers. A family of proteins related to TRP is conserved from Caenorhabditis elegans
to humans, and recent evidence indicates that at least some of these proteins are SOCs. The human TRP-related proteins may
mediate many of the store-operated conductances that have been identified previously in a plethora of human cells.
Two new members of the family
were shown to be involved in the pain pathway (VR1 in rats) and in olfaction mechanosensation and Olfactory adaptation
(OSM-9 in C. elegans).

Project descriptions are still quite rough and preliminary. A more detailed specification will be given (or worked out together with the interested students) before the actual work begins.
All project results (if successful), will actually be used to improve the content and quality of the TREMBL database. Therefore, output quality, reliability, handling of noisy input, and speed are crucial issues.
Inital project coordination and additional biological training will take place during a one week stay at the European Bioinformatics Institute in Hinxton, Cambridge, UK. A similar stay at the end of the project will be focused on project presentation, evaluation and possibly the preparation of a publication.
The goal of the project is to filter out as many of these entries as possible and to provide a proper translation for them by using a combination of standard tools and additional scripts. After further postprocessing by SWISS-PROT these entries will then be included in TREMBL, the computer-annotated supplement of SWISS-PROT.
The corresponding SWISS-PROT entry for this entry is P26150. But for thousands of DDBJ/EMBL/GenBank entries the corresponding SWISS-PROT or TREMBL entries don't exist. The goal of the project is to enable the automatic generation of most of these entries.
As the alignments rarely produce exact matches, the exact location of the features in the new proteins is a nontrivial task that is to be resolved as far as possible in this project. After the determination of feature locations, the information is to be presented in TREMBL format.
The image below visualizes the task to be resolved. The new proteins 1 and 2 have regions with high homology to the reference protein. By multiple alignment and subsequent processing, the location of these features in the new proteins are to be determined.

As above, plus:
It should be possible to run the program from the command line as well as in interactive mode from a web interface. The interactive mode should also comprise graphic visualisation of the results.
Given a genome database, how do we assemble a list of proteins appropriate for construction of cellular level mechanistic models of particular mammalian cell types? From the perspective of bioinformatics, the most useful and practical definition of a cell type is a list of the proteins it expresses. This definition presupposes the central dogma of molecular cell biology:

There are several possible approaches to identifying the proteins that are expressed in a particular cell type. One is the EST approach. It relies on Soares-normalized cDNA libraries obtained from tissues of interest. This approach has two major problems for cell biologists and cell physiologists. First, the tissue samples almost always contain multiple cell types and the normalization procedure guarantees that mRNAs from every cell type will be recorded. Second, even when the original cDNA library is obtained from cultured cells, there is considerable uncertainty as to whether the same genes are expressed in vitro as are expressed in vivo. Moreover, genes expressed in one set of culture conditions may not be expressed in another.
Our project aims to overcome these difficulties by taking advantage of the rapidly increasing information on cell-specific promoter elements. In effect, we propose to construct Computational cDNA (CcDNA) libraries.
This approach has the great advantage that it will be easily generalized to other cell types once promoters have been identified, but we propose to begin with the vascular smooth muscle cell because of its tremendous physiological and pathophysiological significance. The vascular smooth muscle cell is essential for regulation of blood flow to all tissues and organs, as well as control of arterial and venous blood pressure. Aberrant behavior of this cell type is a key feature of atherosclerotic heart disease, hypertension and stroke. In the industrialized world, these diseases account for more deaths and disabilities than any other human affliction.
During the past few years a group of MADS-box transcription factors has been shown to control the expression of muscle-specific genes. In particular, the four members of the myocyte enhancer factor-2 (MEF2) family are expressed in developing cardiac, skeletal and smooth muscle cells. Very recently, Eric Olson's laboratory has identified several potential partners for the MEF2 family that may direct the specific program of vascular smooth muscle differentiation.
We propose to develop a WWW interface to the worldwide genome databases that permits the user to assemble a list of candidate genes containing user-specified upstream promoters or combinations of promoters that are known or hypothesized to control cell-specific expression in vascular smooth muscle. It may also be useful to include promoters that are known to be activated all cell types so as to construct a full computational cDNA (CcDNA) library.
Upon completion of the project we can carry out two tests of its effectiveness. First, we can search the appropriate subsets of dbEST to determine if a significant number of our CcDNAs are known to be expressed in tissues containing vascular smooth muscle. Second, we can compare our list to a list compiled from the joint experience of a large group of investigators working in the fields of vascular smooth muscle cell physiology and cell biology.
A working knowledge of elementary molecular cell biology. A good knowledge in C or Perl and knowledge of HTML are highly desirable.
