Vol.3 No.1 2010
9/110
Research paper : A bioinformatics strategy to produce a cyclically developing project structure (M. Suwa et al.)−6−Synthesiology - English edition Vol.3 No.1 (2010) peptide ligand search. On the other hand, we developed a program to monitor the G protein activation. First, using the sequence with known coupled G protein and binding ligand (Gi/o type: 61, Gq/11 type: 47, Gs type: 23), we determined the parameter effective for determining and categorizing the types of coupled G protein and the optimal determination plane from the physicochemical parameters of various sites of the ligand, GPCR, and G protein using the support vector machine (SVM) method that is thought to have the highest identification performance. Using the optimized parameter[9] and determination plane, we created a hierarchical determination program (GRIFFIN) that conducted binary determination of Gi/o or Gq/11 from the remnant after selecting the Gs bonding type upon entering the ligand molecular weight and GPCR. The prediction could be conducted at sensitivity and selectivity of 85 % or higher[10]. Using the above, it is possible to predict the G protein type that is activated in the downstream of signal transmission by the receptor to which certain peptide ligand bonded, based on the database of the ligand that bonds with GPCR, and this will be useful in designing the evaluation system for the expression of the receptor. GRIFFIN would be used for predicting the GPCR with unknown functions in the functional analysis phase of SEVENS. 4.4 Hop again: Type 1 Basic Research to up-scale the researchUp to this point, the research handled human genome only, but this research, in principle, is applicable to genomes of other species. From 2005, we participated in the Scientific Research on Priority Area of the Ministry of Education, Culture, Sports, Science and Technology (MEXT) and started comparative genome research in earnest. It was necessary to alter the SEVENS pipeline for other organisms. Based on the genome sequences of about dozen eukaryotes and over 200 prokaryotes that were available at the time, we studied the resemblance expectation score (E-value) when mapping the known gene onto the genome sequence, and the additional extension upstream and downstream to the gene candidate regions. Using the improved pipeline, while GPCR could not be identified from prokaryotes, in eukaryotes, we found some GPCR in yeasts, a dozen in plants, about 200 in insects, several hundreds in fish and birds, and several hundreds to thousands in mammals. Among insects, nematodes, and vertebrates, the minimal number of receptors necessary for life activities such as neurotransmission and intercellular interaction were retained in all organisms, and the types of receptors for complex functions increased dramatically in vertebrates. The receptors for chemical substances in the environment were distributed uniquely in different organisms according to atmospheric or aqueous environments. For example, mammals had high percentage of olfactory receptors in the GPCR genes, and they dominated about 70 %. This indicates that they increased rapidly with repeated high-density gene duplication[11]. The SEVENS pipeline for multiple organisms was almost completely automated at this point, and it became possible to continue analysis even with increased number of organisms. 4.5 Step again: use of pipeline with new protocolWe received high acclaim for identifying and publicizing the GPCRs of various organisms, and from 2007, participated in the Silkworm Genome Project, a joint research of China and Japan. The silkworm genome was the first sequence to be completed for the lepidopterans. By accelerating the production technology development of medical proteins and silk with new functions through analysis, it may contribute to developments of new agrichemicals and insect industry. We collaborated with the groups from the University of Tokyo and the Kyoto Institute of Technology, and identified the seven transmembrane helix receptors from the silkworm genome and clarified the family distribution. Particularly, we found several characteristics unique to silkworms compared to other insects (drosophila, anopheles, and honeybee) concerning olfactory and gustatory receptors[12]. For this project also, it was necessary to modify the SEVENS pipeline for insects. We conducted studies of sequence resemblance score when pasting the known gene onto the genome, survey of additional extension for upstream and downstream, and the hidden Malkov modeling for common sequences seen only in the insect olfactory receptor. Also, we introduced a new protocol since the aim was to maximize the number of identified genes. In ordinary pipelines, when the known genes are used as seeds, a greater number of gene candidates emerge including new genes. Therefore, by using theses new genes as initial seeds of the pipeline, the number of new genes will increase. This is repeated sequentially until the number of predicted genes settles out (recursive computation). We identified 66 olfactory receptors. Among these, we identified 18 expressions of new receptors, and the odorous material (cis-Jasmone) that attracts the silkworm to mulberry leaves and its receptor were identified for the first time in the world. This became a world-class result in the field of biology, and was published in Current Biology[13]. The pipeline for insects and recursive computation protocol are reflected in the current SEVENS.4.6 Current results, SEVENS and GRIFFIN As of 2009, SEVENS stores 24,545 genes for 43 eukaryotes, under the support of Grant-in-Aid for Scientific Research (Grant-in-Aid for Publication of Scientific Research Results). It is an integrated DB where various kinds of functional and structural information are visually presented and organized in hierarchical manner. The technologies improved in the
元のページ