Vol.2 No.1 2009
71/88

Research paper : Advanced in-silico drug screening to achieve high hit ratio (Y. Fukunishi et al.)−68−Synthesiology - English edition Vol.2 No.1 (2009) The method of discriminating equivalent atoms is as follows: Arbitrary atoms i and j are selected for marking as “already checked.” If the atoms are equivalent, they are marked as “already checked.” If i = j, the atoms are equivalent and, in this case, no further validation is necessary. If atoms i and j are bound by a single bond, both i and j bind to the “already checked” atoms, and their atomic symbols are the same, then i and j are considered to be equivalent.All of atoms mi and mj binding to i and j, respectively, are temporarily marked as “already checked” and their equivalency is tested by the abovementioned procedure. Subsequently, if mi and mj are not equivalent, the “already checked” mark is removed. However, if all of mi and mj are determined to be equivalent, atoms i and j are considered to be equivalent.When equivalency is tested from atoms i and j, the atoms that are to be tested are indicated as gray-filled circles and the atoms that will finally be tested are indicated as black-filled circles in Fig. 4. The equivalency test is necessary up to the point where the routes from i and j meet each other (black-filled circles), and the whole graph is not necessarily tested.4.10 Compilation of database and downloading of filesThe compound DB is structured as a relational database. The schema includes the information on compound mol2 files (atomic names, 3D coordinates, atomic charges, chemical bond orders, etc.) in addition to the molecular weight, HOMO/LUMO energy in the MOPAC AM1 model, and solvation free energy per molecule and per atom calculated by the GBSA model. The solvation free energy per atom is useful for identifying the location of a compound in the chemical space of compounds (compound space), and is thus used as a parameter that indicates its diversity (the degree of diversity in the collected compounds) in a DB[14]. Compound information can be downloaded in the form of mol2 files from the compound DB.4.11 Computation of protein-compound affinity matrixWe selected a large number of proteins other than the target proteins, performed combinatorial docking calculations against the compound library in the compound DB, constructed a protein-compound affinity matrix, and compiled it into a database. This is the basic DB for the drug screening methods we developed, the multiple target screening (MTS) method[15] and docking score index (DSI) method[14], which will be described later, and is a crucial resource for our VS (Fig. 5).If the compounds that bind to the target proteins are selected in the order of the higher docking scores*term2 (scores) calculated by general VS, the hit rate is low. A compound that exhibits a high score against a target protein occasionally also exhibits high scores against other proteins, which indicates that the associativity of the compound with respect to the target protein is not specific. In contrast, only one compound is focused on in the MTS method; the proteins that bind to the compound are searched for and the compounds that bind to the target proteins with the highest score are selected as candidate hit compounds.The accuracy of the score can also be improved by using the protein-compound affinity matrix. The free energies of binding for a particular compound to bind to analogous proteins are considered to be close in value. Errors in the score can be reduced by averaging the weighted scores depending on the similarity of the proteins; the details are reported elsewhere[16]. In particular, the scores were corrected by the following equation:(1)∑∑=bbabbaibianewRRsswhere snewai, sbi, and Rab are the newly defined score between protein a and compound i, the score between protein b and compound i, and the correlation coefficient of protein a and protein b, respectively.In addition, if known active compounds exist in the compound list, the scores can also be corrected so that the known active compounds will be preferentially predicted. As shown in the following equation, the corrected scores are described as a linear combination of the scores and the coefficient Mab was evaluated so that the database enrichment*term3 was maximized: (2)∑=bbaibianewMssAs a result of applying the MTS method to 12 target proteins including COX-2 and HIV-1 protease and selecting the top 1 % of compounds predicted from the compound library, the discovery rate was improved approximately 40-fold HH2HHCHCH2CCCCCH2H2CH2CH2HH2CHCH2CCCCCH2H2CH2CH2ijHHHCCHHHCCijHHHCCHHHCCijAtoms i and j are equivalent.Atoms i and j are equivalent.Fig. 4 Determination of equivalent atoms. “” indicates “already checked” atoms.

元のページ 

10秒後に元のページに移動します

※このページを正しく表示するにはFlashPlayer9以上が必要です