Vol.2 No.1 2009
69/88
Research paper : Advanced in-silico drug screening to achieve high hit ratio (Y. Fukunishi et al.)−66−Synthesiology - English edition Vol.2 No.1 (2009) NP-complete problems, which thus require a prolonged period of computation time[3]. Hence, we developed a method that compares the topology of molecules using a molecular edge matrix, M, where M (i, j) = 1 if atoms i and j bind to each other and M (i, j) = 0 if not (Fig. 2).In molecular graphs, the sequence of atomic numbers is meaningless, and graph invariants should thus be evaluated. Here, since the edge matrix is a Hermitian matrix, eigenvalues will be the graph invariants. Although the Hosoya index is known as a method to evaluate graph invariants, it is computationally cumbersome[4]. Eigenvalue evaluation is a practical approach because its computation time is N3 for the number of atoms, N. Protons were eliminated to reduce the matrix dimensions by half and the atomic number of each atom was substituted in the diagonal terms in order to reflect the type of atom.4.3.3 Identification of geometric isomersAlthough the method described in Section 4.3.2 makes it possible to identify the graphical topology of molecules with reasonable accuracy, it is incapable of discriminating geometric isomers such as cis and trans isomers. Thus, we developed graph invariants that can discriminate geometric isomers. First, for atoms i and j bound by a double bond, each graph fragment on the edge of four bonds is sequentially numbered from the maximum eigenvalue of a partial graph matrix as 1, 2 and 1’, 2’ (Fig. 3). Geometric isomers can thus be identified from the eigenvalues of the whole graph matrix by assigning −2 for the i−j component if vectors 12 and 1’2’ are parallel and +2 if they are anti-parallel.4.4 ProtonationThe number of absent protons in atoms such as C, N, O, and S in 2D structures was predicted from the bond order, and plausible coordinates of these protons were evaluated from the positional relationship with adjacent atoms and appended to the molecules. Although software that appends protons such as babel[5] and openbabel[6] is available, such software is not necessarily accurate. We investigated the protonated states of various functional groups and devised an algorithm so that it reproduces a molecule with a dominant ion forms under a vacuum and in water (near pH 7.0). Since accurate prediction of ionic configuration for a whole molecule is difficult, representative ionic configurations were applied for each functional group. Moreover, since the 2D chemical structures are simply diagrams, the actual atomic distance may be 1 Å or 10 Å. The average distance of the chemical bonds was therefore scaled to 1.5 Å.4.5 Addition of force field parametersThe generation of 3D molecular structures from their 2D counterparts was conducted using a molecular force field. Our compound DB applied a general amber force field (GAFF)[7] to generate 3D structures. Since the parameters of a GAFF are not available for most molecules, molecular structures cannot be determined. We therefore obtained accurate molecular structures by optimization calculations based on ab-initio calculations of quantum mechanics using CSD[8], a crystal structure database, and manually constructed the structures of 660 molecules. We then developed an algorithm that assigns atom types and force field parameters to all of the atoms if the parameters are absent, thereby making it possible to handle more than 99.9 % of molecules. Moreover, in addition to the consolidation of force field parameters, we developed tplgeneL, a software that assigns force field parameters to general compounds. This software is also capable of assigning parameters to the transition states of chemical reactions, which is useful for enzyme research.4.6 Generation of 3D structuresOnce force field parameters have been provided for the molecules, the 3D molecular structures can be generated. We applied cosgene[9], a software that we had previously developed for simulating molecular dynamics, to generate 3D structures by energy optimizations. 3D molecular structures cannot be generated unless a random displacement is applied on the initial coordinates, because a force in the Z-axis direction will not be generated in a 2D structure containing only X and Y coordinates. The structural adequacy of the generated 3D molecular structures (such as atomic distances and binding angles) was assessed by software, and if a distorted structure was generated, the initial coordinates were 0 1 1 1 01 0 1 0 01 1 0 0 01 0 0 0 10 0 0 1 012345Molecular graphEdge matrix MM(i, j)=-2M(i, j)=2HCH3ijCH3Hij121’2’HCH3ijH3CHij121’2’ isomerCis isomerTransFig. 2 Construction of a binding matrix, M, from a molecular graph.Fig. 3 Identification of geometric isomers. The thin black arrows indicate the sequence of eigenvalues assigned on the partial graphs.
元のページ