Vol.4 No.1 2011
40/78
Research paper : Development and release of a spectral database for organic compounds (T. Saito et al.)−37−Synthesiology - English edition Vol.4 No.1 (2011) which is a unique number for each chemical identity. The spectral code for each kind of MS, IR, 13C NMR and 1H NMR spectrum has been managed individually. Thus the database for a single kind of spectrum and chemical information can be separated as an independent database. Compound and spectral information corresponding to these numbers are managed as relational databases independently, and these numbers made our work proceed smoothly.In particular, adopting the SDBS compound number that is a unique identification to a compound is a characteristic of this database. This number is not just a management number of the compounds. This number enables the compound dictionary to be independent from a spectrum database. As a result, the flexibility of changing the compound dictionary was maintained. Thus operation of useful compound information has been possible up to now. The founders of this database had had rich experiences in constructing databases while in the gas chromatographic data committee, the infrared data committee, and the NMR data subcommittee during the era of the former Agency of Industrial Science and Technology. Based on such experiences, the SDBS compound number was adopted to this database and made the database functional for some 30 years after it was established.In principle, all compounds have been obtained and their spectral data have been acquired by ourselves. The unique bottle number was assigned to each chemical bottle obtained. An SDBS compound number was assigned to each compound. Each sample identified in a bottle had been carefully checked for avoiding assignment of two or more different SDBS compound numbers to an identical chemical compound. This number was assigned only when the sample was evaluated as a new chemical compound in the database. This work might not have been a tough task in the early days of this database, however, as time passed, the task became more difficult. In the past few years, we encountered questions and problems in the checking process. The assignment of a SDBS compound number to a unique chemical became a more complex task and required longer time compared to the early days. This is because compound structures and thus their names have become more complex, and the number of compiled compounds has become large extending to some 30,000 different chemical compounds. For solving this problem, first we collected as much information of a chemical compound as possible, then individual information was searched through our compound database. Deeper consideration by several chemists was made only for those compounds with possibilities of having already been compiled. This enabled us to spend more time to acquire and evaluate the spectral data. As a result, the problem of assigning different SDBS compound number to two or more identical compounds has occurred less frequently and full supervision by chemists has become unnecessary. In recent years, SDBS compound assignment process has become smooth.2.2 Decision of the database operation and renewal of the database platformThe operation of this database was started with a mainframe computer in the1980’s. The choice of such a computer made sense because the first Japanese Windows computer, NEC PC-9800, was just released about the same time. However, operation of this mainframe computer (FACOM MSP) ended in March 1999 due to the decision by the former Agency of Industrial Science and Technology. It was needed to decide whether activity of the database would be continued with another mainframe computer or a personal computer (PC), or terminated altogether. At that point, we decided to continue our activities and to adopt Windows PC, and a new data input tool was developed on the PC[7]. With much of the computer software, it was difficult to shift platforms from MS-DOS to Windows computers, and the majority of the system had to be rebuilt. This database, however, was successfully and completely transferred to a different platform of its operation. If we had remained with the mainframe computer at that time, our system might have been in trouble in adopting many tools for assisting data input to the database system. The data collection and handling became greatly easy by adopting Windows PC as a platform for the data management.3 Selection of data to be collected3.1 Strategy for selecting compoundsThis database was designed as a useful tool for compound identification. Data collections were mainly focused on spectral data of commercial reagents that were frequently used by many people. Figure 1 shows the number of spectral Fig. 5 Structure of the spectral database for organic compounds (SDBS)SDBS in this figure contains information of the SDBS compound number that is the unique number for each compound, the number of elements for the compound, and other information. The structure of 1H NMR database is shown on behalf of each spectral database. All information is correlated to the SDBS compound number via a relational database. MS DB 13C NMR DB Raman DB IR DB ESR DB 1H NMR DB Chemicalname DB SDBS SDBS compoundnumberBottlenumberChemicalbottleinformationDB Compound dictionary1H NMRspectral code Measure-mentparameterDBSpectralpatternDB Peak DB Shifts andassignmentsDB Spectra 1H NMR CASregistrynumber DB Molecularformula andmolecularmass DB Activity stopped
元のページ