Vol.4 No.1 2011
42/78
Research paper : Development and release of a spectral database for organic compounds (T. Saito et al.)−39−Synthesiology - English edition Vol.4 No.1 (2011) measurement condition. Only the best quality MS spectrum was compiled for a molecule. For IR, different conditions of measurement such as KBr plate method and Nujor method for a solid sample was adopted. In such cases, spectra with the best quality and reliability in each condition were compiled. For 13C NMR, 1H nucleus was decoupled from the spectrum which made the spectrum simple singlet peak for each carbon for most of the cases. Only one 13C NMR spectrum was compiled for a compound.For 1H NMR, a resonance peak pattern depends on the resonance frequency. This makes spectral pattern for a given sample solution to be also dependent on resonance frequency. Many 1H NMR were acquired at resonance frequency of 90 MHz at the beginning of this database activity. When a molecule had a complex structure and it was hard to make chemical shift assignment at this frequency, a higher frequency of 400 MHz, which simplified spectral patterns, was also adopted for such a molecule. Simulation of spectral pattern was also important. This database had a capability of simulating spectrum observed at different resonance frequencies. For fulfilling this capability, chemical shifts and spin-spin coupling constants for a molecule were also compiled in the database independent of spectral patterns.For 1H and 13C NMR, chemical shift assignments were compiled in addition to the information above. Especially for 1H NMR, since the resonance peak pattern depends on resonance frequency, it was indispensable to provide a universal property of chemical shift assignments. If there was no such data compiled in this database, two 1H NMR spectrum obtained at different resonance frequencies cannot be compared. The chemical shifts and their assignments were the most valuable information for the 1H NMR spectral database.Information on a chemical compound was compiled as much as possible. It is usual that the more complex the structure of a compound is, the more chemical names or abbreviations are assigned to it. We thought that it was beneficial for users of this database if they could easily find a compound with various names, and therefore, much chemical information was compiled encyclopaedically.3.5 Kinds of spectra collectedIn the early days, six kinds of spectra were collected. Currently, four out of the six kinds are still actively collected (Fig. 5). The spectra used in analysis in the 1980’s were not limited to those six kinds of spectra. Spectra such as ultra violet-visible spectral data were not incorporated in this database. The selection of the six spectra for this database is thought to have depended on instrumentations and researchers in our institute at that time. The reasons for the decision to terminate the activities of Raman and ESR spectra were not only based on the instrumentation and the researcher issues but also on the then supposed relatively low potential demands from users of the database. However, the demand for Raman spectral data has now expanded academically and in industry. Based on this point, this database may not cover enough to respond to the user’s needs. On the other hand, MS, 13C NMR, 1H NMR, and IR have had huge demands from the 1980’s up to the present days. After this database was released to the public, the demands of users can be evaluated by the number of accesses to spectral data (Fig. 2). 4 Policy of data release to the public4.1 Releasing data on the WebIn 1997, spectral data of MS, 13C NMR, and 1H NMR was released from the website of AIST, and IR and ESR followed a year later[10]. Currently, six kinds of spectral data, including Raman spectra, are opened to the public through the Web. When our data were opened to the public for the first time, people had started to use web browsers such as NCSA Mosaic and Netscape. Although more and more people started to use the Web, the internet line was slow and functions of the browsers were insufficient compared to the current situation. Therefore, it was important to display the information efficiently on the Web. As a result, we decided to use GIF image format for displaying spectral patterns and structures of molecules which placed the least load on the internet line. The speed of the web access has increased rapidly in Japan, and therefore, it may not be applicable to other particular parts of the world. Therefore, this format is still kept in our system.Another reason to use the GIF format is to protect the data. In other words, image display format can prevent illegal loss of coordinate data or intellectual property of our database. It is easy to reproduce a high resolution spectrum using digital data. On the contrary, it is not possible to create spectral data with higher resolution than the original image when it is reconstructed from the image. We have had several experiences of systematic data download activities for taking all or at least the majority of our data. The protection plan that we took worked for preventing the coordinate data to be inappropriately taken. In the future, with sufficient protection against illegal accesses, spectral expansion functions based on the coordination data can be provided on the Web.We chose English for information on the Web. This choice was possible because the collected compound names were in English. The other information, or the spectral data, did not depend on any languages. At present, when users access with the Japanese setting computers, the database gives frame information in Japanese. To increase the users’ convenience, links to other databases disclosed on the Web have been created for the information
元のページ