Vol.2 No.1 2009
39/88
Research paper : How Grid enables E-Science? (Y. Tanaka)−36−Synthesiology - English edition Vol.2 No.1 (2009) supercomputers.” There are several issues that must be solved for the computation grid, such as: performance issues, since the performance of the Internet relies on “best effort” and there is no actual guarantee for performance; the current parallel programming method is not suitable for grid use due to problems of fault tolerance and simultaneous securing of computers; and technological issues where the scheduling technology for selecting optimal computing resources is only in the research phase. There are very few case study reports that can say, “we were able to do this using the grid,” and the application community’s views are, “we would like to use it but don’t know how,” “it won’t work anyways,” or “I have gotten no idea what to use it for.”Not all the basic technologies required for the grid have achieved a practical level, but it is possible to provide novel and realistic research methodologies for various science and technology fields by combining the technologies that have been developed so far. The objective of our research is to build an E-Science infrastructure using the grid as the basis of the GEO Grid, and thus to provide a research environment for earth science researchers, as well as to clarify and solve issues standing in the way of full realization of use of the grid in wide-ranging science and technology fields. The aim is to contribute to the creation of innovations in science and technology fields. To achieve this goal, the required specifications for an information infrastructure based on the scenarios of the case studies of GEO Grid were analyzed, and the system was designed and implemented. The strategy taken was to actually build a system for distributing satellite data to provide a research environment to earth science researchers using the grid. Issues were identified as they arose from the findings and feedback of this implementation, and a strategy for realization was planned.In this paper, the tasks undertaken to achieve system construction, the security issues in E-Science, and the problems that still need to be solved will be explained using the GEO Grid as an example. The main objective of this paper is, for researchers in application fields, to promote diffusion of the grid by demonstrating the feasibility of the case study, and to enhance understanding of the grid by clarifying “what can be done and what cannot be done.” Also, for researchers in IT fields, I will explain the methodology used for constructing a system by combining multiple software components.GEO Grid is composed of applications, content, and an information infrastructure, and this paper will report on the design and implementation of the information infrastructure. First, the methodology of system construction in the IT field will be discussed. Then, the requirements of the GEO Grid information infrastructure and a design policy based on these requirements will be presented, and finally I will explain the implementation method as well as the findings and results obtained through the construction of the actual system.2 Requirements of the information infrastructureThe requirements of the GEO Grid information infrastructure are summarized as follows.(1) Provision of large-scale dataSatellite observation data accumulates to several hundred terabytes to petabytes in size throughout its operation period, and high scalability that enables a quick search for the data needed by the user from such large-scale data is required.(2) Handling of diverse dataThe ability to handle diverse data stored in diverse formats provided by diverse organizations is required, including climate data obtained for different physical quantities, and different time-space resolutions for temperature, humidity, and cloud cover.(3) Observation of data provision policyWhile there are free data sets with no limitation on use, in general, the data owner has the right to license, as well as the right to set and change the conditions, such as authorized range of data access or data format, under which such data can be provided. Thus, it is necessary to achieve flexible access control based on the disclosure policy of the data owner.(4) Integration of data and computationIt is necessary to provide integration of computation and data, such as large-scale simulation of areas affected by pyroclastic flow based on data, and easily done computations, such as format change, and preliminary processing of data.(5) Support for a diverse communityIt is necessary to set up a mechanism that allows sharing data, computation, tools, and process flow in the form of templates that can be altered flexibly to support diverse communities and various earth science projects, such as environmental watch, disaster watch, and resource exploration.(6) Ease-of-useIt is necessary to provide tools and interfaces that can be “easily used” by all participants, including users, data providers, and project administrators. Also, the system must allow easy management of tens of thousands of users.3 DesignBased on the requirements mentioned in the previous
元のページ