A Semantic Grid to Support Modelling of Land Use and Water Management ===================================================================== Nick Gotts Introduction ============ The title of this document is also the provisional title of a proposed FP6 Integrated Project in the thematic area of "Information Society Technologies" (IST), and specifically aimed at the Strategic Objective "GRID-based systems for solving complex problems" (see http://www.cordis.lu/ist/activities/activities.htm#). The second call for the IST thematic area (to be issued on 2003.06.17 with a closure date of 2003.10.15) will invite proposals for Integrated Projects under this Strategic Objective (see the IST work Programme 2003-2004, available at http://fp6.cordis.lu/fp6/call_details.cfm?CALL_ID=1). Those originating this initiative are Alun Preece and Pete Edwards of the University of Aberdeen Department of Computer Science, and Gary Polhill, Alistair Law and me from the Macaulay Institute. What is a Semantic Grid? ======================== The concept of a "Semantic Grid" fuses ideas from two current directions in which the Web, and in particular its use for scientific work, are being developed: "Grids", and the "Semantic Web". The general idea of a "Grid" is to make high-end computational resources and capabilities from remote sites available to the participants, without the users needing to concern themselves with where these resources are physically located - any more than we now need to consider where a web page is located. According to a 3-point definition by Foster (2002), a Grid: * Coordinates (computational) resources that are not subject to centralized control (and in particular are within different control domains), * using standard, open, general-purpose protocols and interfaces (addressing issues such as authentication, authorization, resource discovery and resource access) * to deliver nontrivial qualities of service (meeting specified crieria relating to response time. availability, security, etc.). The idea of the "Semantic Web" is premised on the fact that the Web: "Has developed most rapidly as a medium of documents for people rather than of information that can be manipulated automatically" (Berners-Lee et al 2001, p.30). The idea is to annotate Web documents semantically (i.e. with tags indicating the structure and content of the documents, rather than how they should be displayed on browsers), facilitating various kinds of automated search and reasoning. Two important formalisms for the Semantic Web are XML (eXtensible Markup Language) and RDF (Resource Desciption Framework). XML allows users to add arbitrary structure to their documents, RDF expresses assertions as object-attribute-value triples; the two are compatible. Combining the two concepts in the idea of a "Semantic Grid", as proposed by De Roure et al (2001), is intended to produce a Grid in which many of the processes necessary to "coordinated resource sharing and problem solving among dynamic collections of individuals" are automated or semi-automated (the authors note that full automation is extremely challenging technically, and may not be desired). The authors propose a three-layer conceptualisation of a Semantic Grid: data (uninterpreted bits), information (data with meaning) and knowledge (information used for a purpose), with annotations for use by automated processes at all levels. They discuss the knowledge layer primarily in terms of a "knowledge lifecycle" including knowledge acquisition, knowledge modelling, knowledge retrival and reuse, knowledge dissemination, and knowledge maintenance. They regard XML and RDF as insufficient as tagging languages for a Semantic Grid, which would require ontologies created in more sophisticated languages, using concepts from AI knowledge representation work. They suggest a "service-oriented"conceptualisation of the Semantic Grid, with service owners and users as autonomous agents ("each service owner will have one or more agents acting on its behalf.", p.20). The Proposed Integrated Project =============================== While De Roure et al talk about "The Semantic Grid", providing the kinds of capabilities they discuss for scientific research will require the development of detailed, domain-specific ontologies. Indeed, they suggest that the process driving the Semantic Grid's emergence will begin with the development of such ontologies, proceeding to the large-scale annotation of data, information and knowledge in terms of these ontologies, and then the exploitation of this enriched content by "knowledge technologies". Ontology development will, almost necessarily, begin separately in different scientific domains, and a major component of the proposed Integrated Project will be the development of ontologies relevant to the specific application domain chosen (currently suggested to be "Land Use and Water Management"), and to simulation modelling (probably further specified to agent-based modelling, possibly even to spatially explicit agent-based modelling). These ontologies will be used to describe models (in terms of inputs, outputs, internal structures, and intended interpretation) and data sources. Once an annotated database of models and data sources has been created, it will be available for use by the IP participants, and by other modellers, and the expectation is that it will continue to grow in size and sophistication over a period of years. Increasingly, as a grid infrastructure is established (probably using the Globus Toolkit, an "open architecture, open source software toolkit"), and as software tools for use with it are developed, it is intended that the incipient Semantic Grid will support (for example): * Content-driven searches for data sources and models relevant to a planned project. * Ability to make detailed comparisons of the structure and results of existing models. * The reuse, adaptation, and composition of existing models and data sources. * The use of distributed computational resources for simulation experiments, analysis of results. * Availability of, and reasoning about, meta-information concerning the availability, costs, provenance, etc. of data sources, models, and computational resources. At the Barcelona meeting, I hope to gather a core of agent-based modellers from the ABSS SIG, and possibly from other parts of AgentLink, who will in turn recruit others interested in the IP. It might seem that the IP would be of interest primarily to those in the SIG who are working in the areas of land use and water change, but at this stage, either the domain, the type of model, or both, could be expanded to accomodate groups (here and elsewhere) interested in taking part. Furthermore, it may be noted that there are two possible roles for members of the ABSS SIG in this IP: * As domain-level modellers, cooperating in the construction of domain ontologies. * As modellers of possible Semantic Grid structures and procedures - given that the Semantic Grid is envisaged as operating via a dense web of complex interactions between human and (eventually) artificial agents. (Scott Moss in particular has emphasised the role of ABSS in increasing understanding of such dense webs of interaction.) My talk will expand on the points above, and also discuss the relationship of the proposed IP to various existing initiatives relevant to the development of Semantic Grids. References ========== Berners-Lee, Tim, James Hendler and Ora Lassila (2001) The Semantic Web Scientific American 284(5) 28-37 De Roure, David, Nicholas Jennings and Nigel Shadbolt (2001) Research Agenda for the Demantic Grid: A Future e-Science Infrastructure Report commissioned for ESPRC/DTI Core e-Science Programme, available as http://www.semanticgrid.org/v1.9/semgrid.pdf Foster, Ian (2002) What is the Grid? A Three Point Checklist Grid Today 1(6)