Conceptual Design of the GIS
- Nature of Geographic Data
- Entity Relationship (E-R) Data Modeling
- Geographic Data Models
- Methodology for Modeling
- Developing a Spatial Data Model (Entity-Relationship Diagram)
- Summary of Conceptual Data Modeling
- Spatial Data Standards and Metadata Requirements
- Appendix A: Standards Development
- Additional Reading
PART 1 - DATA MODELING
This guide describes data modeling in general, spatial data modeling in specific, the setting of GIS specifications, and an introduction to spatial data and metadata standards. These activities are collectively called conceptual design of the GIS system. This activity takes the information developed during the Needs Assessment and places it a structured format. The result of this activity will be a GIS data model and functional specifications for the GIS system.
Conceptual design is the first step in database design where the contents of the intended database are identified and described. Database design is usually divided into three major activities
- Conceptual data modeling: identify data content and describe data at an abstract, or conceptual, level. This step is intended to describe what the GIS must do and does not deal with how the GIS will be implemented - the "how" question is the subject of logical and physical database design;
- Logical database design: translation of the conceptual database model into the data model of a specific software system; and
- Physical database design: representation of the logical data model in the schema of the software.
The conceptual design of the GIS system is primarily an exercise in database design. Database planning is the single most important activity in GIS development. It begins with the identification of the needed data and goes on to cover several other activities collectively termed the data life cycle - identification of data in the needs assessment, inclusion of the data in the data model, creation of the metadata, collection and entry into the database, updating and maintenance, and, finally, retained according to the appropriate record retention schedule. A complete data plan facilitates all phases of data collection, maintenance and retention and as everything is considered in advance, data issues do not become major problems that must be addressed after the fact with considerable difficulty and aggravation.
The conceptual design of the GIS also includes identification of the basic GIS architecture (functions of hardware and GIS software), estimates of usage (derived from the needs assessment), and scoping the size of the GIS system. All of this is done with reference to the existing data processing environments (legacy systems) that must interface with the GIS.
Preparing A GIS Data Model
A data model is a formal definition of the data required in a GIS. The data model can take one of several forms, the two used in this guideline are a structured list and an entity-relationship diagram. The purpose of the data model, and the process of specifying the model, is to ensure that the data has been identified and described in a completely rigorous and unambiguous fashion and that both the user and GIS analyst agree on the data definitions. The data model is then the formal specification for the entities, their attributes and all relationships between the entities for the GIS.
Building a data model is not necessarily an easy task. Most professionals in local government will not have had experience in this task. The GIS analyst of the project is the individual who either should build the data model or acquire assistance, such as a qualified consultant, to complete this task. If the opportunity exists for the GIS analyst to attend a database design course or seminar, this would enhance this person's ability to build the model but, more importantly, provide the knowledge for using the final data model in building the GIS. To the extent that data models prepared for other local governments match the needs of a particular GIS development program, or can be easily adapted, they can be modified for use as the data model. However, the GIS analyst must have a good understanding of the resulting model and how it is used to build and manage the GIS database.
The next sections of the guideline first discuss the nature of geographic data, then present the methodology used for data modeling, and lastly describe the development of a GIS data model from the information collected during the Needs Assessment. The example provided in the last section is actually a sample local government GIS data model and is suitable for direct use, with appropriate modification to specific situations.
Geographic data describe entities which have a location. The geographic data includes the location information and other information about the entity of interest. This other information will be referred to as attributes of the entities. Historically several terms have been used to describe the data in a GIS database, among them features, objects, or entities. The term feature derives from cartography and is commonly used to identify "features shown on a map," while entity and object are terms from computer science used to identify the elements in a database. The normal dictionary definitions of these terms are:
Object: a thing that can be seen or touched; material thing that occupies space
Entity: a thing that has definite, individual existence in reality
Feature: the make, shape, form or appearance of a person or thing
A good GIS database design methodology requires the use of terms in a clear an unambiguous manner. This guideline will use the term entity to represent objects or things to be included in the database and attribute will be the term for representing the characteristics or measurements to be recorded for the entities. Other terms have commonly been used to describe the organization of entities and attributes in a GIS, such as layer, coverage, base map, theme, and others. Each of these will usually refer to a collection of one or more entities organized in some useful way which is specific to the GIS software in use. These terms will become important during the logical/physical database design activities where decisions about how the GIS data are to be stored in the GIS database are made. The conceptual database design activity is focused solely on specifying what is to be included in the GIS database and should provide clear and unambiguous representation of the entire GIS database.
In addition to a clear and concise definition of entities and their attributes, data modeling describes relationships between entities. An example of a relationship between an employee and a company would be "works for."
Employee - Works For - Company
Relationships may be bi-directional, thus:
Company - Has - Employees
An important aspect of a relationship is "cardinality," that is if the relationship is between only one of each entity or if either entity may be more than one. For example, one company usually has many employees whereas one employee works for only one company. The possible cardinalities are: one-to-one; one-to-many; and many-to-many. Thus:
--- Has --->
Company (One) <--- Work For ------ (Many) Employees
There are many variations of the notation used to express these facts. The notation recommended for local government will be described later.
Geographic, or spatial data, differs from other "regular" data that are included in computer databases in how entities are defined and in the relationships between entities. Entity identification for spatial data includes the definition of a physical or abstract entity (e.g., a building) and the definition of a corresponding spatial entity (i.e., a polygon to represent the building footprint). This latter, or second entity does not exist for other types of computer databases. The existence of the corresponding spatial entity is one of the major factors that distinguishes GIS from other types of systems and is what makes it very important to utilize proper planning and design techniques when building a GIS. An example will be used to illustrate this difference.
To start the discussion of entity-relationship modeling, two examples will be shown. One, a regular database and the second, a simple GIS database. The personnel database in any local government could have entities of employee, dependent and department. Relationships between these entities would be employee " works in" department and dependent "is a member of"the employee's family. Some of the attributes for each entity would be as follows:
Employee (name, age, sex, job title)
Dependent (name, age, relationship_to _employee, i.e., spouse, child, etc.)
Department (department_name, function, size)
An example of a simple spatial database would be a follows:
|Parcel||ID#, owner_name, owner_address, site_address||Polygon|
|Building||Building_name, height, floor_area||Footprint|
This example has been presented using two standard notational forms for conceptual database design: a relation, the entity name followed by a list of attributes; and an entity-relationship diagram showing entities, their attributes, and the relationships between entities. There are two things to notice:
- The standard entity - relationship diagram has no provision for representing the corresponding spatial entity (point, line, polygon) of the data; and
- The representation of the attributes (ellipses) can be somewhat awkward due to different name lengths and the number of attributes to be shown.
The two notational forms modified to accommodate GIS data will be used as the primary tools for conceptual database design in this guideline; however, modifications will be made to adequately represent GIS data.The next section will provide the formal definition of the basic entity-relationship data modeling method, the modifications needed to represent GIS data, followed by examples of GIS data entities and attributes typical for local government and the by a description of how to model these data using the modified entity-relationship data modeling technique.
Basic Entity-Relationship Modeling
The basic entity-relationship modeling approach is based on describing data in terms of the three parts noted above (Chen 1976):
- Relationships between entitles
- Attributes of entitles or relationships
Each component has a graphic symbol and there exists a set of rules for building a graph (i.e., an E-R model) of a database using the three basic symbols. Entities are represented as rectangles, relationships as diamonds and attributes as ellipses.
The normal relationships included in a E-R model are basically those of:
- Belonging to;
- Set and subset relationships;
- Parent-child relationships; and
- Component parts of an object.
The implementation rules for identifying entities, relationships, and attributes include an English language sentence structure analogy where the nouns in a descriptive sentences identify entities,verbs identify relationships, and adjectives identify attributes. These rules have been defined by Chen (1983) as follows:
Rule 1: A common noun (such as person, chair), in English corresponds to an entity type on an E-R diagram.
Rule 2: A transitive verb in English corresponds to a relationship type in an E-R diagram.
Rule 3: An adjective in English corresponds to an attribute of an entity in an E-R diagram.
English statement: Mr. Joe Jones resides in the Park Avenue Apartments which is located on land parcel #01-857-34 owned by the Apex Company.
Analysis: .. "Joe Jones"," "Park Avenue Apartments," "land parcel" and "Apex Company" are nouns and therefore can be represented as entities "occupant," "building," "parcel," and "owner." "resides," "located on" and "owned by" are transitive verbs (or verb phrases) and therefore define relationships.
Example of Simple E-R diagrams
Many times it is possible to build different E-R diagrams for the same data. For example, instead of creating the entity "owner," the owner's name could be an attribute of parcel. During the process of building an E-R diagram (i.e., the conceptual model) for a database, the analyst must make decisions as to whether something is best represented as an entity or as an attribute of some other entity.
The process of constructing an E-R diagram uncovers many inconsistencies or contradictions in the definition of entities, relationships, and attributes. Many of these are resolved as the initial E-R diagram is constructed while others are resolved by performing a series of transformations on the diagram after its initial construction. The final E-R diagram should be totally free from definitional inconsistencies and contradictions. If properly constructed, an E-R diagram can be directly converted to the logical and physical database schema of the relational, hierarchical or network type database for implementation.
Unique Aspects of Geographic Data
In the simplest terms, we think of geographic data as existing on maps as points, lines and areas. Early GIS systems were designed to digitally encode these spatial objects and associate one or more feature codes with each spatial feature. Examples would be a map of land use polygons, a set of points showing well locations, a map of a stream shown as line segments. For the purposes of plotting (redrawing the map) a simple data structure linking (x,y) coordinates to a feature code was sufficient.
A distinguishing feature of a modern GIS is that some spatial relationships between spatial entities will be coded in the database. This coding is termed topologically coding. Topology is based on graph theory, where a diagram can be expressed as a set of nodes and links in a manner that shows logical relationships. Applied to a map, this concept is used to abstract the features shown on the map and to represent these features as nodes and arcs (point and lines). Nodes are the end-points of arcs and areas are formed by a set of arcs. If the concept and definitions of topologic data structures are not familiar to the reader, the following readings are recommended:
- Geographic Information Systems: A Guide to the Technology, by John Antenucci, et. al, pages 98-99.
- Fundamentals of Spatial Information Systems, by Robert Lauring and Derek Thompson, pages 206-211.
- ARC/INFO Data Model, Concepts, & Key Terms, by Environmental Systems Research Institute, Inc., pages 1-12 to 1-15.
Coordinate strings without topology with associated feature codes were called "spaghetti" files because there was not any relationship between any two coordinate strings formally encoded in the database. For example, the "GIS system" would not "know" if two lines intersect or not or whether they had common end points. These relationships could be seen by the human eye if a plot were to be made or alternatively could be calculated (often a time consuming process). Typical of this type of geographic data file are those produced by computer-aided drafting systems (CAD), or known as .dxf, .dwg, or .dgn files.
The data models in most contemporary GISs are still based on the cartographic view. Other data models have begin to evolve, but are still very limited. Current and potential geographic data models include:
- The cartographic data model: points, lines and polygons (topologically encoded) with one, or only a few, attached attributes, such as a land use layer represented as polygons with associated land used code;
- Extended attribute geographic data mode: geometric objects as above but with many attributes, such as census tract data sets;
- Conceptual object/spatial data model: explicit recognition of user defined objects, zero or more associated spatial objects, and sets of attributes for reach defined object (example: user objects of land parcel, building, and occupant, each having its own set of attributes but with different associated spatial objects: polygon for land parcel, footprint for building, and no spatial object of occupant);
- Conceptual objects/complex spatial objects: multiple objects and multiple associated spatial objects (example: a street network with street segments having spatial representations of both line and polygon type and street intersections having spatial representations of both point and polygon type).
Current GIS are based on the cartographic and extended attribute data models. The trend to object-oriented computer systems and databases will require that GIS planners view their databases from an "object viewpoint."
GISs also differ from other systems in that they include spatial relationships. These relationships are included in the GIS either by the topologic coding or by means of calculations based on the (x,y) coordinates. One common calculation is whether or not two lines intersect. Figure 7 shows the spatial relationships, associated descriptive verbs, and the common implementation of each relationship by a GIS.
Connectivity and contiguity are implemented through topology: the link-node structure for connectivity through networks and the arc-polygon structure for contiguity. Containment and proximity are implemented through x,y coordinates and related spatial operations: containment is determined using the point-, line-, and polygon-on-polygon overlay spatial operation and proximity is determined by calculating the coordinate distance between two or more x,y coordinate locations. The spatial relationship of coincidence may be complete coincidence or partial coincidence. The polygon-on-polygon overlay operation in ARC/INFO calculates partial coincident of polygons in two different coverages. The System 9 Geographic Information System recognizes coincident features through a "shared primitive" concept (the geometry of a point or line is stored only once and then referenced by all features sharing that piece of geometry). Future versions of commercial GISs will likely implement coincident features through either the "belonging to" database relationship or through x,y coordinates and related spatial operations, whichever is more efficient within the particular GIS.
In summary, there are three types of relationships that will be represented in a geographic database with an "object view" orientation:
- Normal database relationships, which are represented in a relational database by means of keys (primary and secondary)
- Spatial relationships represented in the GIS portion of the database by topology
- Spatial relationships that exist only after a calculation is made on the (x,y) coordinates
Modeling a geographic database using the E-R approach requires an expanded or extended concept for:
- Entity identification and definition; and
- Relationship types and alternate representational forms for spatial relationships.
There are three considerations in the identification and definition of entities in a geographic database:
Correct identification and definition of entities
Entities in a geographic database are defined as either discrete objects (e.g., a building, a bridge, a household, a business, etc.) or as an abstract object defined in terms of the space it occupies (e.g., a land parcel, a timber stand, a wetland, a soil type, a contour, etc.). In each of these cases we are dealing with entities in the sense of "things" which will have attributes and which will have spatial relationships between themselves. These "things" can be thought of as "regular" entities.
Defining a corresponding spatial entity for each "regular" entity
A corresponding spatial entity will be one of the spatial data types normally handled in a GIS, e.g., a point, line, area, volumetric unit, etc. The important distinction here is that we have a single entity, its spatial representation and a set of attributes; we do not have two separate objects. A limited and simple set of spatial entitles may be used, or alternatively, depending on the anticipated complexity of the implemented geographic information system, an expanded set of spatial entities may be appropriate. The corresponding spatial entity for the regular entity may be implied in the definition of the regular entity, such as abstract entities like a wetland where the spatial entity would normally be a polygon, or a contour where the spatial entity would be a line. Other regular entitles may have a less obvious corresponding spatial entity. Depending on the GIS requirements, the cartographic display needs, the implicit map scale of the database and other factors, an entity may be reasonably represented by one of several corresponding spatial entities. For example, a city in a small-scale database could have a point as its corresponding spatial entity, while the same city would have a polygon as its corresponding spatial entity in a large-scale geographic database.
Recognize multiple instances of geographic entities, both multiple spatial instances and multiple temporal instances
Multipurpose (or corporate) geographic databases may need to accommodate multiple corresponding spatial entities for some of the regular entities included in the GIS. For example, the representation of an urban street system may require that each street segment (the length of street between two intersecting streets) be held in the GIS as both a single-line street network to support address geocoding, network based transportation modeling, etc., and as a double-line (or polygon) street segment for cartographic display, or to be able to locate other entities within the street segment (such as a water line), etc. In each of these instances the "regular" entity is the street segment, although each instance may have a different set of attributes and different corresponding spatial entities. Also, there may be a need to explicitly recognize multiple temporal instances of regular entities. The simple case of multiple temporal instances will be where the corresponding spatial entity remains the same, however, future GISs will, in all likelihood, have to deal with multiple temporal instances where the corresponding spatial entity changes over time.
Three symbols are defined to represent entities: entity (simple); entity (multiple spatial representations); and entity (multiple time periods). The internal structure of the entity symbol contains the name of the entity and additional information indicating the corresponding spatial entity (point, line or polygon), a code indicating topology, and a code indicating encoding of the spatial entity by coordinates. The coordinate code is, at the present time, redundant in that all contemporary GISs represent spatial entities with x,y coordinates. However, it is possible that future geographic databases may include spatial entities where coordinates are not needed. Similarly, topological encoding is normally of only one type and can, for the present, be indicated by a simple code. However, different spatial topologies have been defined and may require different implementations in a GIS (Armstrong and Densham, 1990). In the future, the topology code may be expanded to represent a specific topologic structure particular to a GIS application.
Modeling Spatial Relationships
The spatial relationships are defined by three relationship symbols. The traditional diamond symbol can be used for normal database relationships. An elongated hexagon and a double elongated hexagon, are defined to represent spatial relationships. The elongated hexagon represents spatial relationships defined through topology (connectivity and contiguity) and the double elongated hexagon represents spatial relationships defined through x,y coordinates and related spatial operations (coincidence, containment and proximity). The appropriate "verbs" to include in the hexagonal symbols are the descriptors of the spatial relationships. The spatial operation will be implicitly defined by the relationship symbol (double hexagon), the spatial entity and the topology code. For example, a spatial relationship named "coincident" between entities named "wetlands" and "soils," both of which carry topologic codes and x,y coordinates, indicates the spatial operation of topological overlay. If this does not sufficiently define the spatial operation needed, the name of the spatial operation can be used to describe the relationship, such as shortest path, point-in-polygon, radial search, etc.
The information needed to develop the E-R diagram representing the spatial data model comes from the Needs Assessment activity as:
- The GIS application descriptions
- The master data list: Lists, entities, corresponding spatial entities and attributes
- The list of functional capabilities (spatial operations)
The process of building the E-R diagram involves taking entities from the master data list one at a time and placing each one on the diagram. For each new entity, any relationship to any previously entered entity should be entered. Relationships are found by examining the Application Descriptions and determining if the GIS processes require a specified operation. For example, if an Application Description indicated that land parcels needed to be compared to a flood plain area, then a spatial relationship of "coincident area" (or topological overlay operation) should be defined between the two entities.
As each entity is added to the E-R diagram, the list of attributes should be reviewed and checked to determine if the attribute is appropriate for the entity, does not duplicate any other attribute or entity, and can be rigorously defined for entry to create the metadata (metadata is discussed in the next section of this guideline).
The E-R diagram will be used to verify with the expected users the data content of the GIS and, by additional reference to the GIS needs analysis, the required spatial operations. Once verified by the users, the E-R representation can be mapped into a detailed database design (as will be described in the Database Planning and Design Guideline)where:
- Each entity and its attributes map into:
- One or more relational tables with appropriate primary and secondary keys (this assumes the desired level of normalization has been obtained);
- The corresponding spatial entity for the "regular" entity. As most commercial GISs rely on fixed structures for the representation of geometric coordinates and topology, this step is simply reduced to ensuring that each corresponding spatial entity can be handled by the selected GIS package;
- Each relationship into:
- Regular relationships (diamond) executed by the relational database system's normal query structure. Again, appropriate keys and normalization are required for this mapping.
- Spatial relationships implemented through spatial operations in the GIS. The functionality of each spatial relationship needs to be described, and if not a standard operation of the selected GIS, specifications for the indicated operation need to be written.
Spatial data standards cover a variety of topics including the definition of spatial data entities (including a formal data model), methods of representation of the spatial entities in a GIS, specifications for the transfer of spatial data between different organizations, and the definition of the attributes of the spatial entities and the values these attributes may assume. Metadata is "information about data," and should describe the characteristics of the data such as identifying entities and attributes by their standard names and provide information on such items as data accuracy, data sources and lineage, and data archiving provisions.
Much of the work on spatial data standards to date as been done under the auspices of the Federal Geographic Data Committee and only concerns federal spatial data directly. The relationship between the existing federal data standards and state and local spatial data standards have yet to be developed. Appendix A contains a list of current and pending reports on federal spatial data standards. Work towards New York State spatial data standards will be conducted under the proposed GIS Standing Committee of the Information Resources Management Task Force.
Metadata for Local Governments in New York State
Metadata can serve many purposes. Some of the more important functions of metadata are:
- Provide a basic description of a data set
- Provide information for data transfers to facilitate data sharing
- Provide information for entries into clearinghouses to catalogue the availability of data
The metadata structure and content for local government recommended in this guideline has been prepared according to the following criteria:
- The metadata must first, and primarily, serve as a documentation and data management tool for the data administrator in an agency or department
- Secondly, the metadata must encompass and support the data manager and records management officer in a local agency in all aspects of data management including data definition, source documentation, management and updating, and data archiving and retention requirement.
- Thirdly, the metadata information must be able to generate and supply database descriptions for spatial data clearinghouses such as the prototype New York State Spatial Data Clearinghouse developed under the GIS Demonstration Project conducted by the Center for Technology in Government, SUNY - Albany and any relevant federal spatial data clearinghouses.
The following metadata information is a prototype for a New York State Local Government Spatial Metadata Standard. This metadata is represented in a set of tables listed below and has been implemented in Microsoft Access. A working copy of this metadata program is available to all local governments in the state. The structure and information on how to use the software are described in a user's guide available with the program. The content of the metadata tables is as shown in the following lists.
- Organization Information
- Name Of Organization
- Room/Suite #
- Number And Street Names
- Zip Code
- Phone Number
- Fax Number
- Contact Person
- Phone Number/Extension
- Email Address
- Organization Internet Address
- Reference Information
- File Format
- File Internet Address
- Metadata Created By
- Date Metadata Created
- Metadata Updated By
- Date Metadata Updated
- Metadata Standard Name
- Object/File Name Information
- Data Object Name
- Data Object Information
- Distribution Filename (Same as Filename in Reference Information)
- Data Object Name
- Data Object Description
- Spatial Object Type
- Attribute Information
- Data Object Name
- Data Attribute Name
- Attribute Description
- Attribute Filename
- Code set Name/Description
- Measurement Units
- Accuracy Description
- Data Dictionary Information
- Data Object Name
- Data Attribute Name
- Data Type
- Field Length
- Spatial Object Information
- Data Object Name
- Spatial Object Type
- Place Name
- Projection Name/Description
- HCS Name
- HCS Datum
- HCS X-Offset
- HCS Y-Offset
- HCS Xmin
- HCS Xmax
- HCS Ymin
- HCS Ymax
- HCS Units
- HCS Accuracy Description
- VCS Name
- VCS Datum
- VCS Zmin
- VCS Zmax
- VCS Units
- VCS Accuracy Description
- Source document information
- Data Object Name
- Spatial Object Type
- Source Document Name
- Date Document Created
- Date Digitized/Scanned
- Digitizing/Scanning Method Description
- Accuracy Description
- Lineage Information
- Data Object Name
- Data Object 1
- Data Object 2
- Description of Spatial Operation and Parameters
- Accuracy Description
- Update Information
- Data Object Name
- Update Frequency
- Updated By
- Archive Information
- Data Object Name
- Retention Class
- Retention Period
- Data Archived
- Archived By
- Date to be Destroyed
- Source Documents
- Source Document Name
- Source Document ID#
- Source Organization
- Type of Document
- Number of Sheets (map, photo)
- Source Material (paper, mylar)
- Projection Name
- Coordinate System
- Date Created
- Last Updated
- Control/Accuracy (map, photo)
- Reviewed by
- Review date
- Spatial extent
- File format
- Entities Contained in Source
- Source ID#
- Entity Name
- Spatial Entity
- Estimated Volume of Spatial Entity
- Accuracy Description of Spatial Entity
- Reviewed by
- Review Date
- Scrub Needed (yes/no)
- Attributes by Entity
- Source ID#
- Entity Name
- Attribute Description
- Code Set Name
- Accuracy Description of Attribute
- Reviewed By
- Review Date
(The following material is quite technical, but a good set of sources on conceptual database design.)
Armstrong, M.P. and P.J. Densham, 1990, "Database Organization Strategies for Spatial Decision Support Systems," International Journal of Geographical Information Systems, vol. 4, no. 1, 3-20.
Calkins, Hugh W., "Entity Relationship Modeling of Spatial Data for Geographic Information Systems," International Journal of Geographical Information Systems, January 1996.
Chen, P.P., 1976, "The Entity-Relationship Model - Toward a Unified View of Data," ACM Transactions on Database Systems, vol. 1, no. 1, March 1976, pp. 9-36
Chen, P.P., 1984, "English Sentence Structure and Entity-relationship Diagrams," Information Sciences, 29, 127-149
Davis, C., et. al., eds., 1983, Entity-Relationship Approach to Software Engineering, Amsterdam, Netherlands: Elsevier Science Publishers B.V.
Elmasri, R. and S.B. Navathe, 1989, Fundamentals of Database Systems, Redwood City, California: The Benjamin/Cummings Publishing Company, Inc.
Jajodia, S. and P. Ng, 1983, On Representation of Relational Structures by Entity-Relationship Diagrams, Entity-Relationship Approach to Software Engineering, P. Ng and R. Yeh (eds.), Amsterdam, Netherlands: Elsevier Science Publishers B.V., pp. 249-263.
Liskov, B. and S. Zilles, 1977, "An Introduction to Formal Specifications of Data Abstractions, Current Trends in Programming Methodology" - Vol. 1: Software Specification and Design, R.T. Yeh (ed), Prentice Hall, pp 1-32.
Loucopoulos, P. and R. Zicari, 1992, Conceptual Modeling, Databases, and CASE: An Integrated View of Information Systems Development, New York: John Wiley & Sons, Inc.
Teorey, T.J. and J.P. Fry, 1982, Design of Database Structures, Englewood Cliffs, NJ: Prentice-Hall, Inc.
Ullman, J.D., 1988, Principles of Database and Knowledge-Base Systems, 2 vols. (Rockville, Maryland: Computer Science Press, Inc.)
Developing Standards for Spatial Data and Metadata
Spatial data standards are needed in order to facilitate the exchange of spatial data between geographic information systems. We refer to data as "spatial" because the common factor is a geographic reference (a reference in space) which allows the data to be accessed through a GIS. In order to accomplish the goal of facilitating data exchange, spatial data standards should provide:
- Definitions of terms for spatial objects or features included in GIS;
- A structure (or format) for the exchange of spatial data;
- A method for describing the accuracy and lineage of the data; and
- The definition of metadata (the data that describes the spatial data).
The primary purpose for spatial data standards is to facilitate data sharing and exchange, thus the focus only on data issues. The Council concluded that It is not necessary to develop standards for GIS hardware or software at this time. as these standards are expected to evolve from groups such as the Open GIS Consortium, a non-profit trade association formed to implement the Open Geodata Interoperability Specification .
The Current Status of Standards
At present, spatial data standards exist only at the Federal government level. Under the Federal Geographic Data Committee, three standards documents have been prepared:
The Spatial Data Transfer Standard (SDTS - FIPS 173)
This standard defines a method for the exchange of spatial data between different GIS software systems. It also contains definitions of terms for the spatial objects of interest to Federal government agencies.
Content Standards for Digital Geospatial Metadata (proposed)
This standard defines the content for digital geospatial metadata, the information about spatial data that would be entered into a clearinghouse or repository to form a catalog of spatial data available to other users.
Cadastral Standards for the National Spatial Data Infrastructure (draft)
This is a draft standard for cadastral (land ownership) data, one of twelve theme standards documents under preparation.
The Federal Geographic Data Committee has also established a National Spatial Data Infrastructure (NSDI) for the purpose of coordinating geographic data acquisition and access. The mechanism for this will be a National Spatial Data Clearinghouse, a distributed network of geospatial data producers, managers, and users linked electronically. It is envisioned that this network of clearinghouses would contain information about available spatial data. Potential users would search this clearinghouse to find data of interest, access the metadata for a description of data of interest, and could acquire the data from the distributing agency. Spatial data may be deposited directly with a clearinghouse or retained by the originator.
The Federal effort towards standards development started in 1981 and The National Spatial Data Infrastructure and Federal spatial data standards are still evolving at this time. The remaining subject area (theme) standards reports are scheduled for release during the Spring of 1996 ( themes are: base cartographic, bathymetric, cultural and demographic, geodetic, geologic, ground transportation, international boundaries, soils, vegetation, water, and wetlands). The table below shows the current status of federal spatial data standards development.
Implementation of the Federal geospatial data standards is through Executive Order 12906 signed by the President on April 11, 1994. The FGDC is directed to " ...seek to involve State, local, and tribal governments in the development and implementation of the initiatives continued in this order." The Order provides that:
"Federal agencies collecting or producing geospatial data, either directly or indirectly ~e.g. through grants, partnerships, or contracts with other entities) shall ensure, prior to obligating funds for such activities, that data will be collected in a manner that meets all relevant standards adopted through the FGDC process. "
Status of Federal Geographic Data Committee Standards
|Currently in development:||Completed public review:|
|National Spatial Data Accuracy Standard||Cadastral Content Standard|
|Standards for Digital Orthoimagery||Federal Domain of Values for Data Content Standard|
|Draft Standards for Digital Elevation Data||Cadastral Collection Standard (Cadastral)|
|Hydrographic and Bathymetric Accuracy Standard||Clearinghouse Metadata Profile (Cadastral)|
|Standards for Geodetic Control Networks||Classification of Wetlands and Deepwater Habitats of the United States|
|Transportation Network Profile for Spatial Data Transfer Standard|
|Transportation-related Spatial Feature Dictionary|
|Soils Data Transfer Standard|
|Vegetation Classification Standards|
|River Reach Standards and Spatial Feature|
|Facility ID Code|
|Content Standard for Cultural and Demographic Data|
Source: Federal Geographic Data Committee Newsletter, November 1995.