3.04.01 Types of Data Used in a GIS
3.04.02 Data Preparation
3.04.03 Data Management
3.04.04 Legal Implications on Data Capturing and Storing
3.04.01 Types of Data Used in a GISAlthough the two terms, data and information, are often used interchangeably, they mean two different things. Data can be described as different observations which are collected and stored. Information is processed data which is useful in answering queries or solving a problem.“Analogue data,” “paper version” or “hard copy” are terms often used to denote any document or dataset produced on paper while “digital data” or “soft copy” refer to files processed by GIS software in the computer. The result of the computer manipulated data can be transformed into a paper format such as the printout of a map. Geographic data are inherently a form of spatial data organized in a geographic database. This database can be considered as a collection of spatially referenced data that acts as a model of reality. There are two important components of this geographic database: its geographic position and its attributes or properties. In other words, spatial data (where is it?) and attribute data (what is it?) Spatial Data Attribute Data Spatial data can be represented into two fundamental approaches:
Some basic properties of raster and vector data are as follows:
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Comparison of Raster and Vector Data
|
|||||||||||||||||||||||||||||||||||||||||||||||||
3.04.02 Data PreparationSearch for Data The process of putting data into a GIS takes time. The process can be slow and laborious; and time equals money. Every year someone promises that next year there is going to be a faster, more intelligent scanning system that is going to get data into the system much easier. Things are indeed getting better and more and more data is becoming available in digital form, but the process of building a database still typically represents 80% of the first five-year costs of establishing a GIS. This is real money expenditure and that is where much of GIS time is going to be spent.
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Map Accuracy and Level of Acceptance
|
|||||||||||||||||||||||||||||||||||||||||||||||||
|
In the preparation of the CLUPs using GI Technology, secondary source data will be used. The LGU planner must rely on data captured by a national agency (e.g. geologic map, soil map, erosion map, flooding map, etc.). The source data will most likely be in a paper format, the data has been produced using manual methods, scales may vary, and little is known about the accuracy (few metadata is attached).
Chapter 4 in the Toolbox provides some metadata specifications for some of the data, but a lot more needs to be done to assist the planner. The source maps, in order to be useful in a CLUP GIS database, must be transformed into a digital layer. However, data from paper format will only be converted into digital format. How much error (errors from source and from scanning and georeferencing) is acceptable? The answer depends on how much accuracy the secondary source can provide. If the accuracy of a secondary source is not known, the data could be compared with other secondary sources which have similar features that are comparable. However, one must be cautious in comparing data. Most secondary source data done manually would contain a lot of errors. It is also possible that there are secondary sources which were produced digitally like orthophotos and GPS surveys. These sources would have greater accuracy than all other secondary sources, and these secondary source data will have to be evaluated differently. Lessons Learned
The spatial data, especially the data for the Base Map:
It should not be expected of a planner to be able to assess whether ‘technicalities regarding the cartography’ are properly set from the beginning. There should be enough guaranties for the planner that the data has a workable standard so he can focus on his professional task, which is the actual planning and the preparation of the CLUP. Metadata
A common perception of GIS data is that it consists of two parts: spatial data (coordinates and topology), and attribute data (descriptive information). However, without proper documentation, GIS data will remain incomplete. It is thus equally important that GIS data also includes a metadata component. Metadata creation is typically considered to be an obligation of the data producer. The data user needs metadata to determine whether or not a particular data set exists, and to decide whether or not the data is appropriate for use. Proper metadata should describe the who, what, when, where, why and how regarding all aspects of a GIS data set. The use or creation of Metadata is often ignored or avoided. However, with the rise in use of digital data, the advantage of including metadata for datasets is increasingly recognized. Whereas cartographers rigidly provided metadata within a paper map’s legend, the evolution of computers and GIS has seen a decline in this practice. As organizations start to recognize the value of this ancillary information, they often begin to look at incorporating metadata collection within the data management process. Metadata helps people who use geo spatial data find the data they need and determine the best way to use it. Metadata benefits the data-producing organization as well. As personnel change in an organization, undocumented data may lose their value. Incoming and newcomer staff may have little understanding of the contents and uses for a digital database and may find they can't trust results generated from these data. Lack of knowledge about other organizations' data can lead to duplication of effort. It may seem burdensome to add the cost of generating metadata to the cost of data collection, but in the long run the value of the data depends on its documentation.
In the GIS Cookbook there are Metadata Specifications and Standards for the attributes as well as the spatial datasets. What Are Standards and Why Use Them?
Paper Maps Means Conceptual Standards As Well The printed map, in itself, represents a standardized way of describing geographic information. With our knowledge, experience and intuition we understand a meaning, an image and properties of that road which is described with a certain symbol. It works pretty well as long as we deal with a certain map category. The problem is that the important aspects can easily draw in all information on the maps when performing analysis procedures by using a number of different thematic maps. Computer Assistance Will Increase the Demands for Systematic Management of Data When changing to the digital world, there is a need to describe the tasks in a logical manner to get the computer to do what we want. A Corporate Language GIS, as well as our own language, is created to transfer and disseminate information. A corporate language consists of a corporate vocabulary and a corporate grammar. In the computer world we talk about corporate feature names, feature definitions, attribute lists and uniformly defined data format and data base design. This is standardization. Use of Geographic Data Many organizations use many types of geographic data from numerous data vendors or producers. These data should be used together. Standardization concerning geographic data such as using the same projection is an absolute prerequisite. As we use many data types from different producers it is also necessary with information about who is producing what, about data quality, about data capture methods etc. This is metadata. A uniform metadata structure also requires standardization, in order to easily understand the meaning of metadata. Use of Geographic Data A standard is agreed upon by a group of users who have cooperated in order to standardize a certain thing. The work is approved by the standardization organization and appointed official standard. In addition to the official standards for geographic data, a certain group can decide to apply a standardized data description for a certain purpose. In this case the result will occur as a de facto- standard. This needs no approval by a standardization organization since it is just for the use of the internal organization that agreed on this standard. Today there are a number of official standards concerning geographic data. Those are developed within the International Standard Organization (ISO), for example ISO TC 211 (Global level). There are also a lot of other unofficial standards. One example is the product de facto- standard established by Microsoft as this company is dominating the software market for computers. Another strong player is Environmental Systems Research Institute (ESRI), the world leading vendor of GIS software. In the Philippines, the Inter Agency Task Force on Geographic Information (IATFGI) has made serious effort to come up with technical standards for geodata. The preparation of the GIS Cookbook has been coordinated with their recommendation and applicable metadata specifications have been adopted. However, the metadata specifications have been improved focusing not only on national government institutions but the local government data environment as well.
Guidelines for File and Folder Management
To facilitate an overview of the folders, the subfolders should be organized in a specific order. They are automatically placed first in numerical and then alphabetical order. If you start with digits you can decide the appropriate order. It might not be necessary to use figures for all folders, but this is preferable for the most used or important folders. It is important to name the folders and files in a coherent way, so that will be easier to view the content of the drive. Using meaningful names and abbreviations can help see at a glance what each dataset is. The folder structure described below is a proposed setup that can be used in the preparation of the CLUP. It is recommended for better organization and management of files in case no previous standard has been used by the municipality.
All the files such as written reports and other documents, graphs and photos used in the narrative part of the CLUP and the geodata needed to build up the CLUP GIS, are organized into 4 folders, which then are divided into subfolders and sub-sub folders accordingly: 01_CLUPGIS; 02_CLUPdoc; 03_CLUPpic; 04_CLUPmix. 01_CLUPGIS – contains the data, mostly tables/spreadsheets that is needed for the GIS. The building stones of the GIS consist basically of spatial data (which configures the features on the map), and attribute data (which describes the specific map feature). For example, a school is represented as a point on the map (spatial data) and when you click on it one will find information on how many teachers, classrooms, etc. (attribute data) the school holds. |
|||||||||||||||||||||||||||||||||||||||||||||||||
|
Aside from the geodata there are also (Excel) table data that have no GIS representation, and can be used in the narrative part of the CLUP report as tables or graphs originating from the spreadsheets. The components of the CLUP GIS data are divided into sector folders which follow IATFGl recommendations on metadata as shown below.
Each of the sector folders is divided by planning component subjects (Housing, Education, etc.) in order to differentiate between table files being used for preparatory activities (both for the GIS and to be inserted in the CLUP narrative text), and files that are being used in the GIS. Each planning component subject folder is further subdivided into two subfolders, namely ‘Tables’ and ‘GIS.’ A ‘Quick-look’ file placed together with the sector subfolders in the CLUPGIS folder describes important information about the data, which could be of good use and facilitate understanding by a new user/custodian. Refer to Chapter 5.01.01 for more information about the ‘Quick-look’ file. The GIS Cookbook does not give any recommendation how the data used in the CLUP Report should be organized. However, below are some general suggestions: 02_CLUPdoc – contains drafts of the CLUP document eventually divided into subfolders for drafts and final version. Each subfolder is recommended to have numbered subfolders corresponding to the division of chapter in the document, such as, 01_Introduction; 02_Baseline Studies; etc. 03_CLUPpic – contains all types of imagery, such as photos, satellite imagery, aerial photos, graphic illustrations, etc. For easy reference it is recommended that all imagery used in the final version should be placed in a separate subfolder and if there are several images, these may be subdivided into chapters such as 02_CLUPdoc. 04_CLUPmix – contains miscellaneous files, preferably organized into subfolders according to the steps in Volume 1 prepared, such as minutes from meetings and consultations; correspondence, etc.
Guidelines for Naming of Files It is important to name the folders and files in a coherent way, so that it will be easier to view the content of the drive. Using meaningful names and abbreviations can help see at a glance what each dataset is. The following guidelines are recommended, where the name of the folder or file should be:
The following table sets out the characters that may NOT be used in file or folder names, as they are generally reserved by the operating system and will cause file retrieval problems if used:
|
|||||||||||||||||||||||||||||||||||||||||||||||||
It is recommended that the geodata files be named as follows:
Data Sharing GIS and supporting technologies will lead to the development of decision support systems that facilitate the municipal planning process. By using indicators and alternative development scenarios it is possible to measure the performance of the LGU and future land-use. Planning support systems like the CLUP GIS can measure and compare performances of different planning scenarios according to planner- or citizen-defined indicators for land use, transportation, education, natural resources, and employment, to name a few. However, the ultimate goal is to bring together all potential players to work collaboratively on a common vision for their community. GIS-based planning support systems allow planners to quickly and efficiently create and test alternative development scenarios and determine their likely impacts on future land use patterns and associated population and employment trends, thus allowing public officials to make informed planning decisions. With a basic understanding and implementation of data sharing one can provide more information to local residents and the municipality without increasing capital or personnel costs. Employing these techniques will actually reduce the amount of time spent updating municipal management and planning data and increase accuracy and timeliness. The idea that is advocated for in the GIS Cookbook is that much of the data presented in the CLUP tables (see Chapter 5 in the Toolbox) can be designed/formatted so they can be used both in the CLUP preparation and in the day-to-day work of the respective sector office (health, education, social welfare, building and business permits, etc.) that is responsible for providing the specific municipal service. Once municipal offices (and other government agencies interacting with the LGUs) agree to share or replicate the data, they face the challenge of maintaining up-to-date datasets. Both attribute and spatial data are changing continuously as new social services, infrastructure, etc. are provided, or more accurate data is collected. To maintain up-to-date databases the various data “owners” (custodians) must exchange their most current datasets with those they share their data with. This can be done in two ways:
Corporate datasets and working databases may also have different data models (or schemas). Posting scripts are used to control the transfer of the data between the different databases, and these scripts must be capable of handling these different configuration issues and formats, as shown in the figure below.
Unique Feature Identifiers: To simplify the update process, unique ID’s are used to keep track of joining tables, which features have changed, etc. Consequently all CLUP GIS tables, (see Chapter 5) have been given a field for a unique ID. For example, a school unit will always be identified with a unique alphanumeric ID which is referred to by all users and used when joining tables in a GIS. A good example on unique ID is to start from the coding of municipalities (and barangays) that is used by NSO (see Chapter 5.09.01 for more detailed information). Data Ownership: It is important to clarify data ownership to eliminate potential conflicts. For example, who ‘owns’ the table data for education? Which department is responsible for maintaining the school unit locations and attribute data about enrolment? Data ownership may also have to be shared. For example in a low-income municipality it might be the best solution that the planning unit takes responsibility for the data management of the spatial data, and see to it that the locations of schools are properly identified, while the school unit keeps records on the attributes such as number of classrooms and teachers, etc. However, aside from agreeing about unique IDs and Data Custodianship, for municipal offices that share data with external users (those outside their administrative sphere of influence), “change only updates” result in a number of potential challenges that may include versioning, data transactions, data validation, coordinate systems and accuracy. Sometimes the CLUP/corporate datasets (shape files, Excel) are a different format to the external databases (ESRI Geodatabase, Oracle Spatial, MapInfo TAB, GeoMedia, AutoCAD, etc.). To cope with these issues there is a need for special GIS and IT knowledge. In the Toolbox (Chapter 4.18), some examples illustrate the benefit of data sharing. Data Security The possibility of the system and data being destroyed or severely damaged is real and deserves attention. The system is vulnerable to both deliberate and accidental damages. A disgruntled employee might purposely corrupt data, hackers may steal information, or a computer virus could find its way into the server. Natural disasters also pose a threat. Earthquakes, floods, fires, hurricanes, tornadoes, and lightning are all examples of natural hazards that could disrupt a GIS. While deviant behavior and natural disasters are intriguing subjects, threats more common are found in day-to-day operations. Examples include coffee being spilt in the wrong place, a well-intentioned employee who accidentally deletes or corrupts a database, or a power disruption with no automatic battery backup. When conducting a security review, the physical, logical, and archival security of the databases are examined. Physical security measures protect and control access to the computer equipment containing the databases. Protection of database storage includes guarding against human intrusions (such as unauthorized personnel) and environmental factors (such as fire, flood, or earthquake). Logical security measures protect and control access to the data itself. For example, users may be restricted to certain types of terminals, particular datasets, and particular functions. One common security measure is to ensure that only database management staff have editing and update rights to particular datasets. Archival security is essential for many applications. Metadata, information about past coding and updating practices, the location of data, and the type of media on which data is stored, must be kept track of to allow for data recovery. |
|||||||||||||||||||||||||||||||||||||||||||||||||
The table below illustrates the sections and subsections that might be included in a document that describes the security recommendations of systems and databases for a municipality. Recommendations are made that affect the current and future operations. This document will also help set priorities for actions and costs involved. Further, the security recommendations should be approved and a budget allocated to put the measures into effect.
Backup Basics There are many ways one can unintentionally lose information on a computer; a power surge, lightning, floods, for instance. Sometimes the equipment just fails. Backup copies of files kept in a separate place is a good practice to ensure that the information is still there when something happens to the original files in the computer. Before making backup copies, a checklist of files for backup should be made. This will help determine what files to back up, and also provide a reference list which will be useful in retrieving backed-up files. Backup copies should be stored in external storage media, such as an external hard disk drive or flash drive, CDs, DVDs, or some other storage formats. The size of the files needed for the CLUP database will be relatively modest providing not so much raster data is included. Consequently, the recommendation is that the CLUP folder should be written to a DVD/CD on a regular interval (like once a month) and the backup be kept in a safe environment outside the office.
Intellectual Property Rights (IPR) It consists of Copyright and Related Rights, Trademarks and Service Marks, Geographical Indications, Industrial Designs, Patents, Lay-out Designs (Topographies) integrated circuits and Protection of undisclosed information. Copyright and Related Rights Related Rights – is the protection extended to derivative works, to include among others, dramatizations, translations, adaptations, abridgements, arrangements, and other alterations of literary or artistic works. Programs / Software
IPC allows reproduction of backup copies or adaptation of a computer programs without authorization of the author / copyright owner provided that the copy is necessary for:
Such copy must be destroyed in the event that continued possession of the copy of the computer program ceases to be lawful. Enforcement
This means that one may only copy, adapt or rent a computer program if the copyright owner gives the permission to do this. This permission is given in the form of license. Every purchase of a legitimate copy of a computer program entitles one to receive a license agreement. |
| Attachment | Size |
|---|---|
| 03.04_DataPreparation.pdf | 1.67 MB |










