Page 1
1 1
A GENTLE INTRODUCTION TO GIS
Unit Structure :
1.0 Objectives
1.1 Introduction
1.2 The Nature of GIS
1.3 Some Fundamental Observations
1.4 The Real World and Representation of it
1.5 Summary
1.6 References
1.7 Questions
1.0 OBJECTIVES
Illustrate h ow we think geographically and spatially daily with mental
maps to highlight the importance of asking geographic questions.
Explain how the fundamental concepts of scale, location, direction,
distance, space, and navigation are relevant to geography and
geographic information systems.
Define how a geographic information system is applied, its
development, future and representation in real world.
1.1 INTRODUCTION
The purpose of this chapter is to provide a general overview of some of
the terms, concepts and ideas, which will be covered in detail in later
sections. The acronym GIS stands for geographic information system. As
the name suggests, a GIS is a tool for working with geographic
information. It contains formal definition, Geographic information the key
functions that set GIS apart from other kinds of information systems. GIS
have rapidly developed since the late 1970’s in terms of both technical and
processing capabilities, and today are widely used all over the world for a
wide range of purposes.
1.2 THE NATURE OF GIS
A geographic information system (GIS) is a computer system for
capturing, storing, checking, and displaying data related to positions on
Earth’s surface. By relating seemingly unrelated data, GIS can help
individuals and organizations had better understand spatial patterns and
relationships.
munotes.in
Page 2
Principles of
Geogrphics
Information Systems
2 Following are the some of the examples of GIS where it is used:
A biologist might be interested in the impact of slash -and-burn practices
on the populations of amphibian species in the forests of a mou ntain range
to obtain a better understanding of long -term threats to those populations;
A natural hazard analyst might like to identify the high -risk areas of an -
nual monsoon -related flooding by investigating rainfall patterns and ter -
rain characteristic s;
A geological engineer might want to identify the best localities for
construct - ing buildings in an earthquake -prone area by looking at rock
formation characteristics;
A mining engineer could be interested in determining which prospective
copper mines s hould be selected for future exploration, taking into account
parameters such as extent, depth and quality of the ore body, among
others;
A Geoinformatics engineer hired by a telecommunications company may
want to determine the best sites for the company’s relay stations, taking
into ac - count various cost factors such as land prices, undulation of the
terrain etc.
A forest manager might want to optimize timber production using data on
soil and current tree stand distributions, in the presence of a number o f
operational constraints, such as the need to preserve species diversity in
the area.
A hydrological engineer might want to study a number of water quality
parameters of different sites in a freshwater lake to improve understanding
of the current distribu tion of Typha reed beds, and why it differs from that
of a decade ago.
1.3 SOME FUNDAMENTAL OBSERVATIONS
Our world is dynamic. Many aspects of our daily lives and our
environment are constantly changing, and not always for the better. Some
of these change s appear to have natural causes (e.g. volcanic eruptions,
meteorite impacts), while others are the result of human modification of
the environment (e.g. land use changes or land reclamation from the sea, a
favourite pastime of the Dutch).
There are also a large number of global changes for which the cause
remains un - clear: these include global warming, the El Nin˜o/La Nin˜a
events, or at smaller scales, landslides and soil erosion. In summary, we
can say that changes to the Earth’s geography can have nat ural or man -
made causes, or a mix of both. If it is a mix of causes, we usually do not
fully understand the changes.
For background information on El Nin˜o, please refer to Figure.
This Figure presents information related to a study area (the e quatorial
Pacific Ocean), with positional data taking a prominent role. Although munotes.in
Page 3
A Gentle Introduction to GIS
3 quite a complex phenomenon, we will use the study of El Nin˜o as an
example application of GIS in the remainder of this chapter.
In order to understand what is going on in our world, we study the
processes or phenomena that bring about geographic change. In many
cases, we want to broaden or deepen our understanding to help us make
decisions, so that we can take the best course of action. For instance, if we
understand El Nin˜o better, and can forecast that another event may take
place in the year 2012, we can devise an action plan to reduce the
expected losses in the fishing industry, to lower the risks of landslides
caused by heavy rains or to build up water supplies in areas of expected
droughts.
In order to understand what is going on in our world, we study the
processes or phenomena that bring about geographic change. In many
cases, we want to broaden or deepen our understanding to help us make
decisions, so that we can take the best course of action. For instance, if we
understand El Nin˜o better, and can forecast that another event may take
place in the year 2012, we can devise an action plan to reduce the
expected losses in the fishing industry, to lower the risks of land slides
caused by heavy rains or to build up water supplies in areas of expected
droughts.
Figure 1.1: Sea -surface temperature anomalies during a strong El Niño
event (top) and La Niña event (bottom). Figures made in the IRI ENSO
Maproom. munotes.in
Page 4
Principles of
Geogrphics
Information Systems
4 The El Nin˜o ev ent is a good example of such a phenomenon, because sea
surface temperatures differ between locations, and sea surface
temperatures change from one week to the next.
1.3.1 Defining GIS
A GIS is a computer -based system that provides the following four sets of
capabilities to handle geo -referenced data:
1.Data capture and preparation
2.Data management
3.Data manipulation and analysis
4.Data presentation
1. Data Capture and Preparation
Data capture is tedious job in GIS. A GIS can be used to emphasize the
spatial relationships among the objects being mapped. If the data to be
used are not already in digital form that is in a form a computer can
understand and recognize, various techniques are available to capture the
information. Maps can be digitized, or han d traced with a computer
mouse, to collect the coordinates of the features. Electronic scanning
devices will also convert map lines and points to digits.
In the El Nin˜o case, data capture refers to the collection of sea water
tempera - tures and wind spee d measurements. This is achieved by placing
buoys with measuring equipment at various places in the ocean. Each
buoy measures a number of things: wind speed and direction; air
temperature and humidity; and sea water temperature at the surface and at
variou s depths down to 500 metres. For the sake of our example we will
focus on sea surface temperature (SST) and wind speed (WS).
A typical buoy is illustrated in Figure 1.2, which shows the placement of
various sensors on the buoy.
munotes.in
Page 5
A Gentle Introduction to GIS
5
Figure 1.2 : Schemat icoverviewofanATLAStypebuoyformonitoringsea
water temperatures in theElNin˜oproject
WS sensor Argosantenna
3.8mabovesea
Humiditysensor
Logger Argosantenna
3.8mabovesea
TorroidalbuoyØ2.3m
2. Data Management
Thisphase requires a decisi on to be made on how best to represent our
data, both in term soft heirspatial properties and the various at tribute
values which weneed to store. Data manipulation includes data
verification, attributes data management, insertion, updating, deleting and
retrieval in different forms.For our example data management refers to
the storage and maintenance of the data transmitted by the buoys via
satellite communication.
3. Data Manipulation and analysis
Data analysis can be done, when data has been collected a nd organized in
computer system. In above example, considering data generated at the
buoys was processed before map production. A Figure 1.1 reveals that the
data being presented are based on the monthly averages for SST and WS
(for two months), not on sin gle measurements for a specific date.
1. For each buoy, the average SST for each month was computed, using
the daily SST measurements for that month. This is a simple
computation.
2. For each buoy, the monthly average SST was taken together with the
geogr aphic location, to obtain a georeferenced list of averages, as
illustrated in the following table:
munotes.in
Page 6
Principles of
Geogrphics
Information Systems
6 Buoy Geographicposition Dec.1997avg.SST
B0789 (165◦E,5◦ N) 28.02◦C
B7504 (180◦E,0◦ N) 27.34◦C
B1882 (110◦W,7◦30’S) 25.28◦C
... ... ...
4. Data Presentation
After the data manipulations , our data is prepared for producing output.
This data presentation phase deals with putting it all together into a format
that communicates the result of data analysis in the best possible way.
Before data is presented, we need to consider what the message is that we
want to portray, who the audience is, what kind of presentation medium
will be used, which rules of aest hetics apply, and what techniques are
available for representation.
1.3.2 GI System, GI Science and GI Applications
A geographic information system (GIS) is a type of database containing
geographic data (that is, descriptions of phenomena for which locatio n is
relevant), combined with software tools for managing, analyzing, and
visualizing those data. GIS software, a general -purpose application
program that is intended to be used in many individual geographic
information systems in a variety of application d omains.Starting in the
late 1970s, many software packages have been created specifically for
GIS applications, including commercial programs such as Esri, ArcGIS,
Autodesk and MapInfo Professional and open source programs such as
QGIS, GRASS GIS and MapGui de.
Geo-Information (GI) Science is the scientific field that attempts to
integrate dif -ferent disciplines studying the methods and techniques of
handling spatial information. Geographical information science
(GIScience or GISc) is the scientific discipline t hat studies geographic
information, including how it represents phenomena in the real world,
how it represents the way humans understand the world, and how it can
be captured, organized, and analyzed.
Project -based GIS applications usually have a clear -cut purpose, and
these appli - cations can be short -lived: the research is carried out by
collecting data, entering data in the GIS, analysing the data, and
producing informative maps. An ex - ample is rapid earthquake damage
assessment. munotes.in
Page 7
A Gentle Introduction to GIS
7
1.3.3 Spatial data an d geo -information
Spatial data we mean data that contains positional values, such as (x, y)
co-ordinates. Sometimes the more precise phrase geospatial data is used a
safurther refinement, which refers to spatial data that is geo referenced.
‘spatialdata’ is also k nown as ‘geo referenced data’. Geo information is
a specific type of information resulting from the interpretation of spatial
data.
In recent years, increasing availability and decreasing cost of data capture
equipment has resulted in many users collecting their o wn data. However,
the collection and maintenance of ‘base’ data remain the responsibility of
the various governmental agencies, such as National Mapping Agencies
(NMAs), which are responsible for collecting topographic data for the
entire country following preset standards. Key components of spatial data
quality include positional accuracy (both horizon - tal and vertical),
temporal accuracy (that the data is up to date), attribute accuracy (e.g. in
labelling of features or of classifications), lineage (hist ory of the data
including sources), completeness (if the data set represents all related
features of reality), and logical consistency (that the data is logically
structured).
1.4 THE REAL WORLD AND REPRESENTATIONS IF
IT
Sometimes, we need to represent some part of the real world as it is, as
itwas, or perhaps as we think it will be but in real world its not possible
to exactly represent it.
1.4.1 Models and Modeling
Modelling’ is a term used in many different ways and which has many
different meanings. Are presentation of some part of the real world can be
considereda model because the representation will have certain
characteristics in commonwith the real world. Models as representations
come in many different flavours. In the GIS environment, the most
familiar model is that of a map. A map is a miniature representation of
some part of the real world. munotes.in
Page 8
Principles of
Geogrphics
Information Systems
8 A ‘real world model’ is a representation of a number of phenomena that
we can observe in reality, usually to enable some type of study,
administration , computation and/or simulation. In this book we will use
the term application models to refer to models with a specific application,
including real -world models and so -called analytical models. The phrase
‘data modelling’ is the common name for the design effort of structurin g a
database. This process involves the identification of the kinds of data that
the database will store, as well as the relationships between these kinds of
data.
Most maps and databases can be considered static models. At any point in
time, they represen t a single state of affairs. Usually, developments or
changes in the real world are not easily recognized in these models.
Dynamic models or process models address precisely this issue. They
emphasize changes that have Dynamic models taken place, are taki ng
place or may take place sometime in the future. Dynamic models are
inherently more complicated than static models, and usually require much
more computation. Simulation models are an important class of dynamic
models that allow the simulation of real wo rld processes. Observe that our
El Nin˜o system can be called a static model as it stores state -of- affairs
data such as the average December 1997 temperatures. But at the same
time, it can also be considered a simple dynamic model, because it allows
us to compare different states of affairs.
1.4.2 Maps
Maps have been used for thousands of years to represent information
about the real world, and continue to be extremely useful for many
applications in various domains. Their conception and design has
develop ed into a science with a high degree of sophistication. A
disadvantage of the traditional paper map is that it is generally
restricted to two -dimensional static representations,
and that it is always displayed in a fixed scale. The map scale
determines the spatial resolution of the graphic feature representation.
The smaller the scale, the less detail a map can show. The accuracy of
the base data, on the other hand, puts limits to the scale in which a map
can be sensibly drawn. Hence, the selection of a pro per map scale is
one of the first and most important steps in map design.
A map is always a graphic representation at a certain level of detail,
which is determined by the scale. Map sheets have physical
boundaries, and features spanning two map sheets ha ve to be cut into
pieces. Cartography, as the science and art of map making, functions
as an interpreter, translating real world phenomena (primary data) into
correct, clear and understandable representations for our use. Maps
also become a data source for other applications, including the
development of other maps.With the advent of computer systems,
analogue cartography developed into digital cartography, and
computers play an integral part in modern cartography. munotes.in
Page 9
A Gentle Introduction to GIS
9 Cartography, the art of creating maps, de als with interpreted data. A
cartographer, or map -maker creates a visual hierarchy when he or she
decides how features appear on a map to illustrate data. Map making
can be both subjective or objective -but its goal is always the
visualizing of data with some spatial dimension.
1.4.3 Databases
A database is a repository for storing large amounts of data. It comes
with a number of useful functions:
1. A database can be used by multiple users at the same time —i.e. it
allows concurrent use,
2. A database off ers a number of techniques for storing data and allows
the use of the most efficient one —i.e. it supports storage
optimization,
3. A database allows the imposition of rules on the stored data; rules
that will be automatically checked after each update to t he data —i.e.
it supports data integrity,
4. A database offers an easy to use data manipulation language, which
allows the execution of all sorts of data extraction and data
updates —i.e. it has a query facility,
5. A database will try to execute each query in the data manipulation
lan- guage in the most efficient way —i.e. it offers query
optimization.
Databases can store almost any kind of data in different forms like
tables etc.
For the ElN in oproject, one may assume that the buoys report their
measurement s on a daily basis and that these measurements are stored in a
single, large table ex. Day Measurements
Buoy Date SS T WS Humid Temp10 ...
B0749 1997/12/03 28.2 ◦C NNW 4.2 72% 22.2 ◦C ...
B9204 1997/12/03 26.5 ◦C NW 4.6 63% 20.8 ◦C ...
B1686 1997/12/03 27.8 ◦C NNW 3.8 78% 22.8 ◦C ...
B0988 1997/12/03 27.4 ◦C N1.6 82% 23.8 ◦C ...
B3821 1997/12/03 27.5 ◦C W3.2 51% 20.8 ◦C ...
B6202 1997/12/03 26.5 ◦C SW 4.3 67% 20.5 ◦C ...
B1536 1997/12/03 27.7 ◦C SSW 4.8 58% 21.4 ◦C ...
B0138 1997/12/03 26.2 ◦C W1.9 62% 21.8 ◦C ...
B6823 1997/12/03 23.2 ◦C S3.6 61% 22.2 ◦C ...
... ... ... ... ... ... ...
munotes.in
Page 10
Principles of
Geogrphics
Information Systems
10 1.4.4 Spatial databases and spatial analysis
A spatial database is a general -purpose database (usually a relational
database) that has been enhanced to include spatial data that represents
objects defined in a geometric space, along with tools for querying and
analyzing such data. The SQL/MM Spatial ISO/IEC standard is a part the
SQL/MM multimedia standard and extends the Simple Features standard
with data types that support circular interpolations.
A geodatabase (also geographical database and geospatial database) is a
database of geographic data, such as countries, adm inistrative divisions,
cities, and related information. Such databases can be useful for websites
that wish to identify the locations of their visitors for customization
purposes. A geodatabase is not the same thing as a GIS, though both
systems share a nu mber of characteristics. These include the functions
listed above for databases in general: concurrency, storage, integrity, and
querying, specifically, but not only, spatial data.
A GIS, on the other hand, is tailored to operate on spatial data. It ‘know s’
about spatial reference systems, and supports all kinds of analyses that are
inherently geographic in nature, such as distance and area computations
and spatial interpolation. This is probably GIS’s main strength: providing
various ways to combine repre sentations of geographic phenomena.
Spatial analysis or spatial statistics includes any of the formal techniques
which studies entities using their topological, geometric, or geographic
properties. Spatial analysis includes a variety of techniques, many st ill in
their early development, using different analytic approaches and applied in
fields as diverse as astronomy. For example, in the El Nin˜o case, we may
want to identify the the steepest gradient in water temperature. The aim of
spatial analysis is usu ally to gain a better understanding of geographic
phenomena through discovering patterns that were previously unknown to
us, or to build arguments on which to base important decisions. It should
be noted that some GIS functions for spatial analysis are sim ple and easy -
to-use, others are much more sophisticated, and demand higher levels of
analytical and operating skills. Successful spatial analysis requires
appropriate software, hardware, and perhaps most importantly, a
competent user.
1.5 SUMMARY
This chap ter gives us a ’gentle’ introduction of GIS. It introduces GI
systems, GI Science and GIS applications. Geographic Information
system applications in real life with software etc. GIS has basic four
phases: data capture and preparation, data management, da ta manipulation
and analysis, and data presentation. Also it gives brief introduction about
data modelling, maps, spatial databases, geo -referencing and spatial
databases.
munotes.in
Page 11
A Gentle Introduction to GIS
11 1.6 REFERENCES
1. Principles of Geographic Information Systems -An introductory
textboo k by
Otto Huisman and Rolf A. de By
2. Introduction to Geographic Information Systems by Chang Kang -tsung
(Karl) McGrawHill
3. Fundamentals of Geographic Information Systems by Michael
N.Demers Wiley Publications
4. https://www.educba.com/applications -of-gis/
1.7 QUESTIONS
1. Define GIS. Briefly explain any two capabilities of GIS.
2. What is GI System, GI Science and GIS applications? Explain
3. How modelling helps in r epresenting real world? Explain.
4. Write a short note on nature of GIS.
5. What is geo -spatial data and geo -information?
vvv
munotes.in
Page 12
12 2
GEOGRAPHIC INFORMATION AND
SPATIAL DATABASE
Unit Structure :
2.1 Models and Representations of the Real World
2.2 Geographic Phenomena
2.3 Computer Representations of Geographic Information
2.4 Organizing and Managing Spatial Data
2.5 The Temporal Dimension
2.6 Summary
2.7 References
2.8 Questions
2.1 MODELS AND REPRESENTATIONS OF THE
REAL WORLD
GIS helps to analyse and understand more about processes and phenomena
in the real world. Section 1.2.1 re - ferred to the process of modelling, or
buildin g a representation which has certain characteristics in common with
the real world. In practical terms, this refers to the process of representing
key aspects of the real world digitally (inside a com - puter). These
representations are made up of spatial d ata, stored in memory in the form
of bits and bytes, on media such as the hard drive of a computer. This
digital representation can then be subjected to various analytical functions
(computations) in the GIS, and the output can be visualized in various
ways.
Modelling is the process of producing an abstraction of the ‘real world’ so
that some part of it can be more easily handled.
Depending on the application domain of the model, it may be necessary to
ma- nipulate the data with specific techniques. To inv estigate the geology
of an area, we may be interested in obtaining a geological classification.
This may result in additional computer representations, again stored in bits
and bytes. To examine how the data is stored inside the GIS, one could
look into th e actual data files, but this information is largely meaningless
to a normal user.
munotes.in
Page 13
Geographic Information and Spatial Database
13
In order to better understand both our representation of the phenomena,
and our eventual output from any analysis, we can use the GIS to create
visualizations f rom the computer representation, either on -screen, printed
on paper, or otherwise. It is crucial to understand the fundamental
differences between these notions. The real world, after all, is a
completely different domain than the ‘GIS’ world, in which we build
models or simulations of the real world. The above two are types of
representations of real world using vector and raster representation
methods.
2.2 GEOGRAPHINC PHENOMENA
GIS operates under the assumption that the relevant spatial phenomena
occur in a two- or three -dimensional Euclidean space , unless otherwise
specified. Euclidean space can be informally defined as a model of
space in which locations are represented by coordinates —(x, y) in 2D; (x,
y, z) in 3D —and distance and di - rection can define d with geometric
formulas.In the 2D case, this is known as the Euclidean plane, which is the
most common Euclidean space in GIS use. In order to be able to represent
relevant aspects real world phenomena inside a GIS, we first need to
define what it is we are referring to. We might define a geographic
phenomenon as a manifestation of an entity or process of interest that:
DATA GEOINFORMATION real world GIS world munotes.in
Page 14
Principles of
Geogrp hics
Information Systems
14 • Can be named or described ,
• Can be georeferenced , and
• Can be assigned a time (interval) at which it is/was present.
2.2.1 Types of Geogra phic Phenomena
1. Geographic Fields
A (geographic) field is a geographic phenomenon for which, for every
point in the study area, a value can be determined . Some common
examples of geographic fields are air temperature, barometric pressure and
elevation. T hese fields are in fact continuous in nature. Examples of
discrete fields are land use and soil classifications. For these too, any
location in the study area is attributed a single land use class or soil class.
A field is a geographic phenomenon that has a value ‘everywhere’ in the
study area. We can therefore think of a field as a mathematical function f
that asso - ciates a specific value with any position in the study area. Hence
if (x, y) is a position in the study area, then f (x, y) stands for the va lue of
the field f at local - ity (x, y).
Fields can be discrete or continuous. In a continuous field, the underlying
function is assumed to be ‘mathematically smooth’, meaning that the field
values along any path through the study area do not change abrupt ly, but
only gradually. Good examples of continuous fields are air temperature,
barometric pressure, soil salinity and elevation. Continuity means that all
changes in field values are gradual. A continuous field can even be
differentiable, meaning we can d etermine a measure of change in the field
value per unit of distance anywhere and in any direction. For example, if
the field is elevation, this measure would be slope, i.e. the change of
elevation per metre distance; if the field is soil salinity, it woul d be salinity
gradient, i.e. the change of salinity per metre distance. Figure illustrates
the variation in elevation in a study area in Spain. A colour scheme has
been chosen to depict that variation. This is a typical example of a
continuous field.
Discr ete fields divide the study space in mutually exclusive, bounded
parts, with all locations in one part having the same field value. Typical
examples are land classifications, for instance, using either geological
classes, soil type, land use type, crop typ e or natural vegetation type. An
example of a discrete field —in this case identifying geological units in the
Falset study area —is provided in Figure 2.3. Observe that locations on the
boundary between two parts can be as - signed the field value of the ‘le ft’
or ‘right’ part of that boundary. One may note that discrete fields are a
step from continuous fields towards geographic objects: discrete fields as
well as objects make use of ‘bounded’ features. Observe, how - ever, that a
discrete field still assigns a value to every location in the study area,
something that is not typical of geographic objects. munotes.in
Page 15
Geographic Information and Spatial Database
15 Essentially, these two types of fields differ in the type of cell values. A
discrete field like landuse type will store cell values of the type ‘integer’.
Therefore it is also called an integer raster. Discrete fields can be easily
converted to polygons, since it is relatively easy to draw a boundary line
around a group of cells with the same value. A continuous raster is also
called a ‘floating point’ raster . A field -based model consists of a finite
collection of geographic fields: we may be in - terested in elevation,
barometric pressure, mean annual rainfall, and maximum daily
evapotranspiration, and thus use four different fields to model the relevant
phen omena within our study area.
2.2.2 Data types and values
Since we have now differentiated between continuous and discrete fields,
we may also look at different kinds of data values which we can use to
represent our ‘phenomena’. It is important to note that some of these data
types limit the types of analyses that we can do on the data itself:
1. Nominal data values are values that provide a name or identifier so that
we can discriminate between different values, but that is about all we
can do. Specifically , we cannot do true computations with these values.
An example are the names of geological units. This kind of data value
is called categorical data when the values assigned are sorted according
to some set of non -overlapping categories. For example, we mi ght
identify the soil type of a given area to belong to a certain (pre -defined)
category.
2. Ordinal data values are data values that can be put in some natural
sequence but that do not allow any other type of computation.
Household income, for instance, co uld be classified as being either
‘low’, ‘average’ or ‘high’. Clearly this is their natural sequence, but this
is all we can say —we can not say that a high income is twice as high as
an average income.
3. Interval data values are quantitative, in that they allow simple forms of
com- putation like addition and subtraction. However, interval data has
no arithmetic zero value, and does not support multiplication or
division. For instance, a temperature of 20 ◦C is not twice as warm as
10 ◦C, and thus centigrade temperatures are interval data values, not
ratio data values.
4. Ratio data values allow most, if not all, forms of arithmetic
computation.
Rational data have a natural zero value, and multiplication and division of
values are possible operators (distances measured in metres are an ex -
ample). Continuous fields can be expected to have ratio data values, and
hence we can interpolate them.
We usually refer to nominal and categorical data values as ‘qualitative’
data, be - cause we are limited in terms of the computations we can do on munotes.in
Page 16
Principles of
Geogrp hics
Information Systems
16 this type of data. Interval and ratio data is known as ‘quantitative’ data, as
it refers to quantities.
However, ordinal data does not seem to fit either of these data types.
Often, ordinal data refers to a ranking scheme or som e kind of hierarchical
phenom - ena. Road networks, for example, are made up of motorways,
main roads, and residential streets. We might expect roads classified as
motorways to have more lanes and carry more traffic and than a residential
street.
2.2.3 Geo graphic objects
When a geographic phenomenon is not present everywhere in the study
area, but somehow ‘sparsely’ populates it, we look at it as a collection of
geographic objects. Such objects are usually easily distinguished and
named, and their po - sition in space is determined by a combination of one
or more of the following parameters:
Location (where is it?),
Shape (what form is it?),
Size (how big is it?), and
Orientation (in which direction is it facing?).
How we want to use the information about a g eographic object determines
which of the four above parameters is required to represent it. For
instance, in an in -car navigation system, all that matters about geographic
objects like petrol stations is where they are. Thus, location alone is
enough to de scribe them in this particular context, and shape, size and
orientation are not necessarily relevant. In the same system, however,
roads are important objects, and for these some notion of location (where
does it begin and end), shape (how many lanes does it have), size (how far
can one travel on it) and orientation (in which direction can one travel on
it) seem to be relevant information components.
Shape is usually important because one of its factors is dimension. This
relates to whether an object is per ceived as a point feature, or a linear, area
or volume fea - ture. The petrol stations mentioned above apparently are
zero-dimensional, i.e. they are perceived as points in space; roads are one -
dimensional, as they are considered to be lines in space. In an other use of
road information —for instance, in multi -purpose cadastre systems where
precise location of sewers and manhole covers matters —roads might well
be considered to be two -dimensional entities, i.e. areas within which a
manhole cover may fall.
Colle ctions of geographic objects can be interesting phenomena at a
higher ag - gregation level: forest plots form forests, groups of parcels form
suburbs, streams, brooks and rivers form a river drainage system, roads
form a road network, and SST buoys form an SST sensor network. It i s
sometimes useful to view geo -Geographic scale graphic phenomena at this munotes.in
Page 17
Geographic Information and Spatial Database
17 more aggregated level and look at characteristics like coverage,
connectedness, and capacity. For example:
Which part of the road network is within 5 km of a petrol station? (A
coverage question)
What is the shortest route between two cities via the road network? (A
connectedness question)
How many cars can optimally travel from one city to another in an hour?
(A capacity question)
2.2.4 Boundaries
Where sha pe and/or size of contiguous areas matter, the notion of
boundary comes into play. This is true for geographic objects but also for
the constituents of a discrete geographic field. Location, shape and size are
fully determined if we know an area’s boundary , so the boundary is a good
candidate for representing it. This is especially true for areas that have
naturally crisp boundaries. A crisp boundary is one that can be determined
with almost arbitrary precision, dependent only on the data ac - quisition
technique applied. Fuzzy boundaries contrast with crisp boundaries in that
the boundary is not a precise line, but rather itself an area of transition.
As a general rule -of-thumb, crisp boundaries are more common in man -
made phenomena, whereas fuzzy boundaries are more common with
natural phe - nomena. In recent years, various research efforts have
addressed the issue of explicit treatment of fuzzy boundaries, but there is
still limited support for these in existing GIS software. The areas identified
in a geolog ical classification, like that of Figure 2.3, are typically vaguely
bounded in reality, but applications of this geological information
probably do not require high positional accuracy of the boundaries
involved. Therefore, an assumption that they are actually crisp boundaries
will have little influence on the usefulness of the data.
2.3 COMPUTER REPRESENTATIONS OF EOGRAPHIC
INFORMATION
geographic phenomena have the characteris - tics of continuous functions
over space. Elevat ion, for instance, can be measured at many locations,
even within one’s own backyard, and each location may give a different
value. In order to represent such a phenomenon faithfully in com - puter
memory, we could either:
Try to store as many (location, e levation) observation pairs as possible,
or
Try to find a symbolic representation of the elevation field function, as a
formula in x and y —like (3.0678x2 + 20.08x 7.34y) or so —which can be
evaluated to give us the elevation at any given (x, y) location. munotes.in
Page 18
Principles of
Geogrp hics
Information Systems
18 Both of these approaches have their drawbacks. The first suffers from the
fact that we will never be able to store all elevation values for all
locations; after all, there are infinitely many locations. The second
approach suffers from the fact that we do no t know just what this function
should look like, and that it would be extremely difficult to derive such a
function for larger areas. In GISs, typically a combination of both
approaches is taken. We store a finite, but intelligently chosen set of
(sample) locations with their elevation. This gives us the elevation for
those stored locations, but not for others. We can use an interpolation
function that allows us to infer a reasonable elevation value for locations
that are not stored. A simple and commonly u sed interpolation function
takes the elevation value of the nearest location that is stored. But smarter
interpola - tion functions (involving more than a single stored value), can
be used as well, as may be understood from the SST interpolations
1. Regular Tessellations
A tessellation (or tiling) is a partitioning of space into mutually exclusive
cells that together make up the complete study space. With each cell, some
(thematic) value is associated to characterize that part of space.In a regular
tessellat ion, the cells are the same shape and size. The simplest example is
a rectangular raster of unit squares, represented in a computer in the 2D
case as an array of n m elements. Following are the three types of
tesselation
The three most common types of regular tessellation: from left to right,
square cells, hexagonal cells and triangular cells.
In all regular tessellations, the cells are of the same shape and size, and
the field attribute value assigned to a cell is associated with the entire
area occu - pied by the cell. The square cell tessellation is by far the
most used , mainly because georeferencing a cell is so straightforward.
These tessellations are known under various names in different GIS
packages, but most frequently as rasters .
A raster is a s et of regularly spaced (and contiguous) cells with
associated (field) values. The associated values represent cell values,
not point values. This means that the value for a cell is assumed to be
valid for all locations within the cell A raster is a set of regularly
spaced (and contiguous) cells with associated (field) values. The
associated values represent cell values, not point values. This means
that the value for a cell is assumed to be valid for all locations within
the cell. The location asso ciated wi th a raster cell is fixed by
convention and may be the cell centroid (mid -point) or, for instance, its
left lower corner. Values for other positions than these must be
computed through some form of interpola - tion function, which will munotes.in
Page 19
Geographic Information and Spatial Database
19 use one or more nearb y field values to compute the value at the
requested position. This allows us to represent continuous, even
differentiable, functions.
An important advantage of regular tessellations is that we know how
they parti - tion space, and we can make our computati ons specific to
this partitioning. This leads to fast algorithms. An obvious
disadvantage is that they are not adaptive to the spatial phenomenon
we want to represent. The cell boundaries are both artificial and fixed:
they may or may not coincid e with the boundaries of the phenomena of
interest. For example, suppose we use any of the above regular
tessellations to represent elevation in a perfectly flat area. In this case
we need just as many cells as in a strongly undulating terrain: the data
structure do es not adapt to the lack of relief. We would, for instance,
still use the m n cells for the raster, although the elevation might be
1500 m above sea level everywhere.
2. Irregular Tesselations
Irregular ssellations are more complex than the regul ar ones, b ut they
are also more adaptive, which typically leads to a reduction in the
amount of memory used to store the data. A well -known data structure
in this family —upon which many more variations have been based —is
the region quadtree. It is based on a regular tessellation of square cells
but takes advantage of cases where neigh - bouring cells have the same
field value, so that they can together be represented as one bigger cell.
A simple illustration is provided in Figure.
It shows a small 8x8 rast er with t hree possible field values: white,
green and blue. The quadtree that represents this raster is constructed
by repeatedly splitting up the area into four quadrants, which are called
NW, NE, SE, SW for obvious rea - sons. This procedure stops when al l
the cel ls in a quadrant have the same field value. The procedure
produces an upside -down, tree -like structure, known as a quadtree. In
main memory, the nodes of a quadtree (both circles and squares in the
figure below) are represented as records. The lin ks betwee n them are
point - ers, a programming technique to address (i.e. to point to) other
records.
NNSENSW SWSNSSW SW Smunotes.in
Page 20
Principles of
Geogrp hics
Information Systems
20 An 8 8, three -valued raster (here: colours) and its repre - sentation as a
region quadtree. To construct the quadtree, the field is successively
split into four quadrants until parts have only a single field value. After
the first split, the southeast quadrant is entirely green, and this is
indicated by a green square at level two of the tree. Other quadran ts
had to be s plit further.
Quadtrees are adaptive because they apply the spatial autocorrelation
principle, i.e., that locations that are near in space are likely to have
similar field values. When a conglomerate of cells has the same value,
they are repre sented togeth er in the quadtree, provided boundaries
coincide with the predefined quadrant boundaries. Therefore, we can also
state that a quadtree provides a nested tessellation: quadrants are only split
if they have two or more values. The square nodes at the same le vel
represent equal area sizes, allowing quick computation of the area
associated with some field value. The top node of the tree represents the
complete raster.
3. Vector Representations
In vector representations, an attempt is made to expl icitly associa te
georeferences with the geographic phenomena. A georeference is a
coordinate pair from some geographic space and is also known as a vector.
This explains the name. Below, we discuss various vector representa -
tions. We start with our dis cussion with th e TIN, a representation for
geographic fields that can be considered a hybrid between tessellations
and vector representations.
Input locations & their(elevation) values for a TIN construction. The
location P is an arbi - trary location that has no as sociated elevation
measurement.
A commonly used data structure in GIS software is the triangulated
irregular net - work, or TIN. It is one of the standard implementation
techniques for digital terrain models, but it can be used to represent a ny
continuous f ield. The principles behind a TIN are simple. It is built from a
set of locations for which we have a measurement, for instance an
elevation. The locations can be arbitrar - TINs represent a continuousily
scattered in space and are usually n ot on a nice reg ular grid. Any location
together with its elevation value can be viewed as a point in three -
dimensional space.
30 810 350 980 1550 P 1250 1340 45 1100 820 munotes.in
Page 21
Geographic Information and Spatial Database
21 4. Point representations
Points are defined as single coordinate pairs (x, y) when we work in 2D, or
co-ordinate triplets (x, y, z) when we work in 3D. Points are used to
represent objects that are best described as shape - and size - less, one -
dimensional features. For a tourist city map, a park will not usually be
considered a point feature, but perhaps a museum will, and certainly a
public phone booth might be represented as a point.
5. Line representations
Line data are used to represent one -dimensional objects such as roads,
railroads, canals, rivers and power lines. Again, there is an issue of
relevance for the appli - cation and the scale that the application requires.
For the example application of mapping tourist information, bus, subway
and streetcar routes are likely to be relevant line features. Some cadastral
systems, on the other hand, may consider roads to be two -dimensio nal
features, i. e. having a width as well.
The two end nodes and zero or more internal nodes or vertices define a
line. Other terms for ’line’ that are commonly used in some GISs are
polyline, arc or edge. A node or vertex is like a point (as discussed abo ve)
but it only serves to define the line, and provide shape in order to obtain a
better approximation of the actual feature.
The straight parts of a line between two consecutive vertices or end nodes
are called line segments. Many GISs store a line as a s imple sequence o f
coordinates of its end nodes and vertices, assuming that all its segments
are straight. This is usually good enough, as cases in which a single
straight -line segment is con - sidered an unsatisfactory representation can
be dealt with by us ing multiple (sm aller) line segments instead of only
one.
6. Area representations
When area objects are stored using a vector approach, the usual technique
is to apply a boundary model. This means that each area feature is
represented by some arc/node stru cture that deter mines a polygon as the
area’s boundary. Common sense dictates that area features of the same
kind are best stored in a single data layer, represented by mutually non -
overlapping polygons. In essence, what we then get is an application -
deter mined (i.e. adap tive) partition of space.
Observe that a polygon representation for an area object is yet another
example of a finite approximation of a phenomenon that inherently may
have a curvi - linear boundary. In the case that the object can be percei ved
as having a fuzzy boundary, a polygon is an even worse approximation,
though potentially the only one possible. Such information could be stored
in database tables.
munotes.in
Page 22
Principles of
Geogrp hics
Information Systems
22 7. Topology and Spatial Relationships
Topology deals with spatial properties that do no t change under c ertain
transfor - mations. For example, features drawn on a sheet of rubber (as in
Figure) can be made to change in shape and size by stretching and pulling
the sheet. However, some properties of these features do not change:
Area E is sti ll inside area D ,
The neighbourhood relationships between A, B, C, D, and E stay intact,
and their boundaries have the same start and end nodes, and
The areas are still bounded by the same boundaries, only the shapes and
lengths of their perimeters have c hanged.
Topology refers to the spatial relationships between geographical
elements in a data set that do not change under a continuous
transformation.
Topological relationships are built from simple elements into more
complex el - ements: nodes de fine line segmen ts, and line segments
connect to define lines, which in turn define polygons. The fundamental
issues relating to order, connectivity and adjacency of geographical
elements form the basis of more sophisticated GIS analyses. These
relationshi ps (called topol ogical properties) are invariant under a
continuous transformation, referred to as a topological mapping.
The mathematical properties of the geometric space used for spatial data
can be described as follows:
The space is a three -dimensiona l Euclidean spac e where for every point
we can determine its three -dimensional coordinates as a triple (x, y, z) of
real numbers. In this space, we can define features like points, lines,
polygons, and volumes as geometric primitives of the respective
dime nsion. A point i s zero -dimensional, a line one -dimensional, a polygon
two-dimensional, and a volume is a three -dimensional primitive.
The space is a metric space, which means that we can always compute
the distance between two points according to a given d istance function .
Such a function is also known as a metric. AB4
D E 7
6 C1ABE4D7526Cmunotes.in
Page 23
Geographic Information and Spatial Database
23 The space is a topological space, of which the definition is a bit compli -
cated. In essence, for every point in the space we can find a
neighbourhood around it that fully belongs to that space a s well.
Interior and boundary are properties of spatial features that remain
invari - ant under topological mappings. This means, that under any
topological mapping, the interior and the boundary of a feature remains
unbroken and intact.
There are a number of advantages wh en our computer representations of
ge - ographic phenomena have built -in sensitivity of topological issues.
Questions related to the ‘neighbourhood’ of an area are a point in case.
0-Simplex
1-Simplex
2-simplex
3-simplex
Simplical Complex
Simplices and a simplicial complex. Features are approxi - mated by a set
of points, line segments, triangles, and tetrahedrons.
the topological properties of interior and boundary to define rela - tionships
between spatial f eatures. Since the properties of interior and bound - ary do
not change under topological mappings, we can investigate their possi - ble
relations between spatial features.4 We can define the interior of a region
R as the largest set of point s of R for whic h we can construct a disk -like
environment around it (no matter how small) that also falls completely
munotes.in
Page 24
Principles of
Geogrp hics
Information Systems
24 inside R. The boundary of R is the set of those points belonging to R but
that do not belong to the interior of R, i.e. one cannot constru ct a disk -like
environment around such points that still belongs to R completely.
The five rules of topological consis - tency in two -dimensional space
8. Scale and Resolution
Map scale can be defined as the ratio between the distance on paper
map and the distance of the same stretch in the terrain. A 1:50,000 scale
map means that 1 cm on the map represents 50,000 cm, i.e., 500 m, in
the terrain. ‘Large -scale’means that the ratio is large, so typically it
means there is much detail, as in a 1:1,000 paper ma p. ‘Small -scale’
in contrast means a small ratio, hence less detail, as in a 1:2,500,000
paper map. When applied to spatial data, the term resolution is
commonly associated with the cell width of the tessellation applied.
When digital spati al data sets hav e been collected with a specific map -
making purpose in mind, and these maps were designed to be of a single
map scale, like 1:25,000, we might suppose that the data carries the
characteristics of “a 1:25,000 digital data set.”
9. Representa tion of Geograph ic Fields
A geographic field can be represented through a tessellation, through a
TIN or through a vector representation. The choice between them is
determined by the requirements of the application at hand. It is more
1. Every 1-simplex (‘arc’) must be bounded by two 0-simplices
(‘nodes’, namely its begin and end node)
2. Every 1-simplex border two 2-simplices (‘polygons’, namely its
‘left’ and ‘right’ polygons)
3. Every 2-simplex has a closed boundary consisting of an
alternating (and cyclic) sequence of 0- and 1-simplices.
4. Around every 0-simplex exists an alternating (and cyclic) sequence
of 1- and 2-simplices.
5. 1-simplices only intersect at their (bounding) nodes.
0 1 0 1 1 2 0 1 0 2 1 1 1 2 rule (1 ) rules (2, 5) 0 rule (3)rule (4)
munotes.in
Page 25
Geographic Information and Spatial Database
25 common to use tessell ations, notably rasters, for field representation, but
vector representations are in use too. We have alredy looked at TINs. We
provide an example of the other two below.
10. Representation of Geographic objects
The representation of geographic objects is most naturally supported with
vec- tors. After all, objects are identified by the parameters of location,
shape, size and orientation and many of these parameters can be expressed
in terms of vectors. However, tessellations are still commonly used for
representing geogra phic objects.
11. Tessellations to represent geographic objects
Remotely sensed images are an important data source for GIS
applications. Un - processed digital images contain many pixels, with each
pixel carrying a re - flectance value. Vari ous techniques e xist to process
digital images into classified images that can be stored in a GIS as a raster.
Image classification attempts to characterize each pixel into one of a finite
list of classes, thereby obtaining an interpretation of the content s of the
image.
Line and point objects are more awkward to represent using rasters. After
all, we could say that rasters are area -based, and geographic objects that
are perceived as lines or points are perceived to have zero area size.
Standard classificat ion techniques, moreover, may fail to recognize these
objects as points or lines.
12. Vector representations of geographic objects
A vector -based GIS is defined by the vectorial representation of its
geographic data. According with the characteristics of this data model ,
geographic objects are explicitly represented, and, within the spatial
characteristics, the thematic aspects are related.
2.4 ORGANIZING AND MANAGING SPATIAL DATA
The main principle of data organization applied in GIS system is a spatia l
data layer. A spatial data layer is either a representation of a continuous or
discrete field, or a collection of objects of the same kind. Usually, the data munotes.in
Page 26
Principles of
Geogrp hics
Information Systems
26 is organized so that similar elements are in a single data layer. For
example, all telephone boo th point object s would be in one layer, and all
road line objects in another.
A data layer contains spatial data and attribute (or: thematic) data, which
further describes the field or objects in the layer. Attribute data is quite
often arranged in tabul ar form, maintai ned in geodatabase . Data layers
can be overlaid with each other, inside the GIS package, to study
combinations of geographic phenomena. We shall see later that a GIS
can be used to study the spatial relationships between different
phenomena, requiring comp utations which overlay one data layer with
another.
Two different object layers can be over - laid to look for spatial cor -
relations, and the result can be used as a separate (object) layer.
2.5 THE TEMPORAL DIME NSION
Besides having geometric, thematic, and topological properties,
geographic phenomena are also dynamic; they change over time. For an
increasing number of applications, these changes themselves are the key
aspect of the phenomenon to study. Examples in clude identifying the
owners of a land parcel in 1972, or how land cover in a certain area
changed from native forest to pastures over a specific time. We can note
that some features or phenomena change slowly, such as geological
features, or as in the example of land cover given above. Ot her phenomena
change very rapidly, such as the movement of people or atmospheric
conditions. For different applications, different scales of mea - surement
will apply.
Examples of the kinds of questions involving tim e include:
Where and when did something h appen?
How fast did this change occur?
In which order did the changes happen?
munotes.in
Page 27
Geographic Information and Spatial Database
27 1. Discrete and continuous time
Time can be measured along a discrete or continuous scale. Discrete time
is composed of discrete element s (seconds, minutes, hours, days, months,
or years). In continuous time, no such discrete elements exist, and for any
two different points in time, there is always another point in between. We
can also structure time by events (points in time) or periods ( time
intervals). When we represent time p eri- ods by a start and end event, we
can derive temporal relationships between events and periods such as
‘before’, ‘overlap’, and ‘after’.
2. Valid time and transaction time
Valid time (or world time) is the time when an event really happened, or a
string of events took place. Transaction time (or database time) is the time
when the event was stored in the database or GIS. Observe that the time at
which we store something in the data - base/GIS typically is (much) l ater
than when the related event took pla ce.
3. Linear, branching, and cyclic time
Time can be linear, ex -tending from the past to the present (‘now’), and
into the future. This view gives a single timeline . For some types of
temporal anal ysis, branching time —in which different timelines from a
certain point in time onwards are possible —and cyclic time —in which
repeating cycles such as seasons or days of a week are recognized, make
more sense and can be useful.
4. Time granuality
When mea suring time, we speak of granularity as the precision of a time
value in a GIS or database ( e.g., year, month, day, sec - ond, etc.).
Different applications may obviously require different granu - larity. In
cadastral applications, time granularity might well be a day, as the law
requires deeds to be date -marked; in geological mapping applica - tions,
time granularity is more likely in the order of thousands or millions of
years.
5. Absolute and relative time
Time can be represented as absolute or relative. Abs olute time marks a
point on the timeline where events happ en (e.g., ‘6 July 1999 at 11:15
p.m.’). Relative time is indicated relative to other points in time ( e.g.,
‘yesterday’, ‘last year’, ‘tomorrow’, which are all relative to ‘now’, or
‘two weeks later’, which is relative to some other arbitrary point in time. ).
Temporal data is simply data that represents a state in time, such as the
land-use patterns of Hong Kong in 1990, or total rainfall in Honolulu on
July 1, 2009. Temporal data is collected to analy ze weather patterns and
other environmental variables, mo nitor traffic conditions, study
demographic trends, and so on. This data comes from many sources
ranging from manual data entry to data collected using observational
sensors or generated from simulat ion models. Below are some examples
of temporal data. munotes.in
Page 28
Principles of
Geogrp hics
Information Systems
28
2.6 SUMMARY
Geographic phenomena are present in the real world that we study, their
computer representations only live inside computer systems. This chapter
has discussed different types of geographi c phenomena and examined the
ways that these can be repre sented in a computer system, such as a GIS.
The first type of phenomena we called fields, the second calledc objects.
Amongst fields, we identified continuous and discrete phenomena.
Continuous phe nomena could even be differentiable, meaning that for
locations factors such as gradient and aspect can be determined. Amongst
objects, important classification parameters include location, shape, size
and orientation. Also, this chapter elaborated on the t echniques with which
the above phenomena are stored in a computer system. At the end of the
chapter, it contains topological. Spatial relations and temporal dimension.
2.7 REFERENCES
1. Principles of Geographic Information Systems -An introductory
textbook by
Otto Huisman and Rolf A. de By
2. Introduction to Geogra phic Information Systems by Chang Kang -tsung
(Karl) McGrawHill
3. Fundamentals of Geographic Information Systems by Michael
N.Demers Wiley Publications
4. https://www.educba.com/applications -of-gis/
5. https://desktop.arcgis.com/en/arcmap/10.3/map/time/what -is-temporal -
data.htm
2.8 QUESTIONS
1. Write a short note on Geographic phenomenon.
2. How real-world objects are represented using Model in GIS? Explain.
3. Define Geographic field. Explain its different data ty pe and values.
4. What is regular and irregular tesselation? Explain it.
5. What is topology and spatial representations of geographic objects.
6. Explain temporal dimension in brief with example.
munotes.in
Page 29
29 3
HARDWARE AND SOFTWARE TRENDS
IN GEOGRAPHIC INFORMATION
SYSTEM
Unit Structure :
3.0 Objectives
3.1 Introduction
3.2 Hardware and Software trends
3.3 Geographic Information Systems
3.3.1 GIS software
3.3.2 GIS architecture and functionality
3.3.3 Spatial data infrastructure
3.4 Stages of spatial data handling
3.4.1 Spatial data capture and preparation
3.4.2 Spatial data storage and maintenance
3.4.3 Spatial query and analysis
3.4.4 Spatial data presentation
3.5 Summary
3.6 Questions
3.7 MCQ Questions
3.8 References
3.0 OBJECTIVES
The objective of this chapter is to make students understand the following
concept
Capability of GIS
Architecture of GIS
Functionality of GIS
Spatial Data infrastructure
Stages of Spatial data handling
munotes.in
Page 30
Principles of
Geogrphics
Information Systems
30 3.1 INTRODU CTION
The Abbreviation GIS stands for geographic information system.
We can say that GIS is a tool for working with geographic information.
A system is a set of things that are working together as parts of a
mechanism or an interconnecting network.
GIS sy stem also contain software and hardware.
Spatial data refers to “where” things are, or perhaps, where they were
or will be.
In other words, spatial data means data that contains positional values,
such as (x, y) co -ordinates.
The Spatial data is also know n as geospatial data.
A working GIS requires both hardware and software, and also people
such as GI Systems the database creators or administrators, analysts
who work with the software, and the users of the end product.
Data processing system refers to har dware and software components
which are able to process, store and transfer data or the components of
systems that facilitate the management and processing of geo
information.
3.2 HARDWARE AND SOFTWARE TRENDS
There has been a tremendous amount of change in computer hardware
at an ever -increasing rate.
It appears that computer hardware is advancing at an ever -increasing
rate.
A faster, more powerful processor generation replaces the previous one
every few months.
Furthermore, computers are becoming increasin gly portable, while
offering increased performance.
In comparison to the first PC hand -held computers introduced in the
early 1980's, today's handheld computers have a multiple of the
computing power.
Current PCs are thousands of times faster than 25 -year-old
"minicomputers".
To illustrate this trend: compare an early 1980's PC with a 2 MHz CPU,
128 Kbytes of main memory, and a 10 MByte hard drive with today's
desktop PC. munotes.in
Page 31
Hardware and Software
Trends in Geographic
Information System
31 The cost of computers is also decreasing.
Nowadays, handheld computers are commonpl ace in business and
personal use, providing field surveyors with powerful tools, including
GPS capabilities.
As a result of these hardware trends, software providers continue to
create application programs and operating systems with an increasing
amount of functionality while consuming an increasing amount of
memory as well.
It is generally believed that software technology has developed
somewhat slower than hardware technology and as a result cannot fully
utilise the capabilities offered by the ever -expand ing hardware
capabilities.
Existing software obviously performs better when run on faster
computers.
Along with these trends, there have also been significant developments
in computer networks.
Nowadays, almost any computer on Earth can connect to some
network, and contact computers virtually anywhere else, allowing fast
and reliable exchange of (spatial) data.
Mobile phones are more and more frequently being used to connect to
computers on the Internet.
The UMTS protocol (Universal Mobile Telecommunication s System),
allows digital communication of text, Mobile communication audio,
and video at a rate of approximately 2 Mbps.
The new HSDPA (High -Speed Downlink Packet Access) protocol
offers up to 10 times the speed of UMTS.
Bluetooth version 2.0 is a standa rd that offers up to 3 Mbps
connections, especially between palm - and laptop computers and their
peripheral devices, such as a mobile phone, GPS or printer at short
range.
Wireless LANs (Local Area Networks), under the so -called Wi -Fi
standard, nowadays o ffer a bandwidth of up to 108 Mbps on a single
connection point, to be shared between computers.
They are more and more used for constructing a computer network in
office buildings and in private homes.
When the medium of communication is not the air, but copper or fibre
optics cables that is structured networks then the speed is different.
Standard ‘Dial -up’ telephone modems allow rates up to 56 kbps. munotes.in
Page 32
Principles of
Geogrphics
Information Systems
32 Digital telephone links (ISDN) support much higher rates: up to 1.5
Mbps.
ADSL (Asymmetric Digital Subsc riber Line) technology widely
available through telephone companies on standard copper -wire
networks supports transfer rates anywhere between 2 and 20 Mbps
towards the customer (downstream), and between 1 and 8 Mbps
towards the network (upstream) depending on the internet provider and
quality of the network infrastructure.
Wide -area computer networks (national, continental, global) have a
capacity of several Gbps.
ITC’s dedicated Local Area Network (LAN), which is partially fibre
optics -based, supports a tr ansmission rate locally of 1 Gbps.
3.3 GEOGRAPHIC INFORMATION SYSTEMS
A GIS is a computer -based system that provides the following four sets
of capabilities to handle georeferenced data:
1. Data capture and preparation
2. Data management, including storage and m aintenance
3. Data manipulation and analysis
4. Data presentation
For many years, analogue data sources were used, processing was done
manually, and paper maps were produced.
The introduction of modern techniques has led to an increased use of
computers and dig ital information in all aspects of spatial data
handling.
Spatial data refers to where things are, or perhaps, where they were or
will be. To be more precise, these professionals deal with questions
Spatial data related to geographic space, which we defin e as having
positional data relative to the Earth’s surface. Spatial data represent
positional data.
Most planning projects require data from a number of national
institutes, such as national mapping agencies, soil and forest survey
institute, and national census bureaus, as well as spatial and non -spatial
sources.
The data sources obtained may be from different time periods, and the
spatial data may be in different scales or projections.
With the help of a GIS, the spatial data can be stored in digital fo rm in
world coordinates. munotes.in
Page 33
Hardware and Software
Trends in Geographic
Information System
33 With this software, scaling transformations can be avoided, and map
projections can be converted easily.
With the spatial data thus prepared, spatial analysis functions of the
GIS can then be applied to perform the planning tasks.
3.3.1 GIS software
GIS can be considered as a system that stores spatial data (positional
data), a toolbox, a technology, an information source or a field of
science.
The main characteristics of a GIS software package are its analytical
functions that p rovide means for obtaining new geoinformation from
existing spatial and attribute data.
Geographic information science is driven by the use of GIS tools,
which in turn are improved by the insights and information gained from
their application to various sc ientific fields.
Spatial information theory is one such field, which focuses specifically
on providing the background for the production of tools for the
handling of spatial data.
All GIS packages available on the market have their strengths and
weaknesses , typically resulting from the development history and/or
intended application domain(s) of the package.
There are some GIS tools that are designed to support raster -based
functionality, while others focus on (vector -based) spatial objects
We can state th at any package that provides support for only rasters or
only objects, is not a complete GIS. Well -known, full -fledged GIS
packages include ILWIS, Intergraph’s Geo Media, ESRI’s Arc GIS,
and Map Info from Map - Info Corp .
Generally, no one GIS package is ' better' than another: it depends on
factors such as the intended application and the user's expertise.
ILWIS’s traditional strengths are in raster processing and scientific
spatial data analysis, especially in project -based GIS applications.
Intergraph, E SRI and Map Info products have been known better for
their support of vector - based spatial data and their operations, user
interface and map production.
munotes.in
Page 34
Principles of
Geogrphics
Information Systems
34 Software Development in GIS
3.3.2 GIS architecture and functionality
As we know th at a geographic information system in the wider sense
consists of software, data, people, and an organization in which it is
used.
we should also note that organizational factors will define the context
and rules for the capturing, processing and sharing o f geo information,
as well as the role which GIS plays in the organization as a whole.
A GIS consists of several functional components —components which
support key GIS functions.
The several functional components of GIS are data capture and
preparation, d ata storage, data analysis, and presentation of spatial
data.
Figure 3.3.2 shows the diagram of these components, with arrows
indicating the data flow in the system. Shared Geo Data
Professional GIS
Embedded
GIS
Open
GIS
CAD Based GIS
Internet
GIS
Desktop
GIS
munotes.in
Page 35
Hardware and Software
Trends in Geographic
Information System
35
For a particular GIS, each of these components may provide many or
only a few functions.
If any of these components is missing, the system cannot be called a
geographic information system.
However it is important to note that the same function may be offered
by different components of the GIS, for instance, d ata capture and data
storage may have functions in common, and the same holds for data
preparation and data analysis.
3.3.3 Spatial data infrastructure
Organizations are increasingly working in cooperation in order to
obtain and provide geographic informat ion to other organizations as
well as the general public for reasons of efficiency and legislation.
Data dissemination, security, copyright, and pricing must all be
addressed when spatial data is shared between the GISs of those
organizations.
The design a nd maintenance of a Spatial Data Infrastructure (SDI)
deals with these issues.
An SDI is defined as “the relevant base collection of technologies,
policies and institutional arrangements that facilitate the availability of
and access to spatial data”.
A fu ndamental component of those arrangements is a broad
understanding of the agreements between organizations and a narrower
understanding of the agreements between software systems on how to
share geographic data. Data capture
and preparation
Data storage and
Maintenance
Data
Presentation Manipulation
and Analy sis
Figu re 3.3.2 Functional Component of GIS
munotes.in
Page 36
Principles of
Geogrphics
Information Systems
36 In SDI, standards are often the starting poi nt for those agreements.
Standards exist for all facets of GIS, ranging from data capture to data
presentation.
They are developed by different organizations, of which the most
prominent are the Inter - national Organization for Standardisation
(ISO) and t he Open Geospatial Consortium (OGC).
Typically, an SDI provides its users with different facilities for finding,
viewing, downloading and processing data.
Because the organizations in an SDI are normally widely distributed
over space, computer networks ar e used as the means of
communication.
With the development of the internet, the functional components of GIS
have been gradually become available as web -based applications.
Much of the functionality is provided by so called geo-webservices ,
software prog rams that act as an intermediate between geographic
databases and the users of the web.
Geo-webservices can vary from a simple map display service to a
service which involves complex spatial calculations.
Basic Software component of SDI
Software client:
To display, query & analyse spatial data (web or desktop GIS)
Catalogue service:
discovering, browsing & querying of metadata or spatial data (datasets)
Spatial data service:
allows delivery of data via internet
Processing service:
data, projection and scale transformation
Spatial data repository:
to store data
GIS software:
to create & update data
munotes.in
Page 37
Hardware and Software
Trends in Geographic
Information System
37 3.4 STAGES OF SPATIAL DATA HANDLING
The various stages of spatial data handling are as follows: -
1. Spatial data capture and preparation
2. Spatial data stor age and maintenance
3. Spatial query and analysis
4. Spatial data presentation
3.4.1 Spatial data capture and preparation
Data capture is closely related to surveying engineering,
photogrammetry, remote sensing, and digitization, i.e., the conversion
of analogue data into digital representations.
Remote sensing is the field that provides photographs and images as the
raw base data from which spatial data sets are derived.
Field surveys are often needed to collect data that cannot be obtained
through remote sens ing, or to validate data thus obtained.
Traditional techniques for obtaining spatial data, typically from paper
sources, included manual digitizing and scanning.
Table 3.4.1 lists the main methods and devices used for data capture.
Method Devices
Manual digitizing • coordinate entry via keyboard
• digitizing tablet with cursor
• mouse cursor on the computer
monitor (heads -up digitizing)
• (digital) photogrammetry
Automatic digitizing • scanner
Semi -automatic digitizing • line-following software
Input of available digital data • CD-ROM or DVD -ROM
• via computer network or internet
(including geo -webservices)
Table 3.4.2 : Spatial data in - put methods and devices used
munotes.in
Page 38
Principles of
Geogrphics
Information Systems
38 In recent years there has been a significant increase in the availability
and sharing of digit al geospatial data.
Computer networks and media play an important role in disseminating
this data, particularly the internet.
In some cases, the data may not yet be ready for use in the system when
it is obtained in some digital format.
This may be because the format obtained from the capturing process is
not quite the format required for storage and further use, which means
that some type of data conversion is required.
This problem may also arise if the captured data is only raw base data,
from which the real data objects of interest to the system will need to
be constructed.
For example, semi - automatic digitizing may produce line segments,
while the applications requirements are that non -overlapping polygons
are needed. A build -and-verification phase wou ld then be needed to
obtain these from the captured lines.
3.4.2 Spatial data storage and maintenance
Data organization
The way that data is stored plays a central role in the processing and the
eventual understanding of that data.
In most of the availabl e systems, spatial data is organized in layers by
theme and/or scale.
For instance, the data may be organized in thematic categories, such as
land use, topography and administrative subdivisions, or according to
map scale.
An important underlying need or principle is a representation of the real
world that has to be designed to reflect phenomena and their
relationships as naturally as possible.
In a GIS, features are represented with their geometric and non -
geometric attributes and relationships.
The geom etry of features is represented using primitives of the relevant
dimension: a windmill might be a point; a field of crops might be a
polygon.
The primitives follow either the vector, or the raster approach.
Cells, pixels and voxels
As we know, vector data types describe an object through its boundary,
thus dividing the space into parts that are occupied by the respective
objects. munotes.in
Page 39
Hardware and Software
Trends in Geographic
Information System
39 The raster approach subdivides space into regular cells, mostly as a
square tessellation of dimension two or three.
These cells are called either cells or pixels in 2D, and voxels in 3D.
Each cell contains a description of the real world feature it represents, if
it represents a discrete field.
For continuous fields, the cell holds a representative value.
Table 3.4.2 lists advanta ges and disadvantages of raster and vector
representations.
Raster representation Vector representation
Advantage
s • simple data structure
• simple implementation
of overlays
• efficient for image processing • efficient representation of topology
• adapts well to scale changes
• allows representing networks
• allows easy association
with attribute data
Disadvanta
ges • fewer compact data structure
• difficulties in
representing topology
• cell boundaries independent
of feature boundaries • complex data structure
• overlay more difficult to implement
• inefficient for image processing
• more update -intensive
Raster encoding
The storage of a raster is straightforward. It is stored in a file as a long
list of values, one for each cell, preceded by a small list of extra data
called as ‘file header’ that informs how to interpret the long list.
The order of the cell values in the list can be —but need not be —left-to-
right, top -to-bottom.
This simple encoding scheme is known as row ordering.
The header of the raster file will typically in form how many rows and
columns the raster has, which encoding scheme is used, and what sort
of values are stored for each cell.
Raster files can be quite big data sets. Computationally, it makes sense
to arrange the long list of values of cells in a way th at spatially nearby
cells are also close together in the list. munotes.in
Page 40
Principles of
Geogrphics
Information Systems
40 Low-level storage structures for vector data are much more
complicated. The best natural understanding can be obtained from
Figure 3.4.2, where a boundary model for polygon objects is illustrat ed.
Similar structures are in use
for line objects.
Figure:3.4.2: A Simple boundary model for the polygons A , B and C
for each arc, we store the start and end node, its left and right
polygon. The ‘polygon’ W denotes the outside world polygon.
The boundary model is sometimes also called the topological data
model as it captures some topological information, such as polygon
neighbourhood.
DBMS and Spatial Database
The GIS software packages support both spatial and attribute data, i.e.,
spatial data storage using vectors and attribute data storage using tables.
Historically, database management systems (DBMSs) hav e been based
on tables for data storage.
Many GIS applications have been able to access external databases to
store attribute data and utilize their superior data management
capabilities for some time.
Currently, all major GIS packages provide facilities t o link with a
DBMS and ex - change attribute data with it.
Spatial (vector) and attribute data are still some - times stored in
separate structures, although they can now be stored directly in a spatial
database.
Spatial data is associated with geographic l ocations such as cities,
towns etc. A spatial database is optimized to store and query data
representing objects.
line from to left right vertexlist b1 4 1 W A . . .
b2 1 2 B A . . .
b3 1 3 W B . . .
b4 2 4 C A . . .
b5 3 4 W C . . .
b6 3 2 C B . . .
munotes.in
Page 41
Hardware and Software
Trends in Geographic
Information System
41 Data maintenance
Maintenance of spatial data is the process of keeping the data set
current and supportive to the user community.
It deal s with obtaining new data, and entering them into the system,
possibly replacing outdated data.
The purpose is to have an up -to-date stored dataset available.
For example, after a major earthquake, we may need to update our
road network data to reflect t hat road have been washed away or have
become blocked.
It is important to update spatial data in order to meet the requirements
of data users as well as the fact that many aspects of the real world are
constantly changing.
These data updates can take diff erent forms. It may be that a complete,
new survey has been carried out, from which an entirely new data set is
derived that will replace the current set.
Such a situation is common if the spatial data originate from remotely
sensed data, such as a new ve getation cover set or digital elevation
model.
It may also be that local ground surveys have revealed local changes,
for instance, new constructions, or changes in land use or ownership. In
such cases, local change to the large spatial data set is more typ ically
required. Such local changes should respect matters of data
consistency, i.e., they should leave other spatial data within the same
layer intact and correct.
3.4.3 Spatial query and analysis
SDSS
The most distinguishing parts of a GIS are its functi ons for spatial
analysis, i.e., operators that use spatial data to derive new
geoinformation.
Spatial queries and process models play an important role in this
functionality.
One of the key uses of GISs has been to support spatial decisions.
Spatial dec ision support systems (SDSS) are a category of information
systems composed of a database, GIS software, models, and a so -called
knowledge engine which allow users to deal specifically with locational
problems.
munotes.in
Page 42
Principles of
Geogrphics
Information Systems
42 Spatial data analysis
In a GIS, data are us ually grouped into layers (or themes).
Usually, several themes are part of a project.
The analysis functions of a GIS use the spatial and non -spatial
attributes of the data in a spatial database to provide answers to user
questions.
GIS functions are used for maintenance of the data, and for analysing
the data in order to infer information from it.
Analysis of spatial data can be defined as computing new information
that provides new insight from the existing, stored spatial data.
Consider an example from the domain of road construction. In
mountainous areas this is a complex engineering task with many cost
factors, which include the amount of tunnels and bridges to be
constructed, the total length of the runway, and the volume of rock and
soil to be m oved.
GIS can help to compute such costs on the basis of an up -to-date digital
elevation model and soil map.
The exact nature of the analysis will depend on the application
requirements, but computations and analytical functions operate on
both spatial a nd non -spatial data.
3.4.4 Spatial data presentation
The presentation of spatial data, whether in print or on -screen, in maps
or in tabular displays, or as ‘raw data’, is closely related to the
disciplines of cartography, printing and publishing.
The pre sentation may either be an end -product, for example as a
printed atlas, or an intermediate product, as in spatial data made
available through the internet.
Method Devices
Hard copy • Printer
• plotter (pen plotter, ink-jet printer, thermal
transfer printer, electrostatic plotter)
• film writer
Soft copy • computer screen
Output of digital
data sets • magnetic tape
• CD-ROM or DVD
• The internet
Table 3.4.4: Spatial data presentation munotes.in
Page 43
Hardware and Software
Trends in Geographic
Information System
43
Table 3.4.4 lists several different methods and d evices used for the
presentation of spatial data.
Cartography and scientific visualization make use of these methods and
devices to produce their products.
3.5 SUMMARY
A system is a set of things that are working together as parts of a
mechanism or an int erconnecting network.
GIS system also contain software and hardware.
Spatial data means data that contains positional values, such as (x, y)
co-ordinates.
A working GIS requires both hardware and software, and also people
such as GI Systems the database cr eators or administrators, analysts
who work with the software, and the users of the end product.
A faster , more powerful processor generation replaces the previous one
every few months.
Nowadays, handheld computers are commonplace in business and
personal use, providing field surveyors with powerful tools, including
GPS capabilities.
It is generally believed that software technology has developed
somewhat slower than hardware technology.
The UMTS protocol (Universal Mobile Telecommunications System),
allows digital communication of text, Mobile communication audio,
and video at a rate of approximately 2 Mbps.
The new HSDPA (High -Speed Downlink Packet Access) protocol
offers up to 10 times the speed of UMTS.
Bluetooth version 2.0 is a standard that offers up to 3 Mbps
connections, especially between palm - and laptop computers and their
peripheral devices, such as a mobile phone, GPS or printer at short
range.
Wireless LANs (Local Area Networks), under the so -called Wi -Fi
standard, nowadays offer a bandwidth of up to 108 Mbps on a single
connection point, to be shared between computers.
Standard ‘Dial -up’ telephone modems allow rates up to 56 kbps.
Digital telephone links (ISDN) support much higher rates: up to 1.5
Mbps. munotes.in
Page 44
Principles of
Geogrphics
Information Systems
44 ADSL (Asymmetric Digital Subscriber L ine) technology widely
available through telephone companies on standard copper -wire
networks supports transfer rates anywhere between 2 and 20 Mbps
towards the customer (downstream), and between 1 and 8 Mbps
towards the network (upstream) depending on the internet provider and
quality of the network infrastructure.
A GIS is a computer -based system that provides the following four sets
of capabilities to handle georeferenced data:
1. Data capture and preparation
2. Data management, including storage and maintena nce
3. Data manipulation and analysis
4. Data presentation
We can state that any package that provides support for only rasters or
only objects, is not a complete GIS. Well -known, full -fledged GIS
packages include ILWIS, Intergraph’s GeoMedia, ESRI’s ArcGIS,
and MapInfo from Map - Info Corp .
The several functional components of GIS are data capture and
preparation, data storage, data analysis, and presentation of spatial
data.
The various stages of spatial data handling are as follows: -
1. Spatial data capture and preparation
2. Spatial data storage and maintenance
3. Spatial query and analysis
4. Spatial data presentation
3.6 QUESTIONS
1. List Functional component of GIS.Explain any two of them in details.
2. Differentiate between vector data and raster data.
3. Write a note on Spa tial data infrastructure.
4. What are the different ways of spatial data capture and preparation?
Explain.
5. Write a note on spatial data presentation.
3.7 MCQ QUESTIONS
1. Among the following which do not come under the components of
GIS?
a) Hardware
b) Software
c) Data
d) Compiler munotes.in
Page 45
Hardware and Software
Trends in Geographic
Information System
45 2. GIS uses the information from which of the following sources?
a) Non-spatial Information System
b) Spatial information System
c) Global Information System
d) Position Information System
3. Which of the following doesn't determine the capability of GIS?
a) Defin ing a map
b) Representing cartographic feature
c) Retrieving data
d) Transferring data
4. Boundary model is also known as _____.
a) Topological data model
b) Topological discrete model
c) Temporal data model
d) Temporal continuous model
5. Which of the following is not full -fledged GIS packages?
a) ILWIS
b) GeoMedia
c) ArcGIS
d) Autocad
6. SDI stands for
a) Spatial Data Interface
b) Spatial Data Infrastructure
c) Spatial Data Intention
d) Spatial Data International
7. ArcGIS is product of the____
a) Environmental System Research Center
b) Caliper corporation
c) Autodesk
d) Clark’s lab
8. Spatial Data capturing involves ______.
a) surveying, engineering, photogrammetry, remote sensing and
digitization
b) digitization, finding statistical values, creating maps
c) rasterization, creating maps and presenting on output device
d) surve ying engineering and digitization
9. UMTS protocol (Universal Mobile Telecommunications System)
allows digital communication of text, audio and video at a rate of
approximately ______
a) 8 Mbps
b) 6 Mbps
c) 2Mbps
d) 4 Mbps munotes.in
Page 46
Principles of
Geogrphics
Information Systems
46
10. Which Protocol allows digital communication of text,audio & video
a) GIS
b) HSDPA
c) UMTS
d) SDI
11. Digital telephone links (ISDN) support network speed rates up to
_____.
a) 8 Mbps
b) 2.5 Mbps
c) 1.5 Mbps
d) 5 Mbps
3.8 REFERENCES
Principles of Geographic Information Systems, Otto Huisman, Rolf
A. de By (eds.)
munotes.in
Page 47
47 4
DBMS, GIS AND SPATIAL SYSTEM
Unit Structure :
4.0 Objective
4.1 Introduction
4.2 Database management systems
4.2.1 Reasons for using a DBMS
4.2.2 Alternatives for data management
4.2.3 The relational data model
4.2.4 Querying a relational database
4.3 GIS and spatial databases
4.3.1 Linking GIS and DBMS
4.3.2 Spatial database functionality
4.4 Summary
4.5 Questions
4.6 MCQ Question
4.7 References
4.0 OBJECTIVE
The objective of this chapter is to understands the following concept:
Database manage ment system
Reasons for using DBMS
Alternative for DBMS
Spatial Database
4.1 INTRODUCTION
A database is an organized collection of structured information, or data,
typically stored electronically in a computer system.
A database is usually controlled by a database management system
(DBMS).
Data, DBMS, and applications together make up a database system, also
called database or database system. munotes.in
Page 48
Principles of
Geogrphics
Information Systems
48 Spatial data are relative geographic information about the Earth and its
features.
A specific location on Ea rth is defined by a pair of latitude and longitude
coordinates.
There are two types of spatial data, raster data and vector data, depending
on how they are stored.
Raster data is made up of grid cells that are identified by row and column.
The entire geo graphic area is divided into groups of individual cells, each of
which represents a different image.
Points, polylines, and polygons make up vector data. Points represent wells,
houses, and so on. Polylines are used to represent roads, rivers, and streams ,
among other things. Polygons represent villages and towns.
The purpose of a spatial database is to store and retrieve information about
objects.
These are the objects that have a geometric space definition.
4.2 DATABASE MANAGEMENT SYSTEMS
A database is a large, computerized collection of structured data.
Since the 1960's, databases have been used for non -spatial purposes such as
managing bank accounts, tracking stock, managing salaries, order
bookkeeping, and booking flights.
In all these applications , the amount of data is usually quite large, but the
data itself is simple and regular in structure.
Database design and maintenance
Designing a database is not an easy task.
Before creating a database, one should carefully consider what its purpose
is, and who its users will be.
To organize the data within the database, one must identify the data sources
and define their format.
This format is usually called the database structure.
Lastly, data can be entered into the database.
It is important to ke ep the data up -to-date, and it is therefore wise to set up
the processes for this, and make someone responsible for regular
maintenance of the database.
Documentation of the database design and set -up is crucial for an extended
database life. munotes.in
Page 49
DBMS, GIS And Spatial
System
49 Many enterpr ise databases tend to outlive the professional careers of their
original designers.
A database management system (DBMS) is a software package that
allows the user to set up, use and maintain a database.
Like a GIS allows the set -up of a GIS application, a DBMS offers generic
functionality for database organization and data handling.
Many standard PCs are equipped with a DBMS called MS Access.
This package offers a useful set of functions, and the capacity to store
terabytes of information.
4.2.1 Reaso ns for using a DBMS
There are several reasons why one might want to use a DBMS for data storage
and processing.
A DBMS supports the storage and manipulation of very large data sets.
Some data sets are so big that storing them in text files or spreadsheet
files becomes too difficult for use in practice. The result may be that
finding simple facts takes minutes, and performing simple calculations
perhaps even hours. A DBMS is specifically designed for this purpose.
A DBMS can be instructed to guard over dat a correctness.
For example, an important aspect of data correctness is data entry
checking: ensuring that the data entered into the database does not
contain obvious errors.
For instance, since we know the study area we are working in, we also
know the range of possible geographic coordinates, so we can ensure the
DBMS checks them.
The above is a simple example of the type of rules, generally known as
integrity constraints, that can be defined in and automatically checked by
a DBMS.
More complex integr ity constraints are certainly possible, and their
definition is part of the design of a database.
A DBMS supports the concurrent use of the same data set by many users.
Large data sets are built up over time, which means that substantial
investments are required to create and maintain them, and that probably
many people are involved in the data collection, maintenance and
processing.
These data sets are often considered to be of a high strategic value for the
owner(s), which is why many may want to make use of them within an
organization. munotes.in
Page 50
Principles of
Geogrphics
Information Systems
50 Moreover, for different users of the database, different views on the data
can be defined.
In this way, users will be under the impression that they operate on their
personal database, and not on one shared by many peo ple.
They may all be using the database at the same time, without affecting
each other’s activities. This DBMS function is called concurrency
control.
A DBMS provides a high -level, declarative query language.
The most important use of the language is t he definition of queries.
A query is a computer program that extracts data from the database that
meet the conditions indicated in the query.
A DBMS supports the use of a data model .
A data model is a language with which one can define a database
struc ture and manipulate the data stored in it.
The most prominent data model is the relational data model. Its
primitives are tuples (also known as records, or rows) with attribute
values, and relations, being sets of similarly formed tuples.
A DBMS include s data backup and recovery functions to ensure data
availability at all times.
As potentially many users rely on the availability of the data, the data
must be safeguarded against possible calamities.
Regular back -ups of the data set, and automatic reco very schemes
provide an insurance against loss of data.
A DBMS allows the control of data redundancy .
A well -designed database takes care of storing single facts only once.
Storing a fact multiple times give rise to a phenomenon known as data
redundancy .
Data redundancy can lead to situations in which stored facts may
contradict each other, causing reduced usefulness of the data.
Redundancy, however, is not necessarily always problematic, as long as
we specify where it occurs so that it can be control led for.
4.2.2 Alternatives for data management
The decision to use a DBMS will depend, among other things, on how
much data there is or will be, what type of use it will be put to, and how
many users will be involved. munotes.in
Page 51
DBMS, GIS And Spatial
System
51 On the small -scale side of the spect rum—when the data set is small, its use
is relatively simple, and with just one user —we might use simple text files,
and a text processor .
Think of a personal address book as an example, or a small set of simple
field observations.
Text files does not o ffer support for data analysis, except maybe sorting in
alphabetical order.
If our data set is still small and numeric by nature, and we have a single
type of use in mind, a spread sheet program will be sufficient.
This might be the case if we have a nu mber of field observations with
measurements that we want to prepare for statistical analysis.
However, if we carry out region or nationwide censuses, with many
observation stations and/or field observers and all sorts of different
measurements, one quic kly needs a database to keep track of all the data.
It should also be noted that spreadsheets do not accommodate concurrent
use of the data set well, although they do support some data analysis,
especially when it comes to calculations over a single tabl e, like averages,
sums, minimum and maximum values.
All such computations are usually restricted to just a single table of data.
When one wants to relate the values in the table with values of another
nature in some other table, some expertise and signi ficant amounts of time
are usually required to make this happen.
4.2.3 The relational data model
A data model is a language that allows the definition of:
The structures that will be used to store the base data,
The integrity constraints that the stored data has to obey at all moments
in time, and
The computer programs used to manipulate the data.
For the relational data model, the structures used to define the database are
attributes, tuples, and relations.
Computer programs either perform data extra ction from the database
without altering it, in which case we call them queries, or they change the
database contents, and we speak of updates or transactions.
The technical terms surrounding database technology are defined below.
Let us look at a tiny database example from a cadastral setting. It is
illustrated in Figure 4.2.3. This database consists of three tables, one for munotes.in
Page 52
Principles of
Geogrphics
Information Systems
52 storing people’s details, one for storing parcel details and a third one for
storing details concerning title deeds. Various sourc es of information are
kept in the database such as a taxation identifier (TaxId) for people, a parcel
identifier (PId) for parcels and the date of a title deed (Deed Date).
Relations, tuples and attributes
In the relational data model, a database is considered as a collection of
relations, commonly also known as tables.
A table or relation is itself a collection of tuples (or records). In fact, each
table is a collection of tuples that are similarly shaped.
It means that a tuple has a fixed number of named fields, also known as
attributes or column. All tuples mean row in the same relation have the
same named fields.
As in Figure 4.2.3, relations can be displayed as tabular form data.
An attribute is a named field of a tup le, with which each tuple associates a
value, the tuple’s attribute value. PrivatePerson Tax ID Surname Birth Date
101-367 Georgia 10/05/1952
134-788 Wilson 26/01/1964
101-490 Thomas 14/09/1931 Parcel PID Location AreaSize
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290 Title Deed Plot Owner Deed Date
2109 101-367 10/12/1996 8871 101-490 10/01/1984 1515 134-788 1/09/1991
3421 101-367 25/9/1996
Figure: 4.2.3 : A small example database consisting of three
relations (tables), all with three attributes, and three, four
and four tuples respectively. PrivatePerson / Parcel /
TitleDeed are the names of the three tables. Surname is an
attribute of the PrivatePerson table; the Surname attribute value for person with TaxId ‘101-367’ is ‘Georgia’. munotes.in
Page 53
DBMS, GIS And Spatial
System
53 The example provided in the figure 4.3.2 shows that Private - Person table
has three tuples; the Surname attribute value for the first tuple illustrated is
‘Georgia’
The phrase ‘tha t are similarly shaped’ requires that all values for the same
attribute come from a single domain of values.
An attribute’s domain is a (possibly infinite) set of atomic values such as
the set of integer number values, the set of real number values, etc.
In our example cadastral database, the domain of the Surname attribute, for
instance, is string, so any surname is represented as a sequence of text
characters, i.e., as a string. The availability of other domains depends on the
DBMS, but usually integer (the whole numbers), real (all numbers), date,
yes/no and a few more are included.
When a relation is created, we need to indicate what type of tuples it will store.
This means that we must
1. Provide a name for the relation,
2. Indicate which attributes it will have, and
3. Set the domain of each attribute.
A relation definition obtained in this way is known as the relation schema of
that relation.
The definition of relation schemas is an important part of database design.
Our examp le database has three relation schemas; one of them is Title
Deed.
The relation schemas together make up the database schema.
For the database of Figure 4.2.3, the relation schemas are given in Table
4.2.3.
Underlined attributes Primary key (and thei r domains) indicate the primary
key of the relation.
Relation schemas are stable, and will hardly change over time.
The tuples stored in a table, on the other hand, are often changing, either
because new ones are added, others are removed, or their attr ibute values. PrivatePerson (TaxId:string,Surname:string,Birthdate:date) Parcel
(Pid:number,Location:polygon,AreaSize:number) TitleDeed (Plot
:number, Owner :string, DeedDate :date) Table 4.2.3: The relation sche mas for the three tables of the
database in Figure 4.2.3
munotes.in
Page 54
Principles of
Geogrphics
Information Systems
54 The set of tuples in a relation at some point in time is called the relation
instance at that moment.
This tuple set is always finite: It is possible to count how many tuples there
are.
Figure 4.2.3 gives us a single database instance, i.e ., one relation instance
for each relation. One relation instance has three tuples, two of them have
four.
Any relation instance always contains only tuples that comply with the
relation schema of the relation.
Finding tuples and building links between t hem
The database system is particularly useful for storing large amounts of data,
as we have already discussed. Note: our example database is not even small,
it is tiny.
The DBMS must support quick searches amongst many tuples.
This is why the relation al data model uses the notion of a key.
A key of a relation comprises one or more attributes. A value for these
attributes uniquely identifies a tuple.
In other words, there will always be one tuple in the table with that
combination of values if each ke y attribute has a value.
It remains possible that there is no tuple for the given combination.
In our example database, the set {TaxId, Surname}is a key of the relation
PrivatePerson: if we know both a TaxId and a Surname value, we will find
at most one tuple with that combination of values.
Every relation has a key, though possibly it is the combination of all
attributes.
When searching for tuples, however, such a large key is not useful since we
must supply a value for each of its attributes.
There should be as few attributes as possible on a key: the fewer, the better.
If a key has just one attribute, it obviously cannot have fewer attributes.
Some keys have two attributes; an example is the key {Plot, Owner} of
relation TitleDeed.
We need both attributes because there can be many title deeds for a single
plot (in case of plots that are sold often) but also many title deeds for a
single person (in case of wealthy persons).
munotes.in
Page 55
DBMS, GIS And Spatial
System
55 When we provide a value for a key, we can look up the corresponding tu ple
in the table (if such a tuple exists).
A tuple can refer to another tuple by storing that other tuple’s key value.
For instance, a TitleDeed tuple refers to a Parcel tuple by including that
tuple’s key value.
The TitleDeed table has a special attri bute Plot for storing such values.
The Plot attribute is called a foreign key because it refers to the primary key
(Pid) Foreign key of another relation (Parcel). This is illustrated in Figure
4.2.3.1.
Two tuples of the same relation instance can have i dentical foreign key
values: for instance, two TitleDeed tuples may refer to the same Parcel
tuple.
A foreign key, therefore, is not a key of the relation in which it appears,
despite its name.
A foreign key must have as many attributes as the primary ke y that it refers
to.
Parcel PID Location AreaSize
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290 Title Deed Plot Owner Deed Date
2109 101-367 10/12/1996 8871 101-490 10/01/1984 1515 134-788 1/09/1991
3421 101-367 25/9/1996
Figure: 4.2.3.1 :The table Tit leDeed has a foreign key in its
attribute Plot. This attribute refers to key values of the Parcel
rela- tion, as indicated for two TitleDeed tuples. The table
TitleDeed actually has a second foreign key in the attribute
Owner, which refers to PrivatePerson . munotes.in
Page 56
Principles of
Geogrphics
Information Systems
56 4.2.4 Querying a relational database
There are three most elementary query operators. They are quite powerful
because they can be combined to create more complex queries.
The three query operators have so me characteristics in common.
First, all of them require input and produce output, and both input and
output are relations.
This guarantees that the output of one query (a relation) can be the input of
another query, and this gives us the possibility t o build more and more
complex queries, if we want.
The three Query operator are:
1. Tuple Selection
2. Attribute Projection
3. Join
Tuple Selection
The first query operator is called tuple selection; it is illustrated in Figure
4.2.4(a).
Tuple selection works lik e a filter: it allows tuples(rows) that meet the
selection condition to pass, and disallows tuples that do not meet the
condition.
The operator is given some input relation, as well as a selection condition
about tuples in the input relation.
A selectio n condition is a truth statement about a tuple’s attribute values
such as: Area Size > 1000. For some tuples in Parcel this statement will be
true, for others it will be false.
Tuple selection on the Parcel relation with this condition will result in a se t
of Parcel tuples for which the condition is true. munotes.in
Page 57
DBMS, GIS And Spatial
System
57
The most common way of defining queries in a relational database is
through the SQL language. SQL stands for Structured Query Language.
The Query of figure 4.2.4(a) is given b y:
SELECT * FROM Parcel WHERE Area Size >1000;
Above is the tuple selection from the Parcel relation, using the condition
AreaSize > 1000. This indicates that we want to extract all attributes of the
input relation and only tuples that satisfy the conditio n should be included
in output relation.
Attribute Projection
A second operator is shown in Figure 4.2.4(b).
It is called attribute projection.
This operator requires an input relation, along with a list of attributes, all of
which should be attributes of the input relation.
The output relation of this operator has as its schema only the list of
attributes given, and we say that the operator projects onto these attributes.
Contrary to the first operator, which produces fewer tuples, this operator
produces fewer attributes compared to the input relation. Parcel PID Location Area Size
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290
3434 2020 486
6371 1802 950
2209 3542 1840
1505 2609 145 Tuple Selection S
PID Location AreaSize
2109 2323 1090
2209 3542 1840
Figure 4.2.4 (a) tuple select ion has a single table as input
and produces another table with less tuples. Here, the
condition was that Area - Size must be over 1000;
munotes.in
Page 58
Principles of
Geogrphics
Information Systems
58
The Query of figure 4.2.4(b) is given by:
SELECT PId, Location FROM Parcel;
Above is the attribute projection from the Parcel relation. The SELECT -
clause indicates that we only want to extract the two attributes PId and
Location. There is no WHERE -clause in this query.
Virtual tables
Above two Queries do not create stored tables in the database.
This is why the result tables have no name: they are virtual tables.
The r esult of a query is a table that is shown to the user who executed the
query. Whenever the user closes her/his view on the query result, that
result is lost. Parcel PID Location AreaSize
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290
3434 2020 486
6371 1802 950
2209 3542 1840
1505 2609 145 Attribute Projection P
Figure 4.2.4 ( b) attribute projection has a single table
as input and produces another table with fewer
attributes. Here, the projection is onto the attributes
PId and Location. PID Locati on 3421 2001
8871 1462
2109 2323
1515 2003
3434 2020
6371 1802
2209 3542
1505 2609 munotes.in
Page 59
DBMS, GIS And Spatial
System
59 However, the SQL code for the query is stored for future use. The user
can re -execute the query again to obtain a view on the result once more.
Join
Third type of query operator differs from the above two operators in that it
requires two input relations.
The operator is called the join, and is illustrated in Figure 4.2.4(c).
The join operator com bines two input relations to form one output relation,
gluing two tuples (one from each input relation), to form a bigger tuple, if
they meet a specified condition.
This operator produces an output relation with both the first and second
input relations a s attributes.
The number of attributes is therefore increases.
The output tuples are obtained by taking a tuple from the first input relation
and ‘gluing’ it to a tuple from the second input relation.
The join operator uses a condition that expresses which tuples from the first
relation are combined with which tuples from the second.
The example of Figure 4.2.4(c) combines TitleDeed tuples with Parcel
tuples, but only those for which the foreign key Plot matches with primary
key PId.
The above join query is also easily expressed in SQL as follows.
SELECT * FROM TitleDeed, Parcel WHERE TitleDeed. Plot = Parcel.Pid;
The FROM -clause identifies the two input relations; the WHERE -clause
states the join condition.
It is often not sufficient to use just o ne operator for extracting sensible
information from a database.
The strength of the above operators hides in the fact that they can be
combined to produce more advanced and useful query definitions.
Suppose in figure 4.2.4(c)we really wanted to obtain combined
TitleDeed/Parcel information, but only for parcels with a size over 1000,
and we only wanted to see the owner identifier and deed date of such title
deeds.
We can take the result of the above join, and select the tuples that show a
parcel size o ver 1000. The result of this tuple selection can then be taken as
the input for an attribute selection that only leaves Owner and DeedDate.
This is illustrated in Figure 4.2.4 (d).
The SQL statement that would give us the result of Figure 4.2.4 (d) can be
written as munotes.in
Page 60
Principles of
Geogrphics
Information Systems
60 SELECT Owner, Deed Date FROM Title Deed, Parcel WHERE Title
Deed. Plot = Parcel. PId AND Area Size > 1000;
Parcel PID Location Area Size
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290 Title Deed Plot Owner Deed Date
2109 101-367 10/12/1996
8871 101-490 10/01/1984
1515 134-788 1/09/1991
3421 101-367 25/9/1996
Figure: 4.2.4(c) : The essential binary query operator: join. The
join condition for this example is TitleDeed.Plot= Parcel.Pid,
which expresses a foreign key/key link between T itleDeed and
Parcel. The result relation has 3 + 3 = 6 attributes. Join
Plot Owner Deed Date PID Location AreaSize
2109 101-367 10/12/1996 2109 2323 1090
8871 101-490 10/01/1984 8871 1462 550
1515 134-788 1/09/1991 1515 2003 290
3421 101-367 25/9/1996 3421 2001 467 munotes.in
Page 61
DBMS, GIS And Spatial
System
61
4.3 GIS AND SPATIAL DATABASES
4.3.1 Linking GIS and DBMS
Storing spatial and attr ibute Data
GIS software provides support for spatial data and the matic or attribute
data.
GISs have traditionally stored spatial data and attribute data separately.
This required the GIS to provide a link between the spatial data that is
represented wi th rasters or vectors, and their non -spatial attribute data. Parcel PID Location Area Size
3421 2001 467
8871 1462 550
2109 2323 1090
1515 2003 290 Title Deed Plot Owner Deed Date
2109 101-367 10/12/1996
8871 101-490 10/01/1984
1515 134-788 1/09/1991
3421 101-367 25/9/1996
Figure: 4.2.4(d): A combined selection/projection/join query,
selecting owners and deed dates for parcels with a size larger than
1000. The join is carried out first, then follows a tuple selection on
the result tuples of the join. Finally, an attribute projection is carried out. Join
Plot Owner Deed Date PID Location Area
Size
2109 101-367 10/12/1996 2109 2323 1090
8871 101-490 10/01/1984 8871 1462 550
1515 134-788 1/09/1991 1515 2003 290
3421 101-367 25/9/1996 3421 2001 467 Tupl e Selection
Plot Owner Deed Date PID Location Area
Size 2109 101-367 10/12/1996 2109 2323 1090
Attribute Projection
Owner Deed Date
101-367 10/12/1996 munotes.in
Page 62
Principles of
Geogrphics
Information Systems
62 Geographic information systems are strong because they have built -in
capabilities for analyzing, storing, and producing maps that are derived
from their understanding of geographical space.
GIS packages themselves can store tabular data, however, they do not
always provide a full -fledged query language to operate on the tables.
External DBMS
DBMSs have a long tradition in handling attribute data that is
administrative, non - spatial, tabular, t hematic data in a secure way, for
multiple users at the same time.
Arguably, DBMSs offer much better table functionality, since they are
specifically designed for this purpose.
A lot of the data in GIS applications is attribute data, so it made sense t o use
a DBMS for it.
For this reason, many GIS applications have made use of external DBMSs
for data support.
In this role, the DBMS serves as a centralized data repository for all users,
while each user runs her/his own GIS software that obtains its d ata from the
DBMS.
This meant that a GIS had to link the spatial data represented with rasters or
vectors, and the attribute data stored in an external DBMS.
Linking objects and tables
With raster representations , each raster cell stores a characteristic value.
This value can be used to look up attribute data in an accompanying
database table.
For instance, the land use raster of Figure 4.3.1 indicates the land use class
for each of its cells, while an accompanying table provides full descriptions
for all classes, including perhaps some statistical information for each of the
types.
Observe the similarity with the key/foreign key concept in relational
databases.
With vector representations , our spatial objects —whether they are points,
lines, or polyg ons—are automatically given a unique identifier by the
system.
This identifier is usually just called the object ID or feature ID and is used
to link the spatial object as represented in vectors with its attribute data in
an attribute table.
The principle applied here is similar to that in raster settings, but in this
Linking objects and tables munotes.in
Page 63
DBMS, GIS And Spatial
System
63 In the vector system, ID functions as a key, and any reference to an ID
value in the attribute database is a foreign key reference.
For example, in Figure 4.3.2, parcel is a table with attributes, linked to the
spatial objects stored in a GIS by the Location column. Obviously, several
tables may make references to the vector system, but it is not uncommon to
have some main table for which the ID is also the key.
A A A B B B
A A F B B B
F F A B C E
F F C C E E
F F C E E A Land
UseClass ID Description Perc
A Primary
forest 11.3
B Secondary
vegetation 25.5
C Pasture 31.2
E Built -up area 25.5
F Rivers, lakes 4.1
Figure 4.3.1 : A raster resenting land use and arelated table
providing fulltextdescription s(amongstothers) of each land useclass. munotes.in
Page 64
Principles of
Geogrphics
Information Systems
64
4.3.2 Spatial database functionality
Spatial DBMS
Over the last two decades, DBMS vendors have recognized the need to
store more complex data, like spatial data.
The main problem was that there is a dditional functionality needed by
DBMS in order to process and manage spatial data.
As the capabilities of our hardware to process information has increased, so
too has the desire for better ways to represent and manage spatial data.
During the 1990’s, o bject -oriented and object -relational data models were
developed for just this purpose.
These extend standard relational models with support for objects, including
‘spatial’ objects.
Currently, GIS software packages can store spatial data using a range of
commercial and open -source DBMSs such as Oracle, Informix, IBM DB2,
Sybase, and Postgre SQL, with the help of spatial extensions.
Some GIS software have integrated database ‘engines,’ and therefore do not
need these extensions.
Parcel PID Location OwnerID
3421 2001 435
8871 1462 550
2109 2323 1040
1515 2003 245
3434 2020 486
6371 1802 950 Spatial Attribute
Figure 4.3.2: Storage and linking of vector attribute data between
GIS and DBMS munotes.in
Page 65
DBMS, GIS And Spatial
System
65 ESRI’s ArcGIS, for example, has the main components of the MS Access
database software built -in.
This means that the designer of a GIS application can choose whether to
store the application data in the GIS or in the DBMS.
Spatial databases, also known as geodatabases, are implemen ted directly on
existing DBMSs, using extension software to allow them to handle spatial
objects.
A spatial database allows users to store, query and manipulate collections of
spatial data.
spatial data can be stored in a special database column, known as the
geometry column, or feature or shape, depending on the specific software
package.
This means GISs can rely fully on DBMS support for spatial data, making
use of a DBMS for data query and storage and also for multi -user support,
and GIS for spatial fun ctionality.
Small -scale GIS applications may not require a multi -user capability, and
can be supported by spatial data support from a personal database.
A geodatabase allows a wide variety of users to access large data sets that
include both geographic an d alphanumeric, and the management of their
relations, guaranteeing their integrity.
The Open Geospatial Consortium (OGC) has released a series of standards
relating to geodatabases that define:
Which tables must be present in a spatial database i.e., geom etry columns
table and spatial reference system table.
The data formats, called ‘Simple Features’ i.e., point, line, polygon, etc.
A set of SQL -like instructions for geographic analysis.
The architecture of a spatial database differs from a standard RDBMS not
only because it can handle geometry data and manage projections, but also
for a larger set of commands that extend standard SQL language for e.g.,
distance calculations, buffers, overlay, conversion between coordinate
systems, etc.
The capabilities of spatial databases will continue to evolve over time.
ArcGIS geodatabases can now store topological relationships directly in the
database, supporting different kinds of features (objects) and their behavior
(relationships with other objects) and ways to validate these relationships
and behaviors.
munotes.in
Page 66
Principles of
Geogrphics
Information Systems
66 Querying a spatial database
Spatial query
A Spatial DBMS provides support for geographic co -ordinate systems and
trans - formations.
It also provides storage of the relationships between features, including the
creation and storage of topological relationships.
As a result, one is able to use functions for ‘spatial query’ (exploring spatial
relationships). To illustrate, a spatial query using SQL to find all the Thai
restaurants within 2 km of a given hotel would look like this:
SELECT R. Name FROM Restaurants AS R, Hotels as H WHERE R. Type =
“Thai” AND H. name = “Hilton” AND ST_Intersects (R. Geometry, ST Buffer
(H. Geometry, 2000))
In the above query the WHERE clause uses the ST Intersects function to
perform a spatial join between a 2000 m buffer of the selected hotel and the
selected subset of restaurants. The geometry column carries the spatial data.
4.4 SUMMARY
A database is an organized collection of structured information, or data,
typically stored elect ronically in a computer system.
A database is usually controlled by a database management system
(DBMS).
Spatial data are relative geographic information about the Earth and its
features.
A database management system (DBMS) is a software package that
allows the user to set up, use and maintain a database.
For the relational data model, the structures used to define the database are
attributes, tuples, and relations.
In the relational data model, a database is considered as a collection of
relations, com monly also known as tables.
A key of a relation comprises one or more attributes. A value for these
attributes uniquely identifies a tuple.
The three Query operator are:
1. Tuple Selection
2. Attribute Projection
3. Join
GIS software provides support for spatial da ta and the matic or attribute
data.
munotes.in
Page 67
DBMS, GIS And Spatial
System
67 GISs have traditionally stored spatial data and attribute data separately.
Spatial databases, also known as geodatabases, are implemented directly on
existing DBMSs, using extension software to allow them to handle spat ial
objects.
A spatial database allows users to store, query and manipulate collections of
spatial data.
4.5 QUESTIONS
1. Explain various reasons for using DBMS in GIS.
2. Explain Relational data model using suitable example.
3. Write a note on Spatial Data Functio nality.
4. Explain the linking of GIS with Database.
5. Explain spatial Database querying with suitable example.
4.6 MCQ QUESTIONS
1. Which of the following is not a reason for which DBMS is used with GIS?
a) A DBMS supports the storage and manipulation of very large d ata sets.
b) A DBMS can be instructed to guard over data correctness.
c) DBMS can also use to represent graphics.
d) A DBMS supports the concurrent use of the same data set by many
users.
2. Attribute projection operation can work on ______ input relation/relations.
a) Four
b) three
c) two
d) one
3. Spatial database allows user to ______, _______ and ______ collections of
spatial data .
a) analyze, create map, query
b) create graph, map, analysis
c) store, represent, create graph
d) store, query, manipulate
munotes.in
Page 68
Principles of
Geogrphics
Information Systems
68 4. What is a 'tuple'?
a) A row or record in a database table.
b) Another name for the key linking different tables in a database.
c) An attribute attached to a record.
d) Another name for a table in an RDBMS.
5. Which one of the following is correct query?
a) select * where population>100000 from census
b) from census select * where population >100000
c) population >100000 select * from census
d) select * from census where population >100000
4.7 REFERENCES
Principles of Geographic Information Systems, Otto Huisman, Rolf A. de
By (eds.).
munotes.in
Page 69
69 5
SPATIAL REFERENCING AND
POSITIONING
Unit Structure :
5.0 Objective
5.1 Spatial Referencing
5.1.1 Reference surfaces for mapping
5.1.2 Coordinate Systems
5.1.3 Map Projections and Coordinate Transformations
5.2 Satellite -based Positioning
5.2.1 Absolute positioning
5.2.2 Errors in absolute positioning
5.2.3 Relative positioning
5.2.4 Network positioning
5.2.5 Code versus phase measurements
5.2.6 Positioning technology
5.3 Summary
5.4 Exercise
5.0 OBJECTIVE
In this chapter we are going to explore more on spatial referencing and
positioning systems.
Spatial references are important when building applications that use
geographic data.
A spatial reference defines the coordinate system used to locate the geometry
for a feature .
It controls how and where features are displayed in a map or scene .
A coordinate system is a method for identifying the location of a point on the
earth. Most coordinate systems use two numbers, a coordinate, to identify the
location of a point. Each o f these numbers indicates the distance between the
point and some fixed reference point, called the origin.
munotes.in
Page 70
Principles of
Geogrphics
Information Systems
70 Main objectives of this section are to :
● Understand the relevance and actual use of reference surfaces, coordinate
systems, and coordinate transform ations in mapping.
● Describe and differentiate between coordinate systems and map projections.
● Grasp the logic of map projection equations and the principles of
transforming maps from one projection system to another.
We are going to see more about:
● Spatial reference surfaces and datums
○ The Geoid – vertical (height) datum
○ The Ellipsoid – horizontal (geodetic) datum
○ Local and global datums
● Map projections
○ Classification of map projections
○ Map projection selection
○ Map coordinate systems
○ Coordinate transforma tions
5.1 SPATIAL REFERENCING
● A GIS is to be created from available maps of different thematic layers like
soils, land use, temperature, etc.
● The maps are in two -dimensions whereas the earth’s surface is a 3 -
dimensional ellipsoid.
● Every map has a projec tion and scale.
● To understand how maps are created by projecting the 3 -d earth’s surface
into a 2 -d plane of an analogue map, we need to understand the
georeferencing concepts.
5.1.1 Reference Surfaces for Mapping:
● The surface of the Earth is uniform.
● The oceans can be treated as reasonably uniform, but the surface or
topography of the land masses exhibits large vertical variations between
mountains and valleys.
● These variations make it impossible to approximate the shape of the Earth
with any reasonably simple mathematical model. munotes.in
Page 71
Spatial Referencing
and Positioning
71 ● Two main reference surfaces have been established to approximate the
shape of the Earth.
○ One reference surface is called the Geoid ,
○ The other reference surface is the ellipsoid .
● These are illustrated in the figure below.
The Earth's surface, and two reference surfaces used to approximate it:
The Geoid, and a reference ellipsoid .
The deviation between the Geoid and a reference ellipsoid is called geoid
separation (N).
➔ Due to irregularities or mass anomalies in this distr ibution the 'global ocean'
results in an undulated surface.
➔ This surface is called the Geoid .
➔ The plumb line through any surface point is always perpendicular to it.
➔ Where a mass deficiency exists, the Geoid will dip below the mean
ellipsoid.
➔ Conversely, where a mass surplus exists, the Geoid will rise above the mean
ellipsoid.
➔ These influences cause the Geoid to deviate from a mean ellipsoidal shape
by up to +/ - 100 meters.
➔ The deviation between the Geoid and an ellipsoid is called the Geoid
Separation (N) or Geoid Undulation.
munotes.in
Page 72
Principles of
Geogrphics
Information Systems
72 5.1.2 Coordinate System ( CS):
Georeferencing involves 2 stages:
a) The Geographic Coordinate System (GCS) - specifying the 3 -
dimensional coordinate system that is used for locating points on the
earth’s surface, and
b) The Project ed Coordinate System - that is used for projecting into two
dimensions for creating analogue maps.
a) Geographic Coordinate System (GCS):
● Geographic coordinate systems use a traditional way of representing
locations on the surface of the earth.
● The 3 -dimensional coordinate system is latitude and longitude to measure
and locate features on the globe.
● Geographic coordinate systems use latitude and longitude to measure and
locate features on the globe.
● The GCS defines a position as a function of direction and distance from a
center point of the globe, where the units of measurement are degrees.
● Any location on earth can be referenced by a point with longitude and
latitude coordinates.
● The ellipsoid model that is used to calculate latitude and longitude i s called
the datum.
● Changing the datum, therefore, changes the values of the latitude and
longitude.
For example, the below figure shows a geographic coordinate system where a
location is represented by the coordinates longitude 80 degree East and latitud e
55 degree North. In surveying and geodesy, a datum is a set of reference points on the
earth's surface against which position measurements are made, and (often)
an associa ted model of the shape of the earth (reference ellipsoid) to define
a geographic coordinate system. munotes.in
Page 73
Spatial Referencing
and Positioning
73
A Geographic coordinate system.
● The equidistant lines that run east and west each have a constant latitude
value called parallels.
● The equator is the largest circle and divides the earth in half.
● It is equal in distance from ea ch of the poles, and the value of this latitude
line is zero.
● Locations north of the equator have positive latitudes that range from 0 to
+90 degrees, while locations south of the equator have negative latitudes
that range from 0 to -90 degrees.
● The lines that run through north and south each have a constant longitude
value and form circles of the same size around the earth known as
meridians .
● The prime meridian is the line of longitude that defines the origin (zero
degrees) for longitude coordinates.
● The latitude and longitude lines cover the globe to form a grid that is known
as a graticule . The point of origin of the graticule is (0,0) , where the
equator and the prime meridian intersect.
b) Projected Coordinate System ( GCS):
● A projected coordinate syst em is defined on a flat, two -dimensional surface.
● In contrast to a geographic coordinate system, a projected coordinate system
has constant lengths, angles, and areas across the two dimensions.
● A projected coordinate system is always based on a geographi c coordinate
system that is based on a sphere or spheroid.
● In a projected coordinate system, locations are identified by x,y coordinates
on a grid, with the origin at the center of the grid.
● Each position has two values that reference it to that central l ocation.
● One specifies its horizontal position and the other its vertical position. munotes.in
Page 74
Principles of
Geogrphics
Information Systems
74 ● The two values are called the x -coordinate and y -coordinate.
● Using this notation, the coordinates at the origin are x = 0 and y = 0.
● Mathematical formulas are used to co nvert a three -dimensional geographic
coordinate system to a two -dimensional flat projected coordinate system.
● The transformation is referred to as a map projection .
5.1.3 Map Projections and Coordinate Transformations :
● A map projection is one of many meth ods used to represent the 3 -
dimensional surface of the earth or other round body on a 2 -dimensional
plane in cartography (mapmaking).
● This process is typically, but not necessarily, a mathematical procedure
(some methods are graphically based).
● The creati on of a map projection involves three steps in which information
is lost in each step:
○ Selection of a model for the shape of the earth or round body (choosing
between a sphere or ellipsoid)
○ Transform geographic coordinates (longitude and latitude) to plane
coordinates (eastings and northings).
○ Reduce the scale (in manual cartography this step came second, in
digital cartography it comes last)
● Metric properties of maps
○ Maps assume that the viewer has an orthogonal view of the map (they
are looking straight d own on every point).
○ This is also called a perpendicular view or normal view.
○ The metric properties or a map are:
■ Area
■ Shape
■ Direction
■ Distance
■ scale
● There are several different types of projections that aim to accomplish
different goals while sacrificin g data in other areas through distortion.
○ Area preserving projection – equal area or equivalent projection
○ Shape preserving – conformal, orthomorphic munotes.in
Page 75
Spatial Referencing
and Positioning
75 ○ Direction preserving – conformal, orthomorphic, azimuthal (only from
the central point)
○ Distance preservi ng – equidistant (shows the true distance between one
or two points and every other point)
The most common types of map projections include:
1) Equal area projections
● These projections preserve the area of specific features.
● These projections disto rt shape, angle, and scale.
● The Albers Equal Area Conic projection is an example of an equal area
projection.
2) Conformal projections
● These projections preserve local shape for small areas.
● These projections preserve individual angles to describe spatial
relationships by showing perpendicular graticule lines that intersect at
90-degree angles on the map.
● All the angles are preserved; however, the area of the map is distorted.
● The Mercator and Lambert Conformal Conic projections are examples of
conforma l projections.
3) Equidistant projections
● These projections preserve the distances between certain points by
maintaining the scale of a given data set.
● Some of the distances will be true distances, which are the same distances at
the same scale as the globe.
● If you go outside the data set, the scale will become more distorted.
● The Sinusoidal projection and the Equidistant Conic projection are
examples of equidistant projections.
munotes.in
Page 76
Principles of
Geogrphics
Information Systems
76 4) True -direction or azimuthal projections
● These projections preserve the directio n from one point to all other points
by maintaining some of the great circle arcs.
● These projections give the directions or azimuths of all points on the map
correctly with respect to the center.
● Azimuthal maps can be combined with equal area, conformal, and
equidistant projections.
● The Lambert Equal Area Azimuthal projection and the Azimuthal
Equidistant projection are examples of azimuthal projections.
● Coordinate Transformation Overview is given in below diagram:
5.2 SATELLITE -BASED POSI TIONI NG
● Satellites are the cornerstones of modern positioning and navigation.
● They are used by vehicle navigation systems, smartphones and land
surveyors alike.
● Based on satellite signals, the receiver can define its position anywhere in
the worl d with an ac curacy of a few meters in less than a minute. munotes.in
Page 77
Spatial Referencing
and Positioning
77 ● In addition, the time can be defined as a by -product with an accuracy of
approximately one hundred nanoseconds.
● Using assistance systems, the position can be pinpointed with an accuracy
of a few ce ntimeters.
● Usually, the US -based GPS was synonymous to satellite positioning.
● Although we also have, the Russian GLONASS , as well as the European
Galileo and the Chinese BeiDou which are at the deployment stage, are
openly accessible to all.
● For this reason, satell ite positioning is currently referred to as the
Global Navigation Satellite System (GNSS).
● Satellite positioning is ultimately based on the accurate transfer of time.
● Each of the four satellite positioning systems consists of some 20 to 30
satellites, orb iting at an altitude of approximately 20,000 kilometers.
● All of these satellites are equipped with a precise atomic clock, on the basis
of which they transmit a time signal down to Earth, as well as other data
that, for exampl e, indicates the satellite po sition.
● In addition to three -dimensional position coordinates, the fourth factor to be
solved is the difference between the receiver clock and the satellite clocks.
● This can be done by using at least four satellites, the posit ions of which are
known – the use of several satellites improves the accuracy and reliability of
positioning.
● Because positioning is linked to time, timekeepers also make up a
significant group of GNSS users.
● Using GNSS signals, it is possible to synchron ize devices and clocks that
are located far from each other.
● GNSS receivers do not automatically reveal their location.
● To use GNSS positioning, signals only need to be received.
● This means that users do not automatically reveal their location to the
system administrator or anyone e lse.
● This is a particularly important feature considering the original user of
satellite positioning, i.e. the military.
● However, many devices that use satellite positioning, such as smartphones,
tracking devices and autonomou s vehicles can transmit their location via
other channels, for example, to a cloud service or the device manufacturer
for their purpose of use, to use performance -enhancing services or for
crowdsourcing.
munotes.in
Page 78
Principles of
Geogrphics
Information Systems
78 Description
Global Navigation
Satellite System ( GNSS) A global satellite posit ioning system
Global Positioning
System (GPS) A GNSS system maintained by the US
Department of Defense.
Geosynchronous
satellite A satellite with an orbital period of 24 hours, i.e.
a single satellite can be seen in the sam e place at
the same time, every day, when viewed from the
Earth’s surface.
The orbit’s altitude is approximately 35,000
kilometers
Geostationary orbit A geosynchronous orbit following the Equator.
A geostationary satellite seems to be stationary
when vi ewed from the Earth’s surface.
Medium Earth orbit
(MEO) An orbit with a lower altitude than a
geosynchronous orbit.
Most positioning satellites use an MEO at an
altitude of approximately 20,000 kilometers with
an orbital period of 12 hours.
Satellite -based
augmentation system
(SBAS) An augmentation system which does not produce
any navigation signal but provides information
about the reliability of GNSS signals.
European Space Agency
(ESA) ESA is responsible for the technical development
of the Galile o system, together with the
Euro pean Commission.
European GNSS
Agency (GSA) GSA is responsible for the services offered by
the Galileo and EGNOS systems.
European
Geostationary
Navigation Overlay
Service (EGNOS) Europe’s SBAS.
COSPAS –SARSAT A satelli te system that receives emergency
signals transmitted from the Earth, locates their
sender, and transmits data to rescue authorities.
The 406 MHz frequency is only reserved for this
system.
munotes.in
Page 79
Spatial Referencing
and Positioning
79 Application fields of Satellite -based Positioning are:
➔ Surveying
➔ Military operations
➔ Engineering
➔ Vehicle tracking
➔ Flight navigation
➔ Car navigation
➔ Ship navigation
➔ Agriculture
➔ Mapping
Global Positioning System (GPS):
● The Global Positioning System (GPS) is a satellite -based navigation system
made up of a network of 24 sa tellites placed into orbit by the U.S.
Department of Defense.
● It is one of the global navigation satellite systems (GNSS) that provides
geolocation and time information to a GPS receiver anywhere on or near the
Earth where there is an unobstructed line of sight to four or more GPS
satelli tes.
● GPS is operated and maintained by the Department of Defense (DoD).
● The National Space -Based Positioning, Navigation, and Timing (PNT)
Executive Committee (EXCOM) provides guidance to the DoD on GPS -
related matters im pacting federal agencies to ensur e the system addresses
national priorities as well as military requirements.
● The Federal Aviation Administration oversees the use of GPS in civil
aviation and receives problem reports from aviation users.
● The Global Positio ning System has been successful i n virtually all
navigation and timing applications, and because its capabilities are
accessible using small, inexpensive equipment, GPS is being used in a wide
variety of applications across the globe.
munotes.in
Page 80
Principles of
Geogrphics
Information Systems
80
Components of a GPS system
● GPS is a system, and it is made up of three parts: satellites, ground
stations, and receivers.
● The functionalities of each of these parts:
● Satellites act like the stars in constellations, and we know where they are
because they invariably send out signals.
● The ground station s make use of the radar to make sure the satellites are
where we think they are.
● A receiver is a device that you might find in your phone or in your car
and it constantly seeks for the signals from the satellites.
● The receiver figures out how far away the y are from some of them.
● Once the receiver calculates its distance from four or more satellites, it
knows exactly where you are.
Global Navigation Satellite System (GLONASS)
● Russia started developing GLONASS in 197 6 as an experimental military
communicati ons system.
● They launched the first GLONASS satellite in 1982 and the constellation
became fully operational in 1995.
● The Russian Global Navigation Satellite System (GLONASS) was
developed contemporaneously with GPS but suffered from incomplete
coverage of the globe until the mid -2000s. munotes.in
Page 81
Spatial Referencing
and Positioning
81 ● GLONASS reception in addition to GPS can be combined in a receiver
thereby allowing for additional satellites available to enable faster position
fixes and improved accuracy, to w ithin two meters (6.6 ft).
● GLONASS or Glob al Navigation Satellite System is the satellite navigation
system developed by Russia that consists of 24 satellites, in three orbital
planes, with eight satellites per plane .
● The satellites are placed into nomina lly circular orbits with target
inclinatio ns of 64.8 degrees and an orbital radius of 19,140 km, about 1,060
km lower than GPS satellites, with an orbit period of 11 hrs and 15 minutes.
Versions of GLONASS:
○ GLONASS - These satellites were launched in 1982 for the military and
official organization s. They were intended to for weather, positioning,
timing and velocity measurements.
○ GLONASS -M - These satellites were launched in 2003 to add second civil
code, which is important for GIS mapping receivers.
○ GLONA SS-K - These satellites were launched in 2 011 to add third civil
frequency. These are of 3 types - K1, K2 and KM.
○ GLONASS -K2 - These satellites will be launched after 2015 (currently in
design phase).
○ GLONASS -KM - These satellites will be launched after 20 25 (currently in
research phase).
Galileo
● Galileo is Europe’s Global Satellite Navigation System (GNSS), providing
improved positioning and timing information with significant positive
implications for many European services and users.
● Galileo allo ws users to know their exact position with greater precision than
what is offered by other available systems.
● The products that people use every day, from the navigation device in your
car to a mobile phone, benefit from the increased accuracy that Galileo
provides. munotes.in
Page 82
Principles of
Geogrphics
Information Systems
82 ● Critical, emergency response -services benefit from Galileo.
● Galileo’s services will make Europe’s roads and railways safer and more
efficient.
● It boosts European innovation, contributing to the creation of many new
products and services, creatin g jobs.
● Until now, GNSS users have had to depend on non -civilian American GPS
or Russian GLONASS signals.
● With Galileo, users now have a new, reliable alternative that, unlike these
other programmes, remains under civilian control.
● While European independ ence is a principal objective of the progr amme,
Galileo also gives Europe a seat at the rapidly expanding GNSS global
table.
● The programme is designed to be compatible with all existing and planned
GNSS and interoperable with GPS and GLONASS.
● In this sen se, Galileo is positioned to enhance the c overage currently
available – providing a more seamless and accurate experience for multi -
constellation users around the world.
5.2.1 Absolute positioning
Positioning with GPS can be performed by either of two wa ys: point
positioning or relative position ing.GPS absolute positioning, that is a single
point positioning, uses one set receiver to determine the spatial distance
between the satellite and the receiver antenna to determine the point
coordinates in W GS-84 system carrying through the rear intersect ion of spatial
distances.
5.2.2 Errors in absolute positioning
● Receiver users are required to be familiar with the technology to avoid real
operating blunders such as, munotes.in
Page 83
Spatial Referencing
and Positioning
83 ○ poor receiver placement or
○ incorrect receiver software settings,
which can render po sitioning results virtually useless.
● Error which is the result of the combination of signals from satellites used
for positioning is error related to the relative geometry of satellites and
receiver.
● In the f ollowing figure we will show an illustration of absolute and relative
position errors.
● The blue and red points are true positions and absolute position estimations,
respectively.
● The red points are transformed as close to the blue points as possible by
minimizing the norm proposed in Section II -C.
The optimal transformed positions are the pink points which are defined as
relative position estimations.
● The constellation of satellites in the sky from the receiver’s perspective is
the contr olling factor in these cases.
● Source of error is known as geometric dilution of precision (GDOP).
● GDOP is lower when satellites are just above the horizon in mutually
opposed compass directions.
● However, such satellite positions have bad atmospheric dela y
characteristics, so in practice it is better i f they are at least 15∘ above the
horizon.
● When more than four satellites are in view, modern receivers use “least -
squares” adjustment to calculate the best possible positional fix from all the
signals.
● An o verview of some typical values (without selectiv e availability) is
provided in the Table below. munotes.in
Page 84
Principles of
Geogrphics
Information Systems
84
● GDOP functions not so much as an independent error source but rather as a
multiplying factor, decreasing the precision of position and time values
obtained.
● The procedure that we discussed above is known a s absolute, single -point
positioning based on code measurement.
● It is the fastest and simplest, yet least accurate, means of determining a
position using satellites.
● It suffices for recreational purposes an d other applications that require
horizontal acc uracies to within 5 –10 m.
5.2.3 Relative positioning
● Relative positioning error performance in both indoor and outdoor venues.
● When you set the position relative to an element, without adding any other
positioning attributes such as top, bottom, right, l eft.
● In relative positioning, one of the two receivers involved occupies a known
position during the session, known as the base.
● The objective of the work is the determination of the position of the other,
the rover, relative to the base.
● Both receivers observe the same constellation of satellites at the same time,
and because, in typical applications, the vector between the base and the
rover, known as a baseline.
● The two receivers record very similar errors (biases), and since the base’s
position is kno wn, corrections can be generated there that can be used to
improve the solution at the rover.
munotes.in
Page 85
Spatial Referencing
and Positioning
85 ● Example:
5.2.4 Network positioning
● Network and multireceiver positioning are obvious extensions of relative
positioning.
● The creation of a closed network of points by combining individually
observed baselines.
● The operation of three or more receivers simultaneously has advantages.
● For example, the baselines have redundant measurements and similar, if not
identical, range errors (biases).
munotes.in
Page 86
Principles of
Geogrphics
Information Systems
86 ● The processing methods in such an arrangement can nearly eliminate many
of the biases introduced by imperfect clocks and the atmosphere.
● These processing strategies are based on computing the differences between
simult aneous GPS carrier phase observations.
5.2.5 Code ve rsus phase measurements
Code phase
● Code phase is one processing technique that gathers data via a C/A (coarse
acquisition) code receiver.
● It uses the information contained in the satellite signals (aka the pseudo -
random code) to calculate positions.
● After differential correction, this processing technique results in 1–5-meter
accuracy.
● 100-meter accuracy can improve to between 5 meters and sub -meter
depending on the data collection technique and the da ta receiver used.
● Differential correction is a data collection technique that removes errors in
GPS data created by selective availability, atmospheric delay, and
ephemeris errors.
● Some of the other factors (less correctable) that create error in GPS dat a
include:
1. multipath
2. low number of visible sat ellites
3. large distance between the rover and the base station (10mm
degradation with every kilometer away from base station)
4. high PDOP (position dilution of precision - a measure of the
current satel lite geometry)
5. low SNR (signal to noise ratio)
6. low satellite elevation
7. short occupation period
● Pathfinder Office software comes with a differential correction utility that
calculates error at a known point (base station) and uses this error to cor rect
(or improve the accuracy of) the rover file dat a you collected with your GPS
unit.
munotes.in
Page 87
Spatial Referencing
and Positioning
87 Carrier phase
● Carrier phase is another processing technique that gathers data via a carrier
phase receiver, which uses the radio signal (aka carrier signal) to calcu late
positions.
● The carrier signal, which has a muc h higher frequency than the pseudo -
random code, is more accurate than using the pseudo -random code alone.
● The pseudo -random code narrows the reference then the carrier code
narrows the reference even mor e.
● After differential correction, this processing t echnique results in sub -meter
accuracy.
● The carrier phase receivers are much more accurate than C/A code receivers
but require more involved post -processing and stricter data collection
requirements.
➔ Code phase processing - GPS measurements based on the pse udo
random code (C/A or P) as opposed to the carrier of that code. ( 1–5-
meter accuracy)
➔ Carrier phase processing - GPS measurements based on the L1 or L2
carrier signal. (sub -meter accuracy)
5.2.6 Position ing technology
● A positioning system is a mechanism f or determining the position of an
object in space.
● Technologies for this task exist ranging from worldwide coverage with
meter accuracy to workspace coverage with sub -millimetre accuracy.
● Multiple techno logies exist to determine the position and orientati on of an
object or person in a room, building or in the world are as follow:
○ Time of flight
■ Time of flight systems determine the distance by measuring the time
of propagation of pulsed signals between a t ransmitter and receiver.
■ When distances of at least three locations are known, a fourth position
can be determined using trilateration.
■ Global Positioning System is an example.
○ Spatial scan
■ A spatial scan system uses (optical) beacons and sensors. Two
categories can be distinguished:
● Inside out systems wh ere the beacon is placed at a fixed position in
the environment and the sensor is on the object munotes.in
Page 88
Principles of
Geogrphics
Information Systems
88 ● Outside in systems where the beacons are on the target and the
sensors are at a fixed position in the enviro nment
■ By aiming the sensor at the beacon, the angle between them can be
measured.
○ Inertial sensing
■ The main advantage of an inertial sensing is that it does not require an
external reference.
■ Instead, it measures rotation with a gyroscope or position with a n
accelerometer with respect to a known starting p osition and
orientation.
○ Mechanical linkage
■ This type of tracking system uses mechanical linkages between the
reference and the target.
■ Two types of linkages have been used.
● One is an assembly of mechanic al parts that can each rotate,
providing the user with multiple rotation capabilities.
● The orientation of the linkages is computed from the various
linkage angles measured with incremental encoders or
potentiometers.
● Other types of mechanical linkages ar e wire s that are rolled in
coils.
● A spring system ensures that the wires are tensed in order to
measure the distance accurately.
○ Phase difference
■ Phase difference systems measure the shift in phase of an incoming
signal from an emitter on a moving target compared to the phase of an
incoming signal from a reference emitter.
■ With this the relative motion of the emitter with respect to the receiver
can be calculated.
○ Direct field sensing
■ Direct field sensing systems use a known field to derive orientation o r
position:
● A simple compass uses the Earth's mag netic field to know its
orientation in two directions.
● An inclinometer uses the earth's gravitational field to know its
orientation in the remaining third direction. munotes.in
Page 89
Spatial Referencing
and Positioning
89 ■ The field used for positioning does not need to originate from nature.
○ Optical systems
■ Optical positioning systems are based on optics components, such as
in total stations.
○ Magnetic positioning
■ Magnetic positioning is an IPS (Indoor positioning system) solution
that takes advantage of the magn etic field anomalies typical of indoor
settings by using them as distinctive place recognition signatures.
○ Hybrid systems
■ Because every technology has its pros and cons, most systems use
more than one technology.
■ A system based on relative position change s like the inertial system
needs periodic calibrat ion against a system with absolute position
measurement.
■ Systems combining two or more technologies are called hybrid
positioning systems.
5.3 SUMMARY
● This chapter has set out to demonstrate the spatial po sition element of GIS.
● Many complexities and gross errors may occur unless some fundamental
principles are understood by the user of the system.
● Thus care in the selection and use of referencing and coordinate systems is
always essential.
● The most common difficulties arise when information derived by the use of
GPS is combined with that from maps produced using different datums.
● The combined effects of the use of different geodetic reference systems
(including different datums), projection change, grid sh ifts, and any
embedded errors in the maps can caus e great practical problems.
● As indicated in the chapter maps are the basis for much current and most
historical data in GIS.
5.4 EXERCISE
1. What is the Geographic Coordinate System?
2. Write a short note on Da tum.4.
3. What is Map Projection? List and explain ty pes of Map Projections.5.
4. Explain the term false origin, false easting and false northing with the help
of diagrams. munotes.in
Page 90
Principles of
Geogrphics
Information Systems
90 5. List and explain commonly used map projections
6. Explain different Coordinate systems used in GIS.
7. Explain Spatial framework for mapping loca tions.
References
1. https://zia207.github.io/geospatial -r-github.io/map -projection -coordinate -
reference -systems.html
2. https://kartoweb.itc.nl/geometrics/Reference%20surf aces/refsurf.html#:~:te
xt=One%20reference%20surfac e%20is%20called,Geoid%2C%20and%20a
%20reference%20ellipsoid .
3. http://what -when -how.com/gps/gps -positioning -modes -part-1/
4. https://www.e -
education.psu.edu/geog862/node/1725#:~:text=In%20relative%20positionin
g%2C%20one%20of,rover%2 C%20relative%20to%20the%20base .
5. https://www.nrem.iastate.edu/files/wk12 -
GPS_DiffCorr_CarriervsCode_COStateParks.pdf
6. https://en.wikipedia.or g/wiki/Positioning_system
munotes.in
Page 91
91 6
DATA ENTRY AND PREPARATION
Unit Structure :
6.0 Objective
6.1 Spatial Data Input
6.1.1 Direct spatial data capture
6.1.2 Indirect spatial data capture
6.1.3 Obtaining spatial data elsewhere
6.2 Data Quality
6.3 Data Preparation
6.3.1 Data ch ecks and repairs
6.3.2 Combining data from multiple sources
6.4 Point Data Transformation
6.4.1 Interpolating discrete data
6.4.2 Interpolating continuous data
6.5 Summary
6.6 Exercise
6.0 OBJECTIVE
● The most important component of a GIS is the d ata.
● Geographic data or Spatial data and related tabular data can be collected in -
house or bought from a commercial data provider.
● Spatial data can be in the form of a map/remotely -sensed data such as
satellite imagery and aerial photography.
● These data forms must be properly georeferenced (latitude/longitude).
● Tabular data can be in the form of attribute data that is in some way related
to spatial data.
● Most GIS software comes with inbuilt Database Management Systems
(DBMS) to create and maintain a da tabase to help organize and manage
data.
● Input of relevant data required for Analysis and modeling in a GIS. munotes.in
Page 92
Principles of
Geogrphics
Information Systems
92 ● In this chapter we will see,
○ What is Data?
○ how data is prepared and entered in the system?
○ Why is it important to see the quality of data?
○ What i s data transformation and how is it implemented?
○ In this section we outline four techniques as:
1. Trend surface fitting using regression,
2. Triangulation,
3. Spatial moving averages using inverse distance weighting,
4. Kriging.
6.1 SPATIAL DATA INPUT
● Spatial data is any type of data that directly or indirectly references a
specific geographical area or location.
● It is also known as geospatial data or geographic information, spatial data
can also numerically represent a physical object in a geographic co ordinate
system.
● However, spatial data is much more than a spatial component of a map.
● The data consist of two types:
○ representing geographic features like points,lines and areas and
○ representing attribute data i.e. descriptive information.
● Data input should be done with utmost care, as the results of analyses
heavily depend on the quality of the input data.
Type of spatial data:
In a GISThe two primary types of spatial data are:
1) Vector data and
2) raster data.
Let's discuss each in more detail:
Vector data model:
● Vector data is not made up of a grid of pixels.
● Instead, vector graphics are composed of vertices and paths.
● The three basic symbol types for vector data are: munotes.in
Page 93
Data Entry and
Preparation
93 ○ Points:
■ Vector points are simply XY coordinates.
■ Generally, they are latitude and longitude with a spatial reference frame.
■ When features are too small to be represented as polygons, points are
used.
■ For example, you can’t see city boundary lines on a global scale. In this
case, maps often use points to display cities.
○ Lines:
■ Vector li nes connect each vertex with paths.
■ Basically, you’re connecting the dots in a set order and it becomes a
vector line with each dot representing a vertex.
■ Lines usually represent features that are linear in nature.
■ For example, maps show rivers, roads, a nd pipelines as vector lines.
Often, busier highways have thicker lines than abandoned roads.
○ Polygons or areas:
■ When you join a set of vertices in a particular order and close it, this is
now a vector polygon feature.
■ When you create a polygon, the first and last coordinate pairs are the
same.
■ Cartographers use polygons to show boundaries and they all have an
area.
■ For example, a building footprint has square footage, and agricultural
fields have acreage.
Raster data model:
● Raster data is made up of pixe ls (also referred to as grid cells).
● They are usually regularly spaced and square but they don’t have to be.
● Rasters often look pixelated because each pixel has its own value or
class.
● Raster data models consist of two categories –
○ Discrete:
■ Discrete r asters have distinct themes or categories.
■ For example, one grid cell represents a land cover class or a soil type. munotes.in
Page 94
Principles of
Geogrphics
Information Systems
94 ■ In other words, each land cover cell is definable and it fills the entire area
of the cell.
■ Discrete data usually consists of integers to r epresent classes.
■ For example, the value 1 might represent urban areas, the value 2
represents forest, and so on.
○ Continuous:
■ Continuous rasters (non -discrete) are grid cells with gradually changing
data such as elevation, temperature, or an aerial photog raph.
■ A continuous raster surface can be derived from a fixed registration
point.
■ For example, digital elevation models use sea level as a registration
point.
Spatial data can be obtained from various sources. It can be collected from
scratch, using direc t spatial -data acquisition techniques, or indirectly, by
making use of existing spatial data collected by others.
6.1.1 Direct spatial data capture:
● We can obtain spatial data by direct observation of relevant geographic
phenomena.
● This can be done throug h ground -based field surveys or by using remote
sensors on satellites or aircraft.
● Many Earth science disciplines have developed specific survey techniques
as ground -based approaches remain the most important source of reliable
data in many cases.
● Data th at are captured directly from the environment are called primary
data.
● With primary data, the core concern in knowing their properties is to know
the process by which they were captured, the parameters of any instruments
used, and the rigor with which qua lity requirements were observed.
● In practice, it is not always feasible to obtain spatial data by direct capture.
● Factors of cost and available time may be a hindrance, and sometimes
previous projects have acquired data that may fit a current project’s
purpose.
6.1.2 Indirect spatial data capture:
● The other way to get spatial data is sourced indirectly.
● This includes data derived by:
○ scanning existing printed maps, munotes.in
Page 95
Data Entry and
Preparation
95 ○ data digitized from a satellite image
○ processed data purchased from data -capture firms or international
agencies, and so on.
● This type of data is known as secondary data.
● Secondary data are derived from existing sources and have been collected
for other purposes, often not connected with the investigation at hand.
● Spatial data have been coll ected in digital form at an increasing rate and
stored in various databases by the individual producers for their own use
and for commercial purposes.
● More and more of these data are being shared among GIS users.
● There are several reasons for this such a s,
○ Some data are freely available,
○ Other data are only available commercially, as is the case for most
satellite imagery.
○ High quality data remains both costly and time consuming to collect and
verify.
6.1.3 Obtaining spatial data elsewhere:
● New technol ogies have played a key role in the increasing availability of
geospatial data.
● As a result of this availability of various data has increased,
● Though we have to be more careful that the data we have acquired are of
sufficient quality to be used in analys es and decision -making.
● There are several related initiatives in the world to supply base data sets at
national, regional and global levels, as well as those aiming to harmonize
data models and definitions of existing data sets.
● Global initiatives include , for example,
○ the Global Map,
○ the USGS Global GIS database and
○ the Second Administrative Level Boundaries (SALB) project.
● SALB, for instance, is a UN project aiming at improving the availability of
information about administrative boundaries in develop ing countries.
munotes.in
Page 96
Principles of
Geogrphics
Information Systems
96 6.2 DATA QUALITY
● Data quality is a measure of the condition of data based on factors such as,
○ Accuracy:
■ This can be termed as the discrepancy between the actual attributes value
and coded attribute value.
■ Positional accuracy is th e quantifiable value that represents the
positional difference between two geospatial layers or between a
geospatial layer and reality.
■ Attribute accuracy indicates the attribute attached to the points, lines
and polygons features of the spatial database, which are reliable and
reasonably correct or free from bias.
■ Temporal Accuracy occurs if the GIS data set has a temporal dimension
and thus the spatial information data type results in the form of: x,y,z,t.
The temporal accuracy refers to data that has a time component.
○ Completeness:
■ It is basically the measure of totality of features.
■ A data set with minimal amount of missing features can be termed as
Complete -Data.
○ Relevancy:
■ You must consider whether you really need this information, or whether
you’re collecting it just for the sake of it.
■ Why does relevance matter as a data quality characteristic?
■ If you’re gathering irrelevant information, you’re wasting time as well as
money. Your analyses won’t be as valuable.
○ Validity:
■ Validity means that a pie ce of information doesn’t contradict another
piece of information in a different source or system. munotes.in
Page 97
Data Entry and
Preparation
97 ■ When pieces of information contradict themselves, you can’t trust the
data. You could make a mistake that could cost your firm money and
reputational damage .
○ Timeliness:
■ Timeliness refers to the time expectation for accessibility and availability
of information.
■ Timeliness can be measured as the time between when information is
expected and when it is readily available for use.
○ Consistency:
■ Data consistenc y can be termed as the absence of conflicts in a particular
database.
■ Logical consistency describes the fidelity of relationships encoded in the
data structure of the digital spatial data.
■ For example, a street arc that does not have a street name should not
have a street type.
Characteristic How it’s measured
Accuracy Is the information correct in every detail?
Completeness How comprehensive is the information?
Validity Does the information contradict other trusted
resources?
Relevance Do you really n eed this information?
Timeliness How up - to-date is information? Can it be used
for real -time reporting?
● Data created from different channels with different techniques can have
discrepancies in terms of resolution, orientation and displacements.
● Improve d data quality leads to better decision -making across an
organization.
● Good data decreases risk and can result in consistent improvements in
results.
● Data quality is the degree of data excellency that satisfies the given
objective. munotes.in
Page 98
Principles of
Geogrphics
Information Systems
98 Data lineage:
○ It is the process of understanding, recording, and visualizing data as it flows
from data sources to consumption.
○ This includes all transformations the data underwent along the way how the
data was transformed, what changed, and why.
○ Data lineage process describe s ,
■ the history of the spatial data,
■ including descriptions of the source material from which the data were
derived, and
■ the methods of derivation.
■ it also contains the dates of the source material, and
■ all transformations involved in producing the fin al digital files or map
products.
6.3 DATA PREPARATION
● Data preparation is the process of cleaning and transforming raw data prior
to processing and analysis.
● It is an important step prior to processing and often involves reformatting
data, making correc tions to data and the combining of data sets to enrich
data.
● Data preparation is often a lengthy undertaking for data professionals or
business users, but it is essential as a prerequisite to put data in context in
order to turn it into insights and elimin ate bias resulting from poor data
quality.
● For example, the data preparation process usually includes standardizing
data formats, enriching source data, and/or removing outliers.
● Data preparation helps:
○ Fix errors quickly
■ Data preparation helps catch erro rs before processing.
■ After data has been removed from its original source, these errors
become more difficult to understand and correct.
○ Produce top -quality data
■ Cleaning and reformatting datasets ensures that all data used in
analysis will be high qual ity.
munotes.in
Page 99
Data Entry and
Preparation
99 ○ Make better business decisions
■ Higher quality data that can be processed and analyzed more quickly
and efficiently leads to more timely, efficient and high -quality
business decisions.
Data Preparation Steps
The specifics of the data preparation proce ss vary by industry, organization
and need, but the framework remains largely the same.
1. Gather data
● The data preparation process begins with finding the right data.
● This can come from an existing data catalog or can be added ad -hoc.
2. Discover an d assess data
● After collecting the data, it is important to discover each dataset.
● This step is about getting to know the data and understanding what has to
be done before the data becomes useful in a particular context.
3. Cleanse and validate data
● Clean ing up the data is traditionally the most time consuming part of the
data preparation process, but it’s crucial for removing faulty data and
filling in gaps.
● Important tasks here include:
○ Removing extraneous data and outliers.
○ Filling in missing values.
○ Conforming data to a standardized pattern.
○ Masking private or sensitive data entries. munotes.in
Page 100
Principles of
Geogrphics
Information Systems
100 ● Once data has been cleansed, it must be validated by testing for errors in
the data preparation process up to this point.
● Oftentimes, an error in the system will become apparent during this step
and will need to be resolved before moving forward.
4. Transform and enrich data
● Transforming data is the process of updating the format or value entries
in order to reach a well -defined outcome, or to make the data more easily
understood by a wider audience.
● Enriching data refers to adding and connecting data with other related
information to provide deeper insights.
5. Store data
● Once prepared, the data can be stored.
● Or it can be channeled into a third party application such as , a business
intelligence tool clearing the way for processing and analysis to take
place.
6.3.1 Data checks and repairs:
● Data checks and repairs refers to the step when acquired data sets must be
checked for quality in terms of the accuracy, consistency a nd completeness.
● Errors can be identified automatically, after which manual editing methods
can be used to correct the errors.
● Alternatively, some software may identify and automatically correct certain
types of errors.
● The geometric, topological, and a ttribute components of spatial data can be
distinguished.
● Errors can be injected at many points in a GIS analysis, and one of the
largest sources of error is the data collected.
● Each time a new dataset is used in a GIS analysis, new error possibilities ar e
also introduced.
● One of the feature benefits of GIS is the ability to use information from
many sources, so the need to have an understanding of the quality of the
data is extremely important.
● Accuracy in GIS is the degree to which information on a map matches real -
world values. It is an issue that pertains both to the quality of the data
collected and the number of errors contained in a dataset or a map.
● Precision refers to the level of measurement and exactness of description in
a GIS database. Precis e location data may measure position to a fraction of
a unit (meters, feet, inches, etc.). munotes.in
Page 101
Data Entry and
Preparation
101 ● The more accurate and precise the data, the higher the cost to obtain and
store it because it can be very difficult to obtain and will require larger data
files.
● Highly precise data does not necessarily correlate to highly accurate data
nor does highly accurate data imply high precision data.
● Clean -up operations are often performed in a standard sequence.
● For example, crossing lines are split before dangling lines are erased, and
nodes are created at intersections before polygons are generated.
● These are illustrated in below diagram:
● With polygon data, one usually starts with many polylines, in an unwieldy
format known as spaghetti data.
● This results in fewer polylines with more internal vertices.
● Then, polygons can be identified . munotes.in
Page 102
Principles of
Geogrphics
Information Systems
102
● Sometimes, polylines that should connect to form closed boundaries do not,
and therefore must be connected; this step is not indicated in the figure.
● In a final step, t he elementary topology of the polygons can be derived (d).
Associating attributes
Attributes may be automatically associated with features that have unique
identifiers.
In the case of vector data, attributes are assigned directly to the features, while
in a raster the attributes are assigned to all cells that represent a feature.
Rasterization or vectorization
● Vectorization produces a vector data set from a raster.
● Another form of vectorization takes place when we want to identify features
or patterns i n remotely sensed imagery.
● The keywords here are feature extraction and pattern recognition, which are
dealt with Remote Sensing .
● Rasterization is a process of converting vector data sets to raster data when
some or all of the subsequent spatial data ana lysis is to be carried out on
raster data.
● It involves assigning point, line and polygon attribute values to raster cells
that overlap with the respective point, line or polygon. munotes.in
Page 103
Data Entry and
Preparation
103 ● A cell size which is too large may result in cells that cover parts of mult iple
vector features.
● Rasterization itself could be seen as a ‘ backwards step ’:
○ Firstly, raster boundaries are only an approximation of the objects’
original boundary.
○ Secondly, the original ‘objects’ can no longer be treated as such, as they
have lost t heir topological properties.
● An alternative to rasterization is to not perform it during the data
preparation phase, but to use GIS rasterization functions on - the-fly, that is
when the computations call for it.
● This allows keeping the vector data and ge nerating raster data from them
when needed. Obviously, the issue of performance trade -off must be looked
into.
Topology generation
● We have already discussed derivation of topology from vectorized data
sources.
● However, more topological relations may somet imes be needed, for
instance in networks, e.g. the questions of line connectivity, flow direction,
and which lines have over - and underpasses.
6.3.2 Combining data from multiple sources:
● A GIS project usually involves multiple data sets
● Hence it is impo rtant to see integration of these multiple sets related to each
other.
● There are four fundamental cases to be considered in the combination of
data from different sources:
○ They may be about the same area, but differ in accuracy,
○ They may be about the same area, but differ in choice of representation,
○ They may be about adjacent areas, and have to be merged into a single
data set.
○ They may be about the same or adjacent areas, but referenced in different
coordinate systems.
munotes.in
Page 104
Principles of
Geogrphics
Information Systems
104 Differences in accuracy:
Figure: The integration of two vector datasets, which represent the same
phenomenon, may lead to sliver polygons
● In the above Figure the polygons of two digitized maps at different scales
are overlaid.
● Due to scale differences in the sources, the resultin g polygons do not
perfectly coincide, and polygon boundaries cross each other.
● This causes small,artefact polygons in the overlay known as sliver
polygons.
● If the map scales in - Sliver polygons volved differ significantly, the polygon
boundaries of the l arge-scale map should probably take priority, but when
the differences are slight, we need interactive techniques to resolve the
issues.
Differences in representation
● When points need to be translated into rasters, we need to perform
something known as poi nt data transformation.
● Some advanced GIS applications require the possibility of representing the
same geographic phenomenon in different ways. These are called
multirepresentation systems.
● The commonality is that phenomena must sometimes be Multi -scale a nd
viewed as points, and at other times as polygons.
● For example, a small -scale multirepresentation systems national road
network analysis may represent villages as point objects, but a nation -wide
urban population density study should regard all municipa lities as
represented by polygons.
● The links between various representations for the same object maintained
by the system allows switching between them, and many fancy applications
of their use seem possible.
munotes.in
Page 105
Data Entry and
Preparation
105 ● A comparison is illustrated in below Figure:
Figure: Multi -scale and multi -representation systems compared; the main
difference is that multi -representation systems have a built -in ‘understanding’
that different representations belong together.
Merging data sets of adjacent areas:
● Sometimes data sets have to be matched into a single ‘seamless’ data set,
ensuring that the appearance of the integrated geometry is as homogeneous
as possible.
● Edge matching is the process of joining two or more map sheets.
● Merging adjacent data sets can be a major pro blem.
● Some GIS functions, such as line smoothing and data clean -up (removing
duplicate lines) may have to be performed.
● Following Figure illustrates a typical situation.
Figure: Multiple adjacent data sets, after cleaning, can be matched and merged
into a single one.
● Some GISs have merge or edge -matching functions to solve the problem
arising from merging adjacent data.
● At the map sheet edges, feature representations have to be matched in order
for them to be combined.
● Coordinates of the objects al ong shared borders are adjusted to match those
in the neighboring data sets.
● Mismatches may still occur, so a visual check, and interactive editing is
likely to be required. munotes.in
Page 106
Principles of
Geogrphics
Information Systems
106 Differences in coordinate systems:
● Map projections provide means to map geograph ic coordinates onto a flat
surface and vice versa.
● It may be the case that data layers which are to be combined or merged in
some way are referenced in different coordinate systems, or are based upon
different datums.
● As a result, data Transformations ma y need coordinate transformation , or
both a coordinate transformation and datum transformation .
● It may also be the case that data has been digitized from an existing map or
data layer .
● In this case, geometric transformations help to transform device coo rdinates
(coordinates from digitizing tablets or screen coordinates) into world
coordinates (geographic coordinates, meters, etc.).
Other data preparation functions:
A range of other data preparation functions exist that support conversion or
adjustment of the acquired data to format requirements that have been defined
for data storage purposes.
These include:
● Format transformation functions
○ These convert between data formats of different systems or
representations, e.g. reading a DXF file into a GIS.
○ The user should be warned that conversions from one format to another
may cause problems.
○ The reason is that not all formats can capture the same information, and
therefore conversions often mean loss of information.
● Graphic element editing
○ Manual editing of digitized features so as to correct errors, and to prepare
a clean data set for topology building.
● Coordinate thinning
○ A process that is often applied to remove redundant or excess vertices
from line representations, as obtained from digitizing.
munotes.in
Page 107
Data Entry and
Preparation
107 6.4 POINT DATA TRANSFORMATION
We will see several methods of transforming point data in a GIS:
Interpolation
● We may want to transform our points into other representations in order to
facilitate interpretation and/or integration with other data.
● Examples i nclude defining homogeneous areas (polygons) from our point
data, or deriving contour lines.
● This is generally referred to as interpolation ,
● The calculation of an Interpolation value from ‘surrounding’ observations.
● The principle of spatial autocorrelat ion plays a central part in the process of
interpolation
Nearest -neighbour interpolation
● In order to predict the value of a point for a given (x, y) location, we could
simply find the ‘nearest’ known value to the point, and assign that value.
● This is the simplest form of interpolation, known as nearest -neighbour
interpolation .
● We might instead choose to use the distance that points are away from (x, y)
to weight their importance in our calculation.
Discrete and continuous fields
● How we represent a field co nstructed from point measurements in the GIS
also depends on the above distinction.
● A discrete field can either be represented as a classified raster or as a
polygon data layer, in which each polygon has been assigned a (constant)
field value.
● A continuo us field can be represented as an un -Discrete and continuous
classified raster, as an isoline (thus, vector) data layer, or perhaps as a TIN.
● Some field GIS software only provides the option of generating raster
output, requiring an intermediate step of r aster to vector conversion.
● The choice of representation depends on what will be done with the data in
the analysis phase.
6.4.1 Interpolating discrete data
● If we are dealing with discrete data, we are effectively restricted to using
nearest -neighbour int erpolation.
● In a nearest -neighbour interpolation, each location is assigned the value of
the closest measured point. munotes.in
Page 108
Principles of
Geogrphics
Information Systems
108 ● Effectively, this technique will construct ‘zones’ around interpolation of the
points of measurement, with each point belonging to a zone assigned the
same value.
● Effectively, this represents an assignment of an existing value to a location.
● If the desired output was a polygon layer, we could construct Thiessen
polygons around the points of measurement.
● The boundaries of such polygons, by definition, are the locations for which
more than one point of measurement is the closest point.
● If the desired output was in the form of a raster layer, we could rasterize the
Thiessen polygons.
Figure: Generation of Thiessen polygons for qualitati ve point measurements.
The measured
points are indicated in dark green; the darker area indicates all locations
assigned with the measurement value of the central point.
6.4.2 Interpolating continuous data
There are many continuous geographic field elevat ion, temperature and
groundwater salinity are just a few examples.
Commonly, continuous fields are represented as rasters, and we will almost by
default assume that they are.
The main alternative for continuous field representation is a polyline vector
layer, in which the lines are isolines.
The aim is to use measurements to obtain a representation of the entire field
using point samples.
In this section we outline four techniques to do so:
1. Trend surface fitting using regression,
2. Triangulation,
3. Spatial moving averages using inverse distance weighting,
4. Kriging. munotes.in
Page 109
Data Entry and
Preparation
109 1. Trend surface fitting using regression,
● A surface interpolation method that fits a polynomial surface by least -
squares regression through the sample data points.
● This method results in a surface that minimizes the variance of the
surface in relation to the input values.
● The resulting surface rarely goes through the sample data points.
2. Triangulation,
● Triangulation refers to the use of multiple methods or data sources in
qualitative research to develop a comprehensive understanding of
phenomena (Patton, 1999).
● Triangulation also has been viewed as a qualitative research strategy to
test validity through the convergence of information from different
sources.
3. Spatial moving average s using inverse distance weighting,
● Inverse distance weighted (IDW) interpolation determines cell values
using a linearly weighted combination of a set of sample points.
● The weight is a function of inverse distance. The surface being
interpolated should b e that of a locationally dependent variable.
4. Kriging.
● Kriging is a multistep process; it includes exploratory statistical analysis
of the data, variogram modeling, creating the surface, and (optionally)
exploring a variance surface.
● Kriging is most app ropriate when you know there is a spatially correlated
distance or directional bias in the data.
● Kriging predicts the value of a function at a given point by computing a
weighted average of the known values of the function in the
neighborhood of the point.
● The method is closely related to regression analysis.
6.5 SUMMARY
● GIS will help to ascertain the ground level realities with the help of spatial
data obtained from various resources.
● In GIS one can integrate data from various sources such as Remote Sensi ng
Data and
● Image with that of data of land records and agricultural census.
● It would be more appropriate to use GIS applications in agro -based
enterprise to ascertain the scope of activities and monitoring of activities. munotes.in
Page 110
Principles of
Geogrphics
Information Systems
110 ● Digital data can be obtained dire ctly from spatial data providers, or from
pre-
● existing GIS application projects.
● Sometimes, however, the data must be obtained from non -digital sources
such as paper maps. In all of these cases, data quality is a key consideration.
● Data cleaning and prepa ration involves checking for errors, inconsistencies,
and simplification and merging existing spatial data sets.
● The problems that one may encounter may be caused by differences in
resolution and differences in representation.
● We have discussed various m ethods to address these issues in this chapter.
6.6 EXERCISE
1. Distinguish between primary and secondary data and give examples of
each.
In what circumstances is this distinction difficult to maintain?
2. Rasterization of vector data is sometimes required in d ata preparation. What
reasons may exist for this?
3. What is secondary data in GIS? Explain any two ways to brain secondary
data in GIS.
4. List the four issues in obtaining data from multiple sources. Explain any
two of them.
Refer ence :
1) https://www.techtarget.com/searchdatamanagement/definition/spatial -data
2) https://gisgeography.com/spatial -data-types -vector -
raster/#:~:text=The%20two%20primary%20types%20of,raster%20data%20
in%20a%20GIS .
3) https://www.talend .com/resources/what -is-data-preparation/
munotes.in
Page 111
111 7
SPATIAL DATA ANALYSIS
Unit Structure :
7.0 Objectives
7.1 Introduction
7.2 Classification of analytical GIS capabilities
7.3 Retrieval, classification and measurement
7.3.1 Measurement
7.3.2 Spatial selection queries
7.3.3 Classification
7.4 Overlay functions
7.4.1 Vector overlay operators
7.4.2 Raster overlay operators
7.4.3 Overlays using a decision table
7.5 Neighbourhood functions
7.5.1 Proximity computations
7.5.2 Computation of diffusion
7.5.3 Flow computation
7.5.4 Raster based surface analysis
7.6 Summary
7.7 References
7.8 Unit End Questions
7.0 OBJECTIVES
After going through this chapter ,you will be able to
How analytical functions that can form the building blocks for application
models.
It will ho pefully become clear to the reader that these operations can be
combined in various ways for increasingly complex analyses.
Overview of different types of analytical models and related concepts . munotes.in
Page 112
Principles of
Geogrphics
Information Systems
112 7.1 INTRODUCTION
The discussion up until this point has so ught to prepare the reader f or the ‘data
analysis’ phase. So far, we have discussed the nature of spatial data,
georeferencing, notions of data acquisition and preparation, and issues relating
to data quality and error.
Before we move on to discuss a range of analytical operations, w e should
begin with some clarifications. We know from preceding discussions that the
analytical capabilities of a GIS use spatial and non -spatial (attribute) data to
answer questions and solve problems that are of spatia l releva nce. It is
important to make a distinction between analysis. By analysis we mean only a
subset of what is usually implied by the term: we do not specifically deal with
statistical analysis
7.2 CLASSIFICATION OF ANALYTICAL GIS CAPABILITIES
There are many w ays to classify the analytic al functions of a GIS. The
classification used for this chapter, is essentially the one put forward by
Aronoff [3]. Itmakes the following distinctions, which are addressed in
subsequent sections of the chapter:
1. Classification , retrieval, and measurement functions. All functions in this
category are performed on a single (vector or raster) data layer, often
usingthe associated attribute data.
Classification allows the assignment of features to a class on the basisof
attribute v alues or attribute ranges (d efinition of data patterns). Onthe basis
of reflectance characteristics found in a raster, pixels may beclassified as
representing different crops, such as potato and maize.
Retrieval functions allow the selective search of data . We might thusretrieve
all agricultural fields where potato is grown.
Generalization is a function that joins different classes of objects
withcommon characteristics to a higher level (generalized) class.
Measurement functions allow the calculation of di stances, lengths, orareas.
2. Overlay functions. These belong to the most frequently used functionsin a
GISapplication. They allow the combination of two (or more) spa -tial data
layers comparing them position by position, and treating areas ofoverlap —
and o f non -overlap —in distinct wa ys. Many GISs support over -lays through
an algebraic language, expressing an overlay function as aformula in which
the data layers are the arguments. In this way, we can
find
The potato fields on clay soils (select the ‘potato’ cover in the cropdata
layer and the ‘clay’ cover in the soil data layer and perform anintersection
of the two areas found), munotes.in
Page 113
Spatial Data Analysis
113 The fields where potato or maize is the crop (select both areas of‘potato’
and ‘maize’ cover in the crop data layer and take their union),
The potato fields no t on clay soils (perform a difference operator ofareas
with ‘potato’ cover with the areas having clay soil),
The fields that do not have potato as crop (take the complement of the
potato areas).
3. Neighbourhood functions. Where as overlays combine features at the
samelocation, neighbourhood functions evaluate the characteristics of an
areasurrounding a feature’s location. A neighbourhood function ‘scans’
theneighbourhood of the given feature(s), and performs a computation on it.
Search functions allow the r etrieval of features that fall within a givensearch
window. This window may be a r ectangle, circle, or polygon.
Buffer zone generation (or buffering) is one of the best known
neighbourhood functions. It determines a spatial envel ope (buffer)
around(a) given feature(s). The created buffer may have a fixed width, or
avariable width that depends on characteristics of the area.
Interpolation functions predict unknown values using the known values at
nearby locations. This typically oc curs for continuous fields, like elevation,
when the data actually stored does not provide the direct answer for the
location(s) of interest. Interpolation of continuousdata wa s discussed in
Section 5.4.2.
Topographic functions determine characteristics of an area by lookingat the
immediate neighbourhood as well. Typical examples are slopecomputations
on digital terrain models (i.e. continuous spatial fields).The slope in a
location is defined as the plane tangent to the topography in that location.
Various computations can be performe d, such as:
– determination of slope angle,
– determination of slope aspect,
– determination of slope length,
4. Connectivity functions. These functions work on the basis of networks,
including road networks, water courses in coa stal zones, and
communicatio n lines in mobile telephony. These networks represent spatial
linkagesbetween features. Main fu nctions of this type include:
Contiguity functions evaluate a characteristic of a set of connected spatial
units. One can think of th e search for a contiguous ar ea of forestof certain
size an d shape in a satellite image.
Network analytic functions are used to compute over connected line
features that make up a network. The network may consist of roads, public
transport routes, high volt age lines or other forms of transportation
infrastructure. Analysis of such networks may entail shortest path munotes.in
Page 114
Principles of
Geogrphics
Information Systems
114 computations (in terms of distance or travel time) between two points in
a network for routing purposes. Other forms are to find all pointsreachab le
within a given distance o r duration from a start point forallocation purposes,
or determination of the capacity of the networkfor transportation between an
indicated source location and sink lo - cation.
Visibility functions also fit in this list as they are used to compute the points
visible from a given location (viewshed modelling or viewshed
mapping) using a digital terrain model
7.3 RETRIEVAL, CLASSIFICATION AND MEASUREMENT
7.3.1 Measurement
Geometric measurement on spatial features includes count ing, distance and
area size computations. For the sake of simplicity, this section discusses such
measurements in a planar spatial reference system. We limit ourselves to
geometric measurements, and do not include attribute data measurement. In
general, Me asurement types measurements on vector data are more advanced,
thus, also more complex, than those on raster data. We discuss each group.
Measurements on vector data.The primitives of vector data sets are point,
(poly)line and polygon. Related geometric me asurements are location, len gth,
distance and area size. Some of these are geometric properties of a feature in
isolation (location, length, area size); others (distance) require two features to
be identified. The location property of a vector feature is a lways stored by the
GIS: a s ingle coordinate pair for a point, or a list of pairs for a polyline or
polygon boundary. Occasionally, there is a need to obtain the location of the
centroid of a polygon; some GISs store these also, ot hers compute them ‘on -
the-fly’. Length is a geometric property associated with polylines, by
themselves, or in their function as polygon boundary. It can obviously be
computed by the GIS — as the sum of lengths of the constituent line
segments —but it quite often is also stored with the polyline.
Area size is associated with polygon features. Again, it can be computed,
butusually is stored with the polygon as an extra attribute value. This speeds
upthe computation of other functions that require area size values.
The attentive reader will have noted that all of the above ‘measurements’ do
notactually require computation, but only retrieval of stored data.Measuring
distance between two features is another important function. If both
features are points, say p and q, the computation in a Cartesian spatial
referenc e system are given by the well -known Pythagorean distance function.
7.3.2 Spatial selection queries
Interactive spatial selection In interactive spatial selection, one defines the
selection condition by pointing at or drawing spatial objects on the scree n
display, after having indicated the spa - tial data layer(s) from which to select
features. The interactively defined objects are called the selection objects; they
can be points, lines, or polygons. The GIS Selection objects then selects the
features in the indicated data layer(s ) that overlap (i.e. intersect, meet, contain,
or are contained in. munotes.in
Page 115
Spatial Data Analysis
115
7.3.3 Classification
Classification is a technique of purposefully removing detail from an input data
set, in the hope of revealin g important patterns (of spa tial distribution). In the
process, we produce an output data set, so that the input set can be left intact.
We do so by assigning a characteristic value to each element in the input set,
which is usually a collection of spatial features that can be raster cells or
points, lines or polygons. If the number of characteristic values is small in
comparison to the size of the input set, we have classified the input set.
The pattern that we look for may be the distribution of household income in a
city. Household income is called the classification parameter. If we know for
each ward in the city the associated average income, we have many different
values. Subsequently, we could define five different categories (or: classes) of
Classifi cation parameter income: ‘lo w’, ‘below average’, ‘average’, ‘above
average’ and ‘high’, and pro - vide value ranges for each category. If these five
categories are mapped in a sensible colour scheme, this may reveal interesting
information. The input data s et may have itself been the result of a
classification, and in such a case we call it a reclassification. For example, we
may have a soil map that shows different soil type units and we would like to
show the suitability of units for a specific crop. In th is case, it is better to
assign to the soil units an attribute Reclassification of suitability for the crop.
Since different soil types may have the same crop suitability, a classification
may merge soil units of different type into the same category of cr op
suitability.
In class ification of vector data, there are two possible results. In the first,
theinput features may become the output features in a new data layer, with an
additional category assigned. In other words, nothing changes with respect to
the spatial extents of the o riginal features. Is an illustration of this first type of munotes.in
Page 116
Principles of
Geogrphics
Information Systems
116 output. A second type of output is obtained when adjacent features with the
same category are merged into one bigger feature. Such post -processing
functions are called s patial merging, aggregation or dissolving. An illustration
of this second type is found . Observe that this type of merging is only an
option in vector data, as merging cells in an output raster on the basis of a
classification makes little sense. Vector d ata classification can be pe rformed on
point sets, line sets or polygon sets; the optional merge phase is sensibleonly
for lines and polygons.
Automatic classification
User -controlled classifications require a classification table or user interaction.
GIS software can also perform au tomatic classification, in which a user only
specifies the number of classes in the output data set. The system automatically
determines the class break points. Two main techniques of determining break
points are in use.
1. Equ al interval technique: The m inimum and maximum values vmin and
vmax of the classification parameter are determined and the (constant) interval
size for each category is calculated as (vmax − vmin)/n, where n is the number
of classes chosen by the user. Thi s classification is useful i n revealing the
distribution patterns as it determines the number of features in each category.
2. Equal frequency technique: This technique is also known as quantile
classification. The objective is to create categories with roughly equal numbers
of feat ures per category. The total number of features is determined first and
by the required number of categories, the number of features per category is
calculated. The class break points are then determined by counting off the
features in order of classificat ion parameter value.
munotes.in
Page 117
Spatial Data Analysis
117 7.4 OVERLAY FUNCTIONS
In the previous section, we saw various techniques of measuring and selecting
spatial data. We also discussed the generation of a new spatial data layer from
an old one, using classification. In this section, we look at techniques of
combining two spatial data layers and producing a third from them. The binary
operators that we discuss are known as spatial overlay operators. We will
firstly discuss vector overlay operators, and then focus on the raster case.
Standard overlay operators take two input data layers, and assume they are
georeferenced in the same system, and overlap in study area. If either of these
requirements is not met, the use of an overlay operator is senseless. The
principle of spatial overlay is to compare the characteristics of the same
location in both data layers, and to produce a result for each location in the
output data layer. The specific result to produce is determined by the user. It
might involve a calculation, or some other logical function to be applied to
every area or location.
In raster data, as we shall see, these comparisons are carried out between pairs
of cells, one from each input raster. In vector data, the same principle of
comparing locations applies, but the underlying computations rely on
determining the spatial intersections of features from each input layer.
7.4.1 Vector overlay operators
In the vector domain, overlay is computationally more demanding than in the
raster domain. Here we will only discuss overlays from polygon data layers,
but we note that most of the ideas also apply to overlay operations with point
or line data layers
The standard overlay operator for two layers of polygons is the polygon
intersection operator. It is fundamental, as many other overl ay operators
proposed in the literature or implemented in systems can be defined in terms of
it. The result of this operator is the collection of all possible polygon
intersections; the attribute table result is a join —in the relational database
sense . munotes.in
Page 118
Principles of
Geogrphics
Information Systems
118 This outputattribute table only contains one tuple for each intersection polygon
found, and this explains why we call this operator a spatial join. A more
practical example is provided in Figure which was produced by polygon
intersection of the ward polygon s with land use polygons classified as in
Figure . This has allowed us to select the residential areas in Ilala District. Two
more polygon overlay operators are illustrated in Figure . The first is known as
the polygon clipping operator. It takes a polygon data layer and restricts its
spatial extent to the generalized outer boundary obtained from all (selected)
Polygon clipping polygons in a second input layer. Besides this generalized
outer boundary, no other polygon boundaries from the second layer play a role
in the result.
Attribute table only contains one tuple for each intersection polygon found,
and this explains why we call this operator a spatial joi.
7.4.2 Raster overlay operators
Vector overlay operators are useful, but geometrically complica ted, and this
sometimes results in poor operator performance. Raster overlays do not suffer
from this disadvantage, as most of them perform their computations cell by
cell, and thus they are fast. GISs that support raster processing —as most do —
usually have a language to express operations on rasters. These languages are
generally referred to as map algebra , or sometimes raster calculus. They allow
a GIS to compute new rasters from existing ones, using a range of functions
and operators. Unfortunately, not all implementations of map algebra offer the munotes.in
Page 119
Spatial Data Analysis
119 same functionality. The discussion below is to a large extent based on general
terminology, and attempts to illustrate the key operations using a logical,
structured language. Again, the syntax often differs for different GIS software
packages. When producing a new raster we must provide a name for it, and
define how it is computed. This is done in an assignment statement of the
following format:
Output raster name := Map algebra expression
Arithmetic operators
Various arithmetic operators are supported. The standard ones are
multiplication (×), division (/), subtraction ( −) and addition (+). Obviously,
these arithmetic operators should only be used on appropriate data values, and
for instance, not on classifica tion values. Other arithmetic operators may
include modulo division (MOD) and integer division (DIV ).
Modulo division returns the remainder of division: for instance, 10 MOD 3
will return 1 as 10 − 3 × 3 = 1. Similarly, 10 DIV 3 will return 3. More
opera tors are goniometric: sine (sin), cosine (cos), tangent (tan), and their
inverse functions asin, acos, and atan, which return radian angles as real
values.
Comparison and logical operators
7.4.3 Overlays using a decision table
Conditional expressions are powerful tools in cases where multiple criteria
must be taken into account. A small size example may illustrate this. Consider munotes.in
Page 120
Principles of
Geogrphics
Information Systems
120 a suitability study in which a land use classification and a geological
classification must be used. The respective rasters are illustrated in Figure on
the left. Domain expertise dictates that some combinations of land use and
geology result in suitable areas, whereas other combinations do not. In our
example, forests on alluvial terrain and grassland on shale are considered
suitable combinations, while the others are not.
7.5 NEIGHBOURHOOD FUNCTIONS
In our section on overlay operators, the guiding principle was to compare or
combine the characteristic value of a location from two data layers, and to do
so for all locations . This is what map algebra, for instance, gave us: cell by cell
calculations, with the results stored in a new raster. There is another guiding
principle in spatial analysis that can be equally useful. The principle here is to
find out the characteristics of the vicinity, here called neighbourhood, of a
location. After all, many suitability questions, for instance, depend not only on
what is at the location, but also on what is near the location.
Thus, the GIS must allow us ‘to look around locally’. To per form
neighbourhood analysis, we must:
1. State which target locations are of interest to us, and define their spatial
extent,
2. Define how to determine the neighbourhood for each target,
3. Define which characteristic(s) must be computed for each neigh bourhood.
For instance, our target might be a medical clinic. Its neighbourhood could be
defined as:
An area within 2 km distance as the crow flies, or
An area within 2 km travel distance, or
All roads within 500 m travel distance, or
All other clinics w ithin 10 minutes travel time, or
All residential areas, for which the clinic is the closest clinic. The alert
reader will note the increasingly complex definitions of ‘neighbourhood’ munotes.in
Page 121
Spatial Data Analysis
121 used here. This is to illustrate that different ways of measuring
neighbo urhoods exist, and some are better (or more representative of real
neighbourhoods) than others, depending on the purpose of the analysis.
Then, in the third step we indicate what it is we want to discover about the
phenomena that exist or occur in the neig hbourhood. This might simply be
its spatial extent, but it might also be statistical information like:
The total population of the area,
Average household income, or
The distribution of high -risk industries located in the neighbourhood.
7.5.1 Proximity c omputations
In proximity computations, we use geometric distance to define the
neighbourhood of one or more target locations. The most common and useful
technique is buffer zone generation. Another technique based on geometric
distance that we discuss is T hiessen polygon ge nerationBuffe r zone generation
The principle of buffer zone generation is simple: we select one or more target
locations, and then determine the area around them, within a certain distance.
In Figure a number of main and minor roads wer e selected as targets, and a 75
m (resp., 25 m) buffer was computed from them. In some case studies, zonated
buffers must be determined, for instance in assessments of traffic noise effects.
Most GISs support this type of zonated buffer computation. An ill ustration is
provided in Figure
In vector -based buffer generation, the buffers themselves become polygon
features, usually in a separate data layer, that can be used in further spatial
analysis.Buffer generation on rasters is a fairly simple function. The target
location or locations are always represented by a selection of the raster’s cells,
and geometric distance is defined, using cell resolution as the unit. The
distance function applied is the Pythagorean distance between the cell centres.
The dis tance from a non -target cell to the target is the minimal distance one
can find between that non -target cell and any target cell .
munotes.in
Page 122
Principles of
Geogrphics
Information Systems
122 Thiessen polygon generation
Thiessen polygon partitions make use of geometric distance for determining
neighbourhoods. This i s useful if we have a spatially distributed set of points
as target locations, and we want to know for each location in the study to
which target it is closest. This technique will generate a polygon around each
target location that identifies all those lo cations that ‘belong to’ that target. We
have already seen the use of Thiessen polygons in the context of interpolation
of point data, as discussed in Section . Given an input point set that will be the
polygon’s midpoints, it is not difficult to construct such a partition. It is even
much easier to construct if we already have a Delaunay triangulation for the
same input point set .
7.5.2 Computation of diffusion
The determination of neighbourhood of one or more target locations may
depend not only on dis tance —cases which we discussed above —but also on
direction and differences in the terrain in different directions. This typically is
the case when the target location contains a ‘source material’ that spreads over
time, referred to as diffusion. This ‘sour ce material’ may be air, water or soil
pollution, exiting a train station, people from an opened -up refugee camp, a
water spring uphill, or the radio waves emitted from a radio relay station. In all
these cases, one will not expect the spread to occur even ly in all directions.
There will be local terrain factors that influence the spread, making it easier or
more difficult. Many GISs provide support for this type of computation, and
we discuss some of its principles here, in the context of raster data. Diff usion
computation involves one or more target locations, which are better called
source locations in this context. They are the locations of the source of
whatever spreads.
The computation also involves a local resistance raster, which for each cell
provi des a value that indicates how difficult it is for the ‘source - material’ to
pass by that cell. The value in the cell must be normalized: i.e. valid Resistance
for a standardized length (usually the cell’s width) of spread path.
From the source location( s) and the local resistance raster, the GIS will be able
to compute a new raster that indicates how much minimal total resistance the
spread has witnessed for reaching a raster cell. This process is illustrated in
Figure. While computing total resistances, the GIS takes proper care of the
path lengths. Obviously, the diffusion from a cell csrc to its neighbour cell to munotes.in
Page 123
Spatial Data Analysis
123 the east is shorter than to the cell that is its northeast neighbour. The distance
ratio between these two cases is 1 : √ 2. If val(c) indica tes the local resistance
value
7.5.3 Flow computation
Flow computations determine how a phenomenon spreads over the area, in
principle in all directions, though with varying difficulty or resistance. There
are also cases where a phenomenon does not sprea d in all directions, but
moves or ‘flows’ along a given, least -cost path, determined again by local
terrain characteristics. The typical case arises when we want to determine the
drainage patterns in a catchment: the rainfall water ‘chooses’ a way to leave
the area. This principle is illustrated with a simple elevation raster, in Figure
.For each cell in that raster, the steepest downward slope to a neighbour cell is
computed, and its direction is stored in a new raster. This computation
determines the elev ation difference between the cell and a neighbour cell, and
takes into account cell distance —1 for neighbour cells in N –S or W –E
direction, √ 2 for cells in NE –SW or NW -SE direction.
Among its eight neighbour Determining flow direction cells, it picks the one
with the steepest path to it. The directions in raster (b), thus obtained, are
encoded in integer values, and we have ‘decoded’ them for the sake of
illustration. Raster (b) can be called the flow direction raster. From raster (b),
the GIS can compute the accumulated flow count raster, a raster that for each
cell indicates how many cells have their water flow into the cell.
munotes.in
Page 124
Principles of
Geogrphics
Information Systems
124 7.5.4 Raster based surface analysis
Continuous fields have a number of characteristics not shared by discrete
fields. Since the field changes continuously, we can talk about slope angle,
slope aspect and concavity/convexity of the slope. These notions are not
applicable to discrete fields.
The discussions in this section use terrain elevation as the prototypical example
of a conti nuous field, but all issues discussed are equally applicable to other
types of continuous fields. Nonetheless, we regularly refer to the continuous
field representation as a DEM, to conform with the most common situation.
Throughout the section we will ass ume that the DEM is represented as a raster
Applications There are numerous examples where more advanced
computations on continuous field representations are needed. A short list is
provided below.
Slope angle calculation The calculation of the slope steep ness, expressed as
an angle in degrees or percentages, for any or all locations.
Slope aspect calculation The calculation of the aspect (or orientation) of the
slope in degrees (between 0 and 360 degrees), for any or all locations.
Slope convexity/concavit y calculation Slope convexity —defined as the
change of the slope (negative when the slope is concave and positive when
the slope is convex) —can be derived as the second derivative of the field.
Slope length calculation With the use of neighbourhood operat ions, it is
possible to calculate for each cell the nearest distance to a watershed
boundary (the upslope length) and to the nearest stream (the downslope
length). This information is useful for hydrological modelling.
Hillshading is used to portray relie f difference and terrain morphology in
hilly and mountainous areas. The application of a special filter to a DEM
produces hillshading. Filters are discussed on page 6.4.4. The colour tones
in a hillshading raster represent the amount of reflected light in each
location, depending on its orientation relative to the illumination source.
This illumination source is usually chosen at an angle of 45 ◦ above the
horizon in the north -west.
Three -dimensional map display With GIS software, three -dimensional
views of a DEM can be constructed, in which the location of the viewer, the
angle under which s/he is looking, the zoom angle, and the amplification
factor of relief exaggeration can be specified. Three -dimensional views can
be constructed using only a predefined m esh, covering the surface, or using
other rasters (e.g. a hillshading raster) or images (e.g. satellite images)
which are draped over the DEM.
Determination of change in elevation through time The cut-and-fill volume
of soil to be removed or to be brough t in to make a site ready for
construction can be computed by overlaying the DEM of the site before the
work begins with the DEM of the expected modified topography. It is also munotes.in
Page 125
Spatial Data Analysis
125 possible to determine landslide effects by comparing DEMs of before and
after t he landslide event.
Automatic catchment delineation Catchment boundaries or drainage lines
can be automatically generated from a good quality DEM with the use of
neighbourhood functions. The system will determine the lowest point in the
DEM, which is consi dered the outlet of the catchment. From there, it will
repeatedly search the neighbouring pixels with the highest altitude. This
process is continued until the highest location (i.e. cell with highe st value) is
found, and the path followed determines the c atchment boundary. For
delineating the drainage network, the process is reversed. Now, the system
will work from the watershed downwards, each time looking for the lowest
neighbouring cells, which d etermines the direction of water flow.
Dynamic modelling A part from the applications mentioned above, DEM
sare increasingly used in GIS -based dynamic modelling, such as the
computation of surface run -off and erosion, groundwater flow, the
delineation of ar eas affected by pollution, the computation of areas that w ill
be covered by processes such as debris flows and lava flows.
Visibility analysis A viewshed is the area that can be ‘seen’ —i.e. is in the
direct line -of-sight —from a specified target location. V isibility analysis
determines the area visible from a scenic lookout, the area that can be
reached by a radar antenna, or assesses how effectively a road or quarry will
be hidden from view.
7.6 SUMMARY
Spatial analysis allows you to solve complex location -oriented probl ems and
better understand where and what is occurring in your world . It goes beyond
mere mapping to let you study the characteristics of places and the
relationships between them. Spatial analysis lends new perspectives to your
decision -makin g.
Spatial an alysis is the most intriguing and remarkable aspect of GIS. Using
spatial analysis, you can combine information from many independent sources
and derive new sets of information (results) by applying a sophisticated set of
spatial operators. T his comprehens ive collection of spatial analysis tools
extends your ability to answer complex spatial questions.
Statistical analysis can determine if the patterns that you see are significant.
You can analyze various layers to calculate the suitability o f a place for a
particular activity. And by employing image analysis , you can detect change
over time. These tools and many others, which are part of Ar cGIS, enable you
to address critically important questions and decisions that are beyond the
scope of simple visual analysis. Here are some of the foundational spatial
analys es and examples of how they are applied in the real
munotes.in
Page 126
Principles of
Geogrphics
Information Systems
126 7.7 REFERENCES
1. www.esri .com
2. www.nationalgeographic.or g
3. Principle o f grographic Information syste m(Reference b ook)
7. 8 UNIT END QUESTIONS
1. What is overlay functions?
2. Explain nei ghbourhood functions?
3. Explain spatial queries?
4. Explain difference between raster and vector overlay
munotes.in
Page 127
127 8
GIS APPLICATION MODELS
Unit Structure :
8.0 Objectives
8.1 Introduction
8.2 Applications
8.2.1 Purpose of the model
8.2.2 Methodology
8.2.3 Scale
8.2.4 Dimensions
8.2.5 Implementation logic
8.3 Error propagation in s patial data processing
8.4 Some common causes of Error propogation
8.5 Quantifying error propagation
8.6 Summary
8.7 References
8.8 Unit End Questions
8.0 OBJECTIVES
After going through this chapter ,you will be able to
1. Different applicati on of GIS models
2. Analyzing of erro r and there propogation
8.1 INTRODUCTION
Geographic Information Systems is a vast field in Information Technology
and, like any other booming technology, also has various applications in
multiple domains. GIS is used to create awareness and to share knowledge
regarding the environment, natural resources, potential disasters and risks and
planned urban routes. Organizations like ESRI, Here Maps, and Leidos group
are working on various models concerning natura l resources, advanced driving
systems, and even defense systems of the nations. Applications of GIS allow
people and organizations to do geological observations and analyze the spatial
data in a gr anular format. munotes.in
Page 128
Principles of
Geogrphics
Information Systems
128 We have discussed the notion that real world processes are often highly
complex. Models are simplified abstractions of reality representing or
describing its most important elements and their interactions.
Modelling and GIS are more or less inseparable, as GIS is itself a tool for
modelling ‘the real world’ (or al least some part of it). The solution to a
(spatial) problem usually depends on a (large) number of parameters. Since
these parameters are often interrelated, their interaction is m ade more precise
in an application model.
8.2 APPLICATIONS
Representing the “real world” in a data model has been a challenge for GIS
since their inception in the 1960s. A GIS data model enables a computer to
represe nt real geographical elements as gra phical elements. Two
representational models are dominant; raster (grid -based) and vector (line -
based):
Raster . Based on a cellular organization that divides space into a series of
units. Each unit is generally similar i n size to another. Grid cells are th e
most common raster representation. Features are divided into cellular arrays
and a coordinate (X,Y) is assigned to each cell, as well as a value. This
allows for registration with a geographic reference system. A raste r
representation also relies on tessellation : geometric shapes that can
completely cover an area. Although many shapes are possible (e.g. triangles
and hexagons), the square is the most commonly used. Resolution is an
important concern in raster representa tions. For a small grid, the resolut ion
is coarse but the required storage space is limited. For a large grid the
resolution is fine, but at the expense of a much larger storage space. On the
above figure, the real world (shown as an aerial photograph) is simplified as
a grid where the color of each cell relates to an entity such as road, highway
or river. munotes.in
Page 129
GIS Application
Models
129 Vector . The concept assumes that space is continuous, rather than discrete,
which gives an infinite (in theory) set of coordinates. A vector
representati on is composed of three main element s: points, lines and
polygons. Points are spatial objects with no area but can have attached
attributes since they are a single set of coordinates (X and Y) in a coordinate
space. Lines are spatial objects made up of con nected points (nodes) that
have no w idth. Polygons are closed areas that can be made up of a circuit of
line segments. On the above figure, the real world is represented by a series
of lines (roads and highway) and one polygon (the river). A real -world
entity could be represented by differen t types of vector features depending
on the map scale used in an application (e.g. a road can be represented as a
line at a smaller scale or as a polygon at a larger scale.)
Application models to include any kind of GIS based model (including so -
called ana lytical and process models) for a specific real -world application.
Such a model, in one way or other, describes as faithfully as possible how the
relevant geographic phenomena behave, and it does so in terms of the
param eters.
The nature of application mo dels varies enormously. GIS applications for
famine relief programs, for instance, are very different from earthquake risk
assessment applications, though both can make use of GIS to derive a solution.
Many kinds of appl ication models exist, and they can b e classified in many
different ways.
Here we identify five characteristics of GIS -based application models:
1. The purpose of the model,
2. The methodology underlying the model,
3. The scale at which the model works,
4. Its dimensionality - i.e. whether the model includes spatial, temporal or
spatial and temporal dimensions, and
5. Its implementation logic - i.e. the extent to which the model uses existing
knowledge about the implementation context. It is important to note that the
categories above are merely different characteristics of any given
application model.
8.2.1 Purpose of the model
It refers to whether the model is descriptive, prescriptive or predictive in
nature. Descriptive models attempt to answer the “w hat is” - question.
Prescriptive mod els usually answer the “what should be” question by
determining the best solution from a given set of conditions. models for
planning and site selection are usually prescriptive, in that they quantify
environmental, econ omic and social factors to determine ‘best’ or optimal
locations. So -called Predictive models focus upon the “what is likely to be”
questions, and predict outcomes based upon a set of input conditions.
Examples of predictive models include forecasting mode ls, such as those
attempting to pred ict landslides or sea –level rise. munotes.in
Page 130
Principles of
Geogrphics
Information Systems
130 8.2.2 Methodology
It refers to the operational components of the model. Stochastic models use
statistical or probability functions to represent random or semi -random
behaviour of phenom ena. In contrast, deterministic mode ls are based upon a
well-defined cause and effect relationship.
Examples of deterministic modelsinclude hydrological flow and pollution
models, where the ‘effect’ can often be described by numerical methods and
differen tial equations.
1.Rule -based models attempt to model processes by using local (spatial) rules.
Cellular Automata (CA) are examples of models in this category. These are
often used to understand systems which are generally not well understood, but
for whic h their local processes are well kno wn. For example, the
characteristics of neighbourhood cells (such as wind direction and vegetation
type) in a rasterbased CA model might be used to model the direction of spread
of a fire over several time steps.
2.Agen t-based models (ABM) attempt to mode l movement and development
of multiple interacting agents (which might represent individuals), often using
sets of decision -rules about what the agent can and cannot do. Complex agent -
based models have been developed to understand aspects of travel behavio ur
and crowd interactions which also incorporate stochastic components.
8.2.3 Scale
It refers to whether the components of the model are individual or aggregate in
nature. Essentially this refers to the ‘level’ at which the model operates.
Individual -based models are based on individual entities, such as the agent -
based models described above, whereas aggregate models deal with ‘grouped’
data, such as population census data. Aggregate models may operate on data at
the lev el of a city block (for example, usi ng population census data for
particular social groups), at the regional, or even at a global scale.
8.2.4 Dimensions
It is the term chosen to refer to whether a model is static or dynamic, and
spatial or aspatial. Some models are explicitly spatial, mean ing theoperate in
some geographically defined space. Some models are aspatial, meaning they
have no direct spatial reference. Models can also be static, meaning they do not
incorporate a notion of time or change. In dyna mic models, time is an essential
parameter .Dynamic models include various types of models referred to as
process models or simulations. These types of models aim to generate future
scenarios from existing scenarios, and might include deterministic or stoc hastic
components, or some kind of l ocal rule (for example, to drive a simulation of
urban growth and spread). The fire spread example given above is a good
example of an explicitly spatial, dynamic model which might incorporate both
local rules and stocha stic components.
munotes.in
Page 131
GIS Application
Models
131 8.2.4 Implementat ion logic
It refers to how the model uses existing theory or knowledge to create new
knowledge. Deductive approaches use knowledge of the overall situation in
order to predict outcome conditions. This includes models tha t have some kind
of formalized set o f criteria, often with known weightings for the inputs, and
existing algorithms are used to derive outcomes. Inductive approaches, on the
other hand, are less straightforward, in that they try to gener - approaches alize
(often based upon samples of a speci fic data set) in order to derive more
general models. While an inductive approach is useful if we do not know the
general conditions or rules which apply in a given domain, it is typically a
trialand -error approach which requires empirical testing to deter mine the
parameters of each input variable.
Most GIS only come equipped with a limited range of tools for modelling. For
complex models, or functions which are not natively supported in our GIS,
external software enviro nments are frequently used. In some cases, GIS and
models can be fully integrated (known as embedded coupling) or linked
through data andinterface (known as tight coupling). If neither of these is
possible, the external model might be run independently of our GIS, and the
output exported fro m our model into the GIS for further analysis and
visualization. This is known as loose coupling.It is important to compare our
model results with previous experiments and to examine the possible causes of
inconsistency between the output of our models and the expected results.
8.3 ERROR PROPAGATION IN SPATIAL DATA
PROCESSING
number of sources of error that may be present in source data. It is important to
note that the acquisition of base data to a high standard of q uality still does not
guarantee that the results of further, complex processing can be treated with
certainty. As the number of processing steps increases, it becomes difficult to
predict the behaviour of error propagation. These various errors may affect the
outcome of spatial data manipula tions. In addition, further errors may be
introduced during the various processing steps discussed earlier in this chapter,
as illustrated in Figure.
One of the most commonly applied operations in geographic informati on
systems is analysis by overlaying two or more spatial data layers. As discussed
above, each such layer will contain errors, due to both inherent inaccuracies in munotes.in
Page 132
Principles of
Geogrphics
Information Systems
132 the source data and errors arising from some form of computer processing, for
example, raste rization.
During the process of spa tial overlay, all the errors in the individual data layers
contribute to the final error of the output. The amount of error in the output
depends on the type of overlay operation applied. For example, errors in the
resul ts of overlay using the logical oper ator AND are not the same as those
created using the OR operator.
Table contains lists common sources of error introduced into GIS analyses.
Note that these are from a wide range of sources, and include various common
tasks relating to both data preparati on and data analysis. It is the combination
of different errors that are generated at each stage of preparation and analysis
which may bring about various errors and uncertainties in the eventual outputs.
Consider anoth er example. A land use planning agen cy is faced with the
problem of identifying areas of agricultural land that are highly susceptible to
erosion. Such areas occur on steep slopes in areas of high rainfall. The spatial
data used in a GIS to obtain this inf ormation might include:
• A land us e map produced five years previously from 1 : 25, 000 scale aerial
photographs,
• A DEM produced by interpolating contours from a 1 : 50, 000 scale
topographic map, and
• Annual rainfall statistics collected at two rai nfall gauges.
8.4 SOME COMMON CAUSE S OF ERROR PROPOGATION
Below are the list of some common causes of error in spatial data handling.
munotes.in
Page 133
GIS Application
Models
133 8.5 QUANTIFYING ERROR PROPAGATION
Chrisman noted that “the ultimate arbiter of cartographic error is the real
world, n ot a mathematical formulation”. It i s an unavoidable fact that we will
never be able to capture and represent everything that happens in the real world
perfectly in a GIS. Hence there is much to recommend the use of testing
procedures for accuracy assessme nt.
Various perspectives, motives a nd approaches to dealing with uncertainty
have given rise to a wide range of conceptual models and indices for the
description and measurement of error in spatial data. All these approaches have
their origins in academic research and have strong theoretica l bases in
mathematics and statistics. Here we identify two main approaches for assessing
the nature and amount of error propagation:
1. Testing the accuracy of each state by measurement against the real world,
and
2. Modelling error propagation, either analytically or by means of simulation
techniques.
Modelling of error propagation has been defined by Veregin as: “the
application of formal mathematical models that describe the mechanisms
whereby errors in source data layers are modified by particular d ata
transformation operations.” In other words, we would like to know how errors
in the source data behave under manipulations that we subject them to in a
GIS. If we are able to quantify the error in the source data as well as their
behaviour under GIS ma nipulations, we have a means of judging the
uncertainty of the results.
Error propagation models are very complex and valid only for certain data
types (e.g. numerical attributes). Initially, they described only the prop agation
of attribute error .
More re cent research has addressed the spatial aspects of error propagation and
the development of models incorporating both attribute and locational
components. These topics are outside the scope of this book, and readers are
referred to for more detailed discus sions. Rather than explicitly modelling error
propagation, is often more practical to test the results of each step in the
process against some independently measured reference data.
Looking at this robust system, its ap plications and uses are never -ending , just
like its vast amount of geospatial data sets and databases. Day by day, analysts
and researchers are innovating new applications of this technology. The count
of the applications is never going to fall. Using GIS is not just limited to above
6 -7 applications, but it has around 1000+ uses and applications in various
fields.
Archaeology, geology, Waste management, Natural Resources Management,
Asset management and even Aviation and Banking. It seems that in the near
future, GIS is going to get integra ted with everything, and that’s why munotes.in
Page 134
Principles of
Geogrphics
Information Systems
134 companies like HERE Maps and GOOGLE are working on prototypic
concepts like the Internet of Things, where everything will be interconnected.
8.6 SUMMARY
A GIS data model enables a c omputer to represent real geographical elements
as graphical elements. Two representational models are dominant; raster (grid -
based) and vector (line -based ): Raster. Based on a cellular organization that
divides space into a series of units.
Raster . Based on a cellular organization that divides space into a series of
units. Each unit is generally similar in size to another. Grid cells are the most
common raster representation. Features are divided into cellu lar arrays and a
coordinate (X,Y) is assigned to e ach cell, as well as a value. This allows for
registration with a geographic reference system. A raster representation also
relies on tessellation : geometric shapes that can completely cover an area.
Vect or. The concept assumes that space is continuous, ra ther than discrete,
which gives an infinite (in theory) set of coordinates. A vector representation is
composed of three main elements: points, lines and polygons. Points are spatial
objects with no area but can have attached attributes since they are a si ngle set
of coordinates (X and Y) in a coordinate space. Lines are spatial objects made
up of connected points (nodes) that have no width. Polygons are closed areas
that can be made up of a circuit of lin e segments.
8.7 REFERENCES
1. www.esri .com
2. www.nationalgeographic.or g
3. Principle o f grographic Information syste m(Reference b ook)
8.6 UNIT END QUESTIONS
1. What are different GIS models?
2. What are different c auses of error propogation?
3. HGow error propogates in spatial queries?
4. What is quantif ying error propogation?
munotes.in
Page 135
135 9
DATA VISUALIZATION
Unit Strcture :
9.0 Objectives
9.1 Introduction
9.2 GIS And Maps
9.3 The Visualization Process
9.4 Visualization Strategies: Present Or Explore?
9.5 The Cartographic Toolbox
9.6 How To Map
9.7 Map Cosmetics
9.8 Map Disseminati on
9.9 Summary
9.10 References
9.11 Questions
9.0 OBJECTIVES
After studying this chapter, the students will be able to,
Understand the important of GIS map for spatial data.
Describe the visualization process along with strategies.
Summarize the cartographic toolbox.
Infer the way to map quantitative data, qualitative data, terrain evaluation
and time series data.
Compare and contrast between map cosmetics and map dissemination.
9.1 INTRODUCTION
1. Visualization through maps is not only highly efficient way to transfer
information to the audience but also it provides aesthetics to the
information. It will make boring content to be eye catching.
2. So everyone nowadays use different maps to visualize information on data
analysis reports, Geo based satellite image etc.
3. Doing visualization with map is easier to see the distribution or proportion
of data in each region in order to mine deeper information and make better
decisions about the data taken for understudy. munotes.in
Page 136
Principles of
Geogrphics
Information Systems
136 4. There are many types in map visualization such as adm inistrative maps,
heatmaps, statistical maps, trajectory maps, bubble maps etc.
5. Types of Maps are as follows
5.1 Point Map - Point maps are straightforward, especially for displaying
data with a wide distribution of geographic information. For example
some companies have a wide range of business located in different parts
of country, it will be more complicated to implement with general maps,
and with less accuracy, so for such type of problems, point map are
suitable for precise and fast positioning. I t can also be useful to track
accident in certain geographic areas.
Fig 1 Point maps which shows dark blue circle representing company
sales in different regions
5.2 Line Map - The line map sometimes contains not only space but also
time req uired for the application where analysis of important scene is
important. For eg Route distribution of riding or driving, bus or subway
line distribution.
Fig 2 Line map which shows the taxi route of new York city
5.3 Regional Map -This map i s also called a filling map. It can be displayed
by country, province, city, district or even some customized maps. This munotes.in
Page 137
Data Visua lization
137 map uses different shades of color on the map to show data of different
sizes.
For eg To show the sales in different region s by drilli ng down the
features of that region.
Fig 3 - Regional Map which shows the sales of each city in different
shades of color, with larger the sales are, the darker the color is
5.4 Flow Map - In this type of map the interaction data between the origin
and the destination is usually expressed by a line that connects the
geometric center of gravity of the space unit. The flow direction value
between the origin and the destination is determined by the width or the
color of the line. This type of ma p is used in traffic flow analysis,
population migration, shopping consumption behavior, communication
information flow, aviation routes etc.
Fig 4 - Flow Map which shows the inter -regional trade between
countries with the help of width and color line.
5.5 Time Space distribution Map - Such maps visualize the trajectory
distribution with both temporal and spatial information. They can
record time and spatial distribution of each point. This type of map is
used in GPS geographic tracking etc.
5.6 Data Space Distribution Map - We Use a concrete example to explain
this map. The picture below is a spatial distribution map of passenger
flow in rail transit. Different colors identify different lines and the munotes.in
Page 138
Principles of
Geogrphics
Information Systems
138 thickness of the line indicates the traff ic volume of different stations.
The thicker the line is, the larger the traffic is . It can also indicate the
direction of the track line.
Fig 5 Data Space distribution map which shows the distribution of
passenger flow in a certain period o f time, so as to rationally arrange
operations (such as the number of employees etc).
5.7 Other types of maps also exist such as heat map, heat point map, custom
map, three dimensional rectangular map.
9.2 GIS AND MAPS
1. There is a strong relationship betwe en maps and GIS. More specifically,
maps can be used as input for a GIS.
2. As soon as a question contains a “Where?’ question a map can often be the
most suitable tool to solve the question and provide the answer. “Where do
I find enschede?” and “where did ITC’s students come from?” are both
examples.
3. A map would put these answers in a spatial context. It could show where
in the Netherlands Enschede is to be found and where it is located with
respect to schirpol -amsterdam airport, where most students arrive. As
shown in figure no 6
Fig no 6 This map reveals that most students arrive from Africa and Asia
and only a few come from the Americas, Australia, and Europe.
munotes.in
Page 139
Data Visua lization
139 4. As soon as the location of geographic objects(“Where?”) is involved, a
map becomes useful. Ho wever, maps can do more than just providing
information on location. They can also inform about the thematic
attributes of the geographic objects located in the map. An example would
be “what is the predominant land use in southeast Twentie?”.
Fig 7 This map reveals a dominant northwest -southeast urban buffer about
land use in southeast twente?
5. A third type of question that can be answered from maps is related to
“when?” for instance, “When did the Netherlands have its longest
coastline?”. The answer mig ht be “1600”, and this will probably be
satisfactory to most people. As shown in figure below
Fig 8 This map depicts about the coastline covered by netherlands
6. Maps can deal with questions/answers related to the basic components of
spatial or geographi c data: location(geometry), characteristics (thematic
attributes) and time and their combination.
7. As such, maps are the most efficient and effective means to transfer spatial
information. The map user can locate geographic objects, while the shape
and colo ur of signs and symbols representing the objects inform about
their characteristics. Onscreen maps are often interactive and have a link
to a database and as such allow for more complex queries.
munotes.in
Page 140
Principles of
Geogrphics
Information Systems
140 8. Below figure shows an aerial photograph of the ITC building a nd a map of
the same area. The photographs show all visible objects, including parked
cars, and small temporary buildings.
Fig 9 Comparing aerial photograph in photo a and b
9. The map scale is the ratio between a distance on the map and the
corresponding distance in reality. Maps that show much detail of a small
area called large -scale maps. Broad definition of map is – “ A
representation or abstraction of geographic reality. A tool for presenting
geographic information in a way that is visual, digital or tactile.
10. Traditionally maps are divided in to topographic maps and thematic maps.
A topographic map visualizes, limited by its scale, the earth’s surface are
accurately as possible. This may include infrastructure (eg railroads and
roads), land use (eg veg etation and built -up area), relief, hydrology,
geographic names and a reference grid.
Fig 10 A topographic map of the province of overijssel. Geographic names
and a reference grid have been omitted for reasons of clarity.
11. Thematic maps represent t he distribution of particular themes. One can
distinguish between socio -economic themes and physical themes. The map
in below figure a) shows population density in overjissel, is an example of
the first and the map in figure b) displaying the province’s d rainage areas,
is an example of the second. munotes.in
Page 141
Data Visua lization
141
Fig 11 a) Socio -economic thematic map, showing population density of
the province density of the province of overijissel (higher densities in
darker tints), b) physical thematic map showing watershed areas of
overijissel.
9.3 THE VISUALIZATION PROCESS
1. The cartographic visualization process is considered to be the translation
or conversion of spatial data from a database in to graphics.
2. During the visualization process, cartographic methods and techniqu es are
applied. These can be considered to form a kind of grammar that allows
for the optimal design and production for the use of maps, depending on
the application. The process is described in figure below
Fig 12 The cartographic visualization process
3. The producer of these visual products may be a professional cartographer,
but may also be a discipline expert, for instance mapping vegetation stands
using remote sensing images or health statistics in the slums of a city.
4. The visualization proce ss can vary greatly depending on where in the
spatial data handling process it takes place and the purpose for which it is
needed. The process can be simple or complex while the production time
can be short or long. munotes.in
Page 142
Principles of
Geogrphics
Information Systems
142 5. Some examples includes such as creat ion of full traditional topographic
map sheet, a newspaper map, a sketch map, a map from an electronic atlas,
an animation showing the growth of a city, a three dimensional view of a
building or a mountain or even a real -time map display of traffic
conditi ons.
6. Visualization can also be used for checking the consistency of the
acquisition process or even the database structure. The environment in
which the visualization process is executed can vary considerably. It can
be done on a stand alone personal computer, a network computer
connected to network or the WWW/ internet.
7. The visualization process is guided by the question “how do I say what to
whom?” “how” refers to cartographic methods and the techniques. “I”
refers to the cartographer or map make r, “say” deals with communicating
in graphics the semantics of the spatial data. “what” refers to the spatial
data and its characteristics. “Whom” refers to the map audience and the
purpose of the map - for instance a map for scientists requires a different
approach than a map on the same topic aimed at children.
8. In the past, the cartographer was often solely responsible for the whole
map compilation process. During this process, incomplete and uncertain
data often still resulted in an authoritative map.
9. The visualization process should also be tested on its effectiveness. To the
proposition “how do is say what to whom” we have to add and its
effective? Based on feedback from map users or knowledge about the
effectiveness of cartographic solutions, i t is to be decide whether
improvements are needed and derive recommendations for future
application of these solutions.
10. The visualization process is always influenced by several factors. Some of
these questions can be answered by just looking at the c ontent of the
spatial database:
10.1 What will be the scale of the map: large, small other?. This
introduces the problem of generalization. Generalization addresses
the meaningful reduction of the map content during scale
reduction.
10.2 Are we dealing wi th topographic or thematic data? These two
categories traditionally resulted in different design approaches.
10.3 Whether the data to be represented are of a quantitative of
qualitative nature.
11. One should understand that the impact of these factors m ay increase, since
the compilation of maps by spatial data handling is often the result of
combining different data sets of different quality and from different data
sources, collected at different scales and stored in different map
projections.
munotes.in
Page 143
Data Visua lization
143 9.4 VISUALIZATION STRATEGIES : PRESENT OR
EXPLORE?
1. The main function of maps is to communicate geographic information, i.e
to inform the map user about location and nature of geographic
phenomena and spatial patterns. Well trained cartographers are designi ng
and producing maps with a well designed cartographic tools.
2. The widespread use of GIS has increased the number of maps
tremendously. Even the spreadsheet software used commonly in office
today has mapping capabilities, but many of these maps are not produced
as final products, rather act as intermediaries to support the user in her/his
work dealing with spatial data.
3. The map has started to play a completely new role: it is not only a
communication tool, but also has become an aid in the user’s ( visual)
thinking process.
4. This thinking process is accelerated by the continued developments in
hard and software. Media like DVD -ROMS and the WWW allow dynamic
presentation and also user interaction.
5. Users now expect immediate and real time acces s to the data, data that
have become abundant in many sectors of the geoinformation world, due
to lack of tools for processing user -friendly queries and retrieval when
studying the massive amount of data produced by sensors, which is now
available through the use of internet.
6. A new branch of science is currently evolving to deal with this problem of
abundance in the base of geo -disciplines known as visual data mining.
7. Specific software toolboxes have been developed and their functionality is
based on two key words as interaction and dynamics. A separate discipline
called scientific visualization has developed around it and has also had an
important impact on cartography. It offers the user the possibility of
instantaneously changing the appearance o f a map.
8. Interaction with the map will stimulate the user’s thinking and will add a
new function to the map. As well as communication, it will prompt
thinking and decision making.
9. Developments in scientific visualization stimulated DiBiase to defi ne a
model for map -based scientific visualization also known as
geovisualization. It covers both the presentation and exploration functions
of the map. Where as presentation refers to representing public
communication through maps and exploration refers to representing
private maps through maps with unknown data. munotes.in
Page 144
Principles of
Geogrphics
Information Systems
144
Fig 13 Private visual thinking and public visual communication.
10. The democratization of cartography by Morrison explains that “using
electronic technology, no longer does the map user depend on what the
cartographer decides to put on a map. Today the user is the cartographer,
user are now able to produce analyses and visualizations at will to any
accuracy standard that satisfies them.
11. Exploration means to search for spatial, temporal or spatio -temporal
patterns, relationships between patterns or trends. A search for
relationships between patterns could include: changes in vegetation
indices and climatic parameters, location of deprived urban areas and their
distance to educational facilit ies. A search for trends could, for example
focus on the development in distribution and frequency of landslides.
12. Maps not only enable these types of searches, finding may also trigger
new questions and lead to new visual exploration acts.
13. carto graphic knowledge is incorporated in the program resulting in pre -
designed maps. To create a map, one selects relevant geographic data and
converts these in to meaningful symbols for the map. Paper maps had a
dual function acted as a database of the object s selected from reality and
communicated information about these geographic objects.
14. The sentence “How do I say what to whom and its effective?” guides the
cartographic visualization process and summarizes the cartographic
communication principle.
15. In 1967, the French cartographer bertin developed the basic concepts of
the theory of map design with his publication semiology graphique. He
provided guidelines for making good maps. If ten professional
cartographers were given the same mapping task, and each would apply
bertin’s rule this would result in ten different maps with good quality. munotes.in
Page 145
Data Visua lization
145
Fig 14 The cartographic communication process, based on “How do I say
what to whom and is it effective?
16. From the above figure it is clear that the boxe s with information and
information retrieved do not overlap, which means the information
derived by the map user is not the same as the information that the
cartographic communication process started with. There may be several
causes, possibly the original information was not used of additional
information has been added during the process. Another reason that the
map user did not fully understand the map, due to the cartographer who
added extra information to strengthen the already available information.
It is also possible that the map user has some prior knowledge on the
topic or area, which allows them to combine this prior knowledge
retrieved from the map.
9.5 THE CARTOGRAPHIC TOOLBOX
9.5.1 What kind of data do I have ?
1. To derive the proper symbolog y for a map one has to execute a cartographic
data analysis. The core of this analysis process is to access the
characteristics of the data to find out how they can be visualized, so that the
map user properly interprets them. For example, if all the data are related to
land use, collected in 2005, the title could be Landuse of 2005. Secondly the
individual components such as landuse and probably relief should be
analysed and their nature described. Later these components should be
visible in the map legend .
2. Data will be of a qualitative or quantitative nature. Qualitative data is also
called nominal or categorical data. This data exists as discrete, named
values without a natural order amongst the values. Examples are the
different languages (eg English , Swahili, Dutuch), the different soil types munotes.in
Page 146
Principles of
Geogrphics
Information Systems
146 (eg sand, clay,peat) or the different lanf use categories(eg arable land,
pasture).
3. Quantitative data can be measured, either along an interval or ratio scale.
For data measured on an interval scale, the exa ct distance between values is
known but there is no absolute zero on the scale.
4. Temperature is an example : 40 degree Celsius is not twice as warm as 20
degree Celsius and 0 degree Celsius is not an absolute zero.
5. Quantitative data with a ratio sc ale does have a known absolute zero. An
example is income: someone earning $100 earns twice as much as someone
with an income of $50. In order to generate maps, quantitative data are
often classified in to categories according to some mathematical method.
6. In between qualitative and quantitative data, one can distinguish ordinal
data. These data are measured along a relative scale, based on hierarchies.
For example one knows that one value is ‘more’ than another value such as
warm versus cool. Another ex ample is a hierarchy of road types: ‘highway’,
‘main’, ‘road’ and ‘track’.
7. The different types of data are summarized as follow
Sr.No Measurement
Scale Nature of Data
1 Nominal,
Categorical Data of different nature/ identity of things
(Qualitative)
2 Ordinal Data with a clear element of order, though not
quantitatively determined (ordered)
3 Interval Quantitative information with arbitrary zero
4 Ratio Quantitative data with absolute zero
Table 1 Differences in the nature of data and their measur ement scales
9.5.2 HOW TO MAP
1. Basic elements of a map, irrespective of the medium on which it is
displayed are point symbols, line symbols , area symbols and text. The
appearance of point, line and area symbols can vary depending on their
nature.
2. Points can vary in form or colour to represent the location of shops or they
can vary in size to represent aggregated values for an administrative areas.
3. Lines can vary in colour to distinguish between administrative boundaries
and rivers or vary in sh ape to show the difference between railroads and
roads. Areas follow the same principles: difference in colour distinguishes
between different vegetation stands. munotes.in
Page 147
Data Visua lization
147 4. Although the variations in symbol appearance are only limited by the
imagination they can be grouped together in a few categories. Bertin
distinguished six categories which he called the visual variables and which
may be applied to point, line and area symbols. As illustrated in given
figure below they are Size, value(lightness), texture, colou r, orientation and
shape.
Fig 15 Bertin’s Six visual variables
5. These visual variables can be used to make one symbol different from
another. In doing this, map makers in principle have free choice, provided
they do not violate the rules o f cartographic grammar.
6. The symbol should be located where features belong. Visual variables
influence the map user’s perception in different ways. What is perceived
depends on the human capacity to see or perceive.
6.1 What is of equal importance ( eg all red symbols represent danger)
6.2 order (eg the population density varies from low to high - represented
by light and dark colour tints, respectively).
6.3 An instant overview of the mapped theme.
7. There is an obvious relationship between the na ture of the data to be
mapped and the ‘perception properties’ of visual variables. Dimensions of
the plane is added to the list of visual variables, it is the basis used for the
proper location of symbols on the plane (map). munotes.in
Page 148
Principles of
Geogrphics
Information Systems
148
Table 2 Measurement scal es linked to visual variables based on perception
properties.
9.6 HOW TO MAP
1. This part deal with characteristics mapping problems. We first describe a
problem and briefly discuss a solution based on cartographic rules and
guidelines. The need to follow these rules and guidelines is illustrated by
some maps that have been wrongly designed but are nevertheless commonly
found.
9.6.1 How to map qualitative data
1. If, after a long fieldwork period, one has finally delineated the boundaries of
a province’s watersheds, one likely is interested in a map showing these
areas. The geographic units in the map will have to represent the individual
watersheds.
2. In such a map, each of the watersheds should get equal attention and none
should stand out above the o thers.
3. The application of colour would be the best solution since it has
characteristics that allow one of quickly differentiate between geographic
units. However, since none of the watersheds is more important than the
others, the colours used have t o be of equal visual weight or brightness.
Fig 16 A good example of mapping qualitative data munotes.in
Page 149
Data Visua lization
149 4. The readability is influenced by the number of displayed geographic units.
In the above example it is 15, but when it goes more than this number the
scale displayed here will become too cluttered.
5. The map can also be made by filling the watershed areas by different
forms(like small circles, squares, triangles) etc in one colour as an
application of the visual variable shape.
6. In the figure 17 shows s everal tints of black as an application of the visual
variable value. Looking at the map may cause perceptual confusion since
the map image suggests differences in importance that are there in reality.
Fig 17 Misuse of tints of black
7. In the figure 18 different colours are used instead of black colour but
watersheds are represented in pastel tints, one of them stands out by its
bright colour. This gives the map an unbalanced look. The viewer’s eye will
be distracted by the bright colours, r esulting in an unjustified weaker
attention for other areas.
Fig 18 Misuse of bright colours
munotes.in
Page 150
Principles of
Geogrphics
Information Systems
150 9.6.2 How to map quantitative data
1. One deals with absolute quantitative data, when after executing a census one
would for example like to create a map w ith absolute quantitative data. The
geographic units will logically be the municipalities. The final map should
allow the user to determine the amount per municipality and also offer an
overview of the geographic distribution of the phenomenon. Figure belo w
shows the final map for the province of overjissel.
Fig 19 Mapping absolute quantitative data
2. Imagine a small and a large unit having the same number of inhabitants. The
large unit would visually attract more attention, giving the impression there
are more people than in the small unit.
3. Another issue is that the population is not necessarily homogeneously
distributed within the geographic units. Colour has also been misused as
shown in figure below.
Fig 20 a) Poorly designed maps displaying a bsolute quantitative data: wrong
use of green tints for absolute population b) incorrect use of colour. munotes.in
Page 151
Data Visua lization
151 4. On the basis of absolute population numbers per municipality and their
geographic size, we can also generate a map that shows population density
per municipality. We then deal with relative quantitative data.
5. The aim of the map is to give an overview of the distribution of the
population density from low(light tints) to high (dark tints). The map reader
automatically and in glance associate the dar k colours with high density and
the light values with low density as shown in figure below
Fig 21 Mapping relative quantitative data
6. If one studies the badly designed maps carefully, the information can be
derived, in one way or another, but it would take quite some effort. Proper
application of cartographic guidelines will guarantee that this will go much
more smoothly (eg faster and with less chance of misunderstanding).
Fig 22 a) Badly designed maps representing relative quantitative data -
lightness values used out of sequence b) Colour should not be used
munotes.in
Page 152
Principles of
Geogrphics
Information Systems
152 9.6.3 How to map the terrain elevation
1. Terrain elevation can be mapped using different methods. Often one will
have collected an elevation data set for individual points like peaks or other
characteristic points in the terrain.
2. A contour map in which the lines connect points of equal elevation is
generally used. To visually improve the information content of such a map
the space between the contour lines can be filled with colour and valu e
information following a convention eg green for low elevation and brown
for high elevation areas. This technique is known as hypsometric or layer
tinting.
3. The shaded relief map uses the full three dimensional information to create
shading effects. Thi s map represented on a two -dimensional surface can
also be floated in three -dimensional space to give it a real three -dimensional
appearance of a ‘virtual world’ as shown in figure below
a
Fig 23 a) Visualization of terrain elevation – Contour map b) Ma p with layer
tints d) 3D view of the terrain
4. Interactive functions are required to manipulate the map in three -
dimensional space in order to look behind some objects. These
manipulations include pan -ning, zooming, rotating and scaling. Scaling is
needed , particularly along the Z -axis since some maps require small -scale
elevation resolution while others require large -scale resolution i.e vetical
exaggeration. munotes.in
Page 153
Data Visua lization
153 5. of course one can also visualize objects below the surface in a similar way
but this is more difficult because the data to describe underground objects
are sparsely available.
9.6.4 How to map time series
1. Advances in spatial data handling have not only made the third dimension
part of GIS routines. Nowadays, the handling of time -dependent data is also
part of these routines. This has been caused by the increasing availability of
data captured at different periods in time.
2. Mapping time means mapping change. This may be change in a feature’s
geometry, in its attributes or both. Examples of c hanging geometry are the
evolving coastline of the Netherlands, the location of europe’s national
boundaries or the position of weather fonts.
3. The urban boundaries expand and simultaneously the land use shifts from
rural to urban. If maps are to repre sent events like these, they should be
suggestive of such change.
4. This implies the use of symbols that are perceived as representing change.
Eg of such symbols are arrows that have an origin and a destination. They
are used to show movement and their s ize can be an indication of the
magnitude of change.
5. Specific point symbols such as crossed swords (battle) or lightning (riots)
can be found to represent dynamics in historic maps. Another alternative is
the use of the visual variable value.
6. In a map showing the development of a town, dark tints represent old built -
up areas, while new built -up areas are represented by light tints. It is
possible to distinguish between three temporal cartographic techniques
7 Types of Map
7.1 Single static map : Specific graphic variables and symbols are used to
indicate change or represent an event.
7.2 Series of static maps : A single map in the series represents a ‘snapshot’
in time. Together, the maps depict a process of change. Change is
perceived by the suc cession of individual maps depicting the situation
in successive snapshots. It could be said that the temporal sequence is
represented by a spatial sequence, which the user has to follow, to
perceive the temporal variation. The number of images should be
limited since it is difficult for the human eye to follow long series of
maps.
7.3 Animated map : Change is perceived to happen in a single image by
displaying several snapshots after each other just like a video cut with
successive frames. The difference with the series of maps is that the
variation can be deduced from real ‘change’ in the image itself, not
from a spatial sequence munotes.in
Page 154
Principles of
Geogrphics
Information Systems
154
Fig 24 Mapping change, example of the urban growth of the city of
Maastricht, the Netherlands a) Single map, in which tints r epresent age of
the built -up area b) series of maps c) An animation.
9.7 MAP COSMETICS
1. Each map should have, next to the map image, a title, informing the user
about the topic visualized. A legend is necessary to understand how the
topic is depicted.
2. Additional marginal information to be found on a map is a scale indicator, a
north arrow for orientation, the map datum and map projection used and
some lineage information(such as data sources, dates of data collection,
methods used etc).
3. Further inf ormation can be added that indicates when the map was issued
and by whom (author/ publisher). All this information allows the user to
obtain an impression of the quality of the map and is comparable with
metadata describing the contents of a database or da ta layer.
4. On paper maps, these elements have to appear next to the map and is
comparable with metadata describing the contents of a database or data
layer.
5. Maps presented on screen often go without marginal information, partly
because of space constr aints. How -ever on -screen maps are often interactive
and clicking on a map element may reveal additional information from the
database. Legends and titles are often available on demand as well.
6. Text is used to transfer information in addition to the sym bols used. This can
be done by the application of the visual variables to the text as well. munotes.in
Page 155
Data Visua lization
155 7. Common example is the use of colour to differentiate between hydrographic
names(in blue) and other names(in black). The text should also be placed in
a proper po sition with respect to the object to which it refers.
8. The design aspect of creating appealing maps also has to be included in the
visualization process. ‘Appealing’ does not only mean having nice colours.
One of the keywords here is ‘contrast’.
9. Cont rast will increase the communicative role of the map since it creates a
hierarchy in the map contents, assuming that not all information has equal
importance. This design trick is known as visual hierarchy or the figure -
ground concept. The need for visual hierarchy in map is best understood
when looking at the map as shown in figure below.
Fig 24 The paper map and its(marginal) information.
Fig 25 visual hierarchy and the location of the ITC building a)hierarchy not
applied b) hierarchy applied munotes.in
Page 156
Principles of
Geogrphics
Information Systems
156 9.8 M AP DISSEMINATION
1. The map design will not only be influenced by the nature of the data to
be mapped or the intended audience (the ‘what’ and ‘whom’ from “how
do I say what to whom and its effective”), the output medium also plays
a role. Traditionally, maps were produced on paper and many still are.
2. Compared to maps on paper, on -screen maps have to be smaller and
therefore their contents should be carefully selected. This might seem a
disadvantage, but presenting maps on -screen offers very interestin g
alternatives.
3. A mouse click could also open the link to a database and reveal much
more information than a paper map could ever offer. Links to other than
tabular or map data could also be made available.
4. Maps and multimedia(photography, sound, video, animation) can be
integrated. Some of today’s electronic atlases such as the Encarta world
atlas are good examples of how multimedia elements can be integrated
with the map.
5. Pointing to a country on a world map starts the national anthem of the
country or shows its flag. It can be used to explore a country’s language,
moving the mouse would start a short sentence in the region’s dialects.
6. The WWW is nowadays a common medium used to present and
disseminate spatial data. Here maps can play the ir traditional role, for
instance to show the location of objects or provide insight in to spatial
patterns, but because of the nature of the internet, the map can also
function as an interface to additional information.
7. Maps can also be used as ‘prev iews’ of spatial data products to be
acquired through a spatial data clearing house that is part of a spatial data
infrastructure. For that purpose we can make use of geo -webservices
which can provide interactive map views as intermediate between data
and web browser.
Fig 26 classification of maps on the WWW.
8. An important distinction is the one between static and dynamic maps.
Many static maps on the web are view -only. Organizations such as map
libraries or tourist information providers often make t heir maps available
in this way.
9. An important distinction is the one between static and dynamic maps.
Many static maps on the web are view -only. Organizations such as map munotes.in
Page 157
Data Visua lization
157 libraries or tourist information providers often make their maps available
in thi s way. Static, view -only maps can also serve to give web surfers a
preview of the products that are available from organizations such as
national mapping agencies.
10. When static maps offers more than view -only functionality, they may
present an interac tive view to the user by offering zooming, panning or
hyperlinking to other information. Clicking on geographic objects may
lead the user to quantitative data, photographs, sound or video or other
information sources on the web.
11. Dynamic maps are abou t change; change in one or more of the spatial
data components. On the WWW, several options to play animations are
available. The so -called animated GIF can be seen as a view -only version
of a dynamic map. A sequence of bitmaps, each representing a frame o f
an animation, are positioned one after another and the WWW browser
will continuously repeat the animation. This can be used for example to
show the change of weather over the last day.
12. More interactive versions of this type of map are those to be pl ayed by
media players for instance those in quick time format or as a flash movie.
Plug-ins to the web browser define the interaction options which are
often limited to simple pause, backward and forward play.
13. The WWW also allows for the fully interac tive presentation of 3D
models. The virtual reality markup language (VRML). For instance can
be used for this purpose. It stores a true 3D model of the objects not just
a series of 3D views.
9.9 SUMMARY
In this chapter we studied about
1. Maps are the most e fficient and effective means to inform us about spatial
information. They locate geographic objects, while the shape and colour of
signs and symbols representing the objects, inform about their
characteristics.
2. Maps are the result of the visualization proc ess. Their design is guided by
“how do I say what to whom and is it effective?”. Executing the sentence
will inform the map maker about the characteristics of the data to be
mapped as well the purpose of the map.
3. Map design will not only depend on the natu re of the data to be mapped or
the intended audience but also on the output medium.
9.10 REFERENCES
[1] Principles of Geographic Information Systems, An introductory textbook
by Otto Huisman and Rolf A. de published by The International Institute for
Geo-Information Science and Earth observation (ITC). munotes.in
Page 158
Principles of
Geogrphics
Information Systems
158 9.11 QUESTIONS
1. Suppose one has two maps, one at scale 1:10,000 and another at scale
1:1,000,000. Which of the two maps can be called a large -scale map and
which is a small scale map?
2. Describe the difference betw een a topographic map and a thematic map.
3. Describe in one sentence or in one question, the main problem of the
cartographic visualization process.
4. Which four main types of thematic data can be distinguished on the basis of
their measurement scales?
5. Which a re the six visual variables that allow to distinguish cartographic
symbols from each other?
6. Describe a number of ways in which a three -dimensional terrain can be
represented on a flat map display.
7. Describe different techniques of cartographic output from t he user’s
perspective.
8. Explain the difference between static maps and dynamic maps.
munotes.in