Transcript
Page 1: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

SpatialDataInfrastructuresYingjieHu,DepartmentofGeography,UniversityofTennessee,Knoxville,TN37996WenwenLi,SchoolofGeographicalSciencesandUrbanPlanning,ArizonaStateUniversity,Tempe,

AZ85287Abstract:Spatialdatainfrastructure(SDI)istheinfrastructurethatfacilitatesthediscovery,access,management,distribution,reuse,andpreservationofdigitalgeospatialresources.Theseresourcesmayincludemaps,data,geospatialservices,andtools.Ascyberinfrastructures,SDIsaresimilartoother infrastructures, such as water supplies and transportation networks, since they playfundamentalrolesinmanyaspectsofthesociety.Theseroleshavebecomeevenmoresignificantintoday’sbigdataage,whenalargevolumeofgeospatialdataandWebservicesareavailable.Fromatechnological perspective, SDIsmainly consist of data, hardware, and software.However, a trulyfunctionalSDIalsoneedstheeffortsofpeople,supportsfromorganizations,governmentpolicies,dataandsoftwarestandards,andmanyothers. In this chapter,wewillpresent theconceptsandvalues of SDIs, aswell as a brief history of SDI development in theU.S.Wewill also discuss thecomponents of a typical SDI, and will specifically focus on three key components: geoportals,metadata, and search functions. Examples of the existing SDI implementations will also bediscussed.Definitions1. Spatial data infrastructure: The technology, policies, standards, and human resourcesnecessarytoacquire,process,store,distribute,andimproveutilizationofgeospatialdata,services,andotherdigitalresources.2.Geoportal:Agatewaywebsitethroughwhichpeoplecansearch,discover,access,andvisualizethegeospatialresourceswithinaSDI.3.Metadata:Documentationaboutwho,when,how,what,why,andmanyotherfacetsofthedataandthedataproductionprocess.Metadatacanbeusedfordescribingnotonlydata,butalsotools,services,andothergeospatialresources.4.Datastandard:Acommonlyagreedspecificationonhowdatashouldberecordedanddescribed.5.Geospatialinteroperability:Theabilityofdifferentgeographic informationsystems toshare,exchange,andoperate(heterogenous)geospatialdataandfunctions.6.Webservice:AWebapplicationthatprovidesstandardizedapplicationprogramminginterfacestoallowremoteaccesstodataandfunctionsovertheInternet._____________________________________Hu, Y. & Li, W. (2017). "Spatial Data Infrastructures", The Geographic Information Science &TechnologyBodyofKnowledge,JohnP.Wilson(ed.).http://dx.doi.org/10.22224/gistbok/2017.2.1

Page 2: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

1.SpatialDataInfrastructureanditsValuesThe emergence of spatial data infrastructures (SDIs) is closely associated with the efforts ofcollectingandproducinggeospatialdata, aswell as theadvancementof surveyingandcomputertechnologies.Inthepastdecades,alargeamountofgeospatialdata,suchasremotesensingimagesandGPSlocations,havebeencollectedbygovernmentagenciessuchastheU.S.GeologicalSurvey(USGS) and the National Oceanic and Atmospheric Administration (NOAA). Meanwhile, the fastdevelopmentofgeographicinformationsystemsfacilitatesthederivationofvariousdataproductsfromthecollecteddata,suchastopographicmaps, landcoverdata, transportationnetworks,andhydrographicfeatures.Aslocation‐basedservicesarebecomingincreasinglypopular,vastamountsof volunteered geographic information (VGI) (Goodchild 2007) has also been contributed by thegeneral public through smart mobile devices and social media platforms. In addition, thecomponentization of GIS brings geospatial services that provide data processing and spatialanalysisfunctionsinthegeneralWebenvironment.Thelargenumberofgeospatialdata,services,maps, andothers, however,donot ease theuseof these geospatial resources.Ononehand, it ischallenging to find and access these digital resources which are widely distributed at differentgovernment agencies andwebsites (Li,Wang andBhatia 2016).On the other hand, a lot of dataredundanciesexist,andmoneyandhumanresourceswerewastedinduplicateddatacollectionandmaintenanceefforts(RajabifardandWilliamson2001,MaguireandLongley2005).These problemswere recognized by governments of the countries around theworld, andmanyspatial data infrastructures were constructed since 1990s (Masser 1999). In the U.S., a nationalspatialdatainfrastructure(NSDI)initiativewasstartedin1993toprovidestandardizedaccesstogeographicinformationresources(NationalResearchCouncil1993).AnofficialdefinitionofNSDI,according to the Executive Order 12906, is “the technology, policies, standards, and humanresourcesnecessarytoacquire,process,store,distribute,andimproveutilizationofgeospatialdata.”TheFederalGeographicDataCommittee(FGDC)ischargedwithcoordinatingtheeffortstodeveloptheNSDIintheU.S.ThenamingalsoindicatesthatSDIisrecognizedasaninfrastructuresimilartoother types of infrastructures, such as electricity grids and water supplies, and that it playsfundamental roles in the socioeconomic and environmental developments of a country. ThreeparallelfrontsweredevelopedintheNSDIprogram:1)asetofdatastandardsforformalizingdataandmetadata;2)aclearinghousenetworkprovidingdatastorageandonlineaccess;and3)asetofframework data for the entire country, such as administrative boundaries (Longley et al. 2001,MaguireandLongley2005).Spatial data infrastructure presents a solution to the problems of resource discovery and dataredundancy.Itprovidesaunifiedplatformwherepeoplecangoandsearchgeospatialdata,maps,services,andotherdigitalresources.Asmultiplegovernmentagenciesaresharingtheirdataononeplatform,SDIreducesdataredundancyandtheextraeffortsincollectingduplicatedgeospatialdata.From a cost/benefit perspective, SDI allows geospatial data to be collected once and reusedmultiple times in different applications. More generally, SDI can be considered as an importantelement in the e‐government (Georgiadou, Rodriguez‐Pabón and Lance 2006) and open‐government movement to increase the transparency of governmental activities and to enhancepublicparticipation.Betteraccesstogeospatialdataalsostimulatesthegrowthofnewbusinesseswhichmaynotbepossibleotherwise(Ralston2004).

Page 3: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

2.KeyComponentsofaSDIASDIconsistsofmanycomponents.Inadditiontothedigitalgeospatialresources,aSDIalsoneedshardware, software, people, organizations, standards, policies, and many others to functionproperly. Constructing a SDI also needs effective communications between communities, andnegotiationsamongorganizationsandevencountriestoreachagreements.WhileaSDIhasmanycomponents, this chapter will particularly focus on geoportals, metadata, and search functions,whicharethreekeycomponentsofatypicalSDI.2.1GeoportalsGeoportals areWeb gateways that provide one‐stop access to geospatial resources (Tait 2005).Geoportalsareprobably themostvisiblepartofSDIs, since theyare themain interfaces throughwhich people can search and find geospatial resources. Figure 1 shows the user interface ofData.gov,ageoportal for theU.S.governmentopendata.Figure1(a) is the initial interfaceof thegeoportalwhichcontainsasearchbarthatallowsuserstotypeinsomekeywordsandfindrelevantresources.Figure1(b)showsalistoftheresultswhenthekeyword“watershed”issearched.

(a)(b)

Figure1.TwoscreenshotsoftheU.S.governmentopendatageoportalData.gov.Geoportals are typically developed usingWeb‐based technologies and off‐the‐shelf GIS softwarepackages.Adatabasemanagementsystem(DBMS)isusedtostoreandmanagethemetadataofthegeospatialresourcescontainedintheSDI.AWebinterface,whichoftencontainsamap,enablesenduserstointeractwiththesystemandtoconductsearches(Figure2).Whenasearchisperformed,aHTTP(HypertextTransmissionProtocol) requestwillbesent to theWebserverwhichhosts thegeoportal.Afterqueryingthemetadatastoredinthedatabase,thegeoportalwillthensendbacktheresulttotheclientthroughaHTTPresponse.GeoportalsaretypicallydesignedtobeusedbybothGISprofessionalsandthegeneralpublic.One important function of geoportals is helping usersdiscover the existing geospatial resources.This resourcediscoveryprocessoften follows thepublish‐find‐bindpattern (Rose2004,Maguireand Longley 2005), inwhich: 1) providers publish themetadata of their data and services to ageoportal; 2) users perform a search on the geoportal and potentially find the data; 3) usersconsumethedataandservicesfromtheproviders.Figure2illustratesthesethreesteps.

Page 4: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

Figure2.SomekeycomponentsofaSDIandthepublish‐find‐bindpattern.

Theresourcediscoveryprocessheavilyreliesonthequalityofthemetadataandtheperformanceofthesearchfunction.Inthefollowing,wewilldiscussthesetwocomponents.2.2MetadataMetadata provide documentations on the content and the production process of geospatialresources.Metadata are often called the data about data, and include information such as titles,descriptions,datacategories, the locationsandtimeofthedatacollection,thedatacollectors, theused coordinate systems andmapprojections, and thedata cleaning andprocessing procedures.Metadata can alsobeused fordescribing geospatial services byproviding information about thedataandfunctionsofferedbytheservices,theinputandoutput,thedevelopers,thedevelopmenttime,andothers. Inshort,metadataareaboutallaspectsofdigitalgeospatialresources.Figure3shows themetadata fragmentofageospatial serviceprovidedby theNOAA,whichdescribes thecapabilities(e.g.,gettingamapbasedontheservedgeospatialdata)andthedatalayersprovidedbythisservice.

Figure3.Themetadatafragment(intheformatofXML)ofageospatialserviceprovidedbythe

NOAA.

Page 5: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

Metadataare fundamentally important forSDIs.Whendataandservices leave theiroriginaldataproductioncontextandareintegratedintoaSDI,metadataprovidetheprimaryinformationbasedonwhichGISuserscanunderstandandusedigitalgeospatialresources.Withoutmetadataorwithonlypoorlyconstructedmetadata,it isverydifficult, ifnotimpossible,fordataandservicestobereused.Thequalityofmetadataalsodeterminestheresultofresourcediscovery.Manygeoportalsrank therelevanceofgeospatialresources touserqueriesbasedonthe informationcontained intheir metadata. Complete and accurate metadata allow geoportals to find and rank geospatialresourcesbasedonlocations,time,thematicattributes,datatypes,publishedyears,datacollectors,and many other conditions explicitly or implicitly specified in the user queries. To ensure thequality of metadata, standards are established to define the necessary elements that should beincluded in metadata. In the U.S., the FGDC is responsible for coordinating the development ofmetadata standards, and itsContent Standard forDigitalGeospatialMetadata (CSDGM)hasbeenusedbymanyU.S.governmentagenciesto formalizemetadata.Since2010,FGDChasendorsedaseriesofinternationalmetadatastandards(e.g.,theISO191**standards)topromoteGlobalSpatialDataInfrastructure(GSDI).

2.3SearchFunctionSearchfunctionsarethemajormeansthroughwhichusersdiscovergeospatialresourcesinaSDI.Withoutaneffectivesearchfunction,relevantgeospatialresourcesinanSDIcanhardlybefoundbythe users. Two types of search functions are often adopted in geoportals: text‐based search andmap‐based search. Text‐based search is similar toWeb search engines, inwhich a user types insomekeywordsandreceivestheresultsbasedonthematchedtext.Map‐basedsearchallowsusersto findgeospatial resourcesby interactingwithamap,andausercanpan,zoominandout,anddrawpolygonstospecifytheirareasofinterest.Usingeitherofthesetwomethodsalonehassomelimitations. The text‐based search enables general users, especially theuserswhoareunfamiliarwithaGIS, to findgeospatial resources inawaysimilar tohowtheywoulduseageneral searchengine, such as Google. However, it can be challenging to identify the suitable keywords, or thekeywordsmaynotaccuratelydescribethegeographicareasthattheuserisinterestedin.Themap‐basedsearch,ontheotherhand,providesconveniencefortheuserswhoarealreadyfamiliarwithamapinterface,andallowsuserstospecifygeographiclocationsinamoreaccuratemanner(e.g.,bydrawing polygons). However, not everyone feels comfortable of using a map‐based interface.GeoportalsoftenofferbothsearchfunctionstocomplementthetwoandaccommodatetheneedsofdifferentusersThesearchfunctionsinSDIsarebeingimprovedbyresearchers.Oneoftheseimprovementsliesintext‐basedsearch: there isa transitionfromkeyword‐basedsearchtosemanticsearch.Keyword‐based search examines the matching between the keywords input by the users and the textualdescriptionsofthegeospatialresources.Thus,ifausertypesin“road”,thesearchfunctionwillnotbeableto findtheresources labeledas“street”.Semanticsearchaimstomatchdigitalgeospatialresources to user queries based on the meaning (semantics) of the queries, and therefore canidentifyrelevantdataandserviceseventhoughtheyarenotlabeledwiththeexactlysamewords.Early attempts of semantic search include the use of ontology‐based query expansion (Lutz andKlien 2006), rule‐based semantic reasoning (Li, Zhou and Wu 2016), and faceted search, e.g.,Apache Solr (Hostetter 2006). Despite the existing research efforts, additional work is stillnecessary to integrate semantic search into current SDIs. Figure 4 shows examples of searching“earthquake” (Figure 4(a)) and “natural disasters” (Figure 4(b)) respectively on Data.gov.While

Page 6: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

earthquake is generally considered as a type of natural disaster, the search of “earthquake”,however, returned more matching records than the other search, suggesting that the searchfunctionbehindismorelikelytobebasedonasimplekeywordmatching.Inadditiontotext‐basedsearch,map‐basedsearchisalsosupportedbyData.gov.Figure4(c)showsaspatialfilteringeffectbynarrowingdowntheregionofinteresttothecontiguousUnitedStatesusingthemapinterface.

(a)(b)

(c)

Figure4.SearchexamplesbasedonData.gov.3.ExamplesofSDIsTherearemanyspatialdatainfrastructuresestablishedatdifferentgeographiclevels.Atthegloballevel, thereisGlobalEarthObservationSystemofSystemswhichcombinestheeffortsfrommorethan70countriestoshareenvironmentaldata.Atthecontinentallevel,thereisInfrastructureforSpatialInformationintheEuropeanCommunity(INSPIRE)whichenablesthesharingofgeospatialinformation among public organizations across Europe. At the national level, there is Data.govwhichprovidesaccesstoalargenumberofUnitedStatesgovernmentopendatasets.TheNationalMap (TNM) is a USGS SDI project which supports easy access and downloading of topographicinformationaboutelevation,geographicnames,hydrology,boundaries,transportation,andsoforth.TherearealsonationalSDIs inAustralia,China, Japan,Malaysia,Netherlands,Portugal,andothercountries. At the state level, there is Tennessee GIS portal which offers the geospatial datasets,services,andWebapplicationsrelatedtotheStateofTennessee.Atthecitylevel,thereisNewYork

Page 7: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

CityOpenDataportal . There are also spatial data infrastructuresdedicated to specific domains,suchasdisasterresponse,publichealth,andclimatechange.4.CurrentandPossibleFutureDevelopmentsofSDIsSpatial data infrastructures heavily rely on computer and information technologies, and arecontinuouslyevolvingwiththetechnologicaladvancements.Techniquescommonlyusedintoday’sSDIs,suchasAsynchronousJavaScriptandXML(AJAX)whichenablesasynchronousprocessingtospeedupsearchperformanceaswellasuserexperience,arenotavailableforthefirstgenerationofSDIsin1990s.Similarly,wemayseetheemergenceofnewtechnologiesthatcanimproveSDIsinvarious aspects, and some of these technologies are already being tested in research labs. Forexample, graph databases, such as Neo4j, may be employed to augment the existing relationaldatabases that have been widely used in SDIs to store metadata. Semantic Web (or the thirdgeneration of theWeb) and Linked Data technologies may be leveraged to turn SDIs into localSemanticWebs (Athanasis et al. 2009). The concept of knowledge graph (Singhal 2012)may beappliedandintegratedtoSDIstoconstructknowledgebases that linkthegeospatialentitiesandconceptscontainedinthedata.Machinelearninganddataminingmethods,suchaslatentsemanticanalysis and labeled latent Dirichlet allocation may help automatically infer and generate high‐qualitymetadatafromthecontentofmapsandservices(Li,BhatiaandCao2015,Huetal.2015).Whiletherecanbe lotsofpossibletechnological improvements, futureSDIsalsoneedthecriticalsupportfrompeople,organizations,andgovernments.Withtheseimportantcomponents,afutureSDIwillcontributemoretothedevelopmentofthesociety.AdditionalResources1.TheFederalGeographicDataCommittee:http://www.fgdc.gov2.Opengovernmentmovement:https://obamawhitehouse.archives.gov/open3.GlobalEarthObservationSystemofSystems:http://www.geoportal.org4.InfrastructureforSpatialInformationintheEuropeanCommunity:http://inspire.ec.europa.eu5.Data.gov:https://www.data.gov6.TheNationalMap:https://nationalmap.gov7.TennesseeGISportal:http://tnmap.tn.gov8.NewYorkCityOpenDataportal:https://nycopendata.socrata.comLearningObjectives1.Definespatialdatainfrastructure.2.DescribethemajorcomponentsofatypicalSDI.3.Describethemainfunctionsofgeoportals.4.Definemetadata,anddescribethetypesofinformationthatmaybeincludedinmetadata.5.Differentiatetext‐basedsearchandmap‐basedsearch.6.Listsomeofthewidely‐recognizedSDIs.7.Describetheroleofstandardsinensuringthequalityofmetadata.

Page 8: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

InstructionalQuestions1.Whatarethesocietalbenefitsofspatialdatainfrastructures?InwhichaspectscanSDIsimprovethesharingandreusingofgeospatialdataandservices?2.Whataretheadvantagesanddisadvantagesoftext‐basedandmap‐basedsearches?Whatarethedifferencesbetweenkeyword‐basedandsemanticsearches?3.Howcanmetadatabenefittheunderstandinganduseofgeospatialdataandservices?Howcanmetadatafacilitatethesearchanddiscoveryofdigitalgeospatialresources?BibliographyAthanasis,N.,K.Kalabokidis,M.Vaitis&N.Soulakellis(2009)Towardsasemantics‐basedapproach

inthedevelopmentofgeographicportals.Computers&Geosciences,35,301‐308.Georgiadou,Y.,O.Rodriguez‐Pabón&K.T.Lance(2006)Spatialdatainfrastructure(SDI)ande‐

governance:Aquestforappropriateevaluationapproaches.URISA‐WASHINGTONDC‐,18,43.

Goodchild,M.F.(2007)Citizensassensors:theworldofvolunteeredgeography.GeoJournal,69,211‐221.

Hostetter,C.(2006)Facetedsearchingwithapachesolr.ApacheConUS,2006.Hu,Y.,K.Janowicz,S.Prasad&S.Gao(2015)MetadataTopicHarmonizationandSemanticSearch

forLinked‐Data‐DrivenGeoportals:ACaseStudyUsingArcGISOnline.TransactionsinGIS,19,398‐416.

Li,W.,V.Bhatia&K.Cao(2015)Intelligentpolarcyberinfrastructure:enablingsemanticsearchingeospatialmetadatacataloguetosupportpolardatadiscovery.EarthScienceInformatics,8,111‐123.

Li,W.,S.Wang&V.Bhatia(2016)PolarHub:Alarge‐scalewebcrawlingengineforOGCservicediscoveryincyberinfrastructure.Computers,EnvironmentandUrbanSystems,59,195‐207.

Li,W.,X.Zhou&S.Wu(2016)AnIntegratedSoftwareFrameworktoSupportSemanticModelingandReasoningofSpatiotemporalChangeofGeographicalObjects:AUseCaseofLandUseandLandCoverChangeStudy.ISPRSInternationalJournalofGeo‐Information,5,179.

Longley,P.A.,M.F.Goodchild,D.J.Maguire&D.W.Rhind(2001)GeographicinformationsystemandScience.England:JohnWiley&Sons,Ltd,327‐329.

Lutz,M.&E.Klien(2006)Ontology‐basedretrievalofgeographicinformation.InternationalJournalofGeographicalInformationScience,20,233‐260.

Maguire,D.J.&P.A.Longley(2005)Theemergenceofgeoportalsandtheirroleinspatialdatainfrastructures.Computers,environmentandurbansystems,29,3‐14.

Masser,I.(1999)Allshapesandsizes:thefirstgenerationofnationalspatialdatainfrastructures.InternationalJournalofGeographicalInformationScience,13,67‐84.

NationalResearchCouncil.1993.Towardacoordinatedspatialdatainfrastructureforthenation.NationalAcademiesPress.

Page 9: Spatial Data Infrastructures - University at Buffaloyhu42/papers/2017_GISBOK_SDI.pdf · Spatial Data Infrastructures Yingjie Hu, Department of Geography, University of Tennessee,

Rajabifard,A.&I.P.Williamson.2001.Spatialdatainfrastructures:concept,SDIhierarchyandfuturedirections.InGEOMATICS'80Conference.Tehran,Iran.

Ralston,B.A.2004.GISandpublicdata.Thomson/DelmarLearning.Rose,L.(2004)Geospatialportalreferencearchitecture:acommunityguidetoimplementing

standards‐basedgeospatialportals.OpenGISDisscusionPaper,OGC,04‐039.Singhal,A.(2012)Introducingtheknowledgegraph:things,notstrings.Officialgoogleblog.Tait,M.G.(2005)Implementinggeoportals:applicationsofdistributedGIS.Computers,Environment

andUrbanSystems,29,33‐47.


Recommended