14
Spatial Data Extraction and Multi Query Optimization for Accessing Location Based Services for Everyday Essentials R.Jeberson Retna Raj 1 , T.Sasipraba 2 , Surya Narayanan 3 , Sundeep Teja 4 , Farhaan 5 and Ankit Kumar 6 1 Department of IT, Sathyabama university, Chenni 600119, India. [email protected] January 5, 2018 Abstract Location Based Services for ”Everyday” Essentials (LB- SEE) is a GIS based information system provides the de- tails of all the essential services available in Chennai city. The system is accessible through online for looking up all information regarding schools & colleges, Banks & ATMs, Theatre & malls, Government services which include post offices, Taluk offices, RTO offices etc. The system provides all these services details in a general presentation and it can be visualized through the map. The route informa- tion for the requested service is provided so that one can able to reach the intended destination without any difficul- ties. The user can search the service through specifying the search query in the query interface and the system se- mantically analyzed it and returns the relevant spatial data to the user. Hence, the user is provided with the service 1 International Journal of Pure and Applied Mathematics Volume 118 No. 17 2018, 485-497 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu 485

Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

Spatial Data Extraction and MultiQuery Optimization for Accessing

Location Based Services for EverydayEssentials

R.Jeberson Retna Raj1, T.Sasipraba2,Surya Narayanan3, Sundeep Teja4,

Farhaan5 and Ankit Kumar6

1Department of IT,Sathyabama university,Chenni 600119, India.

[email protected]

January 5, 2018

Abstract

Location Based Services for ”Everyday” Essentials (LB-SEE) is a GIS based information system provides the de-tails of all the essential services available in Chennai city.The system is accessible through online for looking up allinformation regarding schools & colleges, Banks & ATMs,Theatre & malls, Government services which include postoffices, Taluk offices, RTO offices etc. The system providesall these services details in a general presentation and itcan be visualized through the map. The route informa-tion for the requested service is provided so that one canable to reach the intended destination without any difficul-ties. The user can search the service through specifyingthe search query in the query interface and the system se-mantically analyzed it and returns the relevant spatial datato the user. Hence, the user is provided with the service

1

International Journal of Pure and Applied MathematicsVolume 118 No. 17 2018, 485-497ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

485

Page 2: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

detail and location detail though the detailed presentationviews of the system. By deploying this web application, thetime consumed is significantly reduced for a user by unnec-essarily wandering around the places to locate the services.The Shortest Route provided for each location is an extraadded feature for users to know where the service is exactlyavailable.

Key Words : Spatial Data Extraction, GIS, LBS

1 Introduction

Chennai is the one of the densely populated city which covers 600square kilometer of urban landscape with integrated transport andother essential networks. It has well connected road and rail net-works for the public to access various places of the city. The cityis the head quarter of the state Tamilnadu, the Government, cor-porate and business houses operating their day to day business inChennai. The city acts as a technology hub for all sectors likeIndustries, manufacturing, IT, Automobile, Handloom, Power, Ed-ucation, Higher studies, Research & Development etc. It is quiteobvious that a general public visiting Chennai for making use ofthese facilities. The Surveys shows that as an average of one lakhpeople per day both inside Tamilnadu and other part of the countryvisiting Chennai for getting and using the facilities of Governmentservices like passport, registration, birth/death certificate, etc., andvisiting tourist places, corporate offices and Hospitals for variousreasons. It always difficult for a person who is new to Chennaifor identifying these services. Furthermore, it is paramount impor-tant that where and how to getting the right essential services andthe way to reaching the places without others help. Therefore, theneed of the hour is to develop a semi automated system for findingand locating these services through web and mobile. For meetingthe demand, the proposed Location Based Services for EverydayEssentials (LBSEE) has been introduced.

1.1 Objectives

• To develop a good classification technique for automatic clas-sification of unstructured input from the volunteered data.

2

International Journal of Pure and Applied Mathematics Special Issue

486

Page 3: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

• To design a Query processing technique to execute the userrequest.

• To devise a data visualization mechanism to present the datato the user.

• To propose a novel knowledge discovery technique to extractthe essence of the database.

1.2 Unique Features of the Project

• Provides detailed information about all the schools & colleges,Universities, banks & ATMs, Post offices,

• Taluk office, Theatres & malls in 600 square kilometer areaof Chennai city.

• Displays the shortest path between two specified locations.

• Locates al the essential services which includes schools, col-leges, universities and banks based on the facilities available.

• Search for any essential services based on the user locality.

• A detailed record set of the users using the service is main-tained at the back end.

• Currently the project is under the process of hosting for thepublic access through the website www.chennaiessentials.co.in.

The system is designed such as way that it automatically iden-tifies the current location of the user by using his network portal.For making the system more convenience to the user, the systemconsists of two kind of search.

• Basic search

• Advance Search

3

International Journal of Pure and Applied Mathematics Special Issue

487

Page 4: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

1.3 Basic Search

The basic search is to cater the needs of the non familiar IT userto access the required services. The user interface contains a dropdown list which contains all the major places in Chennai city likeT.Nagar, Adyar, Velacherry etc. The function of basic search isin two ways. Initially, the system automatically identifies the userlocation whenever the application is loaded into the users PC orMobile. The user has to select the required service who wants toaccess. The system lists all the relevant providers details, from thatlist the user navigate the relevant service. The detailed informationof the services which includes detailed route information, locationmarker which mark the exact location of the service on the map.The message box shows the details like address, phone number,website link, facilities available, etc on the map when the user dragor click over the location marker.

2 Proposed Architecture

2.1 Methodology

• Collection of service providers details from the volunteereddata sources.

• Construction of GIS database.

• Classification of unstructured data input.

• Development of data mining algorithm for automatic classifi-cation.

• Query processing technique for executing users complex queryrequest.

• Development of knowledge discovery technique for extractingthe essence of the data.

• Designing of user friendly GUI for user interaction.

• Data visualization mechanism to present the retrieved spatialdata to the end user through map.

4

International Journal of Pure and Applied Mathematics Special Issue

488

Page 5: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

• Testing and Validation on the communication ability, spatialquery processing and response generation.

Figure 1: Schematic view of LBSEE system

The architecture of the proposed system is shown in Figure-1.The system consists of the following modules:

• Data Collection

• Integration of various layers

• Query processing

• Spatial data retrieval

• Visualization on the map

2.2 Data collection

The service providers details are collected from open sources likespotico, skyhook, Google etc. The system covers 600 sq.km ofgreater Chennai city is considered for implementation. The essen-tial service providers which includes all the Government and pri-vate Schools and Colleges, all the nationalized Banks and ATMs,

5

International Journal of Pure and Applied Mathematics Special Issue

489

Page 6: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

Theatre and Malls, Government Taluks and Tahsildar office, reg-istration offices, courts, electronics shops, Hotels and Restaurantsetc. The detailed information of each services are collected such away that the database is constructed and massive amount of datais stored in it. For example the bank service, the details of theprovider like name of the bank, whether it is public or private,address, locality, area landmark, latitude and longitude, loan avail-ability, locker facility, website details and phone number etc. Thesystem processed these unstructured data and transforms it into astructured data[7].

2.3 Integration of various layers

The satellite image is digitized and various layers includes road andstreet network, creating various layers for each services such a waythat the service details will be shown on the map for a given servicelatitude and longitude values. The latitude and longitude value ofservice providers is collected from volunteered data sources andstored it in the GIS database. The non available details of latitudeand longitude of the provider is collected through manually at theground level by Trimple GPS and it is stored in the GIS database.

2.4 Query processing

The user query can be processed semantically and it involves sev-eral steps like tokenization, named entity extraction and parsingetc. The user query has to be pre-processed by removing the stopwords, connectivity and making as a meaningful query. The Systemusing Spherical Law of Cosines for Data Extraction based on Eu-clideans distance technique for finding the nearest and neighbour-hood providers and Like Query Technique for filtering the data inGIS database[1].

Stemming is the process of removing prefix, suffix and unwantedstop words in the query. The stemming algorithm is applied to theuser query and the query is processed. The noise (eg: a, was, need,value, of, from, to etc.) is removed from the query. The importanceof stemming algorithm is its reduced time for indexing the results,file compression and increase the search efficiency[2][3].

6

International Journal of Pure and Applied Mathematics Special Issue

490

Page 7: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

Eg: Input query - find RTO office near adyar. // List of stopwords to be removed after tokenization.

StopWords=”a”, ”able”, ”about”, ”above”, ”according”, ”ac-cordingly”, ”across”, ”actually”, ”after”, ”afterwards..., I ,....., need,..........etc;

After tokenization and removal of stop words: Input query isRTO, office, adyar. Therefore, it is easy for the discovery agent toselect the relevant service to the end user.

2.5 Processing of unstructured data

The system is designed such a way that it can process the unstruc-tured data input. For the present and future data processing, theproposed mechanism helps to update the database automatically.For example, the online sources the news like theatres opened inChennai or other news items related to the essential services areprocessed and updated in the GIS database. Figure 2 shows thesystematic view of processing the unstructured data input.

The unstructured information is found from public domain inthe form of text, pdf or spatial data. The rule engine consists of allpossible rules for extracting the key information in the informationdomain. The key pattern for data extraction is clearly defined andthe equivalent rules are framed so that the relevant key informationcan be extracted from the information sources. For example,

Rule No: 1Pattern: Theatre in Chennai.Key phrase: Theatre in Chennai opened.Key phrase: new theatre is inaugurated.Rule No: 1Pattern: Bank in Chennai.Key phrase: new bank branch opened.Key phrase : ATM opened.

The results are divided into three groups which includes exactlymatched, partially matched and no match.

The extracted tokens are parsed and the key tokens are gener-ated. The tokens are applied with the clustering technique and they

7

International Journal of Pure and Applied Mathematics Special Issue

491

Page 8: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

are classified. Initially, the data is clustered through K-means clus-tering algorithm and an integrated hierarchical algorithm is appliedfor efficient clustering[4].

The mean term for the similarity of list of terms are taken intoan account. The terms relate with similar domain are clusteredtogether based on the mean value. After the tokenization processdomain specific mechanism will be implemented[5].

Figure 2: Pre-processing of unstructured input into structured data

The score value can be used for data extraction process for defin-

8

International Journal of Pure and Applied Mathematics Special Issue

492

Page 9: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

ing the fully matched, partially matched and no match sequence.

Weighti =∑n

j Score(Sentencei, Rulej)

Score(Sentencei, Rulej) =

{1 if Sentencei is exactly matched with Rulej

0 if Sentencei is not exactly matched with Rulej

Where n is the total number of rules.

3 Experimental Results

The system is implemented both in web based and android mobilebased accessing services. The mobile based implementation of theproposed system is shown in the following figure 3 and 4(a) to 4(d).Google map is considered for updating the spatial and attributeddata for all essential services. GUI is created for specifying userquery request and the system process request semantically and theresult is visualized through the map. Therefore one can get thedirection as well as details from the system. For mobile users, aseparate android application is created for accessing the application.The user can download the application and can access the services.Two kind of accessing is provided. First one is through specifyingquery and second is the user has to locate the place on the map andselect the required services from the list. The web based system isimplemented by PHP as front end and MySQL software as backendstorage. The shortest distance between source and destination isshown on the map.

Figure 3 shows that the system list out various banks available inthe selected vicinity of destination. Now the user has to specificallyselect the right bank.

9

International Journal of Pure and Applied Mathematics Special Issue

493

Page 10: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

Figure 3: Route detail from users current location to the servicelocation

Figure 4(a) shows the current location of the user. The systemautomatically detects the latitude and longitude of the location of

10

International Journal of Pure and Applied Mathematics Special Issue

494

Page 11: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

the user and the interface has been designed such a way that theuser can select the required services. Figure 4(b) shows the list ofessential service selected by the user and Figure 4(c) shows the listof schools in and around the users location. Figure 4(d) shows theroute map for the selected service.

4 Conclusion

LBSEE is presented in this paper. The system provides the detailsof all the essential services available in Chennai city. The system isaccessible through online for looking up all information regardingschools & colleges, Banks & ATMs, Theatre & malls, Governmentservices which include post offices, Taluk offices, RTO offices etc.The system provides all these services details in a general presen-tation and it can be visualized through the map. The route infor-mation for the requested service is provided so that one can ableto reach the intended destination without any difficulties. The usercan search the service through specifying the search query in thequery interface and the system semantically analyzed it and returnsthe relevant spatial data to the user. The Shortest Route providedfor each location is an extra added feature for users to know wherethe service is exactly available.

5 Acknowledgement

We thank Department of Science and Technology, NRDMS, Gov-ernment of India, for supporting financial assistant to implementthe project.

References

[1] Wenwen Li, Michael F. Goodchild, Richard L. Church, and BinZhou, Geospatial Data Mining on the Web: Discovering Loca-tions of Emergency Service Facilities, Springer-Verlag BerlinHeidelberg, 2012.

[2] Y Jaya Babu, G J Phani Bala, Siva RamaKrishna T, Ex-tracting Spatial Association Rules from the Maximum Frequent

11

International Journal of Pure and Applied Mathematics Special Issue

495

Page 12: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

Item sets Based on Boolean Matrix, International Journal ofEngineering Science & Advanced Technology, 2012

[3] Jingtian Jiang, Nenghai Yu, Chin-Yew Lin, FoCUS: Learningto Crawl Web Forums, WWW 2012 Industrial Track Lyon,France, 2012

[4] Hongsheng Li, Jiping Liu, Yong Wang, Qingyuan Li, Researchon Problem-Based Spatial and Non Spatial Information SearchMethods

[5] Weiwei Sun, Chunan Chen, Baihua Zheng, An Air Index forSpatial Query Processing in Road Networks, IEEE Transac-tions on Knowledge and Data Engineering, VOL. X, NO. X,X 2013

[6] H. Kriegel, P. Kroger, P. Kunath, M. Renz, and T. Schmidt,Proximity queries in large traffic networks, GIS, 2007, p. 21.

[7] R.Jeberson Retna Raj, T.Sasipraba, Semantic Retrieval ofSpatial Objects on Location Based Services For Everyday Es-sentials, ARPN Journal of Engineering and Applied Sciences,ISSN 1819-6608, VOL. 10, NO. 5, MARCH 2015, pp.2033-2036.

12

International Journal of Pure and Applied Mathematics Special Issue

496

Page 13: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

497

Page 14: Spatial Data Extraction and Multi Query Optimization for ... · systematic view of processing the unstructured data input. The unstructured information is found from public domain

498