14
http://www.iaeme.com/IJARET/index.asp 737 [email protected] International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 10, October 2020, pp. 737-750, Article ID: IJARET_11_10_076 Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10 ISSN Print: 0976-6480 and ISSN Online: 0976-6499 DOI: 10.34218/IJARET.11.10.2020.076 © IAEME Publication Scopus Indexed INTEGRATION OF TECHNIQUES WITH DATA MINING FOR COMMUNITY PROJECT DATA ANALYSIS FROM PROJECT SPECIFIC CHARACTERISTICS Tipaporn Supamid School of Information Technology, Sripatum University, India Surasak Mungsing School of Information Technology, Sripatum University, India ABSTRACT The development of Thailand under the strategy of transforming Thailand into being "Digital Thailand" and Thailand 4.0 policy will result in initiatives and pushing various projects of a large number of government agencies But the needs of the community and the readiness of the agencies that proposed the project are different. Still not being able to present and carry out various projects that can be requested at the same time due to the various restrictions of each department. Therefore, the project is characterized by the concept of Smart City / Community. Therefore is a project that in line with the strategy and national development policies and have the opportunity to be considered for support this research presents the methods of project data analysis and classification of which projects have characteristics according to the concept of Smart City / Smart Community, which type of research results using 30 project data and amount of 520 attribute. The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that projects can be classified according to characteristics used as indicators with 86.78% accuracy. Keywords: Smart City, Smart Community, Rule mining, Project Attributes. Cite this Article: Tipaporn Supamid and Surasak Mungsing, Integration of techniques with data mining for community project data analysis from project specific characteristics, International Journal of Advanced Research in Engineering and Technology, 11(10), 2020, pp. 737-750 http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10 1. INTRODUCTION Smart Community It is a community development that has intelligent community management systems. The smart community means using information and communication technology,

INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

http://www.iaeme.com/IJARET/index.asp 737 [email protected]

International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 10, October 2020, pp. 737-750, Article ID: IJARET_11_10_076

Available online at

http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10

ISSN Print: 0976-6480 and ISSN Online: 0976-6499

DOI: 10.34218/IJARET.11.10.2020.076

© IAEME Publication Scopus Indexed

INTEGRATION OF TECHNIQUES WITH DATA

MINING FOR COMMUNITY PROJECT DATA

ANALYSIS FROM PROJECT SPECIFIC

CHARACTERISTICS

Tipaporn Supamid

School of Information Technology, Sripatum University, India

Surasak Mungsing

School of Information Technology, Sripatum University, India

ABSTRACT

The development of Thailand under the strategy of transforming Thailand into being

"Digital Thailand" and Thailand 4.0 policy will result in initiatives and pushing various

projects of a large number of government agencies But the needs of the community and

the readiness of the agencies that proposed the project are different. Still not being able

to present and carry out various projects that can be requested at the same time due to

the various restrictions of each department. Therefore, the project is characterized by

the concept of Smart City / Community. Therefore is a project that in line with the

strategy and national development policies and have the opportunity to be considered

for support this research presents the methods of project data analysis and

classification of which projects have characteristics according to the concept of Smart

City / Smart Community, which type of research results using 30 project data and

amount of 520 attribute. The data mining techniques is Decision Tree, Rule-Based

techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that projects

can be classified according to characteristics used as indicators with 86.78% accuracy.

Keywords: Smart City, Smart Community, Rule mining, Project Attributes.

Cite this Article: Tipaporn Supamid and Surasak Mungsing, Integration of techniques

with data mining for community project data analysis from project specific

characteristics, International Journal of Advanced Research in Engineering and

Technology, 11(10), 2020, pp. 737-750

http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10

1. INTRODUCTION

Smart Community It is a community development that has intelligent community management

systems. The smart community means using information and communication technology,

Page 2: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 738 [email protected]

Internet of Things technology, and Big data collected by the system to develop various aspects

of the community, including energy, environmental, health, and community management which

in each area must be systematically linked in order to maximize efficiency. In which the key

components are Energy In the area of energy, intelligent electric network development also

known as Smart Grid, which is an electrical network that uses information technology and

communications to manage for efficient control of power distribution as well as increased

security and safety.

Strategy 20 years is the development of Thailand since the 1st National Economic and Social

Development Plan has resulted in the development of the country in every dimension, in terms

of economy that Thailand has been elevated to the top group of countries. Middle income

countries in the society that has improved the quality of life of people and in terms of

environment, Thailand has more advantages in ecological diversity.

Thailand 4.0 policy by focusing on creating the competitiveness of the main industries that

is in line with Thailand's capability and the needs of the world market. With a civil state

mechanism as a driving mechanism from the said policy the private sector and the public must

adapt to be able to compete with maximum efficiency. Including community development by

focusing on:

1. Participation in the community and eating well

2. Education in the Smart Living Center community

3. Health and wellness

4. Security and safety

5. Healthy design

2. REVIEW LITERATURE AND RELATED RESEARCH

2.1. Smart Community

Smart community development with consistent concepts for smart city development which the

Office of Economic Promotion (DEPA) has classified Smart City into 7 categories (DEPA,

2562)

1. Smart Mobility The use of information technology and communication technology to

assist in the management of transportation, transportation and traffic systems to help

increase the efficiency of the road system Increase the efficiency of the public

transportation system. Increase security and helps reduce traffic congestion.

2. Smart Economy is creating a strategy to make Khon-Kaen in the economic and

production center of the region using smart solutions to support tourists in terms of

facilities public services that connect to the transportation networks of the surrounding

countries city information is provided through smart security thoroughly to build

confidence among domestic and international tourists, such as applying a smart solution

to promote local traditions - festivals to spread the culture and promote tourism etc.

3. Smart Environment is a measure consisting of preserving the environment, forests,

plants, ecosystems, agricultural promotion.in city food production area, park, green

space water management, water pollution, air pollution, heat island phenomenon.

4. Smart Governance is a measure consisting of the principles of intelligent city leadership,

Strategy and organization structure management process success measurement system.

5. Smart Energy is an energy system balancer in that society. The especially clean energy

such as solar and wind energy, which the energy can be stored in the form of batteries

for backup and can be used immediately when in shortage. In addition, in that society,

Page 3: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 739 [email protected]

also have the said energy monitor center for proper use according to principles demand-

supply at that time.

6. Smart living is a way to improve the quality of life of people in the community. Under

this category there are many interesting projects. Currently, there are many elderly

people in the society and due to the changing lifestyle many seniors have to live at home

alone while their children are out and working. Smart living platforms should be

developed. The focusing on creating a complete ecosystem to create new innovative

services can continuously. Which will encourage entrepreneurs to form a value chain

together to solve problems at this point? Since how to care for the elderly efficient

service and to sustainably strengthen the health services of the population. In addition,

it will create an environment and innovation that is conducive to living in an aging

society to create opportunities for equality and social equality.

7. Smart People are a way of life that will change continuously. The use of modern

technology continuously in daily life digital and technology will be important

mechanisms for development in various fields, which must be constantly learning. Must

be an analytical person have the ability to solve problems have skills in information

technology creative and innovative skills Knowing the integration of science and

knowledge in various fields to benefit and able to adjust to live with others and live in

a digital society happily.

2.2. Identifying which projects are smart city

2.2.1. What type of project characteristics with data mining techniques

Data mining is a process that involves a large amount of data to search for hidden patterns and

relationships in that data set. Currently, data mining has been applied to many types of work

and both in the business that helps in the decision of the executive. In science and medicine as

well as in economy and society, Which is like an evolution of data storage and interpretation

from the original with simple data storage into storage in the form of a database that can retrieve

information for use in data mining that can find the knowledge that is hidden in the data.

2.3. Data mining classification helps to discover rules The relationship of data

used in the analysis.

Finding relationship rules is a relationship mining rule (Rule Mining Algorithm), which is to

search for relationship rules, that is to search for the relationship of data from the large data

contained in order to find frequent pattern and use in analysis of the relationship of data and

can be used to predict various phenomena. The data used in the relationship mining is a database

type transaction database to do not get the results that will be associated with the relationship.

It can be written in the form of a set of items that causes to a set of items that can be effective.

Although, the data base is from the population analysis of the local administrators, such as the

development of village routes and there is a smart light bulb installation etc.

2.4. Decision Tree

Decision Tree is the first method of decision tree as the one technique to take the estimation of

Discrete-value function by using the decision tree. Then, it consists of the sets of regulations,

such as “if-then” form by using the model creation of decision tree. Later, it will be selected of

the attributes relatively to the class at most as the top tree or root node. After that, it will find

the next attributes continuously to find the relationship of measurement attributes that are called

Information Gain (IG). Thus, this value can be calculated with the solution is:

Page 4: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 740 [email protected]

(𝑝𝑎𝑟𝑒𝑛𝑡, 𝑐ℎ𝑖𝑙𝑑) = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦 (𝑝𝑎𝑟𝑒𝑛𝑡) − [𝑝(𝑐1) ∗ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦 (𝑐1) + 𝑝(𝑐2) ∗ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦 (𝑐2) + ⋯ ] .

By

𝑒𝑛𝑡𝑟𝑜𝑝𝑦 (𝑐1) = −𝑝 (𝑐1) log 𝑝 (𝑐1) , 𝑝(𝑐1)

So that, the probability value of the next 𝑐1. Then, we can calculate the value of each

attribute to compare with the class to find the attribute with the most value of IG as the root of

decision tree.

2.5. Rule-Based

Rule based is the way to create the large decision tree which may cause the complexity to

interpret. Thus, with this case it should bring the Rule-Based from the designed decision tree to

make in the form of “IF-THEN” for making understand easily. Besides, it can create each Rule-

Based from the root node to the leaf node with the attributes on the branch on to “IF” while the

leaf node will be on the part of “THEN”, respectively.

2.6. Naïve Bayesian

Naïve Bayesian is the way to solve the problem with using classification. Therefore, it can

expect the results and explain the techniques of Naïve Bayesian to gain the good result with

rapidness and easiness by using the solution is 𝑃(𝐻), 𝑃(𝑋|𝐻) 𝑜𝑟 𝑃(𝑋) thus, According to the

teaching information, it has applied the theory of Bay to calculate the value is 𝑃(𝐻|𝑋) with the

value 𝑃(𝐻), 𝑃(𝑋|𝐻) 𝑜𝑟 𝑃(𝑋) which can calculate by using the solution as this below:

𝑃(𝐻|𝑋) = 𝑃(𝑋│𝐻) 𝑃(𝐻)

𝑃(𝑋)

2.7. Decision Tree - Rule-Based (DT-RB)

It is the hybrid technique to estimate the function value of Decision Tree - Rule-Based (DT-

RB). Moreover, it can create the work pattern discontinuously by using the tree map with the

structure in the form of rule-based. Besides, it can use the method of instruction set or rules of

the tree structure to make the rule-based. In addition, it can make the rule-based structure by

using the solution and find the relationship of attributes for the measurement as this following:

Solution from the solution of Decision Tree

𝑡𝑖 = 1 ∑[𝑃(𝑡𝑖)]2

𝑁

𝑖=1

And solution from the solution of Rule-Based

𝑡𝑖 = 1 ∑[𝑃(𝑡𝑖)]2

𝑁

𝑖=0

𝑙𝑜𝑔2 𝑃(𝑡𝑖)

Therefore, it can setup the pattern of Decision Tree - Rule-Based (DT-RB) that is +

with the solution as this following:

𝐷𝑇 − 𝑅𝐵(𝑡𝑖) = 1∑∑[𝑃(𝑡𝑖)]2

𝑁

𝑖=1

𝑙𝑜𝑔2

𝑁

𝑖=0

𝑃(𝑡𝑖)

It is the probability value of next 𝑡𝑖 . Therefore, we can calculate each attribute to compare

with the class for finding the attribute with the value which is called Information Gain (IG) at

most to be the root of DT-RB.

Page 5: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 741 [email protected]

2.8. Related research

There are many researches relevant to education and development of a genius community with

a focus on public participation with government, energy, safety and development patterns, both

at home and abroad. In order to know which project is in which type of smart community how

much therefore it is necessary to have indicators to evaluate the project. In accordance with the

concepts and forms of the global community equation applied to the national economic and

social development plan 2015 - 2018 And community development to become a genius at the

local level, still lacking indicators that are acceptable and standardized (Suk Sawat

Natthawuttisit, Thanasukwaree and Thitaporn Sinjaroonsak, 2018)

While creating knowledge management participation public libraries should have

innovative approaches to allow government and communities to participate in new ways of

thinking about managing public libraries. (Shannon Mersand, Mila Gasco-Hernandez,

Emmanuel Udoh and J Ramon Gil-Garcia. 2019)

Although the participation of people is entering a new era of change in the environment the

increasingly coherent surrounding and the rapid change of intelligent technology has resulted

in city administration and community management regarding data gathering. Data quality

privacy and security including public participation is a problem and the challenge of making a

smarter community (Mila Gasco-Hernandez, Manuel Pedro Rodríguez Bolívar and Taewoo

Nam, 2019)

The management of community electric power is also challenging, in order to balance the

demand and supply of electric energy within the smart community has guidelines for creating

solar photovoltaic cells and is used in the production of electric power and load balancers with

various algorithms to reduce the cost of electricity. (Ghulam Hafeez, Nadeem Javaid, Sohail

Iqbal, Farman Khan, 2017)

From studies on energy and social policy of communities that plays an important role in

Europe in saving energy and local problems. Real time data analysis aggregate time spent by

simulating project management with a large number of small or large projects, therefore,

management replication and complex environments may occur although the introduction of

technology makes economic differences community members and policies through the creation

of management models for local audits (J. Snape. 2019).

Even the increased demand for a sustainable society, intelligent communities - even with

the urgency of intelligent community development, but the definition of what should and the

way to develop a smart community is not enough. Therefore, the case studies on Japanese smart

community development measures and suggest the general development process of Japanese

smart community according to the measures. The study was conducted in five representative

cases and 14 measures that were collected and classified into three categories according to the

development process (M. Gondokusuma, Y. Kitagawa and Y. Shimoda, 2019).

3. RESEARCH METHODS

The research process can be divided into the following steps

Step 1: Extracting the characteristics associated with the property with the smart city of the

project.

Step 2: Prepare data for project classification based on the concept of Smart City.

Step 3 Processing with WE-KA program.

Analysis of project qualifications that need to be classified according to the concept of smart

city must refer to the factors that indicate the smart city of each category (attributes). The detail

that must be analyzed from the community project is inspecting objectives scope check

examines the conceptual framework and checks the operation methods of community projects.

Page 6: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 742 [email protected]

3.1. Data preparation

Cleaning data Is the initial data processing to search for information on the characteristics of

community projects such as, project scope project objectives project framework and project

operation methods and in the event that the project proposal document lacks some information

the researcher can analyze the data from other parts, namely the principles and reasons, the

goals and the conclusion of the community project.

Data connection is to examine the various attributes that are related to the lack of relevant

or duplicate data usually the set of data has duplicate attributes from the community project

data. Therefore, in order to avoid such redundancy, we can apply correlation analysis. To do a

statistical check that any 2 attributes are duplicated (choose to consider each pair of attributes

from all 423 attributes)

Start program

Scanf (“Input data of community project”);

For (i=0; j<=7; k++;)

{

If (A=S Enr)

If (S − PF1)

If (Enr = Enr_1) then

Else If (Enr_1= Enr_1.1) then

System.out.printf (“S − PA11−1𝑛”, “Smart Environment”);

Brake; }

{

If (A=S Ene)

If (S − PF2)

If (Ene = Ene_1) then

Else If (Ene_1= Ene_1.1) then

System.out.printf (“S − PA11−1𝑛”, “Smart Energy”);

Brake; }

{

Page 7: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 743 [email protected]

If (A=S Eco)

If (S − PF3)

If (Eco = Eco_1) then

Else If (Eco_1= Eco_1.1) then

System.out.printf (“S − PA21−1𝑛”, “Smart Economy”)

Brake; }

{

If (A=S Liv)

If (S − PF4)

If (Liv = Liv_1) then

Else If (Liv_1= Liv_1.1) then

System.out.printf (“S − PA31−1𝑛”, “Smart Living”)

Brake; }

{

If (A=S Mob)

It (S − PF5)

If (Mob = Mob_1) then

Else If (Mob _1 = Mob _1.1) then

System.out.printf (“S − PA41−1𝑛”, “Smart Mobility”)

Brake; }

{

If (A=S Peo)

If (S − PF6)

If (Peo = Peo_1) then

Else If (Peo_1= Peo_1.1) then

System.out.printf (“S − PA51−1𝑛”, “Smart People”);

Brake; }

{

If (A=S Gov)

If (S − PF7)

If (Gov = Gov _1) then

Else If (Gov _1 = Gov _1.1) then

System.out.printf (“S − PA61−1𝑛”, “Smart Governance”);

Brake; }

{

Stop program

3.2. Data selection

Data selection is selection in the form of selection columns that have quite complete data.

Moreover, each column should have the same value for every row. In addition, the values in

Page 8: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 744 [email protected]

each column should not be redundant. But should correct the information to be correct and

perfect Therefore, the data should be adjusted to make decisions and set up data groups to

reduce data distribution. Therefore can find a way to collect a sample of the questionnaire before

searching for a method of estimation of population or μ. For this reason, it should determine

the confidence interval in the mean: known variance is l ≤ μ ≤ u when 𝑙 Is the lower limit and

𝑢 is the upper limit, we will be called "Two-sided confidence interval" by finding the median

of the population within the limits from the solution is P(L ≤ μ ≤ U) = 1 − α . Therefore, in

the case 𝛼 is the risk of error or the value referring to the risk from the prediction of μ with

errors, reliability and risk values are 100(1 − 𝛼)% or called a percentage of credibility.

Likewise, there is a value called confidence interval. The same side with solving a problem is

l ≤ μ . In order to comply with the above solutions, we are referred to as lower confidence

intervals with values of 𝑙 or called the lower limit with the following μ ≤ u and according to

the above solution, it is called as the upper confidence interval by the value of 𝑢 s the value of

the upper limit Therefore, value 100(1 − 𝛼)% called as the reliability of parameters m when

we refer to the theory of sample distribution for the mean, it has the median of m and the

distribution of the values equally 𝜎2/n. Therefore it is valuable 𝑍 = �̅� − 𝜇 /𝜎/√𝑛 when the

distribution values are normal, the values α should be divided by 2 therefore, in the case of

finding the two-sided confidence interval from the image, it can be summarized as follows:

P {−Zσ2⁄

≤ Z ≤ Zσ2⁄} = 1 − α

When replaced by the value z when replaced by the value:

P{−Zσ2⁄

≤x̅ − μσ

√n⁄≤ Zσ

2⁄} = 1 − α

Follow at equation:

P {x̅ − Zσ2⁄∂

√n⁄ ≤ μ ≤ x̅ + Zσ

2⁄∂

√n⁄ } = 1 − α

3.3. Preparation of practice data sets and test data

The process of preparing the information to be as complete as possible which the data will be

used to prepare the data, divided into 2 groups which are practice data and test data by the details

of the division of information as follows:

1. Training data set is divided into practice data Is a ratio of 2 in 3 of the total amount of

data

2. Test data is divided into test data is a ratio of 1/3 of the number of data sets

Data processing The program used in this research is WE-KA program for forecasting in

Rule-Based such as Decision Tree, Rule-Based. Naïve and Bayesian Decision Tree-Rule-Based

(DT-RB) which WE-KA program is Popular in bringing software forecasts about data mining

is widely accepted for data processing in both training and test sets.

Data collection Is gathering new information and data compilation (Data Compilation)

means information that others have already collected or report in various documents for further

study and analysis. The data collected by the researcher are basic project data from local

communities in Prachuap Khiri Khan, we have 30 projects for testing.

Data analysis is managed by various methods such as calculation, presentation of data in

order to achieve objectives. The analysis separates what is considered relevant sub-sections in

Page 9: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 745 [email protected]

order to clearly understand each part, including finding relationships between parts to see if the

more detailed components are compatible.

Symmetry Matrix and Skew-Symmetry Matrix features are used to increase the size of the

classification. Then get very accurate results for attributes and classes using the solution of

symmetric matrix. From there, it can be converted to get the result into its own matrix and

solution as follows:

𝐴𝑡 = 𝐴

In addition, the membership values of the symmetric matrix can be observed from the

diagonal of the left to the bottom right. Likewise, all member values are at the top and bottom

of equal diagonals for reflection in the mirror. Therefore, the definitions of the symmetric

matrix can be summarized as follows

:𝑎𝑗𝑖 = 𝑎𝑗𝑖

For every index of i and j the details are as follows:

[ 𝑎11 𝑎12 … … 𝑎1𝑛

𝑎21 𝑎22 … … 𝑎2𝑛

⋮ ⋱ ⋮ ⋮ ⋱ ⋮

𝑎𝑚1 𝑎𝑚2 … … 𝑎𝑚𝑛] 𝑇

=

[ 𝑎11 𝑎12 … … 𝑎1𝑛

𝑎21 𝑎22 … … 𝑎2𝑛

⋮ ⋱ ⋮ ⋮ ⋱ ⋮

𝑎𝑚1 𝑎𝑚2 … … 𝑎𝑚𝑛]

= [𝑎𝑖,𝑗]𝑚𝑥𝑚

For

i = 1, 2,... m และ j = 1, 2,... n

Matrix A with rows of m by the number of n Therefore; if we want to know the size of the

matrix A the number of rows and digits should be used to tell the size of the matrix. Similarly,

it can be written by replacing it with the symbol of mxn that is the matrix A by the size of mxn

and members that have mn.

So 𝑎𝑖,𝑗 be a member in the row of i and in the principle of j for matrix.

According to the definition of the matrix A, Can be called as Symmetric Matrix if 𝐴𝑇 = 𝐴.

Let 𝐴 = [

4 3 5 45 5 4 32 3 4 44 4 3 5

] Let 𝐴𝑇 = [

4 3 5 45 5 4 32 3 4 44 4 3 5

]

It can be seen that 𝐴𝑇 = 𝐴 . So A Is the symmetry matrix and 𝐴 − 𝐴𝑇 Is a symmetric matrix.

3.4. Qualitative research

This research has been applied to the forty-nine project references shown in table 1 as an

analysis data for the classification of smart cities. Therefore, according to the classification, as

a result of the concept of smart city, it has been tested with the input data of thirty projects in

these details:

Table 1 Display the format of indicators and assess the status as a bismuth community.

Evaluation

Topic Details of Smart City indicators

1 Community solid waste management correctly

2 Water is sufficient and standardized.

Page 10: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 746 [email protected]

3 Air quality does not affect the community.

4 Environmental quality is green enough to meet standards.

5 City to Safety

6 The technology is used for waste treatment and recycling.

7 The technology is used to manage electrical energy. Clean energy and renewable energy

8 Promote inter-governmental participation in accessing government information services

9 Promote single point access to government information and services

10 Promote good governance and encourage cooperation between the state and the public

sector

11 Promote the use of technology to create service innovation.

12 Promote participation of all sectors in the development of smart cities.

13 People are the center of development.

14 smart technology development To use the public service

15 The technology is used to plan strategies and implement relevant policies under local

management.

16 The use of information technology and communication is used in the digital

infrastructure of local agencies.

17 Promote the use of transportation system services

18 Facilitating transportation services

19 Providing information to passengers

20 Traffic management

21 Providing travel information

22 Safety services in the use of public services

23 Security in the transportation network

24 the promotion of renewable energy production

25 energy storage

26 Promoting the use of environmentally friendly vehicles

27 All buildings or establishments must have an energy index of not less than 75%.

28 promoted the cooling system

29 The heat in the area has reduced the amount of greenhouse gases.

30 Promoting business expansion

31 Encouraging the creation of a skill level

32 Technology is used to develop the local economy.

33 Technology is used in agricultural production or promote local sales channels

34 Promote a comprehensive health service system

35 Promote the development of a safe and systematic city system

4. RESEARCH RESULTS

From the comparison of techniques, it can be tested using the WE-KA program which is divided

into 3 parts as follows

Part 1: Technical analysis

Part 2: Evidence of DT-RB

Part 3: Proof of Symmetry Matrix

Page 11: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 747 [email protected]

4.1. Part 1: Technical analysis

It is a comparative analysis of the techniques presented with the new Decision Tree, Rule

based and Naïve Bayesian. In addition, DT-RB's proposed techniques can be used to find

accuracy and accuracy with the results of data mining for Smart Community.

Figure 4.1 Smart Environment

From the description of Figure 4.1 analysis results to using the technique of decision trees

and rule based In addition, it is found that Decision Tree techniques have 92 percent accuracy

when compared to Rule-Based techniques with 89 percent accuracy. Similarly, for Naïve

Bayesian and DT-RB techniques, it is with DT-RB techniques. It is 96 percent accurate

compared to the 80 percent accurate Naïve Bayesian technique.

Figure 4.2 of Smart Energy.

According to the explanation of figure 4.2, it has applied the analysis result to make

comparison of the Smart Energy by using the technique of Decision Tree and Rule-Based.

Besides, it was found that the technique of Rule-Based had the correctness result with 81 percent

when comparing to the technique of Rule-Based with the correctness result of 80 percent. In

addition, for the technique of Naïve Bayesian and the technique of DT-RB, It was found that

the technique of DT-RB had the correctness result with 84 percent when comparing to the

technique of Naïve Bayesian with the correctness result of 80 percent.

Figure 4.3 Smart living.

According to the explanation of figure 4.3, the analysis result to make comparison of Smart

living has applied the technique of Decision Tree and Rule-Based. Furthermore, it was found

that the technique of Decision Tree had the correctness result of 71 percent when comparing to

the technique of Rule-Based with the correctness result of 64percent. Similarly, for the

technique of Naïve Bayesian and DT-RB, it was found that with the technique DT-RB It has

the correctness of 76 percent when comparing to the technique of Naïve Bayesian with the

correctness result of 66 percent.

In addition, when bringing the results to make predictions of each set, it will bring each

method for quality to the Smart Community in policy service data, which has obtained good

results of comparison accuracy as in Table 1.

81.44 80.18 80.8084.33

70

80

90

Page 12: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 748 [email protected]

Figure 4.3 Concluding Result of Prediction for Smart Community.

According to the description of Figure 4.2, the results of the comparison of intelligent

community data are analyzed using 3 techniques, such as 6 sets of smart environment data sets,

8 data sets of intelligent data sets. The prediction of project quality data analysis is called DT-

RB. It is also in the form of a hybrid algorithm during the use of schema. Decision making with

basic rules In addition, based on basic analysis, this technique can be predicted to be the most

accurate method known as the Decision Tree and Rule Based. Therefore, it can increase

predictive accuracy for software engineering and scaling. The symmetric matrix and symmetric

matrix after that, the test set should be separated into 2 sets which are practice sets and test kits

that analyze results for predictions for the smart community.

4.2. Part 2: Proof of DT-RB

If L is the normal language of the characters ∑ will have the right linear grammar 𝐺 =(Q, ∑, δ, 𝑞0, 𝐹) so = 𝐿(𝐺) .

Proof: give M = (Q,∑, δ, 𝑞0, 𝐹) is a dfa accept L assumes Q = {q0, q1, … , qn} and ∑ ={a0, a1 … , an}. Create the right syntax G = (V, ∑, S, P) with V = {a0, a1 … , an}.

And S = a0, For each change

δ(qi, qj) = qk

Of M, add P go into the process.

qi → qj, qk

In addition, if 𝑞𝑘 is in F, add P Go into the process

qk → λ

First of all, show that G ก ำDefined in this manner can create every string in L, By

considering𝜔 ∈ 𝐿 by:

ω = ai, aj …ak, al

by M Accept a string that must be process

δ(q0, a1) = qp,

δ(qp, aj) = qr,

δ(qs, ak) = qt,

δ(qt, al) = qf ∈ F

From the syntax creation, there will be 1 production for using 𝛿’s. Therefore, able to obtain

is

q0 ⇒ aiqp ⇒ aiajqr ⇒ aiaj …akal,

Page 13: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Tipaporn Supamid and Surasak Mungsing

http://www.iaeme.com/IJARET/index.asp 749 [email protected]

⇒ aiqj ⇒ akalqf ⇒ aiaj …akal,

With grammar G and 𝜔 ∈ 𝐿(𝐺). On the contrary, if 𝜔 ∈ 𝐿(𝐺), Then the source must come

from the following format

δ∗(q0, ai, aj …ak, al) = qf

Proof is complete

4.3. Part 3: Proof of Symmetry Matrix

According to the definition of the matrix A, Would be called as a symmetric matrix if 𝐴𝑇 = 𝐴

Let 𝐴 = [

4 3 5 45 5 4 32 3 4 44 4 3 5

] Let 𝐴𝑇 = [

4 3 5 45 5 4 32 3 4 44 4 3 5

]

It can be seen that 𝐴𝑇 = 𝐴 Therefore, A Is the symmetry matrix and 𝐴 − 𝐴𝑇 Is a symmetric

matrix

To be proved

Give A is a square matrix. After that, A and 𝐴𝑇 𝑐𝑎𝑛 𝑏𝑒 𝑚𝑖𝑛𝑢𝑠𝑒𝑑.

If give 𝐵 = 𝐴 + 𝐴𝑇

Will get 𝐵𝑇 = (𝐴 + 𝐴𝑇)𝑇 = 𝐴𝑇 + (𝐴𝑇)𝑇 = 𝐴𝑇 + 𝐴 = 𝐴 + 𝐴𝑇 = 𝐵

So, 𝐴 + 𝐴𝑇 therefore it is a symmetric matrix

Also use the same method

If it lets 𝐶 = 𝐴 + 𝐴𝑇

𝐶 = 𝐴 + 𝐴𝑇 = 𝐴𝑇 + (𝐴𝑇)𝑇 = 𝐴𝑇 + 𝐴 = 𝐴 + 𝐴𝑇 = −𝐶

So, 𝐴 − 𝐴𝑇 therefore it is a symmetric matrix

The proof is complete.

5. DISCUSSION OF RESEARCH RESULTS

The analysis of the project data of 30 units, the amount of 520 attributes from the classification

of the project using data mining techniques. This gives comparative analysis results of the smart

community project data. It is found that the classification of projects with Decision Tree - Rule

Based (DT-RB) shows that the accuracy is 86.78% by classifying project data according to the

concept of smart communities. It is found that the accuracy is greater than 80%. The analysis

results can be a guideline for determining project data groups that can be used as tools to help

each executive. Local decision-making initiatives and projects that match the needs of the

community and response to the country's digital project development strategy, There is still

research from the community, recording studies and community development intelligently,

focusing on the participation of people with energy, safety and development models both at

home and abroad to know that It is necessary to have indicators for evaluating projects in

accordance with the concepts and forms of the global community equation applied to the

national economic and social development of 2015 - 2018 and the strengthening of community

development at the local level is still lacking. Indicators that are acceptable and standardized

from Suksawat Nattawut Sanit and Thitaporn Sincharoensak, 2018 and participation in How

should public library management be used to create innovations for government and

communities to participate in new ideas about managing public libraries from Shannon

Mersand, Mila Gasco-Hernandez, Emmanuel Udoh and J Ramon Gil- Garcia 2019 and the

participation of people who are about to enter a new era of more consistent changes in the

Page 14: INTEGRATION OF TECHNIQUES WITH DATA MINING ......The data mining techniques is Decision Tree, Rule-Based techniques. Naïve Bayesian and Decision Tree-Rule-Based (DT-RB) found that

Integration of techniques with data mining for community project data analysis from project

specific characteristics

http://www.iaeme.com/IJARET/index.asp 750 [email protected]

surrounding environment and the rapid change of technology, resulting in the city

administration and Community management regarding data collection, data quality, privacy

and security, as well as public participation are the problems and damage of community

building. More intelligent from Mila Gasco-Hernandez, Manuel Pedro Rodríguez Bolívar and

Taewoo Nam, 2019 etc.

6. SUMMARY OF RESEARCH FINDINGS

This research is a comparison of community project data. The find effective indicators by using

data mining techniques for project classification based on the concept of smart community is

technique that the researchers used to create the classification model is the Decision Tree - Rule

Based (DT-RB). The purpose of this research is to test whether the model for data classification

in local projects. The aforementioned methods that have better efficiency is suitable for the

project data to be classified which the result of the experiment showed that the precise

classification by the most effective rule-based algorithm is the Decision Tree-Rule Based (DT-

RB), with 86% accuracy. However, future research will use techniques in the form of ensemble

between Decision tree method and rule base method.

REFERENCE

[1] Suksawat Natthawutisit, Thanasukwaree and Titaporn Sinjaroonsak. (2016). Analysis and

design of indicators for smart community in Thailand. Bangkok. National and international

conference Sripatum University, 11. 11. December 21. 2559. 1729-1738.

[2] Mersand S., Hernandez M., Udoh E., and Garcia J. ”Public libraries as anchor institutions in

smart communities: Current practices and future development.” Proceedings of the 52nd Hawaii

International Conference on System Sciences, 2019

[3] Hernandez M,, Bolívar M., and Nam T. ”Introduction to the Minitrack on Smart and Connected

Cities and Communities”. Proceedings of the 52nd Hawaii International Conference on System

Sciences, 2019

[4] Hafeez G., Javaid N., Iqbal S., and Khan F. “Optimal residential load scheduling under utility

and rooftop photovoltaic units”. Energies 11 (3), 611, 2018

[5] Snape J. ”Smart community energy schemes: a case study-based model”. European Council for

an Energy Efficient Economy (eceee), 2019

[6] Gondokusuma M, Kitagawa Y, Shimoda Y. ”Smart community guideline: case study on the

development process of smart communities in Japan”.IOP Conference Series: Earth and

Environmental Science 294 (1), 012017, 2019

[7] Office of Digital Economy Promotion (depa). (Date 16 October 2019). Retrieved from

http://www.depa.or.th

[8] Thailand 4.0, model for driving Thailand to prosperity, stability and sustainability (October 16,

2019). Retrieved from http://www.libarts.up.ac.th/v2/img/Thailand-4.0.pdf

[9] Smart Community (16 October 2019). Search from http:

//www.i=neighbour.com/smart_community? SLang-TH.

[10] Han J., Kamber M., and Peei J. (2016) "Data mining concepts and techniques." Third edition.

pp.1-703.

[11] National Strategy 20 years, 2018-2037. (Date 16 October 2019). Retrieved from

https://www.nesdb.go.th/download/document/SAC/NS_SumPlanOct2018.pdf