Upload
truongphuc
View
218
Download
5
Embed Size (px)
Citation preview
Chapter 6 A Prototype System for Distributive Regulation Compliance Checking
6.1 Introduction
To demonstrate the methodology developed in this thesis, we implemented an Internet-
enabled prototype information management system to facilitate hazardous waste
compliance checking. The prototype system is developed using a mediation model for
organizing information sources and a distributed collaborative framework for
coordinating information flow among the waste generators, TSDS, and regulators. We
will focus the discussion on the underlying information system architecture. Examples
are provided to demonstrate hazardous waste compliance checking for waste generators.
The information for the fictitious waste generator and its waste stream data is derived
from the information obtained from the interviews in the hazardous waste TSDF industry.
The reason for using fictitious data for the research is due to the fact that all the waste
compliance checking data contain private information about waste generators and
TSDFs, and they are not supposed to be revealed to research without the permission from
the owners.
6.2 Information System Architecture
The information flow for the hazardous waste regulation compliance checking prototype
system is depicted in Figure 6.1. The regulation code is down loaded from a U.S.
government regulation repository site, in this case, from an official repository site
http://www.access.gpo.gov/ for the 40 CFR regulation code files. To illustrate the
distributed framework, we employ two computer servers, one for implementing the
1
necessary regulation code information processing and organization and the other for
compliance assistance designed for waste generators. The first server is employed to store
the original 40 CFR part 261 and part 262 regulation codes and to convert the plain text
file codes to an XML encoded semi-structured representation format as discussed in
Section 2.2. The second server is used to translate the XML based CFR regulations into
the corresponding regulation code objects, which are then stored in Oracle database for
further regulation compliance checking and information retrieval. The procedure for
organizing the code objects is implemented using the mediation approach discussed in
Section 5.2. The second server also implements the business logic for a checking service
typically employed for hazardous waste generators.
The first server, i.e., Server 1 as illustrated in Figure 6.1, provides three major
functionalities: (1) retrieving the original hazardous waste regulations from one of the
U.S. government regulation repository sites, in this case, the official web site
http://www.access.gpo.gov/ as the information source for the 40 CFR files, (2)
preprocessing the original file for the CFR regulation codes, and (3) transferring the
preprocessed XML formatted regulation code documents to client Server 2 for
compliance checking purpose. The underlying transferring mechanism used in this
system is a HyperText Transfer Protocol (HTTP) based client-server data transport
model. The procedure for the information processing on server 1 is shown in Figure 6.2.
The second server, i.e., Server 2 as illustrated in Figure 6.1, implements a number of
facilities to provide the functionalities for hazardous waste compliance checking. The
functionalities implemented include: (1) storing the preprocessed XML encoded
regulation code provision documents and translating them into database SQL language.
Using the Data Manipulation Language (DML) of Oracle database, and then importing
them to the Oracle database tables as described in Section 2.4, (2) building regulation
code objects from the Oracle database tables, (3) generating compliance checking
procedures from the regulation code objects, (4) retrieving waste lists from waste
generators, (5) normalizing particular waste lists to a canonical form suitable for
compliance checking, (6) providing compliance checking for the input waste lists, (7)
2
returning the compliance result to the client user, who is either a waste generator or a
TSDF. The information flow for compliance checking implemented in Server 2 is
illustrated in Figure 6.3.
Figure 6.1: System architecture for the prototype system
3
Server for pre processing regulation codes
Server 1
Server for providing Hazardous waste
regulation compliance Server 2
Internet
Internet
Web based compliance checking service for generators
Internet
Original regulation codes
4
Process for retrievingregulation codes
Server 1
Process for convertingoriginal file to formal XML
tagged regulation codes
Process for sending out formal XML tagged regulation codes
To server 2
Server 2
Internet connection
Inter-process
communications
Figure 6.2: Process and information flow on Server 1 for the prototype system
5
Server 1
Process for accepting theXML tagged
regulation codes
Process for converting XMLtagged codes to SQL
statement
Oracle DB tables forregulation codes
Process for generating compliance checking
procedures based on codes
Process for providing web based Compliance checking
information to generators
Process for accepting waste lists from waste generator
Process for converting waste lists to XML base
Canonical form
Process for checking waste Compliance based on checking procedures
Server 2
Figure 6.3: Process and information flow on Server 2 for the prototype system
6.3 Implementation and Features
Current implementation for the prototype system uses the latest and proven technologies:
The prototype takes advantage of the ubiquity of the World Wide Web, and adopts
the web browser as the user interface.
The modules are implemented using object-oriented programming languages.
Extendable Markup Language, XML, is used to represent and organize compliance
information.
Mediation based information model is employed for information integration.
The prototype is implemented using Java and SQL. Java is selected as the programming
language for its rich object-oriented feature and cross-server-platform support. SQL is a
standard database query language that can be employed for efficiently manipulating
regulation code provisions stored in the Oracle database. The processes shown in Figures
6.2 and 6.3 are implemented in Java. SQL statements are embedded in Java programs
through Java Database Connectivity Application Programming Interface (JDBC API) to
retrieve regulation code objects stored in the database tables.
The prototype implementation is a three-tier architecture. The first tier is the user
interface for the waste generators; the second tier is to implement the business logic and
the compliance checking related functionalities; and the third tier is for the regulation
code provisions stored in database.
The user interface, the web-based interactive checking environment, as shown in
Figure 6.4, is implemented using Java servelet technology. Java servlet provides a
component-based, platform-independent method for building web-based applications,
without the performance limitations of other web technology such as Common Gateway
Interface (CGI) programs. And unlike other proprietary server extension mechanisms
(such as the Netscape Server API or Apache modules), the servlet technology is server-
and platform-independent, and thus greatly reduces the software development,
deployment, and maintenance efforts. The servlet technology is an ideal tool for building
the prototype compliance checking system.
6
The business logics, which includes the procedures for processing the waste stream
data from the waste generators, for the information retrieval from database, and for
conducting rule-based decision making for compliance checking, are all implemented as
Java classes. Java classes are contained and run within an application server. In this
research, the Weblogic application server is used for running the Java servlets and the
related Java classes.
There are three basic modules for the compliance checking environment. Each of
them is implemented as a dedicated Java servlet. The first module, which is illustrated on
the upper frame of the web interface shown in Figure 6.4, is used for obtaining the waste
stream data from the user. The second module, which is shown on the lower right part of
the frame, is used for information retrieval for regulation codes and related information.
The third module, which is shown on the lower left part of the frame, is used for the
interactive compliance checking process. The modules are described in details in the
following sections.
7
Figure 6.4: Web browser based user interface for hazard waste compliance checking information management system
6.3.1 Module for retrieval of waste stream data from Generators
This module is implemented based on a Java servlet component model. The servlet
consists the following basic components:
The component for dealing with user input, such as (1) the location of the waste
stream file, and (2) the related regulation code the generator wish to compliance with.
The component for down loading the waste stream file from the generator’s machine
to the compliance checking server.
8
The component for converting the proprietary waste stream data into a canonical form
that can be processed by the compliance checking engine. The data conversion
process is based on a one-to-one mapping from the generator’s waste stream file to a
standard XML file. Each generator is assigned an identification number, which is
used as a key for storing and retrieving the information from the waste generator.
When a generator submits a waste compliance checking request, the identifier for this
particular generator is first retrieved and the identifier is checked against the
generator’s identifier stored in the system. Once the identifier is matched, the
proprietary waste stream data are converted to the canonical form.
The component for communicating with the module of compliance checking engine.
This component will send the waste stream data for compliance checking.
6.3.2 Module for information retrieval for a regulation code
This module is implemented using Java servlet component model for the user
interface, and using Java Database Connection (JDBC) technology for connecting to the
database that stores the regulation code objects.
The Java servlet produces standard HTML files and can be viewed by the users
using a web-browser environment. The lower right part in Figure 6.4 illustrates the
implemented regulation code information retrieval interface. This module serves as an
information retrieval help desk for the users. In order to facilitate the retrieval of complex
information embedded in regulation code, we provide multiple information retrieval
functionalities. The functionalities are classified into six categories:
Retrieval of the definitions defined in a regulation code. As discussed in Section 2.3,
the definitions within a regulation code are used to resolve the vagueness for certain
terms in the regulation code. The user can use the definition retrieval functionality of
this system to obtain the concepts for interpreting the provisions when conducting the
compliance. Figure 6.5 illustrates a compliance user retrieval request for all the
definitions that have the word “EPA” in them. The result, which is depicted in Figure
6.6, shows all the definitions that contain the term “EPA” in the definition sections in
9
40 CFR, and the explanations for the term or the phrase. The retrieval mechanism
used for obtaining the definitions from the regulation code objects is the Oracle
feature of database pattern-matching. Basically, this technique is able to search
through the regulation code objects that are stored as rows in a regulation definition
database table. The technique has been described in earlier Section 2.4.1.1.
Figure 6.5: User input for retrieval of all the definitions that have the word “EPA” in regulation code 40 CFR
10
Figure 6.6: Result for retrieval of all the definitions that have the word “EPA” in regulation code 40 CFR
Retrieval of regulation provisions by index. Plain text formatted regulation codes are
generally organized by index. As discussed in section 2.1.2, the regulation code index
is the explicit organization of the code. By providing the retrieval facility based on
index, the user can quickly find related provisions. Figure 6.7 illustrates the result of
the query for all the provisions within the “40 CFR 262.11” sections.
11
Figure 6.7: Result for retrieval of all the provisions that are in section 40 CFR 262.11 in regulation code 40 CFR
Retrieval of related provisions. Regulation code provisions contain complex linkage
structures for referencing embedded in the provisions themselves. To obtain a
comprehensive understanding of a regulation code for a certain compliance checking,
the user has to search back and forth to follow the complex references embedded in
the provisions. To simplify this type of information retrieval, we provide a retrieval
mechanism for obtaining all the linked provisions started from a particular indexed
provision. Figure 6.8 illustrates the web interface for submitting a query for obtaining
all the linked provisions starting from 40 CFR 262.11. Figure 6.9 shows the result of
the retrieval. All the linked provisions are retrieved and presented in a sequence
following the embedded references in a regulation code. For this example, the
12
embedded link 40 CFR 261.2 inside provision 40 CFR 262.11(0) is followed after the
provision 40 CFR262.11. The procedure for traversal of all the linkages embedded in
provisions has been described in Section 2.4.2.
Figure 6.8: User input for retrieval of all the linked provisions that start from section 40 CFR 262.11 in regulation code 40 CFR
13
Figure 6.9: Result for retrieval of all the linked provisions that start from section 40 CFR 262.11 in regulation code 40 CFR
Retrieval of typical normative provisions. As discussed in section 2.3, normative
provisions are the provisions that give detailed description about the compliance and
the related rules. Most provisions inside a regulation code are normative provisions.
This retrieval method allows a user to find all the normal provisions that contain a
certain word or words. Figure 6.10 illustrates the web interface for user input, and
Figure 6.11 shows the result of the normative provisions that contain the phrase
14
“hazardous waste” in 40 CFR 261 and 40 CFR 262. The procedure that establishes
meta-information for provision classification has been described in section 2.3 and
section 2.4 for the retrieval of normative provisions.
Figure 6.10: User input for retrieval of all the normal provisions that contain “hazardous waste” in section 40 CFR 261 and 40 CFR 262
15
Figure 6.11: Result for retrieval of all the normal provisions that contain “hazardous waste” in section 40 CFR 261 and 40 CFR 262
Retrieval of exceptional provisions. The exceptional provisions contain exception
rules to the typical normative provisions. To retrieve the exceptional provisions in a
regulation code is important for users to gain better understanding about the
exceptional rules or scenarios for a certain compliance request. Figure 6.12 illustrates
the web interface for user query for the exceptional provisions that contain the phrase
16
“hazardous waste” in 40 CFR 261 and 40 CFR 262. Figure 6.13 shows the result for
this query, which includes all the provisions that contain exception rules to the typical
normative provisions described in section 2.3. The classification rules and the meta-
information for provision classification have been described in section 2.3 and section
2.4 for retrieval of exceptional provisions.
Figure 6.12: User input for retrieval of all the exceptional provisions that contain “hazardous waste” in section 40 CFR 261 and 40 CFR 262
17
Figure 6.13: Result for retrieval of all the exceptional provisions that contain “hazardous waste” in section 40 CFR 261 and 40 CFR 262
Retrieval of both typical normative and exceptional provisions. The result of this
retrieval is will include both the typical normative provisions and exceptional
provisions. This retrieval function allows a user to query all the provisions that
contain a certain word or words. This function is similar to a full text search for a
word or words within a regulation code.
18
Information retrieval for a regulation code is an important part of hazardous waste
regulation compliance checking. It provides the users a tool to obtain related regulation
codes that they need to comprehend during a compliance checking process. The above
retrieval functionalities implemented in the prototype system can provide users the
flexibilities for obtaining regulation code provisions with different retrieval requests.
6.3.3 Module for compliance checking
This module is implemented using a Java component model. The Java component
model used for this compliance checking module is based on the “model, view, control”
(MVC) approach, which is a standard software design for handling user interactive input
and interaction with the underlying business logics [Alur, D, Crupi, J. and Malks, D,
2001]. For this prototype system, the approach is employed to present a user interface
(the “view”) for receiving input, a compliance checking procedure (the “model”) for
traversing a relevant rule for a related compliance code provision, and select the next user
interface (the “control”) for the checking rules based on a user’s response to the current
rule.
We implement the user interface using a Java servlet, and the compliance
checking procedure by a decision-tree based rule traversal strategy to find the related
regulation provisions. The logics are implemented in Java classes. The Java servlet acts
as an information controller for creating user interface layout, for retrieving user input,
and for invoking the backend regulation code traversal component. More specifically, the
servlet (1) receives user compliance checking requests from the client, (2) selects the
proper regulation code to use for compliance checking, (3) performs decision-tree based
traversal through the regulation compliance rules for clients based on the procedures
developed in section 3.3, and (4) selects and delivers relevant rules for regulation code
provisions to the clients for further actions. These tasks are performed iteratively until all
the related regulation rules are traversed and checked. In Figure 6.14, the lower left part
19
illustrates the user interface of the compliance checking process module. The user
interface is composed of two parts. The first part is the display of the retrieved regulation
code together with the explanation of the provision. The second part is the decision
choices for displaying the provision. There are three decision choices, namely an answer
“yes”, an answer “no”, and an answer “unknown” for the displayed provision. The Java
servlet will retrieve the choice from the user and perform further rule traversal for a
particular compliance checking.
Figure 6.14: User interface for conducting compliance checking rules for waste generators
Following the above example, the compliance checking module will present a
user the checking sequence for determining if the waste lists submitted by the user is a
hazardous waste based on the interactive input from the user. The complete checking
process is depicted in Figure 6.15.
20
Figure 6.15: Sequence of checking rules for determining a hazardous waste for a generator
6.4 Summary
This chapter provides a description of the prototype developed in this research for an
Internet based information management system for hazardous waste regulation
compliance checking. The focus of the prototype system is to demonstrate the feasibility
of using the methodology developed in this research for the implementation of the
information management system. In addition, we use the prototype system to obtain
insight and experience for developing software components for distributive information
systems using current state of practice.
In the current implementation, information retrieval for regulation codes and the
related information organization and management for code objects is emphasized. The
rule-based checking procedure, however, is implemented using a simple decision-tree
based traversal to obtain the relevant rules for a certain compliance checking, but lacks a
21
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 1 : beginning check process for the waste. Step 2: checking if the waste is excluded from being a waste.
Step 3: checking if the waste is a simple waste. Step 4: checking if the waste is a solid waste.
Step 5: checking if the waste is a listed hazardous waste.
Step 6: Concluding that it is a hazardous waste since at least one component of the waste is a hazardous waste.
formal rule-based system for conducting inference, conflict resolution for complex rule
sets for more general compliance checking cases. In future work, an investigation of
integrating a rule-based engine with the information management system for compliance
checking will be beneficial for the research in regulation code based compliance checking
system as well as the practices in the regulation compliance service.
22