MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation Database Mining

Embed Size (px)

Citation preview

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    1/15

    Decision support for the academic library acquisition

    budget allocation via circulation database mining

    S.-C. Kao   a, H.-C. Chang   b, C.-H. Lin   c

    a Department of Information Management, Kun Shan University of Technology, 949 Da Wan Road, Yung Kung,

    Tainan 710, Taiwan, ROC 

    b Department of Business Administration, National Cheng Kung University, 1 University Road, Tainan 710, Taiwan, ROC c Department of Industrial Management Science, National Cheng Kung University, 1 University Road,

    Tainan 710, Taiwan, ROC 

    Received 21 September 2001; accepted 13 December 2001

    Abstract

    Many approaches to decision support for the academic library acquisition budget allocation have been

    proposed to diversely reflect the management requirements. Different from these methods that focus mainly

    on either statistical analysis or goal programming, this paper introduces a model (ABAMDM, acquisition

    budget allocation model via data mining) that addresses the use of descriptive knowledge discovered in thehistorical circulation data explicitly to support allocating library acquisition budget. The major concern in

    this study is that the budget allocation should be able to reflect a requirement that the more a department

    makes use of its acquired materials in the present academic year, the more it can get budget for the coming

    year. The primary output of the ABAMDM used to derive weights of acquisition budget allocation con-

    tains two parts. One is the descriptive knowledge via utilization concentration and the other is the suit-

    ability via utilization connection for departments concerned. An application to the library of Kun Shan

    University of Technology was described to demonstrate the introduced ABAMDM in practice.

     2002 Elsevier Science Ltd. All rights reserved.

    Keywords:  Decision support; Acquisition budget allocation; Data mining

    1. Introduction

    The decision of the budget allocation for academic libraries is a fairly important, but complextask. Greaves (1974) indicated that eight variables play the most important role in the acquisition

    E-mail address:  [email protected] (S.-C. Kao).

    0306-4573/03/$ - see front matter    2002 Elsevier Science Ltd. All rights reserved.P I I : S0306- 4573( 02) 00019- 5

    Information Processing and Management 39 (2003) 133–147

    www.elsevier.com/locate/infoproman

    http://mail%20to:%[email protected]/http://mail%20to:%[email protected]/

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    2/15

    allocation operation in his study. These include (1) the size of faculty, (2) the size of students or

    size of student credit hours, (3) the cost of library material, (4) the adequacy of the library col-lection in an academic discipline, (5) the size of type of courses, (6) the amount of conducted

    research, (7) the past record in use of allocated funds, and (8) the circulation statistics. Manyensuing studies depicted in the literature have witnessed the increased use of these factors in thepast few years (Budd & Adams, 1989; Crotts, 1999; Decroos et al., 1997; Evans, 1996; Hamaker,

    1995; Lafouge & Laine-Cruzel, 1997; Sorgenfrei, 1999; Wise & Perushek, 1996, 2000). They haveindeed shown a meaningful contribution to the enhancement of library management. Although

    the decision with respect to the acquisition budget allocation also involves many other issues suchas priorities, strategic plans, and programs, etc., it is believed that the suitable material utilizationshould be considered also. The survey presented by Tuten and Lones (1995) and research con-

    ducted by Budd and Adams (1989) emphasized that circulation data is one of the most extensivelyreferred factors when dealing with the decision of desired allocation budget for libraries. There-fore, information discovered in circulation database would be valuable to relevantly reflect the

    utilization of material for a library.The techniques used to support the decision of the acquisition budget allocation operation

    mostly include statistics based models and goal programming based paradigms. Based on both/

    either the quantified data and/or the information provided by the management, the former simplyfocuses on the statistical analysis to derive a hierarchical decision tree where the concerned factors

    along with corresponding shared ratio are contained (Anderson, Sweeney, & Williams, 1994). Thelater deals mainly with the development of mathematical models that can offer optimal solutionsto the problem of contradictory or incommensurable goals by giving the rank order of the con-

    cerned goals and the constraints of the concerned factors in advance. Wise and Perushek (1996,2000) has comprehensively demonstrated its use for the acquisition allocation problem. In spite

    that both techniques have shown a meaningful contribution to the decision support for libraryacquisition budget allocation, a drawback that is revealed is that the historical circulation data ishardly ever taken into account in depth while budgeting. In other words, the utilization of ma-

    terials acquired by a department should be able to reflect the final allocated acquisition budget,and thus becomes the motivation of this study.

    To deeply analyze the circulation data to reveal the material utilization for a department is a

    complex task in all intents and purposes. It is not simply to get the ratio of the number of recordsto the total records of the circulation database for a period of time. The data collection, thedefinition of degree that a material belongs to a department, and the complexity of entropy

    computation are all variables that may cause the task enormously difficult. The data collectionneeds to gather the circulation data from daily operations and store in a database, clean unnec-

    essary attributes (or fields) and missing data if existing, and reconstruct the created database if necessary. The degree that a material belongs to a department needs to be defined in order for a

    department to compute the total entropy of performance. For example, the degree that a materialis classified into the subject of accounting may be defined to be related to the department of ac-counting with the semantic strength of ‘‘absolutely matching’’, information management with

    ‘‘matching’’, and mechanical engineering with ‘‘not matching’’. This implies that when a materialof accounting was utilized, the department of accounting performs better than both informationmanagement and mechanical engineering because they use their acquired materials more appro-priately. Although this task is highly subjective and time consuming, it is necessary for this study.

    134   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    3/15

    The data mining technology is a process of discovering implicit knowledge in large databases. It

    has the capability to uncover the hidden relationships, patterns, and trends in the historical da-tabases. For example, by using modern information technology, the data mining technique has

    witnessed the increased emphasis on the value of past data, such as personal bankruptcy pre-diction (Donato et al., 1999), hotel data mart (Sung & Sang, 1998), customer service support (Hui& Jha, 2000), knowledge generation in finance (Dhar, 1998). It can perform the operation of 

    association, classification, regression, clustering, or summarization to reveal patterns that aresignificantly interesting, meaningful, interpretable, and decision supportable in large databases

    (Han & Fu, 1999; Hirota & Pedrycz, 1999; Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Fayyad &Stolorz, 1997). More importantly, the discovered knowledge with a descriptive or predictive formcan be used to support domain related decisions. For example, the descriptive knowledge ‘‘ac-

    cording to the circulation data collected for the last academic year, the department of informationmanagement made much more use of materials in its subject than others in theirs’’ is decisionsupportable for budget allocation, and so is the predictive knowledge ‘‘IF the department is in-

    formation management THEN the utilization of materials in its subject is 92.54%.’’To achieve the objective of this research, the construction of a model, ABAMDM (acquisition

    budget allocation model via data mining) that is based on the circulation database mining for

    decision support for the academic library acquisition budget allocation is studied. This researchalso provides a library budget allocation solution model with a mechanism that a designer may

    use in developing a decision support system. Of the rest of this paper, three sections are dis-tributed as follows. Section 2 describes the ABAMDM where the definition of membership, thedescriptive knowledge discovery, the entropy computation, and the weights of budget allocation

    for departments are included. An illustrated example presented in the same section is providedto demonstrate the ABAMDM. Section 3, a practical application for the Library of KSUT

    (LKSUT) is delineated, where an allocation table that contains 17 departments with their cor-responding weights is presented. The conclusion and future research issues are addressed in thefinal section.

    2. The ABAMDM

     2.1. The architecture of ABAMDM 

    The architecture of ABAMDM is illustrated in Fig. 1. It contains two stages to achieve theobjective of this study. The first stage is to preprocess the circulation data, and the second is to

    derive circulation performance and descriptive knowledge, that are used to decide the weights of acquisition budget allocation for departments.

     2.1.1. Preprocess of circulation data

    In general, the original circulation database contains several attributes. The only ones that arerequired in this study are departmental members identifiers (member_ID) and materials identifiers

    (material_ID). In Fig. 1, the data table of Circulation contains such two attributes. However,what are needed for this study are department identifiers (dept_ID) and the category identifiers(category_ID) to reflect performance for departments. Therefore, two tables, DeptMember and

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    135

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    4/15

    Material, have to be generated, where DeptMember contains the attributes of dept_ID and

    member_ID and Material the material_ID and category_ID. The circulation data table of Cir-culation_I containing attributes of dept_ID and category_ID can be then generated. Notice thatthis generation process can be omitted if Circulation_I can be directly obtained from the em-

    ployed library information system (the dotted rectangular part in Fig. 1).Eventually, the objective of preprocess of circulation data is to derive the final circulation table(Circulation_II) that includes dept_ID, category_ID, and the corresponding semantic strength.

    The semantic strength represents the degree of the relation between a department and a materialcategory. It is management definable and can be divided into several levels. Basically, a semantic

    strength takes on a linguistic value, and therefore is not calculable. However, it can be assigned anumeric value when measurement is concerned with. For example, the ‘‘absolutely matching’’ is a

    defined semantic strength with a numeric value of 0.6 indicating that a department and a categoryare perfectly relative while ‘‘absolutely not matching’’ no relation at all. Nevertheless, from asystematic point of view, it is very tedious for every academic year to define the semantic strength

    Fig. 1. The architecture of ABAMDM.

    136   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    5/15

    for all categories. For example, if the size of department is 35 and category 500 in a circulation

    database, the size of definition units will be 35   500, that is 17,500. Therefore, it is necessary tocreate a data table (Membership) that contains all departments, categories, and corresponding

    semantic strength, and then perform a Structured Query Language (SQL) operation (SQLFC) toretrieve the Circulation_II (Connolly, Begg, & Strachan, 1996). The SQL FC   is given as follows.

    SQLFCCreate table Circulation_II (dept_ID, category_ID, Strength);

    Insert into Circulation_II (dept_ID, category_ID, Strength).Select Circulation_I.dept_ID, Circulation_I.category_ID, Membership.Strength;From Membership, Circulation_I;

    Where Circulation_I.dept_ID ¼  Membership.dept_ID and Circulation_I.category_ID ¼Membership.category_ID.

    In order to reduce the evaluation differences among librarians, departmental faculties, andspecialists, the table Membership can be created via group assessment. However, this study doesnot go to this point. Part of the Membership for the department of Information Management as an

    example is illustrated in Fig. 2. On the vertical dimension, the 6 represents the ‘‘absolutelymatching’’, 5 the ‘‘extremely matching’’, 4 the ‘‘matching’’, 3 the ‘‘ordinarily matching’’, 2 the

    ‘‘likely matching’’, 1 the ‘‘slightly matching’’, and 0 the ‘‘absolutely not matching’’. On the hori-zontal one, the category codes are based on the table of New Classification Scheme for ChineseLibraries. For example, code ‘‘480’’ is ‘‘trade’’ and is defined to be related to the Information

    Management with the semantic strength of ‘‘absolutely matching’’.

     2.1.2. Generation of decisional knowledgeWhen it comes to the stage of generation of decisional knowledge, the research attempt is to

    obtain descriptive knowledge and utilization gain. The descriptive knowledge is stored in the data

    table of DeptConcent and utilization gain in DeptConnect. The DeptConcent contains attributes

    Fig. 2. Part of Membership for the department of Information Management.

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    137

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    6/15

    of dept_ID and its corresponding degree of concentration that represents the category distribution

    of materials that have been used. The DeptConnect includes attributes of dept_ID and connectionthat represents the utilization suitability for department. For example, the descriptive knowledge

    ‘‘the department of Information Management made much use of materials, and most of them arein its subject’’ can relevantly explain the utilization of the department of information manage-ment. Therefore, the value of concentration and connection is derived from the number of re-

    cords, the distribution of material categories used, and the links of categories and subjects.Importantly, how to obtain the degree of concentration for a department continues a critical

    issue that needs to be solved in this study. The ID3 algorithm introduced by Quinlan (1986) hasbeen widely used to help measure the information entropy for a set of data under the consider-ation of multiple classes (Quinlan, 1987; Sestito & Dillon, 1994). Based on the information theory,

    it adopts a top-down induction method to return the degree of ability (or purity) that a variablecan separate the other. The more the value of the degree of ability, the less the data is equallydistributed. For a department (denoted by  D), the expected information ( I ), the expected entropy

    (E (D)), and the concentration (Concentration(D)) are expressed in formula (1)–(3), respectively.

     I ðnC 1 ; nC 2 ; . . . ; nC nÞ ¼

     nC 1 M 

    log2nC 1 M 

    þ þ

     nC n

     M log2

    nC n

     M 

      ð1Þ

    nC i  is the number of records that return to class  C i,   i ¼  1; 2; . . . ; n  and  M  is the total number of records.

     E ð DÞ ¼Xt i¼1

    nV  i M 

     I ðaV  iC 1 ;aV  iC 2 ; . . . ; aV  iC mÞ

    h i  ð2Þ

    t  is the number of different values that the department  D can takes on;  nV  i   is the total number of 

    records that the department  D  takes value  V  i,   i ¼  1;

    2;

     . . .

     ;

    t ;  aV  iC  j  is the total number of recordsthat the department D takes value V  i  and returns to class C  j, i  ¼  1; 2; . . . ; t , j  ¼  1; 2; . . . ; m and M  isthe total number of records.

    Concentrationð DÞ ¼ I ðnC 1 ; nC 2 ; . . . ; nC nÞ  E ð DÞ ð3Þ

    To obtain the utilization connection for a department, the semantic strength has to be defined in

    advance. Assume there are  n  levels defined for the semantic strength, denoted by SS( L1; L2; . . . ; Ln)and the corresponding importance is represented by SI(x1;x2; . . . ; xn), where Li is the ith level andxi  the importance of  Li  with a numeric value ranging from 0.00 to 1.00 and the sum of  xi  is 1.00,i ¼  1; 2; . . . ; n. The averaged connection for a department D, Connection(D), is defined by formula(4), where  N D   is the number of total members.

    Connectionð DÞ ¼

    Pni  n Lix Li N D

    ð4Þ

    where  n Li   is the number of records of which the category is  Li   and  x Li   is the importance of  Li.In spite that the Concentration(D) in formula (3) and the Connection(D) in formula (4) can be

    obtained as the decision base for acquisition budget allocation, there is no advocacy that can be

    rely on to derive budget allocation weights. However, it is believed that someone may putmore emphasis on connection than concentration, but others may make this decision vice versa.

    Eventually, budget operation has to deal with subjective opinions as usual and will be never

    138   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    7/15

    conducted without ‘‘numbers’’. In other words, budget allocation has to rely on things that are

    countable, measurable, and quantifiable. Therefore, it is an assumption for ABAMDM that theimportance of concentration is   a   and connection 1   a   that are management definable. Conse-

    quently, the final weight, Weight( D), for the department D can be then determined by formula (5),where   i ¼  1; 2; . . . ; m,  m  is the number of departments.

    Weightð DÞ ¼  aConcentrationð DÞ þ ð1   aÞ Connectionð DÞPmi¼1   aConcentrationð DiÞ þ ð1   aÞ Connectionð DiÞ½

      ð5Þ

     2.2. An illustrated example

    Assume that the original preprocessed circulation data table (Table 1) used as an example todemonstrate the ABAMDM includes the following information: (1) four departments considered:

    Dept01, Dept02, Dept03, and Dept04; (2) Dept01 has 35 members, Dept02 31, Dept03 42, andDept04 38; (3) 302 records in total, of which Dept01 has 72, Dept02 52, Dept03 79, and Dept0499; (4) five levels for semantic strength, SS(A, H, M, L, N) –– A: absolutely matching, H: highly

    Table 1

    A collected circulation database for departments

    Dept Category Count Entropy Importance

    Dept01 A 18 0.5000 0.4000

    H 1 0.0857 0.3000

    M 0 – 0.2000

    L 23 0.5259 0.1000

    N 30 0.5263 0.0000

     E (Dept01) 0.3905

    Dept02 A 13 0.5000 0.4000

    H 0 – 0.3000

    M 2 0.1808 0.2000

    L 6 0.3595 0.1000

    N 31 0.4449 0.0000

     E (Dept02) 0.3541

    Dept03 A 42 0.4846 0.4000

    H 2 0.1343 0.3000

    M 8 0.3346 0.2000L 2 0.1343 0.1000

    N 25 0.5253 0.0000

     E (Dept03) 0.4219

    Dept04 A 9 0.3145 0.4000

    H 36 0.5307 0.3000

    M 20 0.4661 0.2000

    L 18 0.4472 0.1000

    N 16 0.4249 0.0000

     E (Dept04) 0.7158

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    139

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    8/15

    matching, M: matching, L: likely matching, N: absolutely not matching; corresponding impor-

    tance: IS(0.4, 0.3, 0.2, 0.1, 0.0); (5)   a  is 0.3.Table 1 contains the following columns: dept, category, count, entropy, and importance. The

    dept represents the identifier of the department while category the level of semantic strength. Thecount is the number of records observed for a level of matching and the entropy is the expectedentropy computed via formula (2). The importance contains the numeric value ranging from 0.00

    to 1.00 representing the corresponding importance that a level of matching is related to a de-partment. To achieve the objective, the concentration of material categories via ID3 algorithm,

    the connection via formula (4), and the weight via formula (5) for the departments are obtained.The expected information via formula (1) for Table 1 is  I (82, 39, 30, 49, 102), that is 1.8805. The

    entropy for Dept01 is 0.3905, Dept02 0.3541, Dept03 0.4219, and Dept04 0.7158. In subsequence,

    by formula (3), the concentration for Dept01 is then 1.4900, Dept02 1.5264, Dept03 1.4585, andDept04 1.1647. Fig. 3 shows the concentration against the total records for four departments.Particularly, it is found that the concentration in terms of category observed in Dept04 is less than

    any one of the others, in spite of the largest use of materials. This implies that Dept04 makes useof materials in various subjects.

    The connection via formula (4) for Dept01, Dept02, Dept03, and Dept04 are 0.2800, 0.2000,

    0.4571, and 0.5316, respectively. Consequently, by formula (5), the weight for Dept01 is 0.2380,Dept02 0.2233, Dept03 0.2752, and Dept04 0.2635. In addition to these numeric data that can be

    used to support acquisition budget allocation, some descriptive knowledge can be derived on thebasis of comparison as follows. Notice that the value of the use of materials is the averaged recordper member (ARPM).

    •   For Dept01:  adequate use of materials (2.0571), not diverse categories observed (1.4900), low

    utilization connection (0.2800).•   For Dept02:  little use of materials (1.6774), not diverse categories observed (1.5264), very low

    utilization connection (0.2000).

    Fig. 3. The concentration against size for four departments.

    140   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    9/15

    •   For Dept03:  ordinary use of materials (1.8810), diverse categories observed (1.4585), high uti-lization connection (0.4571).

    •   For Dept04: high use of materials (2.6053), very diverse categories observed (1.1647), very high

    utilization connection (0.5316).

    3. An application case

    3.1. Application characteristics

    Before ABAMDM was placed in service, the LKSUT took the ARPM as the basis to deriveweights of partial acquisition budget allocation for departments. The introduced ABAMDM has

    been demonstrated to librarians at LKSUT for the possibility of employment. A concise ques-tionnaire was designed to help elicit information with respect to evaluation of the model. Thequestions posed to the librarians covered the aspects of process of circulation data analysis, us-

    ability for acquisition budget allocation, validity of the outputs, and applicability. The feedbacksummarized in a concise manner was contained in Table 2. It was found that the intro-duced ABAMDM was adequate for acquisition budget allocation. However, computerization for

    ABAMDM was strongly recommended. This part will be put onto the list of the future researchissues.

    The LKSUT then employed the proposed ABAMDM to support acquisition budget allocationoperation for the 2001 academic year in the context of material utilization. The total budget was

    14,673,500 new Taiwan dollar (NTD). It is the LKSUT’s policy that based upon the circulation inthe last academic year, 10% of the total budget (1,467,350 NTD), is shared by 17 departments. It

    took LKSUT about three months to create the data table of Membership that contains the se-mantic strength indicating the relations between departments and categories. The semanticstrength was defined to be five levels: absolutely matching, highly matching, matching, slightlymatching, and absolutely not matching. The librarians, departmental faculty, and specialists in

    library domain were all participants who frequently discussed in groups on a department-by-department basis to get opinions from each other so that the distinction can be reduced. They

    Table 2

    Reviewers’ results

    Review scenarios   Results

    Reviewer 1 Reviewer 2 Reviewer 3 Reviewer 4

    Process of circulation

    data analysis

    Adequate Need to be simpler Good Better if simpler

    Usability for acquisition

    budget allocation

    Adequate Acceptable Helpful Decision supportable

    Validity of the outputs Acceptable Agreeable Good Acceptable, but need

    clearer description

    Applicability High High only if  

    computerized

    Adequate Better if computerized

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    141

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    10/15

    individually evaluated the semantic strength and determined the final conclusion in groups if any

    conflict occurs.The LKUST has been employing a windows-based information system (named T2) developed

    by Transtech Information Co. Ltd., to help circulation operation for 4 years. The circulation data(Circulation_I) for a period of time can be easily collected via T2. However, T2 does not includethe function creating Circulation_II that includes three attributes of dept_ID, category_ID, and

    Strength. Therefore, it is necessary to externally perform SQLFC  to generate the Circulation_II.Table 3 listed the characteristics of this application case, including the time period for collecting

    circulation data, the definition of semantic strength, the   a, the number of members for 17 de-partments, the total number of records observed, and the ARPM.

    3.2. The results and findings

    By utilizing ID3, formula (4) and formula (5), the results as shown in Table 4 including con-

    centration, connection, weights for ABAMDM and ARPM, and allocated acquisition budget forABAMDM was obtained. Notice that due to the consideration of comparison aspect, the resultsproduced by ARPM was also included in Table 4. Fig. 4 was created to illustrate the concen-

    tration against the number of records for departments. The relationship between concentration

    Table 3

    The characteristics for the ABAMDM applied in LKSUT

    Time period for collecting

    circulation data

    4/1/2000 to 3/31/2001

    Semantic strength (a ¼  0:2) SS(A, H, M, L, N) –– 

    A: absolutely matching, H: highly matching, M: matching, L:likely matching, N: absolutely not matching;

    ISðA; H; M; L; NÞ ¼ ISð0:4; 0:3; 0:2; 0:1; 0:0Þ

    Departments Total members ( N D) Total records ( N R)   N R/ N D

    Mechanical Engineering 1041 10,169 9.7685

    Electronic Engineering 806 9552 11.8511

    Environmental Engineering 565 10,723 18.9788

    Electrical Engineering 907 13,498 14.8820

    Fiber Engineering 228 6384 28.0000

    Information Management 667 8633 12.9430

    Accounting 471 8390 17.8132

    Industrial Management 402 6244 15.5323Real Estate Management 452 7940 17.5664

    Early Childhood Care and

    Education

    220 1878 8.5364

    International Trade 476 10,740 22.5630

    Finance and Banking 477 4945 10.3669

    Public Communication 166 701 4.2229

    Applied English 629 10,174 16.1749

    Visual Communication Design 579 11,041 19.0691

    Motion Picture Design 233 2172 9.3219

    Space Design 301 4312 14.3256

    142   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    11/15

    Table 4

    The results via ABAMDM and ARPM

    Code Departments Concen-

    tration

    Connec-

    tion

    Weight (%) Budget (NTD)

    ABAMDM ARPM ABAMDM ARPM

    01 Mechanical Engineering 2.0733 3.7414 6.3069 3.8777 92,545 56,899

    02 Electronic Engineering 1.9758 2.6979 4.7258 4.7044 69,344 69,030

    03 Environmental Engi-

    neering

    1.9923 6.3602 10.1544 7.5338 149,001 110,547

    04 Electrical Engineering 1.8863 2.8191 4.8721 5.9075 71,491 86,684

    05 Fiber Engineering 2.0271 3.2114 5.5051 11.1148 80,779 163,093

    06 Information Manage-

    ment

    2.0691 4.7061 7.7337 5.1378 113,481 75,390

    07 Accounting 2.0020 2.5503 4.5170 7.0711 66,280 103,758

    08 Industrial Management 2.0299 2.3211 4.1880 6.1657 61,453 90,472

    09 Real Estate Manage-

    ment

    1.9865 3.3142 5.6422 6.9731 82,791 102,320

    10 Early Childhood Care

    and Education

    2.1071 0.6514 1.7443 3.3886 25,596 49,722

    11 International Trade 1.9506 4.9603 8.0662 8.9566 118,359 131,424

    12 Finance and Banking 2.0591 3.3212 5.6795 4.1152 83,338 60,385

    13 Public Communication 2.1175 1.5175 3.0305 1.6763 44,469 24,597

    14 Applied English 1.9664 4.6698 7.6419 6.4208 112,134 94,215

    15 Visual Communication

    Design

    1.9654 5.7908 9.3014 7.5696 136,484 111,073

    16 Motion Picture Design 2.0964 3.0785 5.3340 3.7000 78,269 54,298

    17 Space Design 2.0581 3.2385 5.5567 5.6867 81,537 83,443

    Fig. 4. The concentration against the number of records for 17 departments.

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    143

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    12/15

    and connection was shown in Fig. 5 while the number of members, the number of records, and

    allocated budget in Fig. 6. Notice that in order to show the results in an appropriate manner, thevalue of the number of records was multiplied by 1/10 and budget by 1/100 in Fig. 6.

    From the tables and figures given above, it was found that

    1. In Table 4, although the department of Fiber Engineering showed the highest ARPM(28.0000 in Table 3), it neither obtained the highest connection, nor the budget.

    Fig. 5. The concentrations and connections for 17 departments.

    Fig. 6. Number of members, number of records, and final budget for 17 departments.

    144   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    13/15

    2. In Fig. 4, the department of Public Communication showed the highest concentration

    (2.1175 in Table 4). This implied that the categories used were not equally distributed. However,the Fig. 5 indicated that it did not obtain a high value of connection. This provided the infor-

    mation that most of the materials it made use were not in its subject. This was supported by theresult shown in Fig. 6 that it obtained low budget. Similar result was found for the department of Early Childhood Care and Education.

    3. In Fig. 4, the department of Electrical Engineering obtained the lowest value of concen-tration (1.8863 in Table 4), but the largest number of records (13,498 in Table 3). Accordingly, it

    was found that it made use of material in a variety of categories, which implied that part of themwere in its subject, but part were not. Therefore, the total connection it obtained was not high (inFig. 5), and so was the allocated acquisition budget (in Fig. 6).

    4. In Fig. 5, it was found that the department of Environmental Engineering obtained thehighest value of connection (6.3602 in Table 4), but the observed concentration was considered tobe low in comparison to others (in Fig. 4) and the number of members was not big. However, it

    finally obtained the highest allocated budget (149,001 in Table 4). It seems that the value of   a

    played a very considerable role in this case.5. In Fig. 6, the department of Mechanical Engineering showed the biggest number of members

    (1041 in Table 3), but did not obtain the biggest budget. This implied that the size of departmentwas not the unique factor that can be relied on to determine final acquisition budget.

    6. The acquisition budget allocated via ARPM depended totally upon the ARPM. Fig. 7provided the information that the result produced by ABAMDM was fairly different from that byARPM.

    From the findings described above, a remarkable implication obtained was that high acquisi-

    tion budget depends not only upon the number of records and the number of members, but alsoupon the suitability of use. Although it is difficult to evaluate the introduced ABAMDM andARPM via the concluding allocated acquisition budget, it is believed that to open up the material

    Fig. 7. The allocated acquisition budget via ABAMDM and ARPM for 17 departments.

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    145

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    14/15

    utilization with respect to relevance and bring it into the process of acquisition budget allocation

    is a useful path to make obvious the value of the materials for which we budget.

    4. Concluding remarks

    This paper has addressed the importance of circulation data processing in a basic detail, in-troduced a budget allocation model by using the data mining technique, illustrated the use of 

    ABAMDM, and demonstrated an application. The proposed ABAMDM employs the SQL tohelp preprocess the circulation data if necessary, the information theory to measure the con-centration of categories observed in the circulation data table, and the utilization connection to

    derive the weights as a decisional base of acquisition budget allocation. It offers a new way of processing the circulation data at hand to elicit information that can interpret the data in anappropriate mode. The knowledge discovered by ABAMDM can be used to support making

    decisions in regard to the acquisition library budget allocation. However, although the ABA-MDM can provide the budget allocation operation with some helpful information by mining thecirculation data table at hand, the subjective information will greatly influence the final results.

    For example, the definition of semantic strength for categories and departments and the value of  a

    are factors that need to be determined carefully. Furthermore, based on reviewers comments listed

    in Table 2, this model may be more applicable if computerized, and thus would become an ex-tension of this study.

    It has been seen that the availability of accessing materials via Internet is rapidly changing the

    strategy as a transition from print to electronic forms for libraries. On-line materials (or electronicmaterials) are very expensive at this time. Importantly, the large amount of expenditure during the

    past decade has revealed that a rather swift shift or reallocation of the collection budget fromprint to electronic publications makes the budget allocation decision more complex and difficult(Miller, 1999). For example, how to budget the comprehensive preservation for the electronic

    materials without the copyright and migration problems? How to negotiate the most advanta-geous on-line database licenses for users, and how to catalog these titles? What can be relied onwhile making the decision on which electronic journals or e-books are good for our library? We

    believe that data collection via daily circulation work will be greatly influenced by the way a usermakes use of the on-line materials, and in consequence makes the budget allocation operationeven more difficult. In spite that many issues and arguments have been brought onto the dis-

    cussion and research platform, it is believed that ‘‘how to use right money to buy right things’’remains a core question while budgeting. It will be valuable to discover unknown information in

    historical data to support making budget allocation related decisions.

    References

    Anderson, D. R., Sweeney, D. J., & Williams, T. A. (1994).   An introduction to management science: quantitative

    approaches to decision making  (pp. 593–622). New York: West Publishing Company.

    Budd, J. M., & Adams, K. (1989). Allocation formulas in practice.  Library Acquisitions: Practice & Theory, 13, 381– 

    390.

    146   S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147 

  • 8/17/2019 MELJUN CORTES RESEARCH PAPERS DSS Dss for the Academic Library Acquisition Budget Allocation via Circulation …

    15/15

    Connolly, T. M., Begg, C. E., & Strachan, A. D. (1996).   Database Systems: A Practical Approach to Design,

    Implementation, and Management. New York: Addison-Wesley.

    Crotts, J. (1999). Subject usage and funding of library monographs.   College & Research Libraries, 60, 261–273.

    Decroos, F., Dierckens, K., Poller, V., Rousseau, R., Tassignon, H., & Verweyen, K. (1997). Spectral method for

    detecting periodicity in library circulation data: a case study.  Information Processing & Management, 33(3), 393–403.

    Dhar, V. (1998). Data mining in finance: using counterfactuals to generate knowledge from organizational information

    systems. Information Systems, 23(7), 423–437.

    Donato, J. M., Schryver, J. C., Hinkel, G. C., Schmoyer, R. L., Jr., Leuze, M. R., & Grandy, N. W. (1999). Mining

    multi-dimensional data for decision support.  Future Generation Computer Systems, 15(3), 433–441.

    Evans, M. (1996). Library Acquisitions formulae: the monash experience.  Australian Academic & Research Libraries,

     27 , 47–57.

    Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases.  AI 

    Magazine, 17 , 37–54.

    Fayyad, U., & Stolorz, P. (1997). Data mining and KDD: promise and challenge.  Future Generation Computer Systems,

    13(2–3), 99–115.

    Greaves, F. L., Jr. (1974). The allocation formula as a form of book fund management in selected state-supported

    academic libraries, Florida State University, unpublished doctoral dissertation.Hamaker, C. (1995). Time series circulation data for collection development or: you can’t intuit that.   Library

    Acquisitions: Practice & Theory, 19(2), 191–195.

    Han, J., & Fu, Y. (1999). Mining multiple-level association rules in large databases.  IEEE Transactions on Knowledge

    and Data Engineering, 11(5), 798–805.

    Hirota, K., & Pedrycz, W. (1999). Fuzzy computing for data mining.   Proceedings of the IEEE, 87 (9), 1575–1600.

    Hui, S. C., & Jha, G. (2000). Data mining for customer service support.  Information and Management, 38(1), 1–13.

    Lafouge, T., & Laine-Cruzel, S. (1997). A new explanation of the geometric law in the case of library circulation data.

    Information Processing & Management, 33(4), 523–527.

    Miller, R. G. (1999). Electronic journals and the scholarly communication process: present and future. In C.-C. Chen

    (Ed.),  IT and Global Digital Library Development   (pp. 293–300). Masachusetts: MicroUse Information.

    Quinlan, J. R. (1986). Induction of decision tree.  Machine Learning, 1, 81–106.

    Quinlan, J. R. (1987). Simplifying decision trees.  International Journal of Man-Machine Studies, 27 , 221–234.Sestito, S., & Dillon, T. (1994).  Automated knowledge acquisition. Englewood Cliffs, NJ: Prentice Hall.

    Sorgenfrei, R. (1999). Slicing the pie: implementing and living with a journal allocation formula.  Library Collections,

    Acquisitions & Technical Services, 23(11), 39–45.

    Sung, H. H., & Sang, C. P. (1998). Application of data mining tools to hotel data mart on the Intranet for database

    marketing.   Expert Systems with Applications, 15(1), 1–31.

    Tuten, J. H., & Lones, B. (1995).  Allocation Formulas in Academic Libraries. Chicago, IL: Association of College and

    Research Libraries.

    Wise, K., & Perushek, D. E. (1996). Linear goal programming for academic library acquisition allocation.  Library

    Acquisitions: Practice & Theory, 20(3), 311–327.

    Wise, K., & Perushek, D. E. (2000). Goal programming as a solution technique for the acquisition allocation problem.

    Library & Information Science Research, 22(2), 165–183.

    S.-C. Kao et al. / Information Processing and Management 39 (2003) 133–147    147