QueryEngine (1)

Embed Size (px)

Citation preview

  • 8/17/2019 QueryEngine (1)

    1/19

    Query Engine

    A Pattern for Performing Dynamic Searchesin Information Systems

    Tim Wellhausen

    [email protected]

    http://www.tim-wellhausen.de

     Jan 24, 2006

    Abstract: This paper presents an architecture pattern for information sstemsthat re!uire comple" search capa#ilities. The pattern includes means to $ener%icall descri#e search re!uests, a ser&ice that interprets search re!uests and e"%ecutes them on data sources, and strate$ies for transmittin$ results #ack to there!uestin$ clients.

    - ' -

  • 8/17/2019 QueryEngine (1)

    2/19

    Context

    ( tpical information sstem retrie&es and manipulates #usiness data in man different was.)earch re!uests to fetch #usiness data are often created and e"ecuted in the middle tier of athree-tier sstem architecture* sometimes the are dnamicall created in a client application.These are a few e"amples +see i$. ':

    ● usiness lo$ic has to #e applied to a set of #usiness o#ects. The o#ects ha&e to #e foundand retrie&ed #efore the can #e modified.

    ● ( client application pro&ides a search dialo$ that lets users dnamicall compose comple"search re!uests. (n application ser&er e"ecutes the !ueries on a data source and returnsthe results to the client application, where the are shown.

    ● ( reportin$ tool has to find and process data from different sources. reatin$ the reportsma in&ol&e operations that are not directl supported # all underlin$ data sources.

    These e"amples ha&e in common that a fle"i#le mechanism for e"ecutin$ search re!uests isneeded. )ince the same search re!uirements tpicall appl to different tpes of #usiness data, a

    $eneric mechanism is helpful to a&oid redundancies.

    Problem

    (n information sstem needs comple" search capa#ilities. 1t ma #e possi#le to manuall write

    nati&e !uer statements, e"ecute those !ueries, e&aluate the results, and send the results #ack tothe re!uestin$ clients. This approach has se&ere draw#acks if an of the followin$ forces appl:

    ● Dynamics. )ome features of an information sstem re!uire dnamic !ueries at runtime. 1na search dialo$ for e"ample, users ma create !ueries # ar#itraril com#inin$ $i&ensearch criteria. (lthou$h the list of search criteria ma #e fi", there ma #e too man &alidcom#inations to implement static, parameteried ser&ices for e"ecutin$ the !ueries.

    ● Heterogeneity. 3ata sources from different &endors must #e supported, sometimes e&endifferent tpes of data sources: relational data#ases, 5 data#ases, 3(7 ser&ers, full-te"t search en$ines, or le$ac sstems. 8sin$ separate search li#raries with different (71sadds e"tra comple"it to a proect.

    - 2 -

     Fig. 1: A system architecture that includes a Query Engine

  • 8/17/2019 QueryEngine (1)

    3/19

    ● Impedance mismatch. 9ueries that refer directl to a data model ma #e inappropriate. 1nan o#ect-oriented sstem, for e"ample, !ueries should refer to the classes and attri#utesof an o#ect-oriented domain model that is mapped to ta#les and columns of a relationaldata#ase.

    ● Loose coupling. (lthou$h client applications ma create search re!uests, the applications

    should not #e #ound to data#ase internals.

    ● Query syntax. ritin$ correct !uer statements in the nati&e !uer lan$ua$e ma #e diffi%cult for an application de&eloper. The data source ma ha&e an uncommon !uer lan$ua$eor there ma #e too few e"perts a&aila#le for writin$ nati&e !ueries.

    ● Distriuted !ueries. 1n a distri#uted en&ironment, a sin$le !uer ma span o&er multipledata sources. 1nstead of e"ecutin$ a !uer on each sstem separatel and com#in$ the res%ults manuall, e"ecutin$ such a !uer should appear to a client as if there were a sin$ledata source.

    Solution

    reate a Query Engine, that is a ser&ice that takes a description of a search re!uest, e&aluatesand e"ecutes the re!uest, and returns the results #ack to the caller. This ser&ice acts as an inter%mediate laer #etween clients and the underlin$ data sources # interpretin$ search re!uestsand shieldin$ the clients from details on how to access the data sources.

    ( 9uer ;n$ine separates the formulation of indi&idual search re!uests from their e"ecution. 1tencapsulates the process how data sources are accessed, how nati&e !uer statements are formu%lated, and how those statements are e"ecuted. 9ueries ma #e created at runtime if the are#ased on dnamic user input or the ma #e created at compile-time if the are #ased on static#usiness re!uirements.

    ( 9uer ;n$ine handles the indi&idual features and limitations of each data source in&ol&edwhile processin$ search re!uests and creatin$ nati&e !uer statements. 1t ma also translate ref%erences to a domain model into references to a data model.

    ( 9uer ;n$ine mi$ht #e #uilt as a reusa#le component or as a customied ser&ice. 1n theformer case, a 9uer ;n$ine can #e made independent of specific data sources, specific !uerlan$ua$es, and application-specific domain models. 1n the latter case, some parts of the 9uer;n$ine ma #e li#rar code, whereas other parts ma #e proect code.

    lients ma access a 9uer ;n$ine # performin$ local procedure calls, remote procedure calls,or # callin$ a e# )er&ice. The interface of a 9uer ;n$ine contains classes that representsearch re!uests and classes that wrap results.

    1n the #asic case, the 9uer ;n$ine pattern has three participants as shown in i$. 2: 9uer,9uer ;n$ine, and

  • 8/17/2019 QueryEngine (1)

    4/19

    ( Query descri#es a search re!uest. 1t includes the search tar$et, a condition that refers to thedata records to #e searched, as well as additional options.

    The Query Engine performs a search re!uest: it translates a 9uer into a nati&e !uer state%ment, e"ecutes it on a data source, and returns a

    no 9ueries need to #e chan$ed.

    ( 9uer ;n$ine separates #usiness lo$ic from technical details on how search re!uests are e"%ecuted. (pplication de&elopers ma concentrate on implementin$ #usiness re!uirementswithout knowin$ details a#out the data sources.

    ( 9uer ;n$ine ma pro&ide features that are missin$ in a specific data source. 1t ma emulatethose features so that the application de&elopers don?t need to implement manual workarounds.1f search re!uests span o&er multiple data#ases, a 9uer ;n$ine is responsi#le for e"ecutin$ sep%arate nati&e !ueries and com#inin$ their results.

    1n o#ect-oriented proects, sometimes it is necessar to e"ecute nati&e !uer statements directlon a data#ase. ( 9uer ;n$ine is a#le to transparentl support #oth 9ueries that refer to ano#ect-oriented domain model and 9ueries that refer directl to data#ase ta#les and columns.

    There are also the followin$ dra'ac(s:

    3e&elopers that oin a proect ha&e to learn how a 9uer ;n$ine works and how the imple%ment search re!uests. ;&en if the know the nati&e !uer lan$ua$e well, it takes time to $et usedto new search facilities.

    1t mi$ht #e difficult and time-consumin$ to pro&ide a#stractions for all features a&aila#le in adata#ase sstem. 1f there is onl a sin$le data#ase sstem in&ol&ed that is unlikel to chan$e, the#enefit of a#straction is reduced.

    )ome features of a 9uer ;n$ine are costl to implement. 1f those features are needed onl a

    couple of times, the ma not #e worth the effort.

    3e&elopers ma create 9ueries that show performance pro#lems #ecause the de&elopers are notcon&ersant enou$h with limitations of the underlin$ data#ase sstem.

    (ddin$ a#stractions to features alread pro&ided # and optimied for a data#ase sstem maadditionall de$rade performance.

    1n some cases, usin$ a data warehouse sstem ma sol&e the pro#lem #etter. )uch a sstempro&ides a common laer for re!uestin$ information from different data sources.

    - 4 -

  • 8/17/2019 QueryEngine (1)

    5/19

    Known uses

    The Query Engine pattern has #een used in se&eral industr proects, #oth for rich client andfor we# applications. The author has worked with two 9uer ;n$ine implementations and hashimself de&eloped another one.

    1n a proect for a erman tele&ision compan, a 9uer ;n$ine was #uilt as a transparent laerto hide disparate data sources from client applications. 9ueries were performed on two differentrelational data#ase sstems and on a full-te"t search en$ine. The implementation facilitated9ueries that spanned o&er multiple data sources.

    1n a proect for another tele&ision compan, a 9uer ;n$ine was used #oth as a #ack-end fordnamic search dialo$s and for retrie&in$ #usiness o#ects from a relational data#ase. The searchdialo$ presented a simplified structure of the #usiness data that could #e !ueried. The 9uer;n$ine modified the search re!uests appropriatel #efore e"ecutin$ them.

    Related Work

    1n his #ook A7atterns of ;nterprise (pplication (rchitectureA BowlerC, 5artin owler de%scri#es the Query Object pattern. 1n his words, Aa 9uer D#ect is an interpreter BoC, thatis, a structure of o#ects that can form itself into a )9 !uer. Eou can create this !uer # re%ferrin$ to classes and fields rather than ta#les and columns.A

    ( shortcomin$ of that approach is that the 9uer D#ect itself creates the nati&e !uer state%ment. Therefore, the pattern shouldn?t #e used in client applications #ecause it would #ind theapplications to the !uer lan$ua$e of the data source. urthermore, this approach does not sup%port hetero$eneous data sources and ma cause redundancies if different kinds of 9ueries areneeded.

    The term F9uer ;n$ineG is widel used in the area of searchin$ in 5 data. There it de%

    scri#es mechanisms that ena#le searchin$ # applin$ an 5 search lan$ua$e such as 7ath.These 9uer ;n$ines normall do not address the forces that are essential to this pattern.

    - H -

  • 8/17/2019 QueryEngine (1)

    6/19

    Implementation

    This section descri#es the implementation of a 9uer ;n$ine. 7lease note that there are manwas how a 9uer ;n$ine can #e #uilt. This paper presents one implementation that resol&esmost of the forces. This section introduces the implementation for a #asic 9uer ;n$ine* addi%tional &ariations follow later on.

    Running Example

    (ll e"amples are #ased on a simple case stud: a compan needs a customer relationship man%a$ement +

  • 8/17/2019 QueryEngine (1)

    7/19

    The class dia$ram in i$. 4 shows the structure of a 9uer implementation that follows the lat%ter case'.

    (n o#ect of the main class, Query, takes the name of the search tar$et, the ma"imum num#er ofresults, a reference to the condition of the search re!uest, a filter, and sort options. (Condition o#ect is either a comparison of a field with a &alue +FieldCondition, a collectionof conditions com#ined # a oolean operator +CombinedCondition, or a su# !uer+SubQuery. ( Filter instance takes a list of those columns the client e"pects to retrie&e fromthe ser&er for each #usiness data record.

    FieldCondition  instances refer to columns of the search tar$et ta#le. (ssociations to otherta#les are handled # SubQuery instances. )u# !ueries are an a#straction of an association in thedata model. The do not determine whether the are resol&ed # usin$ an )9 su# select state%ment or # oinin$ the in&ol&ed ta#les.

    The

  • 8/17/2019 QueryEngine (1)

    8/19

      query#setSort(ne Sort(!due!, false))"

      CombinedCondition condition = ne CombinedCondition(-perator#./D)"  condition#addCondition(  ne FieldCondition(!balanced!, Comparator#0Q1.2S, ne Inteer()))"  condition#addCondition(

      ne FieldCondition(!due!, Comparator#3&0.40&50Q1.2S, firstDate))"  condition#addCondition(  ne FieldCondition(!due!, Comparator#20SS50Q1.2S, lastDate))"  query#setCondition(condition)"

      return query"

    (s an e"ample for a su# !uer, suppose that onl those in&oices should #e listed that are notfrom premium customers. To fulfill this re!uirement, the application de&eloper adds a su#!uer to include onl customers that are not premium customers:

    private Query createQueryForDueInvoices/o6remiumCustomers(  Date firstDate, Date lastDate)

    {  Query invoiceQuery = createQueryForDueInvoices(firstDate, lastDate)"

      Query query = ne Query(!customer!)"  query#setFilter(ne Filter(ne Strin*+{!id!))"  query#setCondition(  ne FieldCondition(!premium!, Comparator#0Q1.2S, !!))"  SubQuery subQuery = ne SubQuery(!customer5id!, query)"

      invoiceQuery#addCondition(subQuery)"

      return invoiceQuery"

    The 9uer implementation presented in this section has #oth ad&anta$es and draw#acks. or a Ja&a pro$rammer, writin$ 9ueries ma #e easier than writin$ )9 code. The de&eloper doesnot need to know the escape snta" for passin$ &alues likes 3ate o#ects to the data#ase if hedoes not use 6reparedStatements , for instance. Dn the other hand, FieldCondition o#ectsrefer to data#ase columns and therefore couple the code to the data model as does plain )9. 1nthis case, the &ariation /ect0riented Domain +odel  helps.

    The e"ample 9ueries map easil to )9. ( 9uer ;n$ine ma ne&ertheless pro&ide operationsthat do not map directl to )9 #ut in&ol&e additional source code, for e"ample to performcomple" calculations #efore or after a )9 statement is e"ecuted. 1t is the responsi#ilit of the9uer ;n$ine to hide such comple"it from the client and to pro&ide hooks to inte$rate those

    functions.

    Query Engine

    The 9uer ;n$ine takes 9ueries, creates e!ui&alent )9 statements, e"ecutes the )9 state%ments, retrie&es the results, and returns the results wrapped in a

  • 8/17/2019 QueryEngine (1)

    9/19

    i&en the Query o#ect for due in&oices that was created in the pre&ious section, the translationto )9 is strai$htforward. The 9uer ;n$ine inspects each part of the 9uer o#ect and $ener%ates correspondin$ )9 code.

    ● The tar$et of the Query #ecomes part of the ?from? clause: from invoice.

    ● The Query has a filter that restricts the results to three columns. (dditionall, a result setlimit is defined. The followin$ ?select? clause is created: select top ' id, amount,due.

    Transformin$ the Condition o#ect in&ol&es &isitin$ the nodes of the Condition o#ecttree, usin$ either the Iterator or the Visitor pattern BoC. The e"ample contains threeFieldCondition o#ects that are com#ined # an (3 operator. Transformed into)9, a ?where? clause is $enerated: 7ere balanced = and due 8= 9:;

  • 8/17/2019 QueryEngine (1)

    10/19

      &esultSet resultSet = ne Simple&esultSet(results)" return resultSet"

    private $ap retrieveData&ecord(@ava#sql#&esultSet resultSet,

    Filter filter) {

      BB 6ut all columns9 values into a 7as7 map as defined by t7e filter  $ap data4ransferas7$ap = ne as7$ap()"  for (int i = " i > filter#siEe()" i) {  Strin column/ame = filter#et(i)"  -b@ect value = resultSet#et(column/ame)"  data4ransferas7$ap#put(column/ame, value)" 

      return data4ransferas7$ap"

    The

  • 8/17/2019 QueryEngine (1)

    11/19

    Result Set

    (

  • 8/17/2019 QueryEngine (1)

    12/19

    ariations

    This section introduces four &ariations=:

    ● ( Query 2ransformer  encapsulates how the 9uer ;n$ine creates a nati&e !uer state%ment.

    ● ( Query #erformer  encapsulates how the 9uer ;n$ine e"ecutes a nati&e !uer statementand how it creates a

  • 8/17/2019 QueryEngine (1)

    13/19

    ates a fra$ment of the whole )9 statement, com#ines it with the fra$ments of its child o#ects,and returns the com#ined phrase.

    To support transformations into dialects of )9 that differ onl sli$htl from standard )9,create su# classes of indi&idual transformer classes that perform the re!uired #eha&ior. 1f, for e"%ample, creatin$ a su# !uer statement for data#ase ( is different from the standard case, a classSQ2.SubQuery4ransformer   inherits from SQ2SubQuery4ransformer . 1n that case, thebstractFactory pattern +BoC simplifies the construction of the transformer o#ect tree.

    1f a !uer should #e automaticall optimied #efore it is e"ecuted on a data source, the trans%former o#ects ma #e manipulated #efore the create the nati&e !uer statement. (n kind ofmodification is possi#le. Eou could, for e"ample, anale a transformer tree and modif eachsu# !uer to use oins instead of su# select statements.

    9uer Transformer o#ects are created # a 9uer Transformer actor, as can #e seen in i$.L. )uch a factor e"amines a Query  o#ect and creates and returns an appropriateQuery4ransformer   instance. 1n addition to a )9 9uer Transformer, man other trans%formers are possi#le, for e"ample for the Mi#ernate !uer lan$ua$e M9H or for 3(7 !ueries.

    8sin$ this approach, the same !uer could #e transformed transparentl into different nati&e!uer lan$ua$es. The factor chooses the appropriate 9uer Transformer either # applin$heuristics +if the !uer tar$et is a class name then return the Mi#ernate transformer, etc. or #specific settin$s.

    usin$ 9uer Transformers, the core of the 9uer ;n$ine is separated from details of creat%in$ nati&e !uer statements. 1f new data sources with different !uer lan$ua$es or !uer dialects

    ha&e to #e inte$rated, new 9uer Transformer implementations can #e added without modif%in$ the core.

    Query Per"ormer

    1f there is onl a sin$le data source, the code for e"ecutin$ a nati&e !uer statement can reason%a#l #e part of the core of the 9uer ;n$ine. 8se distinct 9uer 7erformers instead when oneor more of the followin$ constraints appl:

    ● The proect in&ol&es data sources that pro&ide different (71s for !uerin$.

    H Mi#ernate is an open source o#ect/relational mappin$ tool. The &ariation /ect0riented Domain+odel  e"plains the inte$ration of such a tool in-depth.

    - '= -

     Fig. 5: Query 2ransformer Factory

  • 8/17/2019 QueryEngine (1)

    14/19

    ● 3ifferent persistence strate$ies ha&e to #e inte$rated.

  • 8/17/2019 QueryEngine (1)

    15/19

    The implementation of a 9uer ;n$ine that uses #oth 9uer Transformers and 9uer 7er%formers is simplified:

    Query0nine?ean#@avaA

    public &esultSet e%ecuteQuery(Query query) {

      Query4ransformer query4ransformer =Query4ransformerFactory#createQuery4ransformer(query)"

      Strin queryStatement = query4ransformer#createStatement()" Query6erformer query6erformer =Query6erformerFactory#etQuery6erformer(query)"

     &esultSet resultSet = performer#performQuery(queryStatement)"

     return resultSet"

    #b$ect%#riented &omain 'odel

    5an information sstems ha&e a persistence laer that supports an o#ect-oriented domainmodel. These laers map #etween #usiness o#ects and their correspondin$ data#ase ta#les andcolumns and retrie&e and store #usiness o#ects. i$. '' shows an updated reference architecturefrom i$. that includes a persistence laer.

    (n application that uses an o#ect-oriented domain model should a&oid 9ueries that refer todata#ase ta#les and columns. 1nstead, 9ueries should refer to #usiness classes and their attri#%utes. The 9uer implementation introduced earlier in this paper does not make an assump%tions a#out the domain model #ecause all references are $i&en as plain strin$s. 1t can therefore#e used unmodified.

    i$. '2 shows the class dia$ram of the #usiness domain model of the

  • 8/17/2019 QueryEngine (1)

    16/19

    or a client, writin$ a 9uer #ased on this domain model is strai$htforward. The followin$code creates a !uer for due in&oices, similar to the e"ample $i&en earlier:

    private Query createQueryForDueInvoices(Date firstDate, Date lastDate) {

      Query query = ne Query(Invoice#class#et/ame())"  query#set$a%imum&esults(')" query#setFilter(ne Filter(ne Strin*+{Invoice#class#et/ame()))"

      query#setSort(ne Sort(Invoice#D10, false))"

      CombinedCondition condition = ne CombinedCondition(-perator#./D)"  condition#addCondition(  ne FieldCondition(Invoice#?.2./C0D, Comparator#0Q1.2S,?oolean#F.2S0))"  condition#addCondition(  ne FieldCondition(Invoice#D10, Comparator#3&0.40&50Q1.2S, firstDate))"  condition#addCondition(  ne FieldCondition(Invoice#D10, Comparator#20SS50Q1.2S, lastDate))"

      query#setCondition(condition)" return query"

    This code differs from the earlier e"ample, which was #ased on data#ase ta#les and columns:

    ● The names of #usiness fields are not hard coded. 1nstead, class names and constants areused.

    ● The filter does not contain indi&idual field names #ecause #usiness o#ects are alwasloaded completel. Therefore, the name of the tar$et #usiness class is set.

    ● The data#ase column balanced was compared to the inte$er &alue . That column is nowmapped to the #oolean attri#ute Invoice#?.2./C0D, and it is compared to the &alue?oolean#F.2S0.

    or a client, e"ecutin$ a 9uer referrin$ to an o#ect-oriented domain model is the same as e"%ecutin$ a 9uer referrin$ to a relational data model. The 9uer ;n$ine must ha&e #oth a spe%cialied 9uer Transformer and a 9uer 7erformer: the 9uer Transformer creates a statementin the nati&e !uer lan$ua$e of the persistence laer and the 9uer 7erformer e"ecutes the nat%i&e !uer statement # callin$ the persistence (71.

    The open source persistence li#rar Mi#ernate, for e"ample, could #e inte$rated into a 9uer;n$ine # creatin$ a Q2Query4ransformer  and a ibernate6erformer .

    5an persistence laers support loadin$ o#ect $raphs. This indicates that not onl the tar$eto#ect is retrie&ed #ut also associated o#ects. 1n most cases, thou$h, not all associated o#ectsare needed. ( hierarchical filter allows to e"plicitl define a su# set of the o#ect $raph that

    - '6 -

     Fig. 1": An o/ect0oriented domain model for the *$+ application

  • 8/17/2019 QueryEngine (1)

    17/19

    should #e retrie&ed. 1f, for e"ample, #oth the customer and the in&oice address are re!uired, thefollowin$ filter ma #e set:

    ne Filter(ne Strin*+{!Invoice!,  !Invoice#customer!,  !Invoice#customer#invoice.ddress!)"

    The $i&en filter lists all classes and the fields that form the associations #etween the classes. Mowsuch hierarchical filters are implemented depends on the persistence laer.

    Dnl those persistence li#raries and components can #e inte$rated into a 9uer ;n$ine thatsupport dnamicall created !uer statements. ;ntit eans that are part of the ;J frameworkcannot #e used: 9ueries in ;J-9 ha&e to #e staticall defined in descriptors* onl parameterscan #e set. Therefore, a J2;; en&ironment re!uires a more fle"i#le o#ect/relational mappin$tool.

    ithin an application, se&eral approaches to persistence ma co-e"ist and #e com#ined. 1n the

  • 8/17/2019 QueryEngine (1)

    18/19

    The class -nDemand&esultSet forwards each re!uest for result set elements to a ser&er com%ponent. The ser&ice &esultSet.ccessService  has to #e stateful and must #e #ound to the-nDemand&esultSet o#ect.

    There are two alternati&es on how the ser&ice ma fulfill the re!uests: it ma permanentl holda connection to the nati&e result set o#ect, for e"ample to a @ava#sql#&esultSet . Dr it ma

    keep a list of identifiers of those #usiness o#ects that were retrie&a#le when the nati&e !uerwas e"ecuted. The first alternati&e mi$ht cause resource shorta$es when man search results arekept open. The second alternati&e ields further data#ase !ueries whene&er the client applicationcalls the

  • 8/17/2019 QueryEngine (1)

    19/19

    (cknowledgements

    1?d like to thank m ;uro7o7 shepherd (llan Oell who helped a lot to impro&e the paper. Mepro&ided man useful comments, especiall re$ardin$ the forces of the pattern and the structureof the paper.

    1 would also like to thank the participants of ;uro7o7 2004 who workshoped the paper. The$a&e &alua#le feed#ack that helped me to put the paper into its final form.

    ast #ut not least, 1 would like to thank ut Mankewit, 3ominik 3unekamp, and (stridauereisen. ut in particular $a&e important feed#ack on earl drafts.

    Re"erences

    BowlerC #atterns of Enterprise Application Architecture. 5artin owler. 2002.(ddison esle 7rofessional.

    BoC Design #atterns. Elements of $eusale /ect0riented %oft'are.;rich amma,