Daffodil Replica Tor Testing

Embed Size (px)

Citation preview

  • 8/6/2019 Daffodil Replica Tor Testing

    1/26

    Long Running User Transactionsin Database Systems

    Master-Master Database Replication

    Fabian Merkimerkisoft informatik

  • 8/6/2019 Daffodil Replica Tor Testing

    2/26

    Long Running User Transactions in Database Systems

    Table of Contents

    1 Preface...............................................................................................................................32 Application..........................................................................................................................4

    3 Concept overview...............................................................................................................6

    3.1 Long running user transactions..................................................................................6

    3.2 Master-master or multi master replication..................................................................6

    4 Use Case...........................................................................................................................7

    5 Evaluation..........................................................................................................................8

    5.1 Oracle Workspace......................................................................................................85.1.1 Example...............................................................................................................8

    5.1.2 Conflict resolution................................................................................................95.1.3 Conclusion of Oracle's Workspace...................................................................10

    5.2 Daffodil Replicator....................................................................................................115.2.1 Testing Daffodil Replicator................................................................................125.2.2 Conclusion.........................................................................................................12

    5.3 Hibernate..................................................................................................................135.3.1 What does Hibernate offer?..............................................................................135.3.2 Replication modes.............................................................................................145.3.3 What is missing in Hibernate?...........................................................................14

    5.4 Others.......................................................................................................................15

    5.4.1 Microsofts SQL Server......................................................................................155.4.2 Slony I / II...........................................................................................................15

    5.5 Conclusion................................................................................................................16

    6 Design and implementation of replication with Hibernate............................................... 17

    6.1 Methods of replication...............................................................................................17

    6.2 Replication Framework.............................................................................................18

    6.3 Algorithm...................................................................................................................18

    6.4 Additional replication table........................................................................................19

    6.5 Conflict resolution.....................................................................................................19

    6.6 Dependencies...........................................................................................................206.7 Additional unique constraint vs. UUID......................................................................21

    6.8 Transaction handling................................................................................................22

    7 Testing..............................................................................................................................23

    8 Conclusion.......................................................................................................................25

    9 References.......................................................................................................................26

    Fabian Merki, merkisoft informatik 5.11.2006 2 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    3/26

    Long Running User Transactions in Database Systems

    1 Preface

    The subject "Long Running User Transactions in Database Systems" was selected for thisassignment because of a requirement that arose from a Course Administration softwarethat was under development at that time. Refer to http://kursweb.merkisoft.ch for moreinformation on the Course Administration application.

    This application had specific database replication requirements the likes of which I had notencountered before. The assignment was a perfect opportunity to spend some time toinvestigate what solutions are available on the market and then to test a solution in a realworld scenario.

    I decided to write the document in English, because I feel the Course Administrationapplication is not the only software that could benefit from such a solution. Documenting

    my findings in English on the web will open up the results to a broader spectrum of peoplethan if written in my native German. It also gave me the opportunity to practise writingtechnical documentation in English which is a requirement in my current employment.

    Acknowledgements

    I would like to thank my lecturer Mr. Herbet Bitto for his help and support for the content ofthis assignment, Mr. Steven Hawkes for reviewing the document and the SBB for offeringa comfortable environment where due to time pressures, most of this assignment wasconducted. I found it a challenging experience researching and developing this solutionwhilst commuting on a daily basis.

    I hope you will enjoy reading this document.

    Fabian Merki

    Hereby I do confirm that everything in this assignment is created, written, drawn by myselfunless otherwise stated.

    ________________ __________________________

    Date / Place Fabian Merki

    Fabian Merki, merkisoft informatik 5.11.2006 3 / 26

    http://kursweb.merkisoft.ch/http://kursweb.merkisoft.ch/
  • 8/6/2019 Daffodil Replica Tor Testing

    4/26

    Long Running User Transactions in Database Systems

    2 Application

    The following diagram outlines a problem scenario for an application that uses multipledatabases. Customers subscribe for courses via the internet. The administrator managescourses, subscriptions, teachers and additional data.

    Figure 1

    A local database was selected because the administrators have a slow internet connectionbut require fast data access. The centralised database on the internet can be modified atthe same time as the local database. At some point the local and the central databasemust be replicated. Because both databases are master databases, such a replication iscalled master-master replication. The term master means that the databases is updated bya user and therefore becomes the master of the modified data. Multiple local databasesmight exist since more than one administrator can manage the data. An organisationalprocess must be established by the administrators so that changes do not get overwrittenby others.

    A further requirement of this architecture is that changes may be stored before they are

    Fabian Merki, merkisoft informatik 5.11.2006 4 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    5/26

    Long Running User Transactions in Database Systems

    published. Such time consuming changes by a user are called long-running usertransaction.

    Because a course is not a single database record (actually it is a complex structure) thereplication process must replicate the whole database as a single entity replication ofsingle tables would most likely fail because of the foreign key constraints that exist withinthe database.

    This document outline how the previous requirements can be addressed using existing offthe shelf products together with custom software.

    Fabian Merki, merkisoft informatik 5.11.2006 5 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    6/26

    Long Running User Transactions in Database Systems

    3 Concept overview

    3.1 Long running user transactions

    Databases in general only support the concept of all-or-nothing transactions. Either thewhole transaction completes and is visible for everyone or the transaction has to be rolledback because of an error condition.

    If a user works for several hours or days on a specific task they most certainly will requiredata persistence to protect themselves from data losses. Most database vendors offersuch functionality eg. Oracle with its Workspace feature. Oracle traditionally developed thisfeature for geographical data management. Generating such a map from the raw dataprobably takes months, requiring many people to work in parallel on subsets of the data.

    From time to time the changes from the team need to be brought together.

    3.2 Master-master or multi master replication

    When using local databases for faster performance, the data has to be replicated(synchronized) with the central database. If the local database is only used for queriesthen the replication is called a master-slave replication, i.e. changes from the master arereplicated to all slave(s). This is simple and almost every database provider offers productthat provide this feature.

    Master-master replication is where the local database is used for updates, inserts or

    deletes and these modifications must be replicated to other databases which themselvesare concurrently serving user requests (including data modifications). In this scenario thereare many masters databases which are updating each other.

    This concept is well-known for source control systems such as CVS, Subversion or etc.where files are modified by several programmers concurrently. Once a developer hasperformed a change, the work is committed to a central repository from where othersmerge their changes against the latest version. Because most of the time, changes aremade in different lines within, no conflicts occur and the source control system performsthese merges automatically. Sometimes two developers changed a file in the same area(i.e. the same line). In this scenario when the second developer checks in its changes itwill be asked to resolve the conflict. Either the own version or the latest version in therepository are chosen. Very seldom does it happens that one developer is deleting amethod, variable etc. while the other is newly referring to it in new code. Or two developersare introducing a new symbol in the same namespace twice. The source control does notcheck such problems where as a database will always perform constraint checks.

    Fabian Merki, merkisoft informatik 5.11.2006 6 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    7/26

    Long Running User Transactions in Database Systems

    4 Use Case

    Title Database replication

    Precondition Local and remote database exists.At least the local database is filled with data.

    Description For all tables which need to be replicated, the versions in bothsystems are checked and the corresponding action for each row isapplied.

    Postcondition The local and the remote database are identical in terms of the datain the replicated tables.The version of the replicated objects are stored.

    Variations Database download:1. clean / delete local tables2. start replication

    Actors Administrator

    Fabian Merki, merkisoft informatik 5.11.2006 7 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    8/26

    Long Running User Transactions in Database Systems

    5 Evaluation

    A major proportion of the available time for this assignment was allocated to locate andevaluate existing products offering solutions for the specified requirements. This chapterprovides specific detailed information on the key features of products outlined previously.

    It was found that some replication products only cover the simple case of master-slavereplication and therefore do not fit the requirements of this assignment. Others do havemaster-master concepts but most of them only work if the second instance of the databaseis always available and are not able to accommodate a dynamic set of master databases.

    The evaluation below focus's on Oracle Workspace, Daffodil, Hibernate and provides ainsite into these products can be used to solve the master-master replication problem,covering issues such as merging, conflict resolution etc.

    5.1 Oracle Workspace

    Since Version 9i of Oracle DBMS there is built in support for long-running usertransactions through the use of stored procedures. The basic operations are to create,switch to, refresh, merge back and finally delete a Workspace. The following codeillustrates how this can be realised using Workspace functionality.

    5.1.1 Example

    Session 1 Session 2

    execute DBMS_WM.EnableVersioning('emp');

    execute DBMS_WM.CreateWorkspace('NEWWORKSPACE');execute DBMS_WM.GotoWorkspace('NEWWORKSPACE');

    update emp set ename='-' || ename;commit;

    select ename from emp;

    select ename from emp;

    execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

    select ename from emp;update emp set ename=substr(ename,2);commit;

    execute DBMS_WM.RefreshWorkspace('NEWWORKSPACE');

    select ename from emp;

    execute DBMS_WM.RemoveWorkspace('NEWWORKSPACE');

    Example 1

    Fabian Merki, merkisoft informatik 5.11.2006 8 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    9/26

    Long Running User Transactions in Database Systems

    5.1.2 Conflict resolution

    What happens when user A updates (and commits) the same record in its Workspace asuser B simultaneously? When user A merges the changes to the Workspace 'LIVE',nothing will happen except that the change is successfully stored (assuming there is noconstraint violation) and everything looks as expected. When user B merges this table aconflict will be visible in the view _CONF and the merge aborts. Now theconflicting rows have to be handled manually by invoking DBMS_WM.BeginResolve,DBMS_WM.ResolveConflicts and DBMS_WM.CommitResolve. Finally a second mergemust be performed. Because new conflicts occur, additional merges might be required.The following example illustrates resolving a conflict in a very simple manner.

    Session 1

    assuming the NEWWORKSPACE exists

    Session 2

    execute DBMS_WM.GotoWorkspace('NEWWORKSPACE');

    update emp set ename='*' || ename;

    commit;

    update empset ename='X' || ename;

    commit;

    execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

    --- merge fails because

    --- session 1 & 2 updated the records

    --- overwrite the changes

    --- with the ones of NEWWORKSPACE

    execute DBMS_WM.BeginResolve ('NEWWORKSPACE');

    execute DBMS_WM.ResolveConflicts('NEWWORKSPACE','emp', 'empno>=0','child');

    execute DBMS_WM.CommitResolve('NEWWORKSPACE');execute DBMS_WM.MergeWorkspace('NEWWORKSPACE');

    select ename from emp;

    Example 2

    More information is available on [RES-ORACLE-WS].

    Fabian Merki, merkisoft informatik 5.11.2006 9 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    10/26

    Long Running User Transactions in Database Systems

    5.1.3 Conclusion of Oracle's Workspace

    Oracle's solution meets almost all requirements. The available stored procedures of thepackage dbms_wm allow to manage two different states of the data in a database. As longas all users use the same database and different workspaces, the workspace conceptworks very well. It becomes harder when each user has its own database. In this case thecentral repository has to reference the remote database. Additionally the merge becomesmore complex: copy operations must be performed from the remote table into the centraltable in a different workspace and vice-versa to be able to perform the workspaceoperations in single tables.

    However there is one major drawback: the price! Therefore Oracle Workspace was not anoption in for my solution.

    Fabian Merki, merkisoft informatik 5.11.2006 10 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    11/26

    Long Running User Transactions in Database Systems

    5.2 Daffodil Replicator

    Daffodil Replicator is a Java based data replication tool available in two version: opensource and an enterprise version. It offers the following features:

    Bi-directional Data Synchronization Supports two merge strategies, merges single columns of a row Supports replication across heterogeneous database Conflict detector and resolution Partial data (Tables, Rows and column) Replication Large datatype support Scheduling Platform independent synchronization

    Debugging

    This product supports bi-directional data replication by either capturing a data sourcesnapshot or by synchronizing the changes. It monitors for data changes in the tables andsynchronizes all data changes made by the subscriber and the publisher on a periodicbasis or on-demand by the subscriber. While synchronizing with one or more target datasource, Replicator uses pre-defined conflict resolution algorithms to resolve conflictsbetween the publisher and subscriber. The publications and subscriptions are definedusing GUI or APIs on existing database servers.

    Figure 2

    Source: [RES-DAFFODIL]

    Fabian Merki, merkisoft informatik 5.11.2006 11 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    12/26

    Long Running User Transactions in Database Systems

    5.2.1 Testing Daffodil Replicator

    I wrote a small Java tool to test Daffodil Replicator. To simulate a production environment,two processes run separately and perform db manipulations: one is the publisher, one thesubscriber. The subscriber, which also is the test driver, communicates via a socket to thepublisher to generate commands. This setup helped to create simple and complex tests.

    Unfortunately Daffodil Replicator in the current version (2.1) has problems deleting rows.

    The following exception is thrown after a course with its subscriptions was deleted in thepublisher database and the synchronize method called.

    Caught exception: com.daffodilwoods.replication.RepException:Problem in synchronizing data due to -- 'DELETE on table 'COURSE' caused a

    violation of foreign key constraint 'SQL060919051050792' for key (2). The

    statement has been rolled back.'.

    The problem only occurs when a course row is delete on the publisher side and there is noclear reason why.A more minor issue is that Daffodil Replicator adds triggers to tables and sometimes in mytests, the replication completed without actually performing any changes on the other side.The reason for this was that the test cases drop and recreate tables for a clean test setupwhich caused the deletion of the triggers. Therefore tables should never be dropped andrecreated because Daffodil will not recreate the triggers.

    5.2.2 Conclusion

    It was quite easy to use the open source edition of Daffodil Replicator. Apart from theproblem with deletes, the replication process works very well. One useful feature is thatthe smallest unit of the merge operation is a single cell and not an entire row.

    The detection of the delete bug rendered this product in its current version unusable for myapplication.

    Fabian Merki, merkisoft informatik 5.11.2006 12 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    13/26

    Long Running User Transactions in Database Systems

    5.3 Hibernate

    Hibernate is an open-source object-relational mapping framework for Java. It is able tocreate a database scheme to persist the Java objects and to query the database witheither SQL or HSQL (which is Hibernate's object oriented version of SQL). Model classeshave to be annotated with @Table and fields with @Id, @Basic, @OneToMany etc.Alternatives to annotated classes exists but has not been considered in this research.

    The following code illustrates the usage of model classes with annotations from thejavax.persistence package:

    package model;

    import javax.persistence.*;

    @Entitypublic class Student extends BaseEntity {

    @Column(nullable = false)private String name;

    @ManyToOne(cascade = CascadeType.ALL)private Address address = new Address();

    @OneToMany(targetEntity = Subscription.class,cascade = {CascadeType.REMOVE}, mappedBy = "student")

    private List subscription = new ArrayList();// [...]

    Hibernate can automatically create or alter tables of model classes.

    5.3.1 What does Hibernate offer?

    Database access is performed via the Session class which contains methods to insert,update, insertOrUpdate, delete as well as to obtain a transaction or to query data.

    The following extract illustrates how a student is persisted.

    Session s = ...;Student student = new Student();student.setName("Merki");// [...]

    s.saveOrUpdate(student);

    Apart from this basic database access the Session also has support for replication. Themethod replicate can take an object from an other database and persists it into the currentdatabase. To maintain key constraints, Hibernate maintains the primary key id even if aunique key generator is used. It works best when using the UUID key generator. It is veryimportant to have unique keys over more than one database therefore UUIDs areconsidered reasonable.

    In contrast to Oracle's Workspace, Hibernate is able to manage the full object relationshipmodel and will replicate related objects or cascade deletes to child objects.

    Fabian Merki, merkisoft informatik 5.11.2006 13 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    14/26

    Long Running User Transactions in Database Systems

    5.3.2 Replication modes

    When objects are replicated by invoking the Session.replicate() a replication mode can be

    passed to tell Hibernate what to do when a conflict is detected.The replication modes are:

    EXCEPTION Throw an exception when a row already exists.

    IGNORE Ignore replicated entities when a row already exists.

    LATEST_VERSION When a row already exists, choose the latest version.

    OVERWRITE Overwrite existing rows when a row already exists.

    5.3.3 What is missing in Hibernate?

    Hibernate offers less functionality than the previous products. Each object needs to bereplicated individually; no method to replicate everything exists. If an object was deletedand the delete has to be replicated, different logic and methods must be called.

    By not using UUIDs, the framework has difficulty to inform programmer of replicationproblems. This issue is not yet solved nor documented. But it must be considered whenusing Hibernate for replication.

    Fabian Merki, merkisoft informatik 5.11.2006 14 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    15/26

    Long Running User Transactions in Database Systems

    5.4 Others

    5.4.1 Microsofts SQL Server

    Microsofts SQL Server offers a 'Merge Replication' feature which sounds promising but isquite expensive to use because it is part of the MS SQL Server. Details about Microsoft'ssolution can be found at [RES-MSSQL]. The free version cannot be used in commercialapplications.

    5.4.2 Slony I / II

    Slony-I is a "master to multiple slaves" replication system supporting cascading andslave promotion.

    [..]

    But Slony-I, by only having a single origin for each set, is quite unsuitable for reallyasynchronous multi-way replication. For those that could use some sort of"asynchronous multi master replication with conflict resolution" akin to what isprovided by Lotus Notes or the "syncing" protocols found on Palm OS systems,you will really need to look elsewhere. These sorts of replication models are notwithout merit, but they represent different replication scenarios that Slony-I does notattempt to address.

    (Source: [RES-SLONY])

    It looks as if Slony would not fit my requirements because it only provides single origin foreach set. Therefore I did not look more deeply into it.

    Nevertheless the following paragraph states very well the issues of conflict resolution:

    Some async multimaster systems try to resolve conflicts by finding ways to applypartial record updates. For instance, with an address update, one user, on onenode, might update the phone number for an address, and another user mightupdate the street address, and the conflict resolution system might try to applythese updates in a non-conflicting order.

    Conflict resolution systems almost always require some domain knowledge of the

    application being used.It is absolutely true that domain specific knowledge is needed and that a general conflictresolving mechanism does (most likely) not exist.

    Fabian Merki, merkisoft informatik 5.11.2006 15 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    16/26

    Long Running User Transactions in Database Systems

    5.5 Conclusion

    Because the primary goal of this assignment is a solution to replicate multiple masterdatabases within Java applications and no standard product fits exactly the requirementsan own solution must be developed. Hibernate was chosen as the basis for this solution.

    The reasons of the choice:

    Successful prove of concept

    Database independent (JDBC)

    Free, open-source and 100% pure Java

    No additional server is required

    No additional database mapping required (This was already created for the courseapplication. No redundancy, reduces the number of required changes when adding,renaming, removing columns or tables.)

    Fabian Merki, merkisoft informatik 5.11.2006 16 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    17/26

    Long Running User Transactions in Database Systems

    6 Design and implementation of replication with Hibernate

    In this chapter the techniques and the code to implement a generic replication frameworkusing Hibernate are discribed.

    6.1 Methods of replication

    The replication process can be designed in a number of different ways:

    Data manipulations could be captured to then perform the replication later. This isachieved preferably by the use of triggers or, better still using Hibernate's triggeralternative (the Interceptor interface).

    One could record changes in a transaction log like database systems do. This log would

    include all insert, updates and deletes. On synchronisation, the transaction log of one dbhas to be applied to the other db. This can become very complex since the other db mightalready have undergone change. Updating deleted rows or inserting child records withouta parent row will most certainly occur.

    Another possible solution would be to upate a version column on modification. Onsynchronisation, the replicated ids (rows), the system id and the current version would bestored in a replication table and stored for the next replication. The system id would be aunique identifier over all databases and enables the central database to support more thanone replicated database. Once a row is deleted, the corresponding id is still available inthe replication table and therefore it can be determined if a row has to be inserted or

    delete.The second approach simplifies the replication because the whole row will be replicated.Merging a transaction log is not be very simple. Therefore the second approach waschosen.

    Fabian Merki, merkisoft informatik 5.11.2006 17 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    18/26

    Long Running User Transactions in Database Systems

    6.2 Replication Framework

    The framework should be designed in such a way that the calling application has limited, ifany knowledge of the replication process. The following code will start the process:

    Replication.mergeAll(false);

    The boolean argument freshDownload is used to determine if a cleanup, prior thereplication is required (see chapter use case). This is as per the snapshot operation usedby the Daffodil Replicator.

    The following class diagram provides an overview of the core classes of the developedframework.

    Figure 3

    6.3 Algorithm

    After performing the cleanup, if freshDownload is set to true, the order of the classes to be

    processed is evaluated and stored in a list. Classes, which are not referencing other modelclasses, are at the start of this list while the most referenced classes are at the end of thelist. The algorithm to generate this dependency graph will be explained later in thisdocument.

    For each class in the list a replication object is created. In the construction phase thefollowing query is performened on both databases:

    select x.id, (select r.replicatedVersion from ReplicationVersion r wherer.id=x.id and r.system=:SYSTEM), x.version from x order by id

    Now the results are simultaneously processed to determine if a row has to be inserted,

    updated or deleted in one or the other database. Because both results are ordered by id,

    Fabian Merki, merkisoft informatik 5.11.2006 18 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    19/26

    Long Running User Transactions in Database Systems

    the algorithm is quite simple.

    The next step is to perform the delete operations for each replication object. It is important

    to address the delete operation before insert operation otherwise newly inserted objectsmay conflict with already deleted objects.

    6.4 Additional replication table

    Because delete operations are allowed on both sides (local and remote) and should bemanaged, the replication process must remember what has already been replicated. If arecord exists on side A, but is missing on side B, the program must determine if the recordhas to be deleted on side A or inserted on side B.

    The class ReplicationVersion holds the following information about a replicated object: theuuid of the replicated object, the system code to enable multiple master databases and thereplicated version id. In the example where a record exists on side A but is missing on sideB. If the corresponding ReplicationVersion does not yet exist, the record must be insertedon side B. If it did exist, the record must be deleted on side A, since it was once replicatedand deleted on side B.

    The ReplicationVersion can also be used to determine which side was updated. Theobjects current version will be compared with the one in the ReplicationVersion. If it is notthe same the object needs to be replicated to the other side.

    6.5 Conflict resolution

    What if on both sides the object's current version id differs from the ReplicationVersion?Such a case can occur when on both side, users are updating the same records. In thecourse application where this framework will be used it is wise to overwrite the localchanges from the administrator with the one from the customers because the customermight make changes which should not be lost. In other situations the overwrite might beperformed in the opposite direction or even needed to be decided on a one-by-one basis.As already discussed conflict resolution requires domain specific logic.

    Currently the replication code contains no hook, callback method to allow this but it couldbe easily extended when needed.

    Fabian Merki, merkisoft informatik 5.11.2006 19 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    20/26

    Long Running User Transactions in Database Systems

    6.6 Dependencies

    To be able to perform the replication, the relationships between entities must be evaluated.No child record can be inserted unless the corresponding parent record exists.

    The framework should determine how the classes, which are mapped to database tablesby using annotations, are related to each other. The following Java code performs adependency sorting. A list of classes which require replication are passed in. The methodreturns a list of classes where classes without references to other classes are at the top ofthe list followed by classes with references to already processed classes.

    Figure 4

    This class diagram can be converted into the following list: B, E, C, A, F, D

    It is very important to perform merge, insert operations in the correct order (no child recordshall exist without a parent record).

    Initially the idea was to perform the deletes in the opposite order. But due to Hibernate'scascading of deletes, this was not required. As a result, child records could have beenalready deleted if the parent was deleted.

    public static List getClassStack(List l) {

    Map graph = new HashMap();for (Class clazz : l) {

    graph.put(clazz, getClassStack(clazz));}

    List classStack = new ArrayList();

    while (!graph.isEmpty()) {for (Iterator iterator = graph.keySet().iterator();

    iterator.hasNext();) {Class clazz = iterator.next();List list = graph.get(clazz);if (list.isEmpty()) {

    classStack.add(clazz);iterator.remove();

    }}

    Fabian Merki, merkisoft informatik 5.11.2006 20 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    21/26

  • 8/6/2019 Daffodil Replica Tor Testing

    22/26

    Long Running User Transactions in Database Systems

    6.8 Transaction handling

    The replication process acts on two databases concurrently. A perfect solution in terms of

    committing would be to start a transaction on a central transaction manager, run theprocess and finally commit the transaction. The transaction manager would then performthe committing of changes to both databases and in case of a problem, rollback both. TheJava API contains an javax.transaction.xa package with XAConnection and XAResourceinterfaces. Database vendors offer implementation of these to communicate with atransaction manager.

    The current implementation of the replication process does not make use of this featurebut the solution could be extended to use it as required.

    Fabian Merki, merkisoft informatik 5.11.2006 22 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    23/26

    Long Running User Transactions in Database Systems

    7 Testing

    To prove the functionality and correctness of the replication framework JUnit was used.JUnit is a powerful test framework for Java and was chosen to write the test cases. Thisapproach helped to develop the software and additionally it allowed quickly regressiontesting.

    A very simple datamodel was used for the tests. The following diagram shows therelationship between the classes.

    The following extract of the test class shows how the tests are written:

    public void testSubscription() throws ParseException {Session local = DAO.getLocal().openSession();Session remote = DAO.getRemote().openSession();

    subscribe(local, "Anna", "Football");subscribe(local, "Hans", "Football");subscribe(local, "Hans", "Diving");

    checkSubscription(0, 0, 0, 3, 0, 0);

    megaSubscriptionTest(local, remote);checkSubscription(1, 0, 0, 1, 1, 0);

    initLocalDatabase();

    megaSubscriptionTest(remote, local);checkSubscription(0, 1, 1, 0, 0, 1);

    Fabian Merki, merkisoft informatik 5.11.2006 23 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    24/26

    Long Running User Transactions in Database Systems

    subscribe(local, "Anna", "Diving");subscribe(remote, "Anna", "Diving");

    checkSubscription(0, 0, 1, 0, 0, 0);}

    protected void setUp() throws Exception {initLocalDatabase();

    }

    protected void tearDown() throws Exception {checkCleanup();

    }private void checkSubscription(int dlStudent, int drStudent, int ulSubscription,

    int urSubscription, int dlSubscription, int drSubscription) {

    Replication[] rr = Replication.mergeAll(false);check(rr[0], Course.class, 0, 0, 0, 0);check(rr[1], City.class, 0, 0, 0, 0);check(rr[2], Student.class, 0, 0, dlStudent, drStudent);check(rr[3], Subscription.class, ulSubscription, urSubscription,

    dlSubscription, drSubscription);}

    The checkSubscription Method replicates the databases and then checks if the expectedamount of records were updated. With this approach it is really simple to do complex testcases where both sides insert, update and delete records without writing much code.

    Please see code for full details of test cases.

    Fabian Merki, merkisoft informatik 5.11.2006 24 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    25/26

    Long Running User Transactions in Database Systems

    8 Conclusion

    Depending on the requirements many products are available to perform replication. In thesituation of the course application the Hibernate solution was a good choice (see chapter5.5).

    The complete course administration software including a homepage and the replicationprocess described in this document was successfully deployed to administrate more than700 children in three regions. In this production environment the software worked verywell. A few minor bugs were quickly fixed. The performance of the application was good. Areplication regularly completed within 5-20 seconds. In scenarios where the number ofrecords exceeded 1000 records i. e. after a mass update it took up to 5 minutes.

    To address this performance issue, in parallel to this assignment I under took another

    project to develop a zipped tunnel solution. Early tests are looking promising and areshowing a 2-5 fold reduction in communication load can be achieved. If successful, thezipped tunnel will be integrated with the work of this assignment and used in the courseadministration software.

    Fabian Merki, merkisoft informatik 5.11.2006 25 / 26

  • 8/6/2019 Daffodil Replica Tor Testing

    26/26

    Long Running User Transactions in Database Systems

    9 References

    [HIBERNATE]

    http://www.hibernate.org

    [JAVA]

    http://java.sun.com

    [ORACLE-WS]

    http://www.oracle.com/technology/products/database/workspace_manager/index.html

    [DAFFODIL]

    http://www.daffodildb.com/replicator/dbreplicator.html

    [RES-SLONY]

    http://developer.postgresql.org/~wieck/slony1/adminguide-1.1.rc1/slonyintro.html

    [RES-MSSQL]

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/replsql/repltypes_30z7.asp

    [RES-DAFODIL]

    http://opensource.replicator.daffodilsw.com/what-is-replicator.html

    [RES-ORACLE-WS]

    http://www.adp-gmbh.ch/blog/2006/05/09.php

    http://www.idevelopment.info/data/Oracle/DBA_tips/Workspace_Manager/WM_1.shtml

    Fabian Merki merkisoft informatik 5 11 2006 26 / 26

    http://www.hibernate.org/http://java.sun.com/http://www.oracle.com/technology/products/database/workspace_manager/index.htmlhttp://www.daffodildb.com/replicator/dbreplicator.htmlhttp://developer.postgresql.org/~wieck/slony1/adminguide-1.1.rc1/slonyintro.htmlhttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/replsql/repltypes_30z7.asphttp://opensource.replicator.daffodilsw.com/what-is-replicator.htmlhttp://www.adp-gmbh.ch/blog/2006/05/09.phphttp://www.idevelopment.info/data/Oracle/DBA_tips/Workspace_Manager/WM_1.shtmlhttp://www.hibernate.org/http://java.sun.com/http://www.oracle.com/technology/products/database/workspace_manager/index.htmlhttp://www.daffodildb.com/replicator/dbreplicator.htmlhttp://developer.postgresql.org/~wieck/slony1/adminguide-1.1.rc1/slonyintro.htmlhttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/replsql/repltypes_30z7.asphttp://opensource.replicator.daffodilsw.com/what-is-replicator.htmlhttp://www.adp-gmbh.ch/blog/2006/05/09.phphttp://www.idevelopment.info/data/Oracle/DBA_tips/Workspace_Manager/WM_1.shtml