Data Cube and Data Mining - Duke University · 2018. 11. 27. · •big “Petabyte...

Preview:

Citation preview

CompSci 516DatabaseSystems

Lecture23DataCube

andDataMining

Instructor:Sudeepa Roy

DukeCS,Fall2018 CompSci516:DatabaseSystems 1

Announcements

• HW3dueonNov30(Fri),5pm

• FinalreportdueonDec12,Wed– Seefeedbackonpiazza

• Pleasefilloutthecourseevaluations!–Weneedtohearfromeachofyou–Weneedyourfeedback/suggestionstoimprovethisclass

DukeCS,Fall2018 CompSci516:DatabaseSystems 2

Announcements• Classpresentationnextlecture(Thurs,Nov29)– Intheorderofyourgroupnumber– 4minpresentation(willbetimed!)

• Presentation– 14projectsin75mins– 4minsperproject!– Noteveryonehastopresent(uptoyou)

• everyoneinagroupgetsthesamegrade– Youpresentthecurrentstatusoftheproject

• problem,example,yourapproach,whatyouplan– Besttoshowplots/screenshots/results/demo!– Trytoshowthemostinterestingobservation/findingsin4mins!

– Telluswhatyouwanttodobeforeyousubmitthefinalreport(ifanything)

DukeCS,Fall2018 CompSci516:DatabaseSystems 3

DataWarehousing

DukeCS,Fall2018 CompSci516:DatabaseSystems 4

ReadingMaterial

• [RG]– Chapter25

• Gray-Chaudhuri-Bosworth-Layman-Reichart-Venkatrao-Pellow-Pirahesh,ICDE1996“DataCube:ARelationalAggregationOperatorGeneralizingGroup-By,Cross-Tab,andSub-Totals”

• Harinarayan-Rajaraman-Ullman,SIGMOD1996“Implementingdatacubesefficiently”

DukeCS,Fall2018 CompSci516:DatabaseSystems 5

Acknowledgement:• Thefollowingslideshavebeencreatedadaptingtheinstructormaterialofthe[RG]bookprovidedbytheauthorsDr.RamakrishnanandDr.Gehrke.• SomeslideshavebeenpreparedbyProf.Shivnath Babu

6

Warehousing

• Growingindustry:$8billionwaybackin1998• DatawarehousevendorlikeTeradata

• big“Petabytescale”customers• Apple,Walmart(2008-2.5PB),eBay(2013-primaryDW9.2PB,otherbigdata40PB,singletablewith1trillionrows),Verizon,AT&T,BankofAmerica

• supportsdataintoandoutofHadoop

• Lotsofbuzzwords,hype– slice&dice,rollup,MOLAP,pivot,...

Ack:SlidebyProf.Shivnath Babu

https://gigaom.com/2013/03/27/why-apple-ebay-and-walmart-have-some-of-the-biggest-data-warehouses-youve-ever-seen/

DukeCS,Fall2018 CompSci516:DatabaseSystems

Introduction• Organizationsanalyzecurrentandhistoricaldata

– toidentifyusefulpatterns– tosupportbusinessstrategies

• Emphasisisoncomplex,interactive,exploratoryanalysisofverylargedatasets

• Createdbyintegratingdatafromacrossallpartsofanenterprise

• Dataisfairlystatic

• Relevantonceagainfortherecent“BigDataanalysis”– tofigureoutwhatwecanreuse,whatwecannot

DukeCS,Fall2018 CompSci516:DatabaseSystems 7

OLTP DataWarehousing/OLAPMostlyupdates Mostlyreads

Applications:Orderentry,salesupdate,bankingtransactions

Applications:Decisionsupportinindustry/organization

Detailed,up-to-datedata Summarized, historicaldata(frommultipleoperationaldb,growsovertime)

Structured,repetitive,shorttasks Queryintensive,adhoc,complexqueries

Eachtransactionreads/updatesonlyafewtuples(tensof)

Eachquerycanaccesses manyrecords,andperformmanyjoins,scans,aggregates

MB-GBdata GB-TB data

Typicallyclericalusers Decisionmakers,analysts asusers

Important:Consistency,recoverability,Maximizingtr.throughput

Important:QuerythroughputResponse times

8DukeCS,Fall2018 CompSci516:DatabaseSystems

ThreeComplementaryTrends

• DataWarehousing(DW):– Consolidatedatafrommanysourcesinonelargerepository– Loading,periodicsynchronizationofreplicas– Semanticintegration

• OLAP:– ComplexSQLqueriesandviews.– Queriesbasedonspreadsheet-styleoperationsand“multidimensional”viewofdata.

– Interactiveand“online”queries.

• DataMining:– Exploratorysearchforinterestingtrendsandanomalies

DukeCS,Fall2018 CompSci516:DatabaseSystems 9

ROLAPandMOLAP

• RelationalOLAP(ROLAP)– OntopofstandardrelationalDBMS– DataisstoredinrelationalDBMS– SupportsextensionstoSQLtoaccessmulti-dimensionaldata

• MultidimensionalOLAP(MOLAP)– Directlystoresmultidimensionaldatainspecialdatastructures(e.g.arrays)

10DukeCS,Fall2018 CompSci516:DatabaseSystems

ROLAP:StarSchema

• Toreflectmulti-dimensionalviewsofdata

• Singlefacttable

• Singletableforeverydimension

• Eachtupleinthefacttableconsistsof– pointers(foreignkey)toeach

ofthedimensions(multi-dimensionalcoordinates)

– numericvalueforthosecoordinates

• Eachdimensiontablecontainsattributesofthatdimension

11

NosupportforattributehierarchiesDukeCS,Fall2018 CompSci516:DatabaseSystems

DimensionHierarchies

• Foreachdimension,thesetofvaluescanbeorganizedinahierarchy:

PRODUCT TIME LOCATION

categoryweekmonthstate

pnamedatecity

year

quartercountry

DukeCS,Fall2018 CompSci516:DatabaseSystems 12

ROLAP:SnowflakeSchema

13

• Refinesstar-schema• Dimensionalhierarchyis

explicitlyrepresented

• (+)Dimensiontableseasiertomaintain– supposethe“category

descriptionisbeingchanged

• (-)Needadditionaljoins

• FactConstellations– Multiplefacttablessharesome

dimensionaltables– e.g.ProjectedandActual

Expensesmaysharemanydimensions

DukeCS,Fall2018 CompSci516:DatabaseSystems

OLAPand

DataCube

DukeCS,Fall2018 CompSci516:DatabaseSystems 14

Motivation:OLAPQueries• Dataanalystsareinterestedinexploringtrendsandanomalies– Possiblybyvisualization(Excel)- 2Dor3Dplots– “DimensionalityReduction”bysummarizingdataandcomputing

aggregates

• Findtotalunitsalesforeach1. Model2. Model,brokenintoyears3. Year,brokenintocolors4. Year5. Model,brokenintocolor,….

15

Sales(Model,Year,Color,Units)

DukeCS,Fall2018 CompSci516:DatabaseSystems

Roll-Ups• Analysisreportsstartatacoarselevel,

gotofinerlevels• Orderofattributematters• Notrelationaldata(emptycellsno

keys)

Model Year Color Model,Year,Color Model,Year Model

Chevy 1994 Black 50

Chevy 1994 White 40

90

Chevy 1995 Black 115

Chevy 1995 White 85

200

290

GROUPBY

Sales(Model,Year,Color,Units)

Roll-ups

Drill-downs

• Roll-upsareasymmetric

‘ALL’ ConstructEasiertovisualizeroll-upifallowALLtofillinthesuper-aggregates

SELECT Model, Year, Color, SUM(Units)

FROM SalesWHERE Model = ‘Chevy’

GROUP BY Model, Year, Color

UNION

SELECT Model, Year, ‘ALL’, SUM(Units)

FROM Sales

WHERE Model = ‘Chevy’GROUP BY Model, Year

UNION

UNION

SELECT ‘ALL’, ‘ALL’, ‘ALL’, SUM(Units)

FROM Sales

WHERE Model = ‘Chevy’;

Model Year Color Units

Chevy 1994 Black 50

Chevy 1994 White 40

Chevy 1994 ‘ALL’ 90

Chevy 1995 Black 85

Chevy 1995 White 115

Chevy 1995 ‘ALL’ 200

Chevy ‘ALL’ ‘ALL’ 290

Sales(Model,Year,Color,Units)

DataCube:Intuition

SELECT ‘ALL’, ‘ALL’, ‘ALL’, sum(units)

FROM Sales

UNIONSELECT ‘ALL’, ‘ALL’, Color, sum(units)

FROM Sales

GROUP BY Color

UNIONSELECT ‘ALL’, Year, ‘ALL’, sum(units)

FROM Sales

GROUP BY Year

UNION SELECT Model, Year, ‘ALL’, sum(units)

FROM Sales

GROUP BY Model, Year

UNION ….

18

Sales(Model,Year,Color,Units)

YearColor

Mod

el

TotalUnitsales

DukeCS,Fall2018 CompSci516:DatabaseSystems

• Howmanysub-queries?• Howmanysub-queriesfor

8attributes?

MorecomplextodothesewithGROUP-BY

DataCube• Computestheaggregateonallpossiblecombinationsof

groupbycolumns.

• IfthereareNattributes,thereare2N-1super-aggregates.

• IfthecardinalityoftheNattributesareC1,...,CN,thenthereareatotalof(C1+1)...(CN+1)valuesinthecube.

• ROLL-UPissimilarbutjustlooksatNaggregates

DataCube

Ack:fromslidesbyLaurelOrrandJeremyHyrkas,UW

DataCube Syntax

• SQLServer

SELECT Model, Year, Color, sum(units)

FROM Sales

GROUP BY Model, Year, Color

WITH CUBE

Sales(Model,Year,Color,Units)

ImplementingDataCube

DukeCS,Fall2018 CompSci516:DatabaseSystems 22

BasicIdeas

• Needtocomputeallgroup-by-s:– ABCD,ABC,ABD,BCD,AB,AC,AD,BC,BD,CD,A,B,C,D

• ComputeGROUP-BYsfrompreviouslycomputedGROUP-BYs– e.g.firstABCD– thenABCorACD– thenABorAC…

• WhichorderABCDissorted,mattersforsubsequentcomputations

– if(ABCD)isthesortedorder,ABCischeap,ACDorBCDisexpensive

Notations

• ABCD– group-byonattributesA,B,C,D– noguaranteeontheorderoftuples

• (ABCD)– sortedaccordingtoA->B->C->D

• ABCDand(ABCD)and(BCDA)– allcontainthesameresults– butindifferentsortedorder

Optimization1:SmallestParent

• ComputeGROUP-BYfromthesmallest(size)previouslycomputedGROUP-BYasaparent

– ABcanbecomputedfromABC,ABD,orABCD

– ABCorABDbetterthanABCD– EvenABCorABDmayhave

differentsizes,trytochoosethesmallerparent

25

ABCD

ABC ABD ACD BCD

AC AD BC BD CDAB

A B C D

all

DukeCS,Fall2018 CompSci516:DatabaseSystems

LATTICESTRUCTUREofdatacube

Optimization2:CacheResults

• CacheresultofoneGROUP-BYinmemorytoreducediskI/O– ComputeABfromABCwhile

ABCisstillinmemory

26

ABCD

ABC ABD ACD BCD

AC AD BC BD CDAB

A B C D

all

DukeCS,Fall2018 CompSci516:DatabaseSystems

Optimization3:AmortizeDiskScans

• AmortizediskreadsformultipleGROUP-BYs– SupposetheresultforABCD

isstoredondisk– ComputeallofABC,ABD,

ACD,BCDsimultaneouslyinonescanofABCD

27

ABCD

ABC ABD ACD BCD

AC AD BC BD CDAB

A B C D

all

DukeCS,Fall2018 CompSci516:DatabaseSystems

Optimization4:Sharedpartition

• forhash-basedalgorithms– Useshashtablesto

computesmallerGROUP-Bys

– IfthehashtablesforABandACfitinmemory,computebothinonescanofABC

– OtherwisepartitiononA,andcomputeHTsofABandACindifferentpartitions

• “pipe-hash”algorithmnotcoveredinclass

28

ABCD

ABC ABD ACD BCD

AC AD BC BD CDAB

A B C D

all

DukeCS,Fall2018 CompSci516:DatabaseSystems

(ABCD)

(ABC) ABD ACD BCD

AC AD BC BD CD(AB)

(A) B C D

all Level0

Level1

Level2

Level3

Level4

A B C sum

a1 b1 c1 5

a1 b1 c2 10

a1 b2 c3 8

a2 b2 c1 2

a2 b2 c3 11

NotSorted

A B C sum

a2 b2 c3 11

a1 b1 c2 10

a2 b2 c1 2

a1 b1 c1 5

a1 b2 c3 8

Sorted

A B sum

a1 b1 15

a1 b2 8

a2 b2 13

Optimization5:Shared-sort

• Fromonesortedorderof(ABCD)

– Compute(ABC)– Compute(AB)– Compute(A)– Compute“all”

PipeSort:Shared-sortoptimization

• BUT,mayhaveaconflictwith“smallest-parent”optimization– (ABD)->(AB)couldbeabetterchoice– Figureoutthebestparentchoicebyrunning

aweighted-matchingalgorithmlayerbylayer(detailsnotcoveredinclass) (ABCD)

(ABC)

(AB)

(A)

Aftercomputingthe plan,executeallpipelines

PipeSort Algorithminpicture

1.Firstpipelineisexecutedbyonescanofthedata

2.Sort(CBAD)->(BADC),computethesecondpipeline

3.…..

A,S costs

ACBD

ACB ABD ACD BDC

AC AD BC BD CD

A B C D

all

AB

DataMining

ReadingMaterialOptionalReading:

1.[RG]:Chapter26

2.“FastAlgorithmsforMiningAssociationRules”Agrawal andSrikant,VLDB1994

23,863citationsonGoogleScholarinNovember2018• 23,038inNovember2017• 20,610inNovember2016• 19,496inApril2016

OneofthemostcitedpapersinCS!

33

• Acknowledgement:Thefollowingslideshavebeenpreparedadaptingtheslidesprovidedbytheauthorsof[RG]andusingseveralpresentationsofthispaperavailableontheinternet(esp.byOfer PasternakandBrianChase)

DukeCS,Fall2018 CompSci516:DatabaseSystems

FourMainStepsinKDandDM(KDD)

• DataSelection– Identifytargetsubsetofdataandattributesofinterest

• DataCleaning– Removenoiseandoutliers,unifyunits,createnewfields,usedenormalization ifneeded

• DataMining– extractinterestingpatterns

• Evaluation– presentthepatternstotheendusersinasuitableform,e.g.throughvisualization

DukeCS,Fall2018 CompSci516:DatabaseSystems 34

RememberHW1!

SeveralDM/KD(Research)Problems

• Discoveryofcausalrules• Learningoflogicaldefinitions• Fittingoffunctionstodata• Clustering• Classification• Inferringfunctionaldependenciesfromdata• Finding“usefulness”or“interestingness”ofarule

– SeethecitationsintheAgarwal-Srikant paper– Somediscussedin[RG]Chapter27

DukeCS,Fall2018 CompSci516:DatabaseSystems 35

• Retailerscollectandstoremassiveamountsofsalesdata– transactiondateandlistofitems

• Associationrules:– e.g.98%customerswhopurchase“tires”and“autoaccessories”alsoget“automotiveservices”done

– Customerswhobuymustardandketchupalsobuyburgers

– Goal:findtheserulesfromjusttransactionaldata(transactionid+listofitems)

MiningAssociationRules

36DukeCS,Fall2018 CompSci516:DatabaseSystems

• Canbeusedfor– marketingprogramandstrategies– cross-marketing(masse-mail,webpages)– catalogdesign– add-onsales– storelayout– customersegmentation

Applications

37DukeCS,Fall2018 CompSci516:DatabaseSystems

Notations

• ItemsI={i1,i2,…,im}• D:asetoftransactions• EachtransactionT⊆ I– hasanidentifierTID

• AssociationRule– Xà Y(notFunctionalDependency!)– X,Y⊂ I– X∩Y=∅

38DukeCS,Fall2018 CompSci516:DatabaseSystems

ConfidenceandSupport

• AssociationruleXàY

• Confidence c=|Tr.withXandY|/|Tr.with|X|– c%oftransactionsinDthatcontainXalsocontainY

• Support s=|Tr.withXandY|/|allTr.|– s%oftransactionsinDcontainXandY.

39DukeCS,Fall2018 CompSci516:DatabaseSystems

SupportExample

TID Cereal Beer Bread Bananas Milk

1 X X X

2 X X X X

3 X X

4 X X

5 X X

6 X X

7 X X

8 X

• Support(Cereal)• 4/8=.5

• Support(CerealàMilk)• 3/8=.375

40DukeCS,Fall2018 CompSci516:DatabaseSystems

ConfidenceExample

TID Cereal Beer Bread Bananas Milk

1 X X X

2 X X X X

3 X X

4 X X

5 X X

6 X X

7 X X

8 X

• Confidence(CerealàMilk)• 3/4=.75

• Confidence(Bananasà Bread)• 1/3=.33333…

41DukeCS,Fall2018 CompSci516:DatabaseSystems

Xà YisnotaFunctionalDependencyForfunctionaldependencies• F.D.=twotupleswiththesamevalueofofXmusthavethe

samevalueofY– Xà Y =>XZà Y(concatenation)– Xà Y,Yà Z=>Xà Z(transitivity)

Forassociationrules• Xà AdoesnotmeanXYàA

– Maynothavetheminimumsupport– Assumeonetransaction{AX}

• Xà AandAà ZdonotmeanXà Z– Maynothave theminimumconfidence– Assumetwotransactions{XA},{AZ}

42DukeCS,Fall2018 CompSci516:DatabaseSystems

Checkyourself

ProblemDefinition

• Input– asetoftransactionsD

• Canbeinanyform– afile,relationaltable,etc.

– minsupport(minsup)– minconfidence(minconf)

• Goal:generateallassociationrulesthathave– support>=minsup and– confidence>=minconf

43DukeCS,Fall2018 CompSci516:DatabaseSystems

Decompositionintotwosubproblems

• 1.Apriori– forfinding“large”itemsets withsupport>=minsup– allotheritemsets are“small”

• 2.ThenuseanotheralgorithmtofindrulesXà Ysuchthat– Bothitemsets X∪ YandXarelarge– Xà Yhasconfidence>=minconf

• Paperfocusesonsubproblem 1– ifsupportislow,confidencemaynotsaymuch– subproblem 2infullversionofthepaper

DukeCS,Fall2018 CompSci516:DatabaseSystems 44

BasicIdeas- 1

• Q.Whichitemset canpossiblyhavelargersupport:ABCDorAB– i.e.whenoneisasubsetoftheother?

• Ans:AB– anysubsetofalargeitemset mustbelarge– SoifABissmall,noneedtoinvestigateABC,ABCDetc.

DukeCS,Fall2018 CompSci516:DatabaseSystems 45

BasicIdeas- 2

• Startwithindividual(singleton)items{A},{B},…

• Insubsequentpasses,extendthe“largeitemsets”ofthepreviouspassas“seed”

• Generatenewpotentiallylargeitemsets (candidateitemsets)

• Thencounttheiractualsupportfromthedata

• Attheendofthepass,determinewhichofthecandidateitemsets areactuallylarge– becomesseedforthenextpass

• Continueuntilnonewlargeitemsets arefound

DukeCS,Fall2018 CompSci516:DatabaseSystems 46

OptionalSlidesontheApriori Algorithm

DukeCS,Fall2018 CompSci516:DatabaseSystems 47

Notations

ACTUAL

POTENTIALUsedinbothApriori andAprioriTID

48

• Assumethedatabaseisoftheform<TID,i1,i2,…>whereitemsarestoredinlexicographicorder

• TID = identifierofthetransaction• Alsoworkswhenthedatabaseis“normalized”:eachdatabaserecordis<TID,item>pair

DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

AlgorithmApriori

49

!k

k

kk

t

kt

k-k

k-1

;L Answer

minsup}|c.countC { c L

c.count Cc

,t)(C C Dt

)(L C); k 2; L( k

itemsets} {large 1- L

=

³Î=

++Î

=++¹=

=

end

endend

;do candidates forall

subset begin do ons transactiforall

;genapriori-begin do For

1

1

fCountindividual itemoccurrences

Generatenewk-itemsets candidates

count=0

Findthesupportofallthecandidates

Ct = candidatescontainedint

incrementcount

Takeonlythosewithsupport>= minsup

DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

Apriori-Gen

5050

l Joinstep

l Prunestep

1k1k2k2k11

1k1k

1k1k1

k

q.itemp.item,q.itemp.item,...,q.itemp.itemqp,LL

itemqitempitempp.itemC

----

--

--

<== where from

.,.,.,select intoinsert

2

k

k-1

k

c from C) L(s

ets s of c(k-1)-subs C itemsets c

deletethen if

do foralldo forall

Ï

Î

p andqaretwo large

(k-1)-itemsets identicalinallk-2firstitems.

Joinbyaddingthelastitemofqtop

Checkallthesubsets,removeallcandidatewithsome“small” subset

• TakesasargumentLk-1 (thesetofalllargek-1)-itemsets• Returnsasupersetofthesetofalllargek-itemsets byaugmentingLk-1

Lk-1⨝ Lk-1

DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

Apriori-GenExample- 1

• {1,2,3}• {1,2,4}{1,3,4}

• {1,3,5}• {2,3,4}

• {1,2,3,4}

Step1:Join(k=4)

Assumenumbers1-5correspondtoindividualitems

L3 C4

51DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

Apriori-GenExample- 2

• {1,2,3,4}• {1,3,4,5}

Step1:Join(k=4)

Assumenumbers1-5correspondtoindividualitems

L3 C4• {1,2,3}• {1,2,4}{1,3,4}

• {1,3,5}• {2,3,4}

52DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

Apriori-GenExample- 3

• {1,2,3,4}• {1,3,4,5}

Step2:Prune(k=4)• Removeitemsets thatcan’thavethe

requiredsupportbecausethereisasubsetinitwhichdoesn’thavethelevelofsupporti.e.notinthepreviouspass(k-1)

L3 C4

No{1,4,5}existsinL3Rulesout{1,3,4,5}

• {1,2,3}• {1,2,4}{1,3,4}

• {1,3,5}• {2,3,4}

53DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

CorrectnessofApriori

54

1k1k2k2k11

1k1k

1k1k1

k

q.itemp.item,q.itemp.item,...,q.itemp.itemqp,LL

itemqitempitempp.itemC

----

--

--

<== where from

.,.,.,select intoinsert

2

k

k-1

k

c from C) L(s

ets s of c(k-1)-subs C itemsets c

deletethen if

do foralldo forall

Ï

Î

ShowthatCk⊇ Lk• Anysubsetoflargeitemset mustalsobe

large

• foreachpinLk,ithasasubsetqinLk-1• WeareextendingthosesubsetsqinJoin

withanothersubsetq’ofp,whichmustalsobelarge

• equivalenttoextendingLk-1 withallitemsandremovingthose

whose(k-1)subsetsarenotinLk-1• PruneisnotdeletinganythingfromLk

prune

join

Checkyourself

DukeCS,Fall2018 CompSci516:DatabaseSystems

OptionalSlide

Conclusions(of516,Fall2018)

DukeCS,Fall2018 CompSci516:DatabaseSystems 55

Take-Aways

• DBMSBasics

• DBMSInternals

• OverviewofResearchAreas

• Hands-onExperienceinDBsystems

DukeCS,Fall2018 CompSci516:DatabaseSystems 56

DBSystems• TraditionalDBMS– PostGres,SQL

• Large-scaleDataProcessingSystems– Spark/Scala,AWS

• NewDBMS/NOSQL–MongoDB

• Inaddition– XML,JSON,JDBC,Python/Java

DukeCS,Fall2018 CompSci516:DatabaseSystems 57

DBBasics

• SQL• RA/LogicalPlans• RC• Datalog–Whyweneededeachoftheselanguages

• NormalForms

DukeCS,Fall2018 CompSci516:DatabaseSystems 58

DBInternalsandAlgorithms

• Storage• Indexing• OperatorAlgorithms– ExternalSort– JoinAlgorithms

• Cost-basedQueryOptimization• Transactions– ConcurrencyControl– Recovery

DukeCS,Fall2018 CompSci516:DatabaseSystems 59

Large-scaleProcessingandNewApproaches

• ParallelDBMS• DistributedDBMS• MapReduce• NOSQL

DukeCS,Fall2018 CompSci516:DatabaseSystems 60

OtherTopics

• DataWarehouse/OLAP/DataCube• AssociationRuleMining

• HopesomeofyouwillfurtherexploreDatabaseSystems/DataManagement/DataAnalysis/BigData!

DukeCS,Fall2018 CompSci516:DatabaseSystems 61

Recommended