Upload
lyhanh
View
230
Download
3
Embed Size (px)
Citation preview
Copyright©2016,Oracleand/oritsaffiliates.Allrightsreserved.|
ExadataTechnicalDeepDive:ArchitectureandInternals
Kothanda(Kodi)UmamageswaranVicePresident,ExadataDevelopmentGurmeetGoindiExadataProductManagement
Copyright©2016,Oracleand/oritsaffiliates.Allrightsreserved.|
SafeHarborStatementThefollowingisintendedtooutlineourgeneralproductdirecTon.ItisintendedforinformaTonpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfuncTonality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andTmingofanyfeaturesorfuncTonalitydescribedforOracle’sproductsremainsatthesolediscreTonofOracle.
2
TheExadataDatabaseMachineVisionBestPla-ormfortheOracleDatabase–OnPremisesandintheCloud
3
1. State-of-the-artenterprise-gradehardware,refreshedyearly(processors,flash,disks,network)
3. High-poweredintelligentstorageserverscapableofoffloadingdatabaseworkloads
4. “Smart”databaseprotocolsandopTmizaTonsfromserverstonetworktostorage
5. Onevendorresponsibleforallhardware,so`wareandcustomersupport
2. Sized,tunedandopTmizedexclusivelyforOracleDatabaseworkloads(DW,AnalyTcs,OLTP,Mixed)
ExadataUniqueIntellectualProperty
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
ProvenatThousandsofCriTcalDeploymentssince2008HalfOLTP-HalfAnalyGcs-ManyMixed
• PetabyteWarehouses• OnlineFinancialTrading• BusinessApplicaGons– SAP,Oracle,Siebel,PSFT,…
• MassiveDBConsolidaGon
• PublicSaaSClouds– OracleFusionApps,Salesforce,SAS,…
4
4OFTHETOP5BANKS,TELCOS,RETAILERSRUNEXADATA
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
CustomerDataCenter
Purchased
CustomerManaged
5
Preview:ExadataCloudMachineExadataDatabaseMachine
CustomerDataCenterSubscripGon
OracleManaged
OracleCloud
SubscripTon
OracleManaged
ExadataCloudService
On-Premises CloudatCustomer PublicCloud
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
• Scale-OutDatabaseServers
– 2socketx86processors– 44CPUcores– 256GB-1.5TBGBDRAM
• FastestInternalFabric– 40Gb/sInfiniBand– EthernetexternalconnecTvity
• Scale-OutIntelligentStorage
– High-CapacityStorageServer
– ExtremeFlashStorageServer
ExadataDatabaseMachineX6-2
6
ComputeSo\ware– OracleLinux6– OracleDatabaseEnterpriseEdiTon– OracleVM(opTonal)– OracleDatabaseopTons(opTonal)
StorageServerSo\ware– SmartScan(SQLOffload)– SmartFlashCache– HybridColumnarCompression– I/OResourceManagement
12.8TBPCIFlash96TBdisk
20CPUcores
25.6TBPCIFlash20CPUcores
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
• Scale-OutDatabaseServers– 8-socketx86
processors– 144cores– 2-6TBDRAM
• FastestInternalFabric– 40Gb/sInfiniBand– EthernetexternalconnecTvity
• Scale-OutIntelligentStorage
– High-CapacityStorageServer
– ExtremeFlashStorageServer
ExadataDatabaseMachineX6-8
7
StorageServerSo\ware– SmartScan(SQLOffload)– SmartFlashCache– HybridColumnarCompression– I/OResourceManagement
SameNetworking,StorageandSo\wareasX6-2
LargeSMPProcessorModel– Largewarehouses– MassivedatabaseconsolidaTon– BigIn-Memorydatabases
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
AddRacksto
ConGnueScaling
IncrementallyaddDBorStorageServers
ElasTcConfiguraTonsIncrementallyScaleServersAchieveanyLevelofPerformancewithMinimumHardware
8
MulG-RackFullRackStartSmall2DatabaseServers3StorageServers
u
wv
HighCapacityStorage
DatabaseServer
ExtremeFlashStorage
• EnableDatabaseCPUcoresasneededwithCapacityonDemand• ExpandolderExadatamachineswithnewX6-2servers
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
OracleDatabaseExadataCloudService
• FullOracleDatabasewithalladvancedopGons– 100%CompaGblewithon-premisesdatabases
• Onfastestandmostavailabledatabasecloudpla-orm– Scale-OutCompute,Scale-OutIntelligentStorage,InfiniBand,PCIeFlash– CompleteisolaGonoftenantswithnooverprovisioning
• AllBenefitsofPublicCloud– Fast,elasTc,webdrivenprovisioning– Oracleexpertsdeployandmanageinfrastructure– MonthlyoryearlysubscripTonwithonlinecapacitybursGng
9
BestofOn-PremiseswithBestofCloud
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Preview:OraclePublicCloudServices@Customer
• SamePaaSandIaaShardwareandso`wareasOraclePublicCloud
• ManagedbyOracleanddeliveredasaserviceinyourdatacenterbehindyourfirewall
• Samecost-effecTvesubscripTonpricingmodelasOracleCloud
• Helpsconformtobusinessandgovernmentsecurityrequirements
• ConnectviafastLANtoexisTngsystems
10
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
ExadataX6isMuchFasterandCheaperthanAll-FlashEMC
11
0
50
100
150
200
250
300
350
8X-BrickEMC
XtremIO
1RackHCExadata
24
301
GB/sec
12X
AnalyGcScans
0
1
2
3
4
5
8X-BrickEMC
XtremIO
1RackHCExadata
2M
OLTPWriteIOPS
2.5X
5.2M
EMCPerformancedoesnotscalehigher-Exadatascalesbyaddingracks
• OneHighCapacityExadatabeatsthefastestEMCXtremIOall-flasharrayineveryperformancemetric– 12Xmorethroughput– 2.5XmoreIOPS– 2Xfasterlatency
EMC8X-BrickXtremIO:$7.8MExadataX6-2FullRack:$1.1M
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
Preview:ExadataSL6
12
LinuxonSPARC So\wareinSilicon
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
DatabaseIntelligenceExtendedintoCPUChipSPARCM7So\wareinSilicon
13
• TradiTonalDBalgorithmstoocomplexforchips
• BigChange:In-memoryalgorithmsaremuchsimpler
• 5yearsagoOracleiniTatedarevoluTonaryproject– Buildfastestevermicroprocessor• Mostprocessingcores(32)• Mostconcurrentthreads(256)• FastestMemoryBandwidth(160GB/sec)
– AddIn-MemoryDBoperaTonsdirectlyonchip
So\wareinSilicon
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
In-MemoryAlgorithmsNaTvelyImplementedinSilicon
14
CapacityinSiliconDecompressionEngines
SiliconSecuredMemoryFine-GrainedMemory
ProtecTon
SPARCM7So\wareinSilicon
DatabaseSo\wareAlreadyAvailable
SQLinSiliconDBAcceleraTon
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
SQLinSilicon:DatabaseIn-MemoryAcceleraTonEngines
• SIMDVectorsinstrucTonsarefast,butweredesignedforgraphics,notdatabase
• NewSPARCM7chiphas32opTmizeddatabaseacceleraTonengines(DAX)builtonchip
• Independentlyprocessstreamsofcolumns– E.g.findallvaluesthatmatch‘California’– Upto170Billionrowspersecond!
• Likeadding32addiTonalspecializedcorestochip– Usinglessthan1%ofchipspace
Core
SharedCache
Core Core Core
DBAccel
DBAccel
DBAccel
DBAccel
SPARCM7
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
CapacityinSilicon:DecompressionEngines
• Compressioniskeytopuungmoredatain-memory• Decompressionisfarmoreimportantfordatabasesthancompression– Dataisloadedonce,queriedmanyTmes
• Bitpaverndecompressioninnormalcoresisslow– 64CPUcoresneededtodecompressatfullmemoryspeed
• SPARCM7adds32opTmizeddecompressengines– Runbit-paverndecompressatmemoryspeed
DoublesMemoryCapacity
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
SiliconSecuredMemory:FineGrainedMemoryProtecTon
• DatabaseIn-memoryplacesterabytesofdatainmemory– MorevulnerabletocorrupTonbybugs/avacksthanstorage
• SPARCM7locksmemoryasitisallocatedsoonlytheownercanaccessit– Hidden“color”bitsaddedtopointers(key),andcontent(lock)– Pointercolor(key)mustmatchcontentcolororprogramisaborted– Hardwaresupporteliminatesperformanceimpact
• Helpspreventaccessoffendofstructure,stalepointeraccess,maliciousavacks,etc.plusimprovesdeveloperproducTvity
MemoryPointers
MemoryContent
STOP
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
ExadataSL6:ExadatawithUltra-fastSPARCLinuxServers
• IdenTcaltoExadatawithx86DatabaseserversreplacedbySPARCT7-2servers– Ultra-fast32-coreSPARCM7Processors– Two-socketT7-2Servers
• SameelasTcconfiguraTonsasExadataX6-2• StorageserversidenTcalasExadataX6-2• RunssameOracleLinuxasExadataX6-2– OracleLinux(UEK2)–singledomainconfiguraTon
• RunsOracleDatabase12.1.0.2
18
Copyright©2016Oracleand/oritsaffiliates.Allrightsreserved.|
Preview:ExadataSL6World’sFastestandMostSecureLinuxDatabaseMachine
MassiveMemoryBandwidth
FastestDatabaseProcessor
SiliconSecuredMemory
1.9xIntelx862.2xIntelx86EndtoEnd
DatabaseSecurity
19
Copyright©2016,Oracleand/oritsaffiliates.Allrightsreserved.
ExadataSmartSystemSo`ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
SmartAnalyGcs• Movequeriestostorage,notstoragetoqueries• AutomaTcallyoffloadandparallelizequeriesacrossallstorageservers• 100XfasteranalyTcs
SmartStorage• HybridColumnarCompressionreducesspaceusageby10X• Database-awareFlashCachinggivesspeedofflashwithcapacityofdisk
21
SmartOLTP• SpecialInfiniBandprotocolenableshighestspeed,lowestlatencyOLTP• Ultra-fasttransacTonsusingDBopTmizedflashloggingalgorithms• Fault-tolerantIn-MemoryDBbymirroringmemoryacrossservers
SmartConsolidaGon• WorkloadprioriGzaGonfromCPUtonetworktostorageensuresQoS• 4XmoreDatabasesinsamehardware
SmartSystemSo`wareHighlights
PCIFlash
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
SmartAnalyGcs• 5XfasterscansbyconverTngdatatoColumnarformatinFlashCache• 3XfasterJSON/XMLbyoffloadingtostorageservers
SmartConsolidaGon• ZerooverheadVMs• Snapshotsfortest/dev• SetflashcacheminsizeperDBtoensureQoS• InfiniBandparTToning• IPv6forEthernet
22
SmartOLTP• 3XfasterOLTPmessagingusingdirectDBtoInfiniBandaccess• InstantdetecTonofnodefailure• Sub-secondcappingofI/OlatencybyrerouTngI/Ostofasterstorage
SmartLicensing• Capacity-on-Demandreduceslicensecostbydisablingunneededcores• TrustedParGGonslimitlicensescopeofspecializedopTons
SmartSystemSo`wareIntroducedin2015
VMVM
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
SmartAnalyGcs• DatabaseIn-Memorycolumnarformatinstorageserver• AggregaGoninstorage• Setmembershipusingnewtypeofstorageindex
SmartConsolidaGon• Hierarchicalsnapshots• 2XapplicaTonconnecTons*• AutomatedVLANcreaTon*• Addextra10gEthernetCard• 64GBDIMMsfor2XMemory
23
SmartOLTP• SmartFusionBlockTransfereliminateslogwriteswhenmovingblocksbetweennodes*• Automatedrollingupgradeacrossfullstack• 2Xfasterdiskrecovery
SmartAvailability• ShortRangeStretch(Extended)clusters• 4Xfasterso`wareupdates*• HighredundancyQuorumdisksonQuarterandEighthracks*• StorageIndexpreservedonrebalance*
Preview:NewSmartSystemSo`ware
Products
*AlreadyReleased
SparseSnap
BaseDB
SparseSnap
CDB
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:InmemoryformatinColumnarFlashCache• In-MemoryformatsusedinSmartColumnarFlashCache
• Enablesvectorprocessingonstorageserverduringsmartscans– MulTplecolumnvaluesevaluatedinsingleinstrucTon
• FasterdecompressionspeedthanHybridColumnarCompression
• EnablesdicTonarylookupandavoidsprocessingunnecessaryrows• SmartScanresultssentbacktodatabaseinInMemoryColumnarformat– ReducesDatabasenodeCPUuTlizaTon
• In-memoryperformanceseamlesslyextendedfromDBnodeDRAMmemoryto10xcapacityflashinstorage– EvenbiggerdifferenTaTonagainstall-flasharraysandotherin-memorydatabases
24
In-MemoryColumnarscans
In-FlashColumnarscans
UpcomingreleaseofExadataSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:StorageIndexSetMembership• StorageIndex– Currentlycontainsupto8columnsofmin/maxsummary– CreatedautomaTcallyandkeptinmemory
– UsedtoskipperformingI/Os
• Whataboutquerieswithlowcardinalitycolumns?select name, address from travels
where origin=‘Sierra Leone’ and dest=‘CA’
• TradiTonalmin/maxnotgoodenough
• Databasegathersstatsandfindthatcolumnhaslessthan256disTnctvalues
• Databaserequestsstoragetocomputebloomfilter
• StoragewillcomputedisTnctvaluesandcreateabloomfilter
• SmartScanscheckvalue‘CA’againstbloomfilterandsavesperformingI/O
25
ORIGIN DEST NAME ADDRESS
SierraLeone AZ Alice …
SierraLeone UT Bob …
SierraLeone VT John
HASH(AZ) HASH(VT)HASH(UT)
CreateBloomFilter
BloomFilterinStorageIndex
HASH(CA) Lookup SAVEI/O
FirstScan
FutureScans
UpcomingreleaseofExadataSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:JoinandAggregaTonSmartScan
• ExtendIn-MemoryAggregaTontechniqueintostorage
• FindSalespercountrySELECT /*+ VECTOR_TRANSFORM */ country_id, sum(amount_sold) amount_sold FROM customers, sales WHERE customers.cust_id = sales.cust_id GROUP BY customers.country_id ORDER BY customers.country_id;
• Storagecellsscanningsalesfacttablewillreturntuples{country_id, sum_amount_sold }
• JoinandAggregaTonoffloadedtothestorageserver
26
12.2Databaseand12.2ExadataStorageServerSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:SmartwriteburstsandtempIOinflashcache• Writethroughputoffourflashcardshasbecomegreaterthanthewritethroughputof12-disks
• Whendatabasewritethroughputexceedsthethroughputofdisks,smartflashcacheintelligentlycacheswrites
• WhenquerieswritealotoftempIOanditisbovleneckedondisk,smartflashcacheintelligentlycachestempIO– WritestoflashfortempspillreduceselapsedTme– ReadsfromflashfortempreduceselapsedTmefurther
• SmartflashcacheprioriTzesOLTPdataanddoesnotremovehotOLTPlinesfromthecache
• Smartflashwearmanagementforlargewrites
27
UpcomingreleaseofExadataSo5ware
WriteBurtsandTempIOin
FlashCache
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:SmartAnalyTcsSo`wareFeatures
• CompressedIndexFastFullScan• SmartScanVIEWswithLOBs,XMLandJSON– notjusttables
• AWREnhancements– DiffreportforExadatasecTon– FlashCacheMetrics– Moregranularhistograms
• Upto25%reducToninStorageServerCPUforSPARCSuperClusterduringSmartScans– Reducesendiannessconversionoverhead
28
12.2Databaseand12.2ExadataStorageServerSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:Snapshots
• HierarchicalSnapshots– Createsnapshotsofdatabasesonpreviouslycreatedsnapshots
– Usecaseexample• Developmentreleasesnightlybuildofthedatabase• Testercreatesasnapshotforhimselfandfindsabug• Testercreatesasnapshotofhissnapshot• Testerprovidesthenewcopybacktodevelopmentforanalysis
– Syntaxandtechnologyremainunchanged– Workswithpluggableandnon-pluggabledatabases
• Sparsebackupofsnapshots– RMANbacksupthemodifiedblocksandnottheunchangedblocks
fromparent
29
TestSnapshot
SnapshottoDev
12.2Databaseand12.2ExadataStorageServerSo5ware
NightlyMaster
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:ExtendedDistanceClusters• Twositesandaquorumsite• InfiniBandconnectedforhighperformance– 100mopTcalcablesin2016(bestforfirecells)
• CreatedusingASMExtendedDiskgroups– Nestedfailuregroups
• Computenodesateachsitereaddatalocaltothatsite
• Dataiswriventoallsites• SmartScansscanacrosscellsonbothsitesincreasingthroughput– Rowfiltering,columnprojecTon,storageindex,andflashcacheprovideextremeperformance
• DataGuardconTnuestobetherecommendedDRsoluTon
30
InfiniBand
12.2Databaseand12.2ExadataStorageServerSo5ware
QuorumFailureGroup
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
SmartFusionBlockTransfer
• OLTPworkloadscanhavehotblocksthatarefrequentlyupdated(e.g.right-growingindex)– Logfilemustbewrivenbeforetransferringahotblockbetweeninstancessotheblockcanberecovered
– Addslatencyandreducesthroughput
• OnExadata,Oracledoesnotwaitforthelogwrite– Exadataensuresthelogwritecompletesbeforechangestoblockonanotherinstancecommit,guaranteeingdurability
– WaitforLogI/Oduringtransferofhotblocksiseliminated– Upto40%throughputand33%responseTmeimprovementinsomeheavilycontendedOLTPworkloads
31
Availablewith12.1.0.2BP12
1.Issuelogwrite
2.WaitforlogwritecompleGon
3.Transferblock
ExadataAvoidsI/OWait
PriorInter-InstanceBlockTransferProtocol
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:SuperFastSo`wareUpdates
• 4xspeedupinStorageServerSo`wareUpdate– Parallelfirmwareupgradesacrosscomponentssuchasharddisks,flash,ILOM/BIOS,InfiniBandcard
– ReducedrebootsforSo`wareupdates– Usekexecwherepossible
• ManageaCloudinsteadofmanagingasinglerack– UsesinglepatchmgruTlitytoupgradehundredsofracks
• Enablepatchmgrtorunfromanon-Exadatasystemandrunaslowprivilegeduser
32
UpcomingreleaseofExadataSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
Upcoming:ExtremeManageability• IPv6+Virtualmachine+VLANdeployments
• GetgraphsfromExawatcher
• MakeDNS,NTP,andotherIPaddresschangesonline
• SeamlesscustomerservicewithAutomaTcServiceRequestssendingdiagnosTcavachments
• ManageComputenodesusingaRESTfulservice– ExaClienabledforcomputenodesinaddiTontostoragecells
• MuchfasterrebalancewithimprovedflashcachehitraToduringrebalance
• SecureEraseduringhardwarereTrement
33
UpcomingreleaseofExadataSo5ware
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved.
ExadataAdvantagesIncreaseEveryYear
34
• SmartScan• InfiniBandScale-Out
• DatabaseAwareFlashCache• StorageIndexes• ColumnarCompression
• IOPrioriTes• DataMiningOffload• OffloadDecryptonScans
• In-MemoryFaultTolerance• Direct-to-wireProtocol• JSONandXMLoffload• InstantfailuredetecTon
• NetworkResourceManagement• MulTtenantAwareResourceMgmt• PrioriTzedFileRecovery
• UnifiedInfiniBand
• Scale-OutServers
• Scale-OutStorage• DBProcessorsinStorage
• PCIeNVMeFlash
• TieredDisk/Flash
• So`ware-in-Silicon
• 3DV-NANDFlash
• In-MemoryColumnarinFlash• SmartFusionBlockTransfer
• ExadataCloudServiceTransformaGonalOLTP,AnalyGcs,ConsolidaGon
CloudWithoutCompromise
Copyright©2016.Oracleand/oritsaffiliates.Allrightsreserved. 35