Assignment 3 Kdtree

Embed Size (px)

Citation preview

  • 8/12/2019 Assignment 3 Kdtree

    1/19

    CS106LFall 2013

    Handout #03November 21, 2013

    Assignment 2: KDTree_________________________________________________________________________________________________________

    Due December 9, 11:59 PM

    Over the past si !ee"s, !eve eplored a !ide arra$ o% S&L 'ontainer 'lasses()ouve seen the linear vector and deque, alon* !ith the asso'iative map and set(One propert$ 'ommon to all these 'ontainers is that the$ are exact( +n element iseither in a setor it isnt( + value either appears at a parti'ular position in a vectoror it does not( For most appli'ations, this is ea'tl$ !hat !e !ant( Ho!ever, insome 'ases !e ma$ be interested not in the uestion -is . in this 'ontainer,/ butrather -!hat value in the 'ontainer is . most similar to/ ueries o% this sort o%tenarise in data minin*, ma'hine learnin*, and 'omputational *eometr$( n thisassi*nment, $ou !ill implement a spe'ial data stru'ture 'alled a kd-treeshort %or-k4dimensional tree/5 that e'ientl$ supports this op1eration(

    +t a hi*h level, a "d4tree is a *enerali7ation o% a binar$ sear'h tree that storespoints in k4dimensional spa'e( &hat is, $ou 'ould use a "d4tree to store a 'olle'tiono% points in the Cartesian plane, in three4dimensional spa'e, et'( )ou 'ould also usea "d4tree to store biometri' data, %or eample, b$ representin* the data as anordered tuple, perhaps hei*ht, !ei*ht, blood pressure, 'holesterol5( Ho!ever, a"d4tree 'annot be used to store 'olle'tions o% other data t$pes, su'h as strings(+lso note that !hile its possible to build a "d4tree to hold data o% an$ dimension, allo% the data stored in a "d4tree must have the same dimension( &hat is, $ou 'antstore points in t!o4dimensional spa'e in the same "d4tree as points in%our4dimensional spa'e(

    ts easiest to understand ho! a "d4tree !or"s b$ seein* an eample( 8elo! is a

    "d4tree that stores points in three4dimensional spa'e9

    Noti'e that in ea'h level o% the "d4tree, a 'ertain 'omponent o% ea'h node has beenbolded( % !e 7ero4inde the 'omponents i(e( the :rst 'omponent is 'omponent7ero, the se'ond 'omponent is 'omponent one, et'(5, in level no% the tree, the n;

    - 1 -

    (3, 1, 4)

    (2, 3, 7) (4, 3, 4)

    (2, 1, 3) (2, 4, 5) (6, 1, 4)

    (5, 2, 5)(1, 4, 4) (0, 5, 7)

    (4, 0, 6) (7, 1, 6)

  • 8/12/2019 Assignment 3 Kdtree

    2/19

    35rd 'omponent o% ea'h node is sho!n in bold( &he reason that these values arebolded is be'ause ea'h node a'ts li"e a binar$ sear'h tree node that dis'riminatesonl$ alon* the bolded 'omponent( For eample, the :rst 'omponent o% ever$ nodein the le%t subtree is less than the :rst 'omponent o% the root o% the tree, !hile the:rst 'omponent o% ever$ node in the ri*ht subtree has a :rst 'omponent at least asa lar*e as the root nodes( Similarl$, 'onsider the "d4trees le%t subtree( &he root o%

    this tree has the value 2, 3,

  • 8/12/2019 Assignment 3 Kdtree

    3/19

    2

    -1 4

    -2 0 63

    0

    No!, suppose that !e traverse the 8S& loo"in* %or 7ero( >e be*in at the root and'he'" !hether the root node has the value !ere loo"in* %or( Sin'e it doesnt, !edetermine !hi'h o% the t!o subtrees to des'end into, then re'ursivel$ loo" in thatsubtree %or 7ero( @athemati'all$, this is euivalent to splittin* the real number lineinto t!o re*ions A numbers less than t!o and numbers *reater than to t!o( &his issho!n here9

    - 3 -

    2

    -1 4

    -2 0 63

    0

    Values less than two Values greater than two

  • 8/12/2019 Assignment 3 Kdtree

    4/19

    Noti'e that all o% the nodes in the le%t subtree are in the le%t partition and all thenodes in the ri*ht subtree are in the ri*ht partition( Sin'e 0 B 2, !e "no! that i%7ero is 'ontained in this tree at all, it must be in the le%t partition( &his immediatel$rules out the possibilit$ that 7ero is in the ri*ht subtree, and so !e 'an re'ursivel$des'end into the le%t subtree !ithout !orr$in* about missin* the node %or 7ero(

    &he above dis'ussion hi*hli*hts the "e$ insi*ht that ma"es binar$ sear'h treespossible( a'h node de:nes some partition o% the real line into t!o se*ments, andea'h o% the nodes subtrees is %ull$ 'ontained !ithin one o% the se*ments(Sear'hin* a 8S& 'an thus be thou*ht o% as 'ontinuousl$ splittin* spa'e in hal%, then'ontinuin* the sear'h onl$ in the hal% that 'ontains the value in uestion(

    &he main reason %or mentionin* this line o% reasonin* is that it is possible to s'alethis up to data o% hi*her dimensions( Suppose, %or eample, that !e have the%ollo!in* 'olle'tion o% points in the plane9

    Suppose that !e !ant to build a binar$ sear'h tree out o% these points( % !e usethe %amiliar de:nition o% a 8S&, !e !ould pi'" some node as the root, then build asubtree out o% the remainin* nodes that are -less than/ the root node and one

    subtree out o% the values that are -*reater than/ the root node( Dn%ortunatel$,there isnt a parti'ularl$ *ood de:nition o% !hat it means %or a point in spa'e to beless than another( 8ut lets instead 'onsider the vie! o% a 8S& !e dis'ussed above(n a 8S&, ea'h point naturall$ split the entire real line into t!o re*ions( n t!odimensions, !e 'an split theplaneinto t!o re*ions around a point b$ dra!in* a linethrou*h that point( For eample, i% !e dra! the %ollo!in* line throu*h the indi'atedpoint9

    &hen !eve split the plane into t!o distin't re*ions, one above the line and onebelo! this line( &his observation *ives us a !a$ to build a binar$ sear'h tree inmultiple dimensions( First, pi'" an arbitrar$ point in spa'e and dra! a line 'hosen

    - 4 -

  • 8/12/2019 Assignment 3 Kdtree

    5/19

    ho!ever $oud li"e5 throu*h it( Net, separate the remainin* points into points toone side o% the line and points on the other( Finall$, re'ursivel$ 'onstru't binar$sear'h trees out o% those points( &his te'hniue is "no!n as binary space

    partitioningsin'e ea'h step splits spa'e into t!o re*ions5, and trees *enerated this!a$ are "no!n as binary space partitioning treesor BSP trees(

    8ut 8SE trees are not restri'ted to ust the t!o4dimensional planeG the samete'hniue !or"s in arbitraril$ man$ dimensions( n three dimensions, !e 'ouldpartition spaceinto t!o re*ions b$ dra!in* aplanethrou*h a point, then ta"in* there*ions above and belo! the plane as the t!o hal%4re*ions( >hen !or"in* !ith 8SEtrees, one o%ten uses the term splitting hyperplaneto re%er to the obe't passin*throu*h a point that splits spa'e in hal%( n t!o dimensions, a h$perplane is a line,!hile in three its a plane( n a standard binar$ sear'h tree, this -h$perplane/ is usta point(

    >hat does an$ o% this dis'ussion have to do !ith "d4trees &o ans!er that uestion,lets return to our ori*inal 'olle'tion o% points in t!o4dimensional spa'e, as sho!nhere9

    Suppose that !e !ant to build a "d4tree out o% these data points( >e be*in b$'hoosin* some node !hi'h !ell sa$ is at 0, $05 %or notational simpli'it$5 andsplittin* the data set into t!o *roups, one o% points !hose 'omponents are lessthan the splittin* nodes, and one o% points !hose 'omponents are at least aslar*e as the splittin* nodes( >e 'an visuali7e the split li"e this9

    Noti'e that this is essentiall$ euivalent to runnin* a splittin* h$perplane throu*hone o% the points( n that sense, a "d4tree is a spe'ial 'ase o% a 8SE tree !ith aspe'ial rule that determines !hi'h splittin* h$perplanes to use( Ho!ever, !eve

    done so !ithout needin* to !rite an$ 'ode that manipulates h$perplanes orhal%4spa'es( +ll o% the 'omple *eometr$ is ta"en 'are o% impli'itl$(

    Lets 'ontinue buildin* this "d4tree( >e re'ursivel$ build a "d4tree in the ri*hthal%4spa'e the points to the ri*ht o% the 'entral node5 b$ pi'"in* the some pointand splittin* the data hori7ontall$ throu*h it, as seen here9

    - 5 -

    x < x0

    x x0

  • 8/12/2019 Assignment 3 Kdtree

    6/19

    % !e 'ontinue this 'onstru'tion to 'ompletion, our resultin* "d4tree !ill loo" li"ethis9

    Here, the *old node is the root, nodes one level do!n are red, nodes t!o levelsdeep are *reen, and nodes three levels deep are blue(

    &o *ive $ou a better sense %or the *eometri' intuition behind this "d4tree, lets tra'ethrou*h !hat happens !hen !e tr$ loo"in* up !hether a *iven point is in the"d4tree( n parti'ular, lets see !hat happens as !e tr$ to loo" up the node in thebottom4ri*ht 'orner o% the "d4tree( >e be*in at the root o% the "d4tree and 'onsider!hether our nodes 'oordinate is less than or *reater than the root nodes 'oordinate( &his is euivalent to splittin* the plane verti'all$ at the root node, thenas"in* !hi'h hal%4spa'e our node is in( Our node happens to be in the ri*hthal%4spa'e, and so !e 'an i*nore all o% the nodes in the le%t hal%4spa'e andre'ursivel$ eplore the ri*ht( &his is sho!n *raphi'all$ belo!, !here the *ra$ed4outre*ion 'orresponds to parts o% the plane !e !ill never loo" in9

    - 6 -

    y y1

    y < y1

  • 8/12/2019 Assignment 3 Kdtree

    7/19

    No!, !e 'he'" !hether our node is above or belo! the red node, !hi'h is the rooto% the tree in this hal%4spa'e( Our node is belo! it, so !e 'an dis'ard the tophal%4spa'e and loo" in the bottom( &his is sho!n here9

    Net, !e 'he'" !hether !ere to the le%t or the ri*ht o% the *reen node thats theroot o% this re*ion o% spa'e( >ere to the ri*ht, so !e dis'ard the sliver o% a

    hal%4spa'e to the le%t o% that node and 'ontinue on9

    +t this point, !e have rea'hed the node !ere loo"in* %or, and the sear'h al*orithmterminates(

    !e"rest!eighbor #ooku$ in kdTrees

    No! that $ou have a better *eometri' intuition %or the "d4tree, !e 'an tal" aboutthe most interestin* operation on the "d4tree9 nearest4nei*hbor loo"up( &his uer$!or"s as %ollo!s9 *iven a "d4tree and a point in spa'e 'alled the test point5, !hi'hpoint in the "d4tree is 'losest to the test point &he point in the data set 'losest tothe test point is 'alled its nearest neighbor5( 8e%ore !e dis'uss the a'tual al*orithm%or doin* nearest4nei*hbor loo"up, !ell dis'uss the intuition behind the al*orithm(

    Suppose that !e have a *uess o% !hat !e thin" the nearest nei*hbor to the testpoint is( For eample, suppose that the test point is indi'ated b$ the star and that!e thin" the nearest nei*hbor is the point 'onne'ted to the star b$ the dashed line9

    - 7 -

  • 8/12/2019 Assignment 3 Kdtree

    8/19

    =iven our *uess o% !hat the nearest nei*hbor is, !e 'an ma"e a 'ru'ial observation(% there is a point in this data set that is 'loser to the test point that our 'urrent*uess, it must lie in the 'ir'le 'entered at the test point that passes throu*h the'urrent *uess( &his 'ir'le is sho!n here9

    +lthou*h in this eample this re*ion is a 'ir'le, in three dimensions it !ould be asphere, and in *eneral !e 'all it the candidate hypersphere(

    &he reason that this observation is so important is that it lets us prune !hi'h partso% the tree mi*ht hold the true nearest nei*hbor( n parti'ular, noti'e that this 'ir'leis entirel$ to the ri*ht o% the splittin* h$perplane runnin* verti'all$ throu*h the rooto% the tree( Conseuentl$, an$ point to the le%t o% the root o% the tree 'annotpossibl$ be in the 'andidate h$persphere, and 'onseuentl$ 'ant be an$ betterthan our 'urrent *uess( n other !ords, on'e !e have a *uess about !here thenearest nei*hbor is, !e 'an start eliminatin* parts o% the tree !here the a'tualans!er 'annot be( &his *eneral te'hniue o% sear'hin* a lar*e spa'e and prunin*

    options based on partial results is 'alled branch-and-bound(

    From the pi'ture its 'lear that the 'ir'le o% possible nearest nei*hbors does not'ross the middle splittin* h$perplane, but ho! 'an !e determine thismathemati'all$ n *eneral, *iven a 'ir'le and a line or, more *enerall$, ah$persphere and a a h$perplane5, its a bit tri'"$ to determine !hether that 'ir'leinterse'ts the line( Fortunatel$, thou*h, the %a't that !eve 'hosen all o% thesplittin* h$perplanes to be ais4ali*ned *reatl$ simpli:es this tas"( 8elo! is an

    - 8 -

  • 8/12/2019 Assignment 3 Kdtree

    9/19

    arbitrar$ line and t!o 'ir'les, one o% !hi'h 'rosses the line and one o% !hi'h doesnot9

    No!, 'onsider the distan'e %rom the 'enters o% these 'ir'les to the line $ $ 0( &hisis simpl$ the absolute value o% the di?eren'e bet!een the 'ir'les $ 'oordinates and$0, as seen here9

    y = y0

    (x1, y

    1)

    (x2, y

    2)

    r1

    r2

    y1- y

    0

    y2! y

    0

    Noti'e that the distan'e I$1 A $0I %rom the 'enter o% the blue 'ir'le to the line is*reater than the radius o% the 'ir'le, and so the 'ir'le does not 'ross the line( Onthe other hand, the distan'e %rom the 'enter o% the red 'ir'le to the line is less thanthe radius o% the 'ir'le, and so some part o% that 'ir'le does 'ross the line( &his*ives a *eneral 'riterion %or determinin* !hether a 'andidate h$persphere 'rosses aparti'ular splittin* h$perplane( n parti'ular, *iven a "d4tree node holdin* point a0,a1, a2, (((, a"5 and h$persphere o% radius r 'entered at b0, b1, b2, (((, b"5, i% the nodepartitions points based on their ith 'omponent, then the h$persphere 'rosses thenodes splittin* plane onl$ i% Ibi A aiI B r(

    &o re'ap9

    =iven a *uess about !hi'h node is the nearest nei*hbor, !e 'an 'onstru't a'andidate h$persphere 'entered at the test point and runnin* throu*h the*uess point( &he nearest nei*hbor to the test point must lie inside thish$persphere(

    % this h$persphere is %ull$ to one side o% a splittin* h$perplane, then all pointson the other side o% the splittin* h$perplane 'annot be 'ontained in thesphere and thus 'annot be the nearest nei*hbor(

    - 9 -

    y = y0

    (x1, y

    1)

    (x2, y

    2)

    r1

    r2

  • 8/12/2019 Assignment 3 Kdtree

    10/19

    &o determine !hether the 'andidate h$persphere 'rosses a splittin*h$perplane that 'ompares 'oordinate i, !e 'he'" !hether IbiA aiI B r(

    &hese observations, ta"en to*ether, su**est the %ollo!in* al*orithm %or :ndin* thenearest nei*hbor to a test point9

    Let the test point be (a0, a1, ..., ak).

    Maintain a global best estimate of the nearest neighbor, called 'gess.'Maintain a global !ale of the distance to that neighbor, called 'bestDist'

    "et 'gess' to #$LL."et 'bestDist' to infinit%.

    "tarting at the root, e&ecte the folloing procedre if crr #$LL retrn

    /* If the current location is better than the best known location,

    * update the best known location. */ if distance(crr, gess) * bestDist bestDist distance(crr, gess) gess crr

    /* Recursively search the half of the tree that contains the test point. */ if ai* crri recrsi!el% search the left sbtree on the ne&t a&is else recrsi!el% search the right sbtree on the ne&t a&is

    /* If the candidate hypersphere crosses this splitting plane, look on the * other side of the plane by examining the other subtree.

    */ if +crri ai+ * bestDist recrsi!el% search the other sbtree on the ne&t a&is

    ntuitivel$, this pro'edure !or"s b$ !al"in* do!n to the lea% o% the "d4tree as i% !e!ere sear'hin* the tree %or the test point( +s !e start un!indin* the re'ursion and!al"in* ba'" up the tree, !e 'he'" !hether ea'h node is better than the bestestimate !e have so %ar( % so, !e update our best estimate to be the 'urrent node(Finall$, !e 'he'" !hether the 'andidate h$persphere based on our 'urrent *uess'ould 'ross the splittin* h$perplane o% the 'urrent node( % it doesnt, then !e 'aneliminate all points on the other side o% the splittin* h$perplane %rom 'onsiderationand !al" ba'" up to the net node in the tree( Other!ise, !e must loo" in that side

    o% the tree to see i% there are an$ 'loser points(

    &his al*orithm 'an be sho!n to run in Ol* n5 time on a balan'ed "d4tree !ith ndata points provided that those points are randoml$ distributed( n the !orst 'ase,thou*h, the entire tree mi*ht have to be sear'hed( Ho!ever, in lo!4dimensionalspa'es, su'h as the Cartesian plane or three4dimensional spa'e, this is rarel$ the'ase(

    - 10 -

  • 8/12/2019 Assignment 3 Kdtree

    11/19

    k!e"rest !eighbor %e"rches "nd Bounded Priorit& 'ueues

    n this dis'ussion, !eve onl$ 'onsidered the problem o% :ndin* the singlenearestnei*hbor to a test point( + more interestin* uestion is, *iven a test point and somenumber k, to :nd the k-nearest-neighborso% that point( &his sear'h is o%ten re%erredto as a k-NN search( t turns out that the previous al*orithm 'an easil$ be adapted

    to do a k4NN sear'h instead o% a 14NN sear'h( &he al*orithm is almost identi'al,e'ept that instead o% maintainin* ust the best point, !e maintain a list o% the kbest points !eve seen so %ar(

    8e%ore des'ribin* the al*orithm, !ell introdu'e a spe'ial data stru'ture 'alled abounded priority queueor BPQ%or short5( + bounded priorit$ ueue is similar to are*ular priorit$ ueue, e'ept that there is a :ed upper bound on the number o%elements that 'an be stored in the 8E( >henever a ne! element is added to theueue, i% the ueue is at 'apa'it$, the element !ith the hi*hest priorit$ value isee'ted %rom the ueue( For eample, suppose that !e have a 8E !ith maimumsi7e :ve that holds the %ollo!in* elements9

    Jalue + 8 C K

    Eriorit$ 0(1 0(2 1(33 3(2 M(6

    Suppose that !e !ant to insert the element F !ith priorit$ 0(M into this boundedpriorit$ ueue( 8e'ause this 8E has maimum si7e :ve, this !ill insert theelement F, but then evi't the lo!est4priorit$ element 5, $ieldin* the %ollo!in* 8E9

    Jalue + 8 F C K

    Eriorit$ 0(1 0(2 0(M 1(33 3(2

    No! suppose that !e !ish to insert the element = !ith priorit$ M(0 into this 8E(8e'ause =s priorit$ value is *reater than the maimum4priorit$ element in the 8E,upon insertin* = it !ill immediatel$ be evi'ted( n other !ords, insertin* anelement into a 8E !ith priorit$ *reater than the maimum4priorit$ element o% the8E has no e?e't( =iven a''ess to a 8E, !e 'an per%orm a k4NN sear'h in a"d4tree as %ollo!s9

    - 11 -

  • 8/12/2019 Assignment 3 Kdtree

    12/19

    Let the test point be P (%0, %1, ..., %k).

    Maintain a -/ of the candidate nearest neighbors, called 'bp'"et the ma&imm sie of 'bp' to k

    "tarting at the root, e&ecte the folloing procedre if crr #$LL retrn

    /* dd the current point to the !"#. $ote that this is a no%op if the * point is not as good as the points we&ve seen so far. */ enee crr into bp ith priorit% distance(crr, P)

    /* Recursively search the half of the tree that contains the test point. */ if %i* crri recrsi!el% search the left sbtree on the ne&t a&is else recrsi!el% search the right sbtree on the ne&t a&is

    /* If the candidate hypersphere crosses this splitting plane, look on the * other side of the plane by examining the other subtree. */ if bp isn't fll

    2or2 +crri %i+ is less than the priorit% of the ma&2priorit% elem of bp then recrsi!el% search the other sbtree on the ne&t a&is

    &here are t!o minor 'han*es to this al*orithm that di?erentiate it %rom the initial14NN sear'h al*orithm( First, !hen determinin* !hether to loo" on the oppositeside o% the splittin* plane, !e use as the radius o% the 'andidate h$persphere the

    distan'e %rom the test point to the maimum4priorit$ point in the 8E( &herationale behind this is that !hen :ndin* the knearest nei*hbors, our 'andidateh$persphere %or the k nearest points needs to en'ompass all ko% those nei*hbors,not ust the 'losest( &he other main 'han*e is that !hen !e 'onsider !hether toloo" on the opposite side o% the splittin* plane, our de'ision ta"es into a''ount!hether the 8E 'ontains at least k points( &his is etremel$ important % !eprune out parts o% the tree be%ore !e have made at least k*uesses, !e mi*hta''identall$ thro! out one o% the 'losest points( Consider the %ollo!in* setup9

    Suppose that !e !ish to per%orm a 24NN loo"up %or the test point indi'ated b$ thestar( >e re'ursivel$ 'he'" the le%t subtree o% the splittin* plane, and :nd the pointindi'ated in blue as a 'andidate nearest nei*hbor( Sin'e !e havent %ound t!onearest nei*hbors $et, !e still need to loo" on the other side o% the splittin* plane%or more nei*hbors, even thou*h the 'andidate h$persphere does not 'ross thesplittin* h$perplane(

    - 12 -

  • 8/12/2019 Assignment 3 Kdtree

    13/19

    The Assignment

    )our assi*nment is to implement a 'lass representin* a "d4tree, !hi'h !ell 'all'()ree, that allo!s 'lients to build "d4trees, uer$ "d4trees %or membership, andee'ute k4NN loo"ups on them( n the 'ourse o% doin* so, $oull *ain eperien'e!ith 'lass implementation, const4'orre'tness, templates, 'op$ %un'tions, operator

    overloadin*, and e'eption4handlin*( +dditionall$, $oull *et to eperien'e :rsthandthe po!er o% k4NN loo"ups b$ seein* appli'ations that build o? o% $our '()ree'lass(&he amount o% 'ode that $ou a'tuall$ need to !rite is not too *reat A on the order o%t!o hundred lines A thou*h it !ill reuire $ou to have a solid understandin* o% thelan*ua*e %eatures !eve eplored over the past !ee"s(

    &o ma"e it easier to 'omplete the assi*nment, ve bro"en the pro*ram do!n into aseries o% :ve smaller steps( advise 'ompletin* the assi*nment in this order, but$oure %ree to implement '()reeas $ou see :t(

    %te$ (ero: %et u$ the Pro)ect

    &he starter 'ode is prett$ sel% eplanator$( &he most important 'ode %or ourpurposes is %ound in '()ree.hand "oint.h.

    %te$ *ne: Im$+ement B"sic unction"+it&

    No! that $ouve *otten the proe't set up, its time to start implementin* '()ree(&he '()ree implementation $oull be !ritin* is a'tuall$ a sli*ht variant on the"d4tree stru'ture des'ribed earlier in this handout that asso'iates auiliar$ data!ith ea'h point( n a sense, $our '()ree !ill a't li"e a %an'$ map%rom points inspa'e to values( For eample, $ou 'ould use a '()ree to map %romlon*itudelatitude pairs to 'ities, %rom biometri' data to disease pro*nosis, or %rom

    ima*es to labels on those ima*es(

    8elo! is a partial spe'i:'ation o% the '()ree'lass, hi*hli*htin* the %un'tions $oullneed to !rite to *et basi' %un'tionalit$ !or"in*(

    Basic (incomplete) KDTreeinterface

    template *sie3t #, t%pename 4lemT%pe5 class KDTreepublic+

    KDTree()67KDTree()6

    sie3t dimension() const6sie3t sie() const6

    bool empt%() const6

    !oid insert(const oint*#58 pt, const 4lemT%pe8 !ale)6

    bool contains(const oint*#58 pt) const6 4lemT%pe8 operator9: (const oint*#58 pt)6 4lemT%pe8 at(const oint*#58 pt)6 const 4lemT%pe8 at(const oint*#58 pt) const6-

    - 13 -

  • 8/12/2019 Assignment 3 Kdtree

    14/19

    )ou ma$ have noti'ed that '()reehas an unusual template si*nature9

    template sie0t $, typename 1lem)ype2 class '()ree

    )ou did not misread that A the '()ree implementation is parameteri7ed over asie0t as !ell as a t$pe( >e have not dis'ussed inte*er template ar*uments

    be%ore, but the$ behave ust li"e re*ular t$pe template ar*uments( % $ou !ant to'reate a '()reethat maps %rom points in three4dimensional spa'e to strings, $ou'ould de'lare it as

    '()ree3, string2my'()ree-

    &he "e$s in the '()reeare obe'ts o% t$pe "oint$2, !here $is the dimension o% the'()ree( &hat is, a '()ree3, string2 uses "oint32s as "e$s, a '()ree4, city)2!ould use "oint42s as "e$s, et'( ve provided a %ull$4!or"in* implementation o%"oint in the starter 'odeG it behaves li"e a :ed4si7e S&L vectordouble2( Foreample9

    "oint32 pt- pt567 8 93:.6- pt597 8 ;4.6- pt547 8 4.:9

  • 8/12/2019 Assignment 3 Kdtree

    15/19

    4lemT%pe8operator9: (const oint*#58 pt)6

    Peturns a re%eren'e to the value asso'iated!ith the point pt( % the point does not eist inthe '()ree, it is added !ith the de%ault value o%1lem)ypeas its value, and a re%eren'e to thisne! value is returned( &his is the same

    behavior as the S&L maps operator57(

    Note that this %un'tion does not have a constoverload be'ause the %un'tion ma$ mutate thetree(

    4lemT%pe8at(const oint*#58 pt)6

    const 4lemT%pe8at(const oint*#58 pt) const6

    Peturns a re%eren'e to the value asso'iated!ith the point pt, i% it eists( % the point is notin the tree, then this %un'tion thro!s anout0of0rangee'eption(

    &his %un'tion is const4overloaded, sin'e it doesnot 'han*e the tree(

    Noti'e that the last %our %un'tions contains, operator57, and the t!o versions o% at5all do some sear'h o% the '()reeloo"in* %or a parti'ular value, di?erin* onl$ in theirbehavior !hen the point is not 'ontained in the tree( contains returns %alse,operator57adds a ne! element, and atthro!s an out0of0rangee'eption( Patherthan !ritin* the 'ode to traverse the tree %our times and 'ustomi7in* the behavior!hen an element isnt %ound, stron*l$ su**est !ritin* a helper %un'tion thatsear'hes the tree %or a parti'ular point, then returns a pointer to the node'ontainin* it( )ou 'an then implement these %un'tions on top o% this 'ommonsubroutine( +s an eample, heres a simple implementation o% contains thatassumes the eisten'e o% a helper %un'tion find$ode9

    template sie0t $, typename 1lem)ype2 bool '()ree$, 1lem)ype2++contains=const "oint$2> pt? const return find#ode(pt) ; #$LL-

    &o 'he'" !hether $ou have $our 'ode !or"in*, $ou 'an run the :rst set o% tests%rom the proe't test%harness( % these report an$ errors, be sure to 'orre't thembe%ore movin* on( )ou ma$ also !ant to add tests o% $our o!n(

    %te$ T-o: Im$+ement !e"rest!eighbor #ooku$

    No! that $ou have the basi' %un'tionalit$ read$, its time to implement k4NNsear'hes( )our net tas" is to implement the k$$@alue %un'tion, !hi'h loo"s li"ethis9

    xtended (still incomplete) KDTreeinterface

    template sie0t $, typename 1lem)ype2 class '()ree public+

    '()ree=?-A'()ree=?-

    - 15 -

  • 8/12/2019 Assignment 3 Kdtree

    16/19

    sie0t dimension=? const-sie0t sie=? const-bool empty=? const-

    void insert=const "oint$2> pt, const 1lem)ype> value?-

    bool contains=const "oint$2> pt? const- 1lem)ype> operator57 =const "oint$2> pt?- 1lem)ype> at=const "oint$2> pt?- const 1lem)ype> at=const "oint$2> pt? const-

    4lemT%pe k##

  • 8/12/2019 Assignment 3 Kdtree

    17/19

    %un'tions, !hi'h !ill 'ause a 'rashes( &o prevent this, $ou !ill need to implement a'op$ 'onstru'tor and assi*nment operator %or the '()ree 'lass( &his results in the:nal inter%a'e o% the '()ree'lass9

    !omplete KDTreeinterface

    template sie0t $, typename 1lem)ype2 class '()ree public+

    '()ree=?-A'()ree=?-

    KDTree(const KDTree8 other)6 KDTree8 operator (const KDTree8 other)6

    sie0t dimension=? const-sie0t sie=? const-bool empty=? const-

    void insert=const "oint$2> pt, const 1lem)ype> value?-

    bool contains=const "oint$2> pt? const-

    1lem)ype> operator57 =const "oint$2> pt?- 1lem)ype> at=const "oint$2> pt?- const 1lem)ype> at=const "oint$2> pt? const-

    1lem)ype k$$@alue=const "oint$2> pt? const--

    )ou are %ree to implement these %un'tions as $ou see :t, but stron*l$ en'oura*e$ou to read over Chapter 11 in the 'ourse reader be%ore doin* so( t is surprisin*l$eas$ to *et these %un'tions !ron*, and $ou !ill !ant to ensure that $ou understand!hat to !at'h out %or be%ore $ou start 'odin* them up(

    &he testin* harness 'ontains t!o tests that eer'ise the 'op$ %un'tions, one

    'he'"in* the basi' %un'tionalit$ and one e'lusivel$ 'he'"in* ed*e 'ases( @a"esure that $our implementation passes the tests be%ore movin* on(

    Ad/ice, Ti$s, "nd Tricks

    Here are a %e! spe'i:' pointers that mi*ht ma"e $our li%e a lot easier as $ou *othrou*h this assi*nment9

    Don't hesitate to ask questions! &his assi*nment uses man$ o% the CRRte'hniues !eve seen over the past %e! !ee"s( % $oure havin* trouble*ettin* $our 'ode to 'ompile, or 'ant remember !hat "e$!ord $ouresupposed to be usin* some!here, email me and 'an tr$ to point $ou in the

    ri*ht dire'tion(

    This assignment is not as hard as it may seem. &his handout is %airl$dense, but the a'tual amount o% 'ode $ou need to !rite is not that *reat( )ouare onl$ responsible %or implementin* a %e! %un'tions, some o% !hi'h 'an beimplemented in a sin*le line o% 'ode( % $ou ta"e the time to thin" throu*hho! all the %un'tions are related to one another, $ou 'an save $oursel% mu'h'odin* e?ort b$ implementin* the %un'tions in terms o% ea'h other(

    - 17 -

  • 8/12/2019 Assignment 3 Kdtree

    18/19

    Watch out for typenameweirdness when implementing functions" )ourimplementation o% '()ree!ill reuire the use o% a nested t$pe to representnodes in the tree( % $ou !rite an$ private helper %un'tions that return obe'tso% this t$pe, $ou !ill need to use the typename"e$!ord !hen implementin*those %un'tions( For eample, suppose that $ou de:ne a helper stru't 'alled$odeand then de:ne a %un'tion that returns a $ode*, as sho!n here9

    private+ struct $ode /* ... */ - $ode* find$ode=const "oint$2> pt?-

    &he implementation o% this %un'tion !ould then have this si*nature9

    template sie0t $, typename 1lem)ype2 typename'()ree$, 1lem)ype2++$ode* '()ree$, 1lem)ype2++find$ode=const "oint$2> pt?-

    &hats a real mouth%ul, and un%ortunatel$ its the onl$ !a$ to 'ommuni'ate tothe 'ompiler !hat $oure tr$in* to implement( @a"e sure $ou understand theuse o% typename, alon* !ith !h$ the template ar*uments are dupli'ated int!o pla'es(

    Be careful about const-correctness" % $ou 'reate an$ private member%un'tions to assist in the implementations o% the '()ree publi' inter%a'e,ma"e sure those member %un'tions are mar"ed const!here appropriate( nparti'ular, contains, at, and k$$@alueare const, so i% the$ 'all an$ member%un'tions, those %un'tions must be mar"ed constas !ell( )ou !ill *et some%airl$ %ero'ious 'ompiler errors i% $ou tr$ 'allin* a non4constmember %un'tion

    %rom a constmember %un'tion, so be !ar$(

    se fabs instead of abs( &he cmath2 header :le eports t!osimilar4soundin* %un'tions to 'ompute absolute value, absand fabs( n thisassi*nment, $ou should not use the abs %un'tion( abs !or"s on inte*ralvalues, so i% $ou pass in a double, the returned value !ill be in'orre'tl$rounded to an int( fabsis desi*ned to !or" on floats and doubles, and is amu'h more appropriate %un'tion(

    emember the const_cast"static_cast trick( &he '()ree 'ontains t!o%un'tions named atthat di?er onl$ in their constness( Pather than !ritin*t!o 'opies o% the same 'ode, $ou 'an use the const0caststatic0casttri'" toimplement the non4constversion in terms o% the constversion( Loo" over thele'ture 'ode %or the @ector'lass %or more details(

    0tensions

    % $oure interested in sharpenin* $our CRR s"ills, !ant to do more advan'edoperations on the "d4tree, or %eel li"e spendin* a la7$ Sunda$ 'odin* a!a$ %uriousl$,!h$ not add some etensions to $our '()ree 8elo! is a list o% possible etensionideas, some o% !hi'h are strai*ht%or!ard, !hile others !ill reuire si*ni:'ant time

    - 18 -

  • 8/12/2019 Assignment 3 Kdtree

    19/19

    and e?ort( % $ou end up 'ompletin* an$ o% these, let me "no! and d be *lad toloo" over !hat $ouve !ritten

    Bui+d the kdtree more inte++igent+&( &raditionall$, "d4trees are not builtone element at a time, but rather %rom a 'omplete data set all at on'e( &oensure that the tree is balan'ed, the elements are sorted b$ their :rst

    'omponent, the median is used as the root o% the tree, and the remainin*elements are then re'ursivel$ subdivided into 'hildren o% the root node(mplement a ne! 'onstru'tor %or the '()ree'lass to build up the tree in this%ashion(

    Add su$$ort or other dist"nce metrics( >hen doin* nearest4nei*hborloo"up, !e use u'lidean distan'e as a measure o% -'loseness/ bet!een t!opoints and tr$ to :nd a point in the "d4tree !ith the least u'lidean distan'eto the test point( Ho!ever, its possible to use all sorts o% other distan'emetri's, su'h as @anhattan distan'e or the maimum norm( +dd support toK&reeto tr$ out these ne! distan'e metri's( Ho! does the behavior o% thesample appli'ations 'han*e

    .hoose "es more inte++igent+&( &he 'urrent "d4tree implementation'$'les throu*h !hi'h ais it splits on !ith ea'h level o% the tree( + more'lever idea !ould be to split alon* the lon*est ais o% the data set !ith the*oal o% spreadin* the points out more evenl$( Dpdate the '()ree'lass to usethis %un'tionalit$(

    Add su$$ort or r"nge se"rches( One 'ommon operations on "d4trees is arange search, !here the input is a re'tan*le in spa'e and the output is theset o% points in the "d4tree 'ontained in that re'tan*le( &his *ives a mu'hbetter al*orithm %or the Cit$Finder pro*ram than the one !e !rote earlier in

    the uarter( Pesear'h ho! to implement this %un'tion, then add it to '()ree( Add su$$ort or e+ement remo/"+( &he '()ree$ouve !ritten 'an have

    ne! elements added, but 'annot remove eistin* elements( Kevelop anal*orithm to remove arbirar$ points %rom a "d4tree, then update $our '()reeinter%a'e to support this(

    Be cre"ti/e &hin" o% an$ 'lever uses %or a "d4tree Ho! about somethin*$ou 'ould do to ma"e the "d4tree more e'ient % $ou have an$ ideas $oudli"e to tr$ out, b$ all means *o %or it and d love to see !hat $ou 'ome up!ith(

    De+i/er"b+es

    Submit K&ree(h and an$ other :les $ouve added on paperless( % $ouve addedan$ etensions or spe'ial %eatures should be a!are o%, shoot me an email A mal!a$s 'urious to see !hat $ouve done( !ould also appre'iate it i% $ou o?eredsome %eedba'" on this assi*nment A !as it interestin* &oo eas$ &oo hard Tustri*ht Finall$, pat $oursel% on the ba'" A $ouve ust 'ompleted the last assi*nmento% CS106L and are no! a veteran CRR pro*rammer( Con*ratulations

    19