35

Searching Images by Color: Presented by Chris Becker, Shutterstock

Embed Size (px)

Citation preview

Page 1: Searching Images by Color: Presented by Chris Becker, Shutterstock
Page 2: Searching Images by Color: Presented by Chris Becker, Shutterstock

Searching Images by Color Chris Becker Search Engineering @ Shutterstock

Page 3: Searching Images by Color: Presented by Chris Becker, Shutterstock

What is Shutterstock?!

•  Shutterstock sells stock images, videos & music.!

•  Crowdsourced from artists around the world!

•  Shutterstock reviews and indexes them for search!

•  Customers by a subscription and download them!

Page 4: Searching Images by Color: Presented by Chris Becker, Shutterstock

Why search by color?!

Page 5: Searching Images by Color: Presented by Chris Becker, Shutterstock

Stock photography on the internet…!

Page 6: Searching Images by Color: Presented by Chris Becker, Shutterstock

Stock photography on the internet…!

Page 7: Searching Images by Color: Presented by Chris Becker, Shutterstock

Color is one of several visual attributes that you can use !

to create an engaging !image search experience!

Page 8: Searching Images by Color: Presented by Chris Becker, Shutterstock

Shutterstock Labs!www.shutterstock.com/labs!

! Spectrum! Palette!

Page 9: Searching Images by Color: Presented by Chris Becker, Shutterstock

Diving into Color Data!

Page 10: Searching Images by Color: Presented by Chris Becker, Shutterstock

Color Spaces!

•  RGB!!

•  HSL!!

•  LCH!!

•  Lab!

Page 11: Searching Images by Color: Presented by Chris Becker, Shutterstock

Calculating Distances Between Colors!

•  Euclidean distance works reasonably well in any color space!!distRGB = sqrt((r

1-r

2)^2 + (g

1-g

2)^2 + (b

1-b

2)^2)!

distHSL = sqrt((h1-h

2)^2 + (s

1-s

2)^2 + (l

1-l

2)^2)!

distLCH = sqrt((L1-L

2)^2 + (C

1-C

2)^2 + (H

1-H

2)^2)!

!

•  More sophisticated equations that better account for human perception can be found at!http://en.wikipedia.org/wiki/Color_difference!!

Page 12: Searching Images by Color: Presented by Chris Becker, Shutterstock

Images are just numbers![ [[054,087,058], [054,116,206], [017,226,194], [234,203,215], [188,205,000], [229,156,182]], [[214,238,109], [064,190,104], [191,024,161], [104,071,036], [222,081,005], [204,012,113]], [[197,100,189], [159,204,024], [228,214,054], [250,098,125], [050,144,093], [021,122,101]], [[255,146,010], [115,156,002], [174,023,137], [161,141,077], [154,189,005], [242,170,074]], [[113,146,064], [196,057,200], [123,203,160], [066,090,234], [200,186,103], [099,074,037]], [[194,022,018], [226,045,008], [123,023,087], [171,029,021], [040,001,143], [255,083,194]], [[115,186,246], [025,064,109], [029,071,001], [140,031,002], [248,170,244], [134,112,252]], [[116,179,059], [217,205,159], [157,060,251], [151,205,058], [036,214,075], [107,103,130]], [[052,003,227], [184,037,078], [161,155,181], [051,070,186], [082,235,108], [129,233,211]], [[047,212,209], [250,236,085], [038,128,148], [115,171,113], [186,092,227], [198,130,024]], [[225,210,064], [123,049,199], [173,207,164], [161,069,220], [002,228,184], [170,248,075]], [[234,157,201], [168,027,113], [117,080,236], [168,131,247], [028,177,060], [187,147,084]], [[184,166,096], [107,117,037], [154,208,093], [237,090,188], [007,076,086], [224,239,210]], [[105,230,058], [002,122,240], [036,151,107], [101,023,149], [048,010,225], [109,102,195]], [[050,019,169], [219,235,027], [061,064,133], [218,221,113], [009,032,125], [109,151,137]], [[010,037,189], [216,010,101], [000,037,084], [166,225,127], [203,067,214], [110,020,245]], [[180,147,130], [045,251,177], [127,175,215], [237,161,084], [208,027,218], [244,194,034]], [[089,235,226], [106,219,220], [010,040,006], [094,138,058], [148,081,166], [249,216,177]], [[121,110,034], [007,232,255], [214,052,035], [086,100,020], [191,064,105], [129,254,207]], ]

Page 13: Searching Images by Color: Presented by Chris Becker, Shutterstock

•  getting histograms!

•  computing median values!

•  standard deviations / variance!

•  other statistics !

Any operation you can do on a set of numbers, you can do on an image!

Page 14: Searching Images by Color: Presented by Chris Becker, Shutterstock
Page 15: Searching Images by Color: Presented by Chris Becker, Shutterstock

Extracting Color Data!

Page 16: Searching Images by Color: Presented by Chris Becker, Shutterstock

Tools & Libraries!•  ImageMagick!

•  Python Image Library!

•  ImageJ!

Page 17: Searching Images by Color: Presented by Chris Becker, Shutterstock

Code Example!#! /usr/bin/env perl!use Image::Magick;!!my $image = Image::Magick->new;!$image->Read(‘SamplePhoto.jpg’);!$image->Quantize(colorspace => 'RGB', colors => 64);!my @histogram = $image->Histogram();!my %colors;!!while ( my($R,$G,$B,$opacity,$count) = splice(@histogram,0,5)) {!!

# convert r,g,b to a hex color value!my $hex = sprintf("%02x%02x%02x",!

$R / 256,!$G / 256,!$B / 256!

);!!

$colors{$hex} += $count; !}!

Page 18: Searching Images by Color: Presented by Chris Becker, Shutterstock

Indexing & Searching in Solr!

Page 19: Searching Images by Color: Presented by Chris Becker, Shutterstock

Indexing color histograms!

color_txt = "cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 2e6b2e 2e6b2e 2e6b2e ff0000 …"

• index colors just like you would index text!• volume of color == frequency of the term!

Page 20: Searching Images by Color: Presented by Chris Becker, Shutterstock

Solr Fields & Queries!

•  Easy to query!

•  Can use solr’s default ranking effectively!!/solr/select?q=ff0000 e2c2d2&qf=color&defType=edismax…!!

•  or access term frequencies directly to create specific sort functions:!!sort=product(tf(color,"ff0000"),tf(color,"e2c2d2")) desc!

<field name="color" type="text_ws" …>!

Page 21: Searching Images by Color: Presented by Chris Becker, Shutterstock

Indexing color statistics!

lightness: median: 2 standard dev: 1 largest bin: 0 largest bin size: 50

saturation median: 0 standard dev: 0 largest bin: 0 largest bin size: 100 …

Represent aggregate statistics of each image!

Page 22: Searching Images by Color: Presented by Chris Becker, Shutterstock

Solr Fields & Queries!

•  Sort by the distance between input param and median value!!/solr/select?q=*&sort=abs(sub($query,hue_median)) asc!

<field name=”hue_median” type=”int” …>!

Page 23: Searching Images by Color: Presented by Chris Becker, Shutterstock

Ranking & Relevance!

Page 24: Searching Images by Color: Presented by Chris Becker, Shutterstock

How much of the image has the color ? !

Page 25: Searching Images by Color: Presented by Chris Becker, Shutterstock

is this relevant if I search for ?!

Page 26: Searching Images by Color: Presented by Chris Becker, Shutterstock

which image is more relevant if I search for ?!

Page 27: Searching Images by Color: Presented by Chris Becker, Shutterstock

is this relevant if I search for ?!

Page 28: Searching Images by Color: Presented by Chris Becker, Shutterstock

How do we account for these factors?!

Page 29: Searching Images by Color: Presented by Chris Becker, Shutterstock

How much of the image contains the selected color?!

•  Score each color by number/percentage of pixels!!sort=tf(color,"ff9900") desc!

Page 30: Searching Images by Color: Presented by Chris Becker, Shutterstock

Color Accuracy!•  As you reduce your color space, you also reduce

precision!

•  reducing the colorspace too much increases recall and lowers precision. !

•  Not reducing it enough lowers recall and higher precision.!

•  reducing your color space down to ~100 to ~300 colors works well!

Page 31: Searching Images by Color: Presented by Chris Becker, Shutterstock

Weighing Multiple Colors Equally!•  If you search for 2 or more colors, the top result should

have the most even distribution of those colors!

•  simple option:!!sort=product(tf(color,"ff9900"),tf(color,"2280e2")) desc!!

•  more complex: compute the stdev or variance of the matching color values in your solr sort function, and sort the results with the lowest variance first. !!

Page 32: Searching Images by Color: Presented by Chris Becker, Shutterstock

Accounting for Similar & Different Colors!

•  The score for a particular color should reflect all the colors in the image.!

•  At indexing time, increase the score based on similar colors; decrease it based on differing colors.!

Page 33: Searching Images by Color: Presented by Chris Becker, Shutterstock

Conclusion!

Page 34: Searching Images by Color: Presented by Chris Becker, Shutterstock

Conclusion!•  This talk provided a rough guide to building a basic search-by-color

application!

•  Lots of opportunity to do more sophisticated things in image search. !

•  matching colors in certain parts of an image!

•  identifying visual styles (blur vs sharp, high contrast, etc)!

•  patterns & textures!

•  analyzing content in images (object detection)!!!

Page 35: Searching Images by Color: Presented by Chris Becker, Shutterstock

One more demo…!