40
a DOC Conundrum Star Ying, Data Scientist at Department of Commerce

PyData DC 2016: A DOC Conundrum

Embed Size (px)

Citation preview

Page 1: PyData DC 2016: A DOC Conundrum

a DOC Conundrum

Star Ying, Data Scientist at Department of Commerce

Page 2: PyData DC 2016: A DOC Conundrum

first, some background

Page 3: PyData DC 2016: A DOC Conundrum
Page 4: PyData DC 2016: A DOC Conundrum

grow the economy

Page 5: PyData DC 2016: A DOC Conundrum

⅓ of all federal public data

Page 6: PyData DC 2016: A DOC Conundrum

new data released constantly

Page 7: PyData DC 2016: A DOC Conundrum
Page 8: PyData DC 2016: A DOC Conundrum

$

Page 9: PyData DC 2016: A DOC Conundrum

a DOC conundrum

Page 10: PyData DC 2016: A DOC Conundrum

how to impart better understanding of our data

Page 11: PyData DC 2016: A DOC Conundrum

of any data

Page 12: PyData DC 2016: A DOC Conundrum

so really a data conundrum

Page 13: PyData DC 2016: A DOC Conundrum

a different perspective

Page 14: PyData DC 2016: A DOC Conundrum

def wdtd()……

Page 15: PyData DC 2016: A DOC Conundrum

def wdtd()……

#wdtd: what did this

#do?

Page 16: PyData DC 2016: A DOC Conundrum

is inherited

Page 17: PyData DC 2016: A DOC Conundrum

a quick quiz

Page 18: PyData DC 2016: A DOC Conundrum

american community survey

Page 19: PyData DC 2016: A DOC Conundrum

how many erratas have been issued for 2016?

Page 20: PyData DC 2016: A DOC Conundrum

how many erratas have been issued for 2016?

http://www.census.gov/programs-surveys/acs/technical-documentation/errata.html

Page 21: PyData DC 2016: A DOC Conundrum

viirs nighttime lights

Page 22: PyData DC 2016: A DOC Conundrum

which pixels are really blank?

Page 23: PyData DC 2016: A DOC Conundrum

which pixels are really blank?

http://ngdc.noaa.gov/eog/viirs/download_monthly.html

Page 24: PyData DC 2016: A DOC Conundrum

survey of income and program participation

Page 25: PyData DC 2016: A DOC Conundrum

which weights do I use?

Page 26: PyData DC 2016: A DOC Conundrum

which weights do I use?

http://www.census.gov/programs-surveys/sipp/methodology/weighting.html

Page 27: PyData DC 2016: A DOC Conundrum

how do we convey the necessary information to use our product

Page 28: PyData DC 2016: A DOC Conundrum

now, an anecdote

Page 29: PyData DC 2016: A DOC Conundrum

can we tie satellite images to economic

activity?

Page 30: PyData DC 2016: A DOC Conundrum

accounted for cloud coverage,

population, etc...

Page 31: PyData DC 2016: A DOC Conundrum

forgot the earth is a

sphere

Page 32: PyData DC 2016: A DOC Conundrum

tendency to silo ourselves

Page 33: PyData DC 2016: A DOC Conundrum

real insights and outcomes can only be derived from true synthesis of

knowledge of the data and mechanics in processing it

Page 34: PyData DC 2016: A DOC Conundrum

so what are we doing about it?

Page 35: PyData DC 2016: A DOC Conundrum

it is a communication problem

Page 36: PyData DC 2016: A DOC Conundrum

Data Usabilitycommerce.gov/datausability

handcrafted tutorials with working open code

Page 37: PyData DC 2016: A DOC Conundrum
Page 38: PyData DC 2016: A DOC Conundrum

I’d love to hear your ideas

Page 39: PyData DC 2016: A DOC Conundrum

no really