Upload
graylinkim
View
215
Download
3
Embed Size (px)
Citation preview
Open LegislationSpring 2011
Open Data(Government)
Secondary Sources are nice
● OpenCongress● GovTrack.US● OpenStates● FedSpending.org
● Many more
Primary Sources are better
● Data.gov● USAspending.gov● California● Oregon● Washington
● Many more
Sometimes though...Open Data is not Enough.
We need Platforms.
A Different Breed of Open
● Making data accessible:● Built-in search● Permanent URIs● Standardized Feeds● Real-time Alerts
● REST Architecture with Feed Publishing● RSS/Atom => Pubsubhubbub => Alerts
So back toOpen Legislation
Browse, Search, and Sharehttp://open.nysenate.gov/legislation
Its not a Service;Its an Open Platform
1 Year Re-cap
● Open Sourced It (for real)● Improved the API (xml/json)● Decreased Load Times● Restructured the Back-end● Basic Documentation● Wrapped into a build system
The next year
● In general..● Data Quality and Documentation● Usage Tracking and Statistics● User Interface Improvements● Further separation of the Platform and Service
● Right now● Data Quality, Data Quality, Data Quality● And a little bit of documentation
The Senate has LegislativeData Quality issues?
Well, not exactly
● Legislative Research Service has the data● Big, ancient mainframe to boot
● They FTP us updates every 5 minutes● In SOBI formats (what?)● With some XML mixed in
● We parse it back into XML/JSON/SQL structure
Reasons for Difficulty
● Poorly Documented SOBI behavior
● Formatted as a change log (sometimes)● Finding sources of error can be hard
● LRS is not co-operative
Solutions
● Version Control● Write objects to JSON/XML files● With Git, commit each new version
– Commit message points to the source SOBI● Use git to trace data errors back to SOBI files
● Unit Test known corner cases
● Periodically do a scrape check?
Progress
✔ Parsing has been overhauled✔ Objects are written to file✔ Bugs have been found and fixed✔ Periodic Scrapes are approved
A short task list
✗ Integrate git into the parsing system.✗ Document expected behavoir✗ Write a small test suite✗ Try to avoid having to scrape.
HFOSS Symposium 2011
● Bryan Sivak – Civic Commons● Mark Prutalis – Sahana Foundation● Many universities, Mozilla, Google
● David, Moorthy, Brian, and Myself!● 1 Hour and a few 3' x 4' posters.