N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Civic Location Data eXchange Format (CLDXF)
Christian Jacqz
Director, MassGIS, Commonwealth of Massachusetts
Member NENA Core Services / Data Structures / CLDXF Work Group
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Purposes of CLDXF Support the exchange of address data by
providing “definitive set of core civic location data elements”
Ensure portability of address data
Permit efficient design of software systems
Meet functional needs of call-routing and dispatch
Does not include all elements needed for local address data management
No address ID, no metadata, no data quality checks
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Purposes of CLDXF
Map a profile between IETF PIDF-LO and NENA
PIDF - Presence Information Data Format “hello, it’s me and I’m waiting for an answer”
LO - Location Object “this is exactly where I am” coordinate location or civic address
CLDXF added two (minor) elements to PIDF-LO and dropped six elements
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Purposes of CLDXF Map elements to FGDC address standard
FGDC - Federal Geographic Data Committee
United States Thoroughfare, Landmark, & Postal Address Data
Standard
Sponsored by NENA and URISA, managed by Census
Over 10 years in development
More complex than CLDXF
Provide illustrative examples of parsing
There’s a lot of weird addresses out there!!
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Why address standard is so important
Data standardization per CLDXF will greatly facilitate “matching” between address records
How to ensure that two records that refer to the same address can be matched in a database, without human intervention?
Street name match is most important
Unit matching is most difficult
Matching between datasets goes beyond the explicit goals of the standard but is (in my view) a tremendously important benefit of implementing the standard
However, remember that addressing authority has final say on name - additional content standards may be required
Fourth v. 4th
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Why address standard is so important A standard allows for automated matching between
many different address lists & mapping sources
ALI database may not provide a complete list
Field data collection Local
updates
Commercial data provider
Voter List
Utilities ALI DB
Tax List
US Census
LocalParcels
address standard
tabular
geographic
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
How is CLDXF different from other standards?
No abbreviations
except State and Country
More levels of geography
Municipalities, communities, neighborhoods
Boundaries matter!
Complete parsing of street names
Fixes deficiencies in existing telco and USPS formats
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
How is CLDXF different?
Covers all possible numbering schemes
Number prefix, number, number suffix
Provides structure for subaddress information
Solves “kitchen sink” problem
Supports precision in address down to room & seat
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
How is CLDXF different? XML standard
XML is a extensible markup language – documents must be “well-formed” with nested tags etc.
About data, not presentation
Additionally, XML schemas and namespaces validate an XML document and ensure elements are unique
<note> <to>Ed</to> <from>Martha</from> <heading>Reminder</heading> <body>Take some time off!</body></note>
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
How is CLDXF different? Data elements vs. database fields
In XML you have required or optional elements in a database, the field layout is fixed and records can have null values
In XML, nested hierarchy of tags is specified in a schema in a database table, there is no hierarchy (although parent-child relationships are sometimes supported)
In XML, tags may be allowed to repeat within a “record”in a database, one record has one value in one field
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
About each element
CLDXF <-> PIDF-LO correspondence What is it? (and definition source) Examples Data type Does it have a domain? Mandatory/conditional/optional How many of this element? Notes
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
CLDXF element groups
Country, State, and Place Name
Street Name
Address Number
Landmark Name
Subaddress
Address Descriptor
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Country, State, and Place NamesThe easy ones – large geographies, well-defined legal status Country Name / Country (Country) –
mandatory two-letter ISO code
State Name / State (A1) – mandatory two-letter USPS code
Place Name / County (A2) - mandatory The name of county or county-equivalent
where the address is located.
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Country, State, and Place Names Where is a given street? What place names are needed to make
street names unique? Incorporated Municipality (A3) – mandatory
(“unincorporated” as default) The general-purpose local governmental unit
where the address is located Must have legally established boundaries. Need domain of muni names.
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Addresses and boundaries “…where the address is located.”
All these structures are located in Cambridge but addressed in Belmont.
You can’t list an address for Grove Street in Cambridge – because this Grove Street is not in Cambridge and there very well might be another Grove Street that is
CAM
BRID
GE
BELM
ON
T
Gro
ve S
treet
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Addresses and boundaries “…where the address is located.”
In which municipality is this address located?
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
More Kinds of Place Names Unincorporated Community (A4) - optional
Within an incorporated municipality, or in an unincorporated portion of a county
If not mapped, may be difficult to use
Distinguish from landmark – not single use or under single ownership and control.
Neighborhood Community (A5) – optional Neighborhood, subdivision or small commercial area.
Postal Community Name and ZIP Code (PCN, PC) -optional but strongly recommended
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
National Domains for Place Names
Country, state, county, postal town and zip code have domains
Local or statewide domains for incorporated municipalities Include type of place e.g. ‘Township of North
Hampden’ v. ‘Borough of North Hampden’ Mapping of boundaries makes use of
any place name much more useful
country (Country) state (A1) county (A2)
postal community name (PCN)
postal zip code (PC)
ISO 3166-1 USPS Pub 28https://www.census.gov/geo/reference/codes/cou.html
USPS City State File
USPS City State File
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
How MA uses place names All of MA is incorporated municipalities (A3)
Survey level boundaries, legally defined
“MSAG community” is the geography in MA that ensures the uniqueness of street names (A4) A4 boundaries are mapped, and strictly nested
within A3
Distinguish from PSAP boundaries – existing MSAG has a real problem with this
Zip codes are useful, but a nightmare to map
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parts of Street Name ( PIDF-LO element ) Street Name Pre Modifier ( PRM ) Street Name Pre Directional ( PRD ) Street Name Pre Type ( STP ) Street Name Pre Type Separator (added to US
profile of PIDF-LO to match FGDC) ( STPS ) Street Name ( RD ) Street Name Post Type ( STS ) Street Name Post Directional ( POD ) Street Name Post Modifier ( POM )
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Familiar elements
No abbreviations. IMHO, this is a very good thing.
Example: “N JOHNSON TR”Is it “NORTH JOHNSON TRAIL”or “NEIL JOHNSON TERRACE”
Any list of abbreviations will need constant maintenance
Domains for Pre/Post Types at http://technet.nena.org/nrs/registry/_registries.xml
pre directional pre type street name post type post directionalMAIN STREETBROADWAY
NORTH FAIRFAX DRIVEROUTE 7
SEVENTH STREET EASTAVENUE C NORTH
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
A Few Twists on Familiar Elements
Two types
Multiword types
Local Knowledge Required
pre directional pre type street name post type post directionalWARREN STREET COURT
NORTH 10TH STREET EXTENSION
pre directional pre type street name post type post directionalSTADIUM ACCESS ROAD
INTERSTATE HIGHWAY 95
pre directional pre type street name post type post directionalEAST BRIDGEWATER ROAD
EAST BRIDGEWATER ROADNORTH STAR ROAD
NORTH STAR ROADBYRON LANE ROADBYRON LANE ROAD
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Not-so-familiar elements
Modifiers Separated from name, not a type word or phrase
Separated from name, before or after directional
pre modifier pre directional pre type street name post type post directional post modifierARNOLD AVENUE BYPASSARNOLD AVENUE EXTENDEDSOUTH SHORE MALL ACCESS ROAD NUMBER 1
ALTERNATE ROUTE 8THE MANOR LANE
THE MANOR LANE
pre modifier pre directional pre type street name post type post directional post modifierMARKET STREET NORTH EXTENSIONMAIN STREET
NORTH MAIN STREETOLD NORTH MAIN STREET
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Not-so-familiar elements (continued)
Street Name Pre Type Separator Added to match FGDC Separator Element Preposition or prepositional phrase that
“separates” pre type from name
‘northbound’ and ‘southbound’ modifiers
pre-type pre-type separator street nameROAD TO THE RIVERAVENUE OF THE AMERICAS
pre-directional pre-type street name post-type post-directional post-modifierINTERSTATE HIGHWAY 95 southbound
NORTH MARKET STREETMARSHALL AVENUE SOUTH
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Content standards to support matching In CLDXF, local addressing authority has broad
discretion about what goes into the name Domains apply types and directionals, not modifers IMHO, NENA should recommend best practices for
street name content, such as: No abbreviations (maybe except often mis-spelled
honorifics “LIEUTENANT” = “LT”, “MONSIGNOR” = “MSGR”)
Use “official” name including special characters (“MARY’S WAY”, note that CLDXF supports these)
Have a rule for numbering – e.g. “First” through “Tenth” , then “11th” and up
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Issues with domains If you are trying to support legacy systems
with linear geocoding What do you do with feature names like
“Apartments” “Commons” that don’t properly refer to a linear, drivable feature
All kinds of things can appear on a street sign that are legitimate streets with no type or implicit type (the latter is “BROADWAY”)“BLUE FIN” “SAIL-A-WAY” “ASSINNASHAMAYAK”
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Street Names using CLDXF Open source parser to implement CLDXF
Process street names as raw inputs Identify possible element types for each
word or phrase using lookup of abbreviations (~1000 records) to domains for directionals and types, also listing of base names which could be otherwise interpreted
Enforce ordering of elements Score viable candidates Annotate invalid records
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Very simplified automated parsing example
“E ST” lookup:
Element order:
Rules:
INPUT ELEMENT_TYPE STANDARD OUTPUT SCOREE RD E 1E POM eastbound 0E PRD EAST 2E POD EAST 2E RD EAST 0E BRIDGEWATER RD EAST BRIDGEWATER 2ST RD SAINT 0ST STS STREET 2
token 1 token 2
RD RD E SAINTRD STS E STREET E STREET (3)POM RDPOM STSPRD RD EAST SAINTPRD STS EAST STREETRD RD EAST SAINTPOD RDPOD STSRD STS EAST STREET EAST STREET (2)
Filter on element order
Filter on required elements
Element typeCartesian product X RD, STS
RDPOMPRDPODRD
PRM PRD STP STPS RD STS POD POM
STP or STS – must have typeRD – must have name
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
What was the point of that?! CLDXF can be implemented in code to
standardize street names and to deal with all aspects of parsing and matching except:
Alternate spellings of base name MLK Blvd v. Martin Luther King Blvd Msgr OBrien v. Monsignor Martin J. O’Brien Msschsts Ave v. Massachusetts Ave
Concatenation of full street name and subaddress
Location , unit, building or other info
Ambiguous sequence of address number 47 | A J Handy Drive v. 47 A | J Handy Drive
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Address Number What is an address number?
Ideally, the number part indicates a location in sequence along a road, respecting parity
At a minimum, the full address number uniquely identifies one of the following
a site or a group of structures a single structure a part of a structure or some other location like an undeveloped parcel
with reference to a named street
Address number prefix (HNP)
Address number (HNO) (integer)
Address number suffi x (HNS)
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Address Number – further thoughts
Unfortunately, address numbers are often used to encode other kinds of information: Sector
Cross street or block
Building, Floor, Unit
Decoding the pattern may be useful Splitting the full number into prefix, number
and suffix should preserve the sequence information, if any
Zero should not be used to indicate no address number
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Weirdo numbering sample number parsing – odd cases
Again, if possible, decode the assignment Mileposts:
Address Number prefix (HNP)
Address Number (HNO)
Address Number suffi x (HNS) Note
A 21 MA has hundreds of these12 B typical two family or infill12 1/2 also common for infill12 .5 due to field type constraints
B 4 -01 building, floor, unit194- 23 1/2 Queens, NY (block number)
N6W2 3001 grid reference
Milepost (MP) Milepost 34.4Km 2.7Mile Marker 12
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz 123 North Street
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz Tunnel Massachusetts Bay Transit
Authority Green Line Haymarket to North StationAddress Number Prefix
Address NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz 289 ½ Broadway South
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz 22 A West Virginia Avenue
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz Interstate Highway 495 northbound
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz Old State Route 1
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz A-17 Warren Street Court
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz Avenue C Loop
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz Summit County Road 99
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz 72 Road to the River
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Parsing Quiz 14-16 Main Street (trick question)
Address Number PrefixAddress NumberAddress Number SuffixStreet Name Pre ModifierStreet Name Pre DirectionalStreet Name Pre TypeStreet Name Pre Type SeparatorStreet NameStreet Name Post TypeStreet Name Post DirectionalStreet Name Post Modifier
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Landmarks and landmark parts Landmark: “Name by which a prominent feature
is publicly known”
Landmark part added by CLDXF as extension of PIDF-LO. Usually involves a geographic hierarchy.
Landmark part is a repeating tag, so doesn’t neatly translate to fields
Order is not specified, (e.g. smallest -> largest) parts concatenated with spaces
Landmark Name Part (LMKP) Landmark Name Part (LMKP) Landmark (LMK)Statue of Liberty Statue of Liberty Winona Park Elementary School Winona Park Elementary School Yosemite National Park Camp Curry Yosemite National Park Camp CurryUniversity of South Florida Sun Dome Arena University of South Florida Sun Dome Arena Reed College Eliot Hall Reed College Eliot Hall
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
One way to manage landmark parts
If you are managing “sites” as a separate geographic layer, with sub-sites and named buildings mapped:
When is a something “a prominent feature, publicly known”?
When is something a building and when is it a landmark?
Note: a landmark is a complete, valid address
Site (landmark part) sub-site (landmark part) building (landmark part)Beth Israel Hospital East Campus Lamont BuildingSt. John's College Meecham LibraryWisteria Lake Boat RampGeneral Electric Plant Maintenance Shed
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Subaddress elements (PIDF-LO)
In USPS or ALI database, typically unstructured info
CLDXF provides hierarchy of building, floor, unit
FGDC allows for flexibility in typing subaddress components; CLDXF suggests type word included
Building (BLD)Floor (FLR) Unit (UNIT) Room (ROOM) Seat (SEAT) Additional Location Information (LOC)
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Subaddress examples
Building Addl. Location Information
Floor Unit Room Seat Notes
Building 4 Suite 10Wing 7
Floor 6Corridor Zero
Apartment 2DMezzanine Room 450F
PenthouseBasement (FLOOR)
Basement (UNIT)Terminal A Gate C27
4th floor Empire RoomCorridor F Floor 3 Cubicle 23
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
Subaddress issues Not always clear what goes into “building” v.
“landmark” “generic” identifiers, like numbers or letters, go
into building field, whereas names go into landmark field, but “publicly known?”
Many ways “building,” “floor” or “unit” can be represented or abbreviated in inputs
Identifiers can be encoded into unit field – this may be an area for content standards “#5” “Apt. 5” “Unit 5” “No. 5” all refer to Unit 5
“7B” “A-5C” “B12” all contain reference to building or floor as well as unit
N E N A D e v e l o p m e n t C o n f e r e n c e | O c t o b e r 2 0 1 4 | O r l a n d o , F l o r i d a
One last element
Not part of the address, but an attribute Domain is -
http://www.iana.org/assignments/location-type-registry/locationtype-
registry.xml
Address Feature Type / Place Type (PLC)
airport A place from which aircraft operate, such as an airport or heliport.
arena Enclosed area used for sports events.
bank Business establishment in which money is kept for saving or commercial purposes or is invested, supplied for loans, or exchanged.
bar A bar or saloon.