Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Monitoring Market Rents in Metro Vancouver Transit Locations │1_
School of Community And Regional Planning
Monitoring Market Rents in Metro Vancouver Transit Locations │2_
This project was completed as a requirement of the Affordable Housing Policy and Planning course (PLAN 530) at the
University of British Columbia’s School of Community and Regional Planning in partnership with Metro Vancouver.
The project was designed to provide an opportunity for students to collaborate with a community partner on a
research topic that is current, practical, and relevant to housing issues in British Columbia.
The research findings and recommendations in this report are those of the authors and they do not necessarily
reflect the views of The University of British Columbia or Metro Vancouver.
Monitoring Market Rents in Metro Vancouver Transit Locations │3_
TABLE OF CONTENTS
Context 4
Purpose 5
Limitations 6
Census Survey (Statistics Canada) 7
Rental Market Survey (CMHC) 7
Process 8
Advantages & Disadvantages 9
Current Practices 9
Legality 13
Web Scraping Pilot Program 15
Collaborative Rental Housing Data Platform 18
Monitoring Market Rents in Metro Vancouver Transit Locations │4_
Figure 1. Metro Vancouver 2040 Regional Growth Strategy3
Context
Strategy 4.1 of Metro Vancouver 2040: Shaping our Future, Metro Vancouver’s regional growth
strategy, illustrates the organization’s goal to provide diverse and affordable housing choices to its
residents.1 In order to accomplish this goal, Metro Vancouver uses the combined cost burden of
housing and transportation as one of the key measures to track their progress on its Metro 2040
Dashboard.2
However, the current methodology of determining housing cost by census data has several
limitations. First, Statistics Canada conducts the national census survey every five years. Such
infrequency significantly decreases the accuracy of current housing cost due to a discrepancy gap of
up to six years, including the data processing time. It impedes the process of capturing up-to-date
information on rental housing stock. Second, census data are granular data that Statistics Canada
rounds up or excludes values in small geographical areas to protect the residents’ personal
information. Third, the scope of the data is confined to the neighbourhood level and it is impossible to
access each housing unit’s information. For instance, the average rental value can only be released at
a census tract level or a larger geographic area. The geographic data constraint limits the potential of
conducting microscale analyses with the use of each dwelling unit’s specific location and attributes.
The limitations above hinder the ability of local government
planners and decision-makers to retrieve actual housing
costs in the market. Therefore, it is highly necessary to
develop a better procedure for collecting rental market
values at a more frequent and appropriate scale. The
resulting methodology will improve the understanding of
market rents in such a rapidly growing region of Metro
Vancouver, especially nearby current and future rapid-
transit locations. In addition, the improvement in data
access is critical to executing an equitable transit-oriented
development, ensuring that such development induces the
desired societal benefits and minimizes unfavourable
effects. As a result, government officials can deliver
appropriate policy responses from the outcomes.3
1,3 http://www.metrovancouver.org/services/regional-planning/PlanningPublications/RGSAdoptedbyGVRDBoard.pdf 2 http://www.metrovancouver.org/metro2040
Monitoring Market Rents in Metro Vancouver Transit Locations │5_
Figure 2. Research Process
Purpose
The census data’s limitations on capturing market rental costs affect Metro Vancouver’s ability to
assist municipalities in monitoring housing status and developing policies. Currently, there is no
standard reporting of average market rents near transit locations, and no clear data source to track
rental charges, particularly in the secondary rental stock. The purpose of this research is to explore,
develop, and propose methodological tools to compile and measure asking market rents in Metro
Vancouver. These alternative tools will accurately and comprehensively compile current asking prices
of primary and secondary rents in the region, in particular, near transit locations. Accordingly, local
governments and other stakeholders can use the recommended methodologies to better coordinate
future developments in the region. The information collected by the proposed tools will be essential
for planners and decision-makers to preserve and encourage affordable housing, employment,
childcare, and access to recreation and services near rapid-transit areas.
This report was prepared in three phases (Figure 2). During the data collection phase, the team
examined academic literature, conducted informal interviews (Figure 3), and undertook Q&A
correspondence via email.
The research team communicated with six key informants, conducted three informal interviews,
and examined three legal cases on web scraping. In addition, the team surveyed a personal blog
that demonstrates a ‘how-to guide’ and its applications of web scraping. The literature review
indicated a limitation of academic study on the legality of web scraping. During the communications
with key informants, the team asked them different questions subject to their expertise. The
interviewees were asked to provide a detailed methodology of their current practice, such as
survey, analysis, and research. In addition, researchers who regularly use web scraping were
asked to provide their opinion on the legality of web scraping.
In the data analysis phase, the team compiled all the information and summarized the findings. Then,
this report aimed to answer two key questions: (1) what are current and potential methodologies to
Monitoring Market Rents in Metro Vancouver Transit Locations │6_
Figure 3. Organizations and Occupations of Key-informants
compile market rents near transit locations; and (2) what are the administrative and legal constraints
associated with these methodologies?
During the final phase, the team created recommendations on potential methodologies based on the
research findings. The team suggested a web scraping pilot program and a collaborative data
platform to monitor the cost of rental housing in the Metro Vancouver region. Lastly, the report was
revised based on the feedback provided by the course instructors.
Limitations
This report was conducted over four months, from January to April 2019. All researchers who use web
scraping were reluctant to reveal in-depth information of their research (e.g. full instruction of their
research method such as computer codes). Although they agreed to share their publicly available
analyses and reports (e.g. blog posts), they wanted to remain discreet due to the ambiguous legality of
web scraping. Additionally, several respondents showed concerns that web scraping may be
challenged in the courts that web scraping may violate terms and conditions of some websites, which
is a significant barrier for conducting collaborative research and publishing results. Lastly, some of
the information gathered in this research was technical and perhaps can be better understood by
professionals in the corresponding field (e.g. software programming, law, finance).
Monitoring Market Rents in Metro Vancouver Transit Locations │7_
Census Survey by Statistics Canada
Statistics Canada conducts a national census survey to collect information on all individuals and
households in Canada. The survey has extensive coverage across the country with a high response
rate; the response rate of the most recent census in 2016 was over 98%.4 It collects information of
each dwelling’s various characteristics (e.g. owned or rented, price of monthly rent, number of rooms,
condition for repairs, subsidized housing or not).
However, Statistics Canada conducts the survey every five years. As a result, users may have to rely
on the data that was collected more than five years ago. Each entry of information is aggregated into
groups and compressed at a Census Tract level. Although the census has excellent coverage,
however, its housing data contains a limited amount of rental type information. For instance, all
dwellings are categorized into either subsidized or non-subsidized and the census does not determine
whether the dwelling is a purpose-built rental or a secondary market rental. Moreover, the census
only collects 25% sample data of all dwellings in Canada. Therefore, it is difficult to retrieve accurate
and current rental market information.
Rental Market Survey by Canada Mortgage and Housing Corporation
Canada Mortgage and Housing Corporation (CMHC) conducts an annual rental market survey,
predominantly by phone interviews with minimal site-visits. Less than 5% of the sampled units are
site-visited and the information gathered through site-visits are not formally collected.5 Its primary
purpose is to verify the building’s status (e.g. demolished, condo or non-market, and vacancy). CMHC
often knocks on doors to ask tenants for the name and number of the person who collects the rent.
However, this is a small percentage of the entire data collection since CMHC does not conduct an in-
depth survey. Overall, it collects information about the rental price, availability, turnover rate, and
vacancy rate across Canada. CMHC conducts the survey every year, which is more frequent than
census; however, it is still not frequent enough to accurately capture rental data, which often
fluctuates throughout the year. Also, the rental market survey by CMHC only collects purpose-built rental information. Therefore, it excludes most of the
rental units in the housing market. The survey only targets privately initiated
structures with at least three rental units, further excluding sublet units by
individuals or rented units by private investors. Moreover, the information of the
collected data is only released at a city scale, not by neighbourhood or location.
4 https://www12.statcan.gc.ca/census-recensement/2016/ref/response-rates-eng.cfm 5 Suzanne Milburn: Manager of Programs West, Canada Mortgage and Housing Corporation
Monitoring Market Rents in Metro Vancouver Transit Locations │8_
As described above, the census by Statistics Canada and the rental market survey by CMHC have
substantial limitations to collect information on up-to-date market rental units. Therefore, an
alternative methodology is necessary to compile more accurate and consistent data of the rental
market in Metro Vancouver. In both the academic and the industry sector, web scraping is becoming a
more popular methodology to analyze a large amount of real-time data with high accuracy.
Multiple academic studies defined the meaning of web scraping and examined its technology. Daniel
Glez-Peña, a software engineering researcher, defined web scraping as “the process of extracting
and combining contents of interest from the web in a systematic way. By using scraping software,
the user can create a large database from multiple listings on online web pages.”6 Vargiu and Urru
described web scraping as “the set of techniques used to automatically get some information from a
website instead of manually copying it. The goal of a web scraper is to look for certain kinds of
information, extract, and aggregate it into new web pages. In particular, scrapers are focused on
transforming unstructured data and saving them in structured databases.”7
Process
The overall process of web scraping begins with data collection. Using
programming languages such as Java or Python, the user can construct
programming codes to automatically extract data of a particular section
in each page or listing.8 For instance, the user can write codes to extract
each rental listing’s address, size, number of bedrooms information and
compile them into a single spreadsheet. This spreadsheet may
continuously update itself if the user imposes a time limit.
However, the same listings often get posted more than once, possibly on multiple websites. Therefore,
the deduplication process that verifies the same listings and records them only once is critical.
Deduplication is conducted either manually by a human or automatically by computer codes. Then,
listings that do not meet a quality standard, such as missing values and outliers are filtered out.
Finally, the resulting inventory is used for analysis, such as calculating the median rental price of an
area.
6 Glez-Peña, D., Lourenço, A., López-Fernández, H., Reboiro-Jato, M., & Fdez-Riverola, F. (2013). Web scraping technologies in an API world. Briefings in bioinformatics, 15(5), 788-797 7 Vargiu, E., & Urru, M. (2013). Exploiting web scraping in a collaborative filtering-based approach to web advertising. Artif. Intell. Research, 2 (1), 44-54. 8 Data Scientist, Flipboard and Quantitative Rhetoric
Monitoring Market Rents in Metro Vancouver Transit Locations │9_
Advantages and Disadvantages
Web scraping provides an opportunity to collect more up-to-date and comprehensive information on
market rents. By automatically compiling thousands of online listing data from multiple online
platforms, it saves the human labour of manually collecting listing data, which is efficient in both in
terms of cost and time. Also, the gathered data can easily be managed within a single spreadsheet.
However, web scraping also has limitations. First, not all listings contain the same level of data and
may exclude detailed information depending on each listing. Also, since websites are subject to
maintenance and modification, data gathering may be disrupted as a result. Lastly, but most
importantly, the legality of web scraping is still in a grey zone in Canada, meaning that its legal
permissibility is not fully defined and may vary on a case-by-case basis.9
Current Practices
The following examples demonstrate the current practices of web scraping in the academia and
the industry.10
PadMapper
PadMapper is a private rental housing platform
website, which produces a national Monthly
Canadian Rent Report by web scraping their
website’s listings data. Every month, it records
and publishes the median asking rents of all one-
bedroom apartments available in the top 25 most
populous Canadian cities.11 Its methodology can be
summarized as the following: first, ZumperPro
tool compiles active listings data based on a
combination of proprietary listings from
numerous brokers; then, it aggregates the data
with the information from PadMapper’s
established connections with multiple renters
and landlords.12 Consequently, every listing posted
on PadMapper is verified to create an accurate set of an inventory. Then, the resulting inventory is
used to calculate the median rents of each municipality. Lastly, PadMapper creates a public map
based on the summarized information (Figure 4).13 Since this is a part of PadMapper’s business model,
the company did not share more detailed information about its methodology and data sources.
9 PhD Candidate at the University of British Columbia 10 All key informants agreed upon sharing their name, organization, and the information used for this report. 11 https://blog.padmapper.com/category/rental-data/ 12 https://blog.padmapper.com/2017/09/11/our-methodology-empowering-the-renter-with-data/; Chrystal Chen: Marketing Manager at Zumper/PadMapper 13 https://blog.padmapper.com/canadian-rent-trends
Figure 4. PadMapper Map of May 2019 Canadian Rent
Report13
Monitoring Market Rents in Metro Vancouver Transit Locations │10_
Figure 5. Map of Asking Rents of One-bedroom Units in 800m Radius Around SkyTrain Stations (March to May 2018)
Quantitative Rhetoric
A private data analyst at Quantitative Rhetoric publishes a Monthly Rental Report that provides a
statistical overview of median rental prices in the City of Vancouver.14 It displays the housing price
index and the median rental price in Vancouver based on the web scraped data from a significant
secondary market rental site (the author did not reveal the exact name). To achieve this, the author
deduplicates the aggregated database of multiple listings using the indicators of URL, number of beds,
number of bathrooms, and locations. The author then inspects the output on the first few reports to
examine its validity; and performs diagnostics such as rent distribution to ensure the quality of the
data. Finally, the author excludes outliers by applying hard cutoffs and medians for further robust
analysis.
MountainMath
MountainMath’s private data analyst created a set of maps to illustrate rental listing information in
Metro Vancouver visually.15 The author first collected rental listings data posted between from March
to May 2018 on a major online platform website alongside with TransLink’s SkyTrain station location
data. Then, the author produced three maps to demonstrate the results: (1) median rental price of one-
bedroom units in Metro Vancouver (Figure 5), (2) median rental price of two-bedroom units (Figure 6),
and (3) number of one-bedroom rental listings (Figure 7) within 800 metres around SkyTrain stations. Figure 5:
14 http://quantitativerhetoric.com/category/vancouver-real-estate.html 15 https://doodles.mountainmath.ca/blog/2018/06/21/skytrain-rents/
Monitoring Market Rents in Metro Vancouver Transit Locations │11_
Figure 6. Map of Asking Rents of Two-bedroom Units in 800m Radius Around SkyTrain Stations (March to May 2018)
Figure 7. Map of One-bedroom Monthly Listings in 800m Radius Around SkyTrain Stations (March to May 2018)
These maps by MountainMath demonstrate the monthly listing prices of asking rents near SkyTrain
Stations in Metro Vancouver between the months of March and May in 2018. A future analysis of
Monitoring Market Rents in Metro Vancouver Transit Locations may include examining asking rents
near other types of rapid-transit stations (e.g. B-lines) and its change over time (e.g. before and after
installation of a new rapid transit line).
Monitoring Market Rents in Metro Vancouver Transit Locations │12_
Statistics Canada
The federal agency of Statistics Canada produced Measuring Private
Short-term Accommodation in Canada report in March 2019.16 This report
provides an overview of the data sources and methods used to examine
the private short-term accommodation market in Canada. It explores the
temporal record of the short-term rental status and the economic impact
across Canada by statistical analyses. These analyses were conducted
based on the short-term rental information of total revenue generated,
fees paid, reserved days, the percentage of listing types, the percentage
of unit types, and more.
The report’s methodology section states, “Statistics Canada acquired data from a third-party market
research firm (AirDNA LLC) that specializes in providing data analytics for private short-term
accommodation rental platforms. The acquired data included web scraped listing information, in addition to
derived or modelled revenue data for all properties within the geographic boundaries of Canada.”
Then, the data was edited for consistency by removing duplicate records and filling in missing
information such as incorrect provincial classification. The number of listings collected was compared
to the estimates published by researchers or academics who scraped their listing information by
themselves. This procedure was followed by preliminary data estimates derived from Airbnb, the
market’s largest firm. The procedure contained the data aggregation of the generated revenue to
include the host and guest fees charged by the intermediary platform. Assumptions were made to
estimate these guest fees paid since Airbnb did not provide this information.
In summary, Statistics Canada did not web scrape the data itself but obtained data from a third-party
market research firm, AirDNA LLC, to measure private short-term accommodation in Canada. In the
process of selecting AirDNA LLC third-party market research firm to conduct this study, Statistics
Canada followed its standard Government of Canada procurement policies.17 Although Statistics
Canada does not have additional information about other available market research firms that utilize
web scraping, this report reinforces the potential possibility of legally conducting web scraping
analysis through a third-party.18
16 https://www150.statcan.gc.ca/n1/en/pub/13-605-x/2019001/article/00001-eng.pdf?st=3pjPF54N 17 https://www.canada.ca/en/services/business/doing-business/how-to-sell/procurement-policies.html 18 https://www150.statcan.gc.ca/n1/pub/13-605-x/2019001/article/00001-eng.htm
Monitoring Market Rents in Metro Vancouver Transit Locations │13_
Legality
Academic Research (Craigslist vs. 3Taps)
The availability of literature on the legality of web scraping is limited. However, there is an academic
paper that specifically explored the legal issue of web scraping regarding the collection of
information on rental housing markets in the United States.19
Its authors describe three major legal considerations in utilizing web scraping methodology,
copyright, trespassing, and archives, through a legal dispute between Craigslist and 3Taps.20 3Taps,
an online exchange platform for an exchange of goods, services, and information, was accused of
web scraping Craigslist’s rental listing data in 2013 for competitive commercial purposes and
displaying them on their web site. As a result, three significant statements were made. First, the
Federal District Court of Northern California decided that it is not a violation of copyright to scrape
publicly available data such as Craigslist listings. Also, research is a non-commercial fair use that
neither repackages nor relists the data. Second, 3Taps was sued only after Craigslist sent a cease-
and-desist letter and blocked their IP addresses. The judge ruled that 3Taps trespassed on
Craigslist’s servers specifically by ignoring the cease-and-desist letter and using a proxy to violate
the restrictions that forbid them from accessing the servers. Third, other organizations such as the
Internet Archive scrape and snapshot Craigslist’s web pages along with various other websites.21
Researchers can collect rental listings from these snapshots instead of from Craigslist directly,
though they may lack specific details. As a result, 3Taps lost the case and agreed to pay Craigslist
$1,000,000 over ten years, and permanently stop scraping content from the website.
Trader Corporation vs. CarGurus
In 2017, Trader Corporation sued CarGurus for scraping their listings and photos.22 The main arguments
were:
1. Were the Trader vehicle photos protected by copyright and were the photos owned by Trader?
2. Did CarGurus infringe Trader's copyright?
3. Should statutory damages be applicable, what is the appropriate amount?
The Ontario Superior Court of Justice found the ownership of over 150,000 photos to Trader and confirmed that CarGurus infringed Trader’s copyright by using them for commercial purposes. As a result, CarGurus had to delete all their web scraped photos and make an agreement to not to reproduce Trader’s photos in the future. 19 Boeing, G., & Waddell, P. (2017). New insights into rental housing markets across the united states: web scraping and
analyzing craigslist rental listings. Journal of Planning Education and Research, 37(4), 457-476. 20 http://www.dmlp.org/sites/dmlp.org/files/2012-07-20-Craigslist%20Complaint.pdf 21 http:// archive.org/ 22 https://ca.practicallaw.thomsonreuters.com/w-007-6508?transitionType=Default&contextData=(sc.Default)&firstPage= true&bhcp=1&ignorebhwarn=IgnoreWarns
Monitoring Market Rents in Metro Vancouver Transit Locations │14_
Century 21 vs. Zoocasa
In 2011, Century 21 Canada sued Zoocasa for scraping their online real estate listings.23 The British
Columbia Supreme Court concluded that Zoocasa, a search engine and aggregator of real estate
listings, violated the terms of use of Century 21, which forbids the copying or reuse of its real estate
listings.
Zoocasa was accused of breaching Century 21’s terms of use as well as both trespassing and violating
Century 21’s copyright by web scraping information and reproducing them on Zoocasa’s website. The
British Columbia Supreme Court ruled that real estate listings aggregator Zoocasa indeed violated
Century 21’s terms of use by putting the company’s listings on its site. The court stated that the terms
of use on a website are legally binding even if there is no opt-in provision for web surfers to indicate
that they agree to the conditions. Although Zoocasa only had to pay $1,000 as a fine, this legal case
confirms that a website’s fine print can be enforced even if its users are not asked to accept the terms
by explicitly clicking them.
Conclusion
According to the literature review, web scraping has been legally punished when it was used for
commercial purposes. However, arguments varied case-by-case and there was no litigation where
a private company sued a government authority. The Federal District Court of Northern California
stated that rearranging and publishing publicly available data is not a violation of copyright,
especially when data is used for non-commercial purposes. Also, Statistics Canada produced a
report on short-term accommodation through a private market research firm that web scraped
Airbnb. However, web scraping methodology should be approached cautiously, and researchers
should not violate the website’s terms of use in conducting such study.
23 https://www.stikeman.com/en-ca/kh/canadian-technology-ip-law/that-a-wrap-bc-supreme-court-enforces-website-terms-of-use-and-validates-browse-wrap-agreements-in-century-21-v-zoocasa
Monitoring Market Rents in Metro Vancouver Transit Locations │15_
Metro Vancouver is interested in identifying options to improve access to market rental housing
information, in particular near rapid transit. The current measure based on Statistics Canada’s census
data is highly infrequent and has limits on examining detailed information. As an alternative, web
scraping from online marketplace platforms is becoming more popular among private data analysts
and organizations to collect data and produce monthly reports on rental housing. For instance,
PadMapper compiles active housing listings data by web scraping then produces monthly rent reports
by aggregating datasets in the largest 25 Canadian cities. Similarly, Statistics Canada produced a
report which measured private short-term accommodations in Canada by using third-party web
scraping services. However, automatic web scraping often has been disputed as a legitimate research
method as it may violate websites’ terms of use. Therefore, to enhance gathering more
comprehensive and accurate market rental data, the team suggests two methodologies:
A) Web Scraping Pilot Program B) Collaborative Rental Housing Data Platform
A) Web Scraping Pilot Program
If Metro Vancouver seeks a timely and effective method to scan the housing market at any given
time to compile rental data, then the ordinary web scraping method is a viable option. It provides a
compiled list of rental listings in real-time. The compiled database would create a list of units with
universal features found on all listings, such as the number of bedrooms and bathrooms, property
size, price, location, type of housing, availability, and more. The database can then be used for
further analysis which can guide municipalities to create policy responses.
The method described above contains multiple benefits. Web scraping is a digitalized and automatic
process which is cost-effective, current, and easy to manage, in comparison to telephone or paper
surveys. This method can reduce the administrative burden of data collection. Furthermore, better
access to real-time data also provides more in-depth insight into areas of inequality through detailed
investigation of housing and transportation costs. The Housing and Transportation Cost Burden Study
by Metro Vancouver is an excellent example, where it demonstrates how the accuracy and
comprehensiveness of housing information can impact residents’ affordability and livability.24
However, this methodology may require more time than traditional surveys to manage data
accuracy, such as filtering out duplications and false listings.
24 http://www.metrovancouver.org/services/regional-planning/PlanningPublications/HousingAndTransportCost BurdenReport2015.pdf
Monitoring Market Rents in Metro Vancouver Transit Locations │16_
There are two other important considerations in this option:
1. Legality of Web Scraping
Online platform websites such as PadMapper and Craigslist have strict
restrictions on their uses. If Metro Vancouver uses web scraping in its future
analyses, it should carefully review each website’s terms of use (also called
terms of service or terms and conditions). Alternatively, Metro Vancouver may
request that a third-party firm conduct research similar to that done by
Statistics Canada on the topic of short-term accommodation. Whereas each
website’s terms of use vary, the following sections describe an example of how
a private housing listings platform website, PadMapper, states its acceptable
uses.
Acceptable uses Several sites, including PadMapper, allow visitors to access, view, use, download, and print its
site content subject to the following conditions:25
1. You may use the Services, and download, access and print the Site Content, only in
reasonable limited quantities for your personal, non-commercial use;
2. You may not modify the Site Content; 3. Any displays or print outs of the Site Content must be marked "© PadMapper, Inc. 2012-2016.
All rights reserved,” and
4. You may not remove or alter any copyright, trademark or other proprietary rights notices
that have been placed in the Site Content.
Not acceptable uses However, the website also states the following to restrict web scraping its content:
“You shall not: Harvest or otherwise collect any data, information or Site Content from the Website, including by using manual or automated software, devices, or other processes to "crawl,” "scrape” or "spider” any page of the Website or Services to copy, obtain, propagate, distribute or misappropriate any User Data or Site Content.”
Similarly, Craigslist states: “You agree not to copy/collect CL content via robots, spiders, scripts,
scrapers, crawlers, or any automated or manual equivalent (e.g. by hand)” unless it is licensed by
theme in a written agreement.26 Although the terms and conditions may seem counter-intuitive,
some of its terms may not be legally binding; however, a lawyer may have more accurate
information as to the enforceability of these terms.
25 https://www.padmapper.com/tos 26 https://www.craigslist.org/about/terms.of.use.en-us
Monitoring Market Rents in Metro Vancouver Transit Locations │17_
2. Multidisciplinary Team
Another challenge is the capital cost of establishing the web scraping system and maintaining it. As
websites change over time and organizations restructure their policy of use, different roles are required to
ensure that the collected data is accurate, and volumes of data extraction are consistent. Below (Figure
8) is a simplified process model of what it would look like if Metro Vancouver were to develop its own
web scraping system:
The first action would be to assemble a multidisciplinary team to mitigate some of the risks and
unknowns. Based on the research on existing practices, a team would require the following members at
a minimum:
• Planner (data to request)
• IT analyst (internal data storage)
• Data scientist (data manipulation)
• Legal counsel (laws & regulations)
Once the team is complete, the planner can lead the team by creating a list of features to be
extracted from rental listing websites, based on the team’s objectives. Then, programmers can design
and build a system to import information from multiple websites. Here, the team may include not only
the major platforms such as Craigslist and Kijiji, but also other media platforms that are in other
languages (e.g. Chinese, Punjabi, Farsi, Korean, Spanish, Japanese, and more) and different types of
platforms (e.g. Facebook, institutional websites, media platforms, blogs) depending on the availability
of time and resources. After a comprehensive database is created and deduplicated, planners and
data scientists can conduct analyses from the compiled information and visualize the results.
Metro Vancouver Regional Planning
Figure 8. Simplified Process for a Potential Metro Vancouver Web Scraping Project
Monitoring Market Rents in Metro Vancouver Transit Locations │18_
Figure 9. Simplified Example Framework of the Collaborative Platform
B) Collaborative Rental Housing Data Platform
The Collaborative Rental Housing Data Platform is an advanced system of the first recommendation,
the Web Scraping Pilot Program. This platform goes beyond web scraping various listing websites,
and it seeks to create user agreements and licensing strategies with websites or data companies. By
creating a partnership among multiple stakeholders, the proposed collaborative platform will engage
with other data providers to combine external data (e.g. transportation, occupation, health, and built
environment) with rental housing data. As a result, Metro Vancouver and municipal partners can
acquire a better understanding of the combined cost burden of housing and transportation.
For instance, trends and patterns of multiple datasets (e.g. rental market, redevelopment pattern,
land use changes, walkability, health levels, origin-destination travel patterns, job locations, and
more) can aid planners and decision-makers to examine the following questions:
• Are rents nearby transit-oriented locations increasing at a faster rate than in
other neighbourhoods that are away from rapid transits? • How do rents vary along rapid transit corridors? • What are the community impacts caused by the construction of rapid transit in
different neighbourhoods?
A collaborative platform would help Metro Vancouver to answer these questions. This collaborative
platform also allows Metro Vancouver to share their burdens across multiple disciplines
associated the resourcing impacts with data collection and analysis (cost, resources, time) with
other partners.
Monitoring Market Rents in Metro Vancouver Transit Locations │19_
During this process, the collaborative platform (Figure 9) must address four key concerns:
• What matters the most? What are the main goals/objectives of these analyses? • What are the minimum requirements to achieve these goals? How in-depth and
comprehensive should the platform be? • How will the data be used? Who are the potential users? • How will the platform monitor the system and its data?
These questions can be best answered by a group of professionals who are involved with the
collaboration. By coordinating discussions and providing a wide range of support, Metro Vancouver
could gather more comprehensive and in-depth information on the primary and secondary rental
market in the region. However, organizing such a collaborative platform might be a challenge of
organizational management, time, and financial support.
king Market Rents intro Vancouver: The cost of market rental housing acts as a key measure to fulfill Metro Vancouver ’s goal of providing
diverse and affordable housing choices in the region. As the housing availability and cost data change
every day, more accurate and frequent data source than the traditional survey is required. Through
literature review and informal interviews, this report compared the advantages and disadvantages of
census survey, CMHC survey, and web scraping. As a result, the research team recommends two options
of A) Web Scraping Pilot Platform and B) Collaborative Rental Housing Data Platform. For a short-term and
quick analysis, we recommend option A. However, a multidisciplinary and extensive data platform of option
B will provide more significant benefits in the long run. The fundamental basis of both options is web
scraping. The current legality of web scraping varies depending on websites and cases; however, Statistics
Canada has recently published a report on short-term rental accommodations by using third-party web
scraping services, which indicates increased acceptance of this methodology. Collecting and monitoring
asking market rents in Metro Vancouver will encourage more equitable, affordable, and sustainable growth
in the region, in particular, areas along rapid transit corridors.
Monitoring Market Rents in Metro Vancouver Transit Locations │20_
https://upload.wikimedia.org/wikipedia/commons/5/5b/TransLink_SkyTrain_departs_Stadium-Chinatown_station_in_Vancouver%2C_British_Columbia%2C_Canada.png
https://youtu.be/OTyxsjBEscY