Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
CIM Internet study - methodology Page 1
CIM Internet study
Edition October 2015
Methodology
CIM Internet study - methodology Page 2
CIM – Centre d’Information sur les Médias
Avenue Herrmann-Debrouxlaan 46 - 1160 Bruxelles
Tél. : 32 2 661 31 50 - Fax: 32 2 661 31 69
E-mail : [email protected]
URL : http://www.cim.be
CIM Internet study - methodology Page 3
Content
PREFACE ........................................................................................................................................................... 6
INTRODUCTION ................................................................................................................................................ 7
WHAT IS CIM ? ................................................................................................................................................. 9
STAKEHOLDERS .............................................................................................................................................. 10
CIM INTERNET TECHNICAL COMMITTEE ..................................................................................................................... 10
THE RESEARCH INSTITUTE ........................................................................................................................................ 11
SUBSCRIBERS ........................................................................................................................................................ 11
RULES OF THE STUDY .............................................................................................................................................. 11
THE CIM INTERNET STUDY .............................................................................................................................. 12
1 THE WEB TRAFFIC CENSUS SCRIPTING ................................................................................................... 12
1.1 SCRIPTING OF HTML PAGES ......................................................................................................................... 12
1.2 SCRIPTING OF HTML5 OR AJAX PAGES .......................................................................................................... 14
1.3 SCRIPTING OF HTML5, HYBRID AND NATIVE APPS ............................................................................................. 14
1.4 SCRIPTING OF STREAMING .......................................................................................................................... 14
2 THE INTERNET TRAFFIC CENSUS REPORTING .......................................................................................... 15
2.1 PUBLIC RESULTS ....................................................................................................................................... 15
2.2 GEMIUSPRISM .......................................................................................................................................... 15
2.3 GEMIUSOLA ............................................................................................................................................ 16
2.4 OFFICIAL CIM REPORTS ............................................................................................................................. 17
2.5 METRICS ................................................................................................................................................. 17
3 CIM INTERNET AUDIENCE STUDY ........................................................................................................... 18
3.1 INTRODUCTION ........................................................................................................................................ 18
3.2 METHODOLOGICAL ISSUES WITH PANEL RESEARCH .......................................................................................... 19
3.2.1 Representativity of the sample ........................................................................................................ 19
3.2.2 The multi-cookie problem ................................................................................................................ 19
3.2.3 The multi-browser problem ............................................................................................................. 19
3.2.4 The multi-pc problem ...................................................................................................................... 19
3.2.5 The multi-user problem ................................................................................................................... 19
3.3 RECRUITMENT ALGORITHM ......................................................................................................................... 20
CIM Internet study - methodology Page 4
3.4 ONLINE QUESTIONNAIRE ............................................................................................................................ 20
3.5 RESPONSE RATE ........................................................................................................................................ 21
3.6 VALIDATION RULES .................................................................................................................................... 22
3.6.1 Validation rules for the CIM Internet PC Panel ................................................................................ 23
3.6.2 Validation rules for the Smartphone Panel and the Tablet Panel ................................................... 23
3.7 REAL USERS AND REAL USER ESTIMATES ........................................................................................................ 24
3.7.1 Real Users methodology .................................................................................................................. 24
3.7.2 Representativeness ......................................................................................................................... 24
3.7.3 Concept of the algorithm................................................................................................................. 24
3.7.4 Real Users algorithm with usage of BEAST model ........................................................................... 25
3.7.5 Calculation of Real Users number in steps ...................................................................................... 26
3.7.6 Weighting the cookie panel: Theoretical example .......................................................................... 27
3.7.7 Real Users (site-centric) for websites .............................................................................................. 27
3.7.8 Socio-demography of Panelists ....................................................................................................... 28
3.7.9 Preparing the PC Cookie Panel data ................................................................................................ 28
3.7.10 Rules for Panel weighting ........................................................................................................... 30
3.7.11 Results ......................................................................................................................................... 31
3.8 UNIVERSE ............................................................................................................................................... 32
3.8.1 PC Cookie Panel ............................................................................................................................... 32
3.8.2 Smartphone Panel ........................................................................................................................... 35
3.8.3 Tablet panel ..................................................................................................................................... 37
3.9 DATA PROCESSING .................................................................................................................................... 38
3.9.1 Gross versus net panel ..................................................................................................................... 38
3.9.2 Website weighting (behavioral representativeness) ....................................................................... 39
3.9.3 Socio-demographic weighting ......................................................................................................... 39
3.10 FUSION OF THE CIM INTERNET PANELS ......................................................................................................... 41
3.10.1 Data sources ............................................................................................................................... 42
3.10.2 Metrical Clustering ...................................................................................................................... 43
3.10.3 Behavioral Distance .................................................................................................................... 43
3.10.4 Nearest Neighbor Merging ......................................................................................................... 44
3.10.5 Weighting ................................................................................................................................... 44
CIM Internet study - methodology Page 5
4 PUBLICATION OF CIM INTERNET AUDIENCE RESULTS ............................................................................. 47
4.1 PARTICIPATION IN THE STUDY ...................................................................................................................... 47
4.2 CONDITIONS FOR PUBLICATION IN PLANNING FILES .......................................................................................... 47
4.3 ACCESS TO THE RESULTS ............................................................................................................................. 47
4.4 MONTHLY EXCEL REPORTS .......................................................................................................................... 47
4.5 THE GEMIUSEXPLORER AUDIENCE REPORTING TOOL ......................................................................................... 48
4.6 MONTHLY MEDIA PLANNING TOOLS .............................................................................................................. 48
5 CIM INTERNET SOFTWARE PANEL .......................................................................................................... 49
6 CONTROLS OF THE CIM INTERNET STUDY .............................................................................................. 49
6.1 CHECKING TRAFFIC DATA ............................................................................................................................ 49
6.1 CHECKING AUDIENCE DATA ......................................................................................................................... 50
ANNEXES ........................................................................................................................................................ 51
ANNEX 1 ......................................................................................................................................................... 51
6.2 CIM INTERNET PC COOKIE PANEL QUESTIONNAIRE ......................................................................................... 51
6.3 CIM INTERNET TABLET PANEL AND CIM INTERNET SMARTPHONE PANEL ............................................................ 58
ANNEX 2: SOCIO-DEMOGRAPHIC VARIABLES IN REPORTING ......................................................................... 62
ANNEX 3: CALCULATION OF SOCIAL GROUPS ................................................................................................. 66
CIM Internet study - methodology Page 6
PREFACE
Always remember, it’s better to arrive late than to arrive ugly.
Plus que jamais, à l’ère du big data, notre industrie a besoin de se consolider au travers d’environnements
propres et de currencies. Nous avons remis les choses à plat et sommes repartis d’une page blanche il y a un an
et nous avons introduit un nouvel institut du nom de Gemius sur le marché belge. Gemius est présent sur une
vingtaine de marchés au travers du globe dont certains pays avancés comme Israël, la Turquie, le Danemark, etc.
Notre ambition, au travers de cette collaboration, repenser et redessiner l’étude Internet au sein du CIM. Et ce
n’est pas une étude que nous avons décidé de dessiner mais bien cinq études, différentes, amenées à répondre
aux différents besoins du marché avant toute forme de convergence.
L’étude Site Centric desktop a été délivrée au marché fin 2014, regroupant plus de 600 sites, sections ou groupes.
Les résultats sont disponibles via gemiusOLA et mis gratuitement à disposition des souscripteurs et, à côté des
indicateurs traditionnels du Site Centric, nous apportons des données nouvelles sur le temps passé et nous
créons également une mesure du mobile (tablettes et smartphones) ainsi qu’une mesure du streaming.
Partant du Site Centric nous avons lancé différentes collectes fin de l’année passée afin de constituer les cookie
panels sur les mesures desktop et mobile. L’heure est au déploiement des résultats issus des cookie panels
desktop, smartphone et tablette. Le marché attendait ces résultats depuis longtemps, ils sont à présent
disponibles de façon séparée dans les outils de planning. La prochaine phase se jouera au niveau de la fusion des
données cookie panel desktop et mobile (smartphone et tablette), afin d’avoir une vue exacte sur les patterns
de consommation des familles média présentes dans la mesure.
Tablettes, smartphones, desktops, streaming... nous savons que notre écosystème est plus complexe à aborder,
plus varié et plus difficile à mesurer. Ceci sans compter sur les environnements qui n’acceptent pas la mise en
place de tags nécessaires aux études Site Centric et au Cookie Panel. Or ces environnements sont parfois
importants pour les agences, pour les médias et pour les annonceurs. C’est pourquoi nous produisons un
Software Panel sur la mesure desktop. Une fois ce panel certifié nous opérerons une fusion avec le Cookie Panel
de façon à pouvoir bénéficier d’un set de données.
À l’heure du big data il est crucial de pouvoir asseoir notre marché sur un certain nombre de currencies. Les
données CIM resteront à cet effet fondamentales pour nous aligner et pour consolider le marché. Les Slow
Moving Data issues des études empiriques, notamment au travers des panels, ne luttent pas contre les Fast
Moving Data utilisées et exploitées dans les Data Management Platforms des agences, des médias ou des
annonceurs. Ces deux familles de données s’alimentent les unes des autres et s’enrichissent de leurs forces
relatives et spécifiques. Le travail continu au sein de la commission et nous entrevoyons les prochains
développements au travers d’un software panel pour le mobile ainsi que la possibilité d’auditer, de mesurer,
d’optimiser en real time les campagnes publicitaires sur des indicateurs de couverture sur cible par exemple. Ce
dernier point est un exemple de la façon dont les Slow Moving Data vont enrichir les Data Management
Plateformes et les critères de ciblages commercialisés au travers du Big Data.
Jean-Michel Depasse
President of the CIM Internet Technical Committee
CIM Internet study - methodology Page 7
INTRODUCTION
As of 2000, CIM is responsible for the traffic measurement on Belgian and Luxembourgian websites
and, since 2005, for the Belgian Internet audience measurement. From 2000 to 2011 the CIM Internet
study was outsourced to Douwère/Metriware, for both the traffic part Metriweb and the audience
part Metriprofil. From 2012 to May 2014, TNS Media was in charge of the CIM Internet study. Only
traffic data were published. Since June of 2014 Gemius has set up the new integrated traffic and
audience CIM Internet Study.
As the Internet constantly evolves, the scope of the CIM Internet measurement has been enlarged
over time. The CIM Internet study currently collects census data on webpages (regular as well as html5
responsive webpages), audio and video streams, html5 and native applications on PCs, tablets and
smartphones. Per device a different panel is established to estimate the Internet audience.
Gemius is also developing a software panel to allow, for the first time, traffic and audience estimates
for non-subscribing websites.
The CIM Internet study currently consists of these elements:
- A census measurement for subscribed websites (on PC, tablet, smartphone and other devices)
- A census measurement for subscribed html5 & native apps (on tablets, smartphone and others)
- A PC cookie panel representing the PC audience
- A tablet cookie panel
- A smartphone cookie panel
Since the traffic coming from other devices (Connected TV, gaming devices, …) is still very low, they
are counted together with the biggest panel, the PC cookie panel.
Other elements are still under development (to be delivered during 2015):
- A census measurement for streaming from subscribed publishers (on PC, tablet, smartphone and
other devices)
- A total surfing universe, based on the fusion of PC audience, tablet panel and smartphone panel
- A software panel on Windows PCs measuring surfing on non-subscribed websites
Other elements may be added in the future when a technical solution will become available:
- An audience measurement for apps
- A software panel version for other operating systems than Windows: Apple, …
- A software panel on tablets and smartphones measuring web pages and apps
CIM Internet study - methodology Page 8
The different elements in the CIM Internet study, split up into traffic and audience, are shown in the
image below:
CIM Internet study - methodology Page 9
WHAT IS CIM ?
Created in 1971 from the merger of OFADI (first authentication agency for the distribution of press
titles in Belgium) and CEBSP (first Belgian agency for audience measurement), the CIM is an association
that aims to provide reference figures for the Belgian advertising market. The data collected by CIM
are primarily intended for members who co-finance the studies. However, some results are also
available to the general public. These are published on the CIM website: www.cim.be
CIM members consist of advertisers, intermediaries (advertising and media agencies) and the media.
These members meet in the General Assembly, where the votes are distributed among the different
individual industry associations and members, so that all the interests in the advertising market are
fairly represented.
CIM Internet study - methodology Page 10
STAKEHOLDERS
CIM Internet Technical Committee
The Internet Technical Committee overlooks the realization of the CIM Internet study. This committee
was established in 1999. At the time of publication, the Committee was composed as follows:
Chairman : Jean-Michel DEPASSE (Mindshare)
Members :
Katrien Berte (Mediahuis)
Céline Branders (Rossel)
Dominique Catry (De Persgroep Publishing)
Alain Collet (Omnicom Media Group)
Saskia Cuperus (Roularta)
Philippe Degueldre (Pebble Media)
Pierre Dubois (RTBF)
Marie-Christine Georges (Mediabrands)
Kim Gils (Medialaan)
Quentin Huyberechts (BNP Paribas Fortis)
Stéphanie Radochitzki (Space)
Noëlle Stevens (RTL Belgium)
Nicolas Vanderseypen (Isobar)
Alexis Wautot (IPM Group)
Staff members:
General Manager: Stef Peeters
Senior Project Manager: Paul Vanrespaille
Assistant Project Manager: Nicolas Schönau
The staff members take care of the relations with the subscribers and, together with the research
institute, they monitor the correct implementation of all technical requirements.
CIM Internet study - methodology Page 11
The research institute
Since 1st June 2014 Gemius is the research institute responsible for the data collection, analysis and
publication of the CIM Internet study results. Gemius is specialized in internet measurement, and
provides the currency study in several European countries (more information on www.gemius.com).
Subscribers
The tactical internet studies that are discussed in this publication are co-financed by the subscribers.
Only these companies have access to the full results of this publication. The technical partners have
contractually no right to give the information to third parties. Moreover, the members of the CIM are
required to handle the data with care and they may use it only in the context of their normal
commercial activities. Companies wishing to join the CIM can obtain all necessary information from
the CIM staff ([email protected]) or on the website www.cim.be.
An up-to-date list of subscribers can be found on the CIM website:
FR: http://www.cim.be/fr/internet/liste-des-souscripteurs
NL: http://www.cim.be/nl/internet/lijst-van-intekenaars
Rules of the study
The CIM Internet technical committee sets the rules for the CIM Internet study. These are approved
by the CIM general board.
The most recent version of these rules can always be found on the CIM website:
FR: http://www.cim.be/fr/internet/reglement-internet
NL: http://www.cim.be/nl/internet/reglement-internet
CIM Internet study - methodology Page 12
THE CIM INTERNET STUDY
1 The web traffic census scripting The CIM Internet traffic measurement is based on a ‘site centric’ approach. This means that the
measurement requires the co-operation of the site and application owner to install measurement
codes during the start-up of the project.
All subscribing websites are required to implement a small JavaScript tag into the source code of their
website. Each time a webpage is requested by a browser, a call is sent to one of the measurement
servers of Gemius (the research institute). These servers retain the number of page requests and
identify the browser.
If a browser visits one of the participating sites for the first time, a third party cookie is installed on the
hard disk of the device. A cookie is a small text file that contains a unique numerical code, the browser
ID, which the Gemius server installs in the browser of the visitor of the webpage. In case of a successful
installation, the cookie will be sent to Gemius every time the surfer visits a scripted website. This cookie
allows to link several page requests and visits coming from one unique browser in an anonymous way,
independently of the websites that are visited.
Several browsers do not accept cookies, because of default or user based settings. In that case, the
classic site centric measurement system can only measure a page request. As there is no way to identify
a browser, no information can be collected on Unique Browsers (UB’s) and visits. This issue is a growing
problem in site centric measurement studies, so research institutes had to search for additional
solutions to identify browsers.
Gemius uses its proprietary BrowserID technology, on top of cookies, to identify more browsers. The
BrowserID technology combines local storage identifiers and 3rd party cookies to better identify
browsers used by internet users. In the future browserID may incorporate more variables (such as 1st
party cookie, UserAgentString, IP class) in order to increase the efficiency of assigning traffic to
individual browsers. The BrowserID technology has allowed to reduce the percentage of unidentified
page views from 16% to 4%.
The Internet user’s browser returns these identifiers when subsequent connections to the collection
server are made, which makes it possible to identify Page Views made by the same Internet user. Page
Views generated by a given BrowserID are grouped into a logical set, meant to reflect one
uninterrupted Visit to the given website. This serves as a base for determining the number of Visits.
1.1 Scripting of html pages
The Gemius measurement script is a JavaScript tracking code that consists of a tracking code script
with a unique identifier per site or section and several additional fields that can contain more page-
specific information. The JavaScript connects to the Gemius servers. The script is asynchronous, which
means that the speed of response coming from the Gemius servers will not affect the loading of the
webpage itself. The script supports both mobile and regular webpages (in HTTP and HTTPS), as well as
applications built in HTML5 code.
CIM Internet study - methodology Page 13
<script type="text/javascript">
<!--//--><![CDATA[//><!--
var pp_gemius_identifier = 'IDENTIFIER';
var pp_gemius_extraparameters = new Array('lan=EN', 'key=keyword',
'subs=subsection', 'free=free_field');
// lines below shouldn't be edited
(function(d,t) {try {var
gt=d.createElement(t),s=d.getElementsByTagName(t)[0],l='http'+((location.pr
otocol=='https:')?'s':'');
gt.setAttribute('async','async');gt.setAttribute('defer','defer');
gt.src=l+'://gabe.hit.gemius.pl/xgemius.js';
s.parentNode.insertBefore(gt,s);} catch (e) {}})(document,'script');
//--><!]]>
</script>
The script consists of a unique ID code or identifier for each site and section. Four additional
parameters are available to describe the content of the webpage. Only the first parameter for the
language of the site is mandatory.
These fields are used to describe the content of the page that is being measured.
Lan = Language. Possible values: FR, NL, EN, GE, LU, OTHER.
Key = Keyword: value to describe the content of the page. Specification of the information in
the section.
A keyword is used:
- To include additional info on site structure that can’t be derived from the section only.
E.g.: Section sports – Keyword: Tour de France
- To recreate commercial packages.
The number of different keywords is limited to 200 per site or section.
Subs = Subsection: free field for subscribers who wish to add more detailed info on their
website structure.
Free: a free field to be used for whatever information the subscriber wants to collect.
Language and keyword are reported to the market in all Gemius tools, whereas subsection and the
free parameter are only available for internal analysis by a subscriber in the online tool gemiusPrism.
Subscribers can consult more information on the measurement script and the correct tagging of a
website on the logged in section of the CIM website on:
FR: http://www.cim.be/fr/internet/documentation-technique-pour-les-souscripteurs
NL: http://www.cim.be/nl/internet/technische-documentatie-voor-intekenaars
CIM Internet study - methodology Page 14
For each site that enters the study, the CIM staff checks if the tagging is implemented correctly,
following the rules of the CIM Internet study.
1.2 Scripting of html5 or AJAX pages
An adapted script is available to measure websites using HTML5 or AJAX. Subscribers with such
websites can ask the Permanent Structure for the documentation by sending an e-mail to
1.3 Scripting of html5, hybrid and native apps
HTML5 apps are scripted with the normal page scripting technology. Two SDKs (“Software
Development Kit”) are available to tag native or hybrid apps: one for iOS native and hybrid apps, and
one for Android native and hybrid apps.
This documentation and an app-specific identifier are sent to every subscriber who declares to have a
native or hybrid app.
1.4 Scripting of streaming
The tagging of audio and video streams is done in the video player. There are two separate
documentations available: one adapted for Flash players with JS controller and one for JavaScript
controlled players.
You can find them on the CIM website on page:
FR: http://www.cim.be/fr/internet/documentation-technique-pour-les-souscripteurs
NL: http://www.cim.be/nl/internet/technische-documentatie-voor-intekenaars
CIM Internet study - methodology Page 15
The stream tagging also requires an identifier that is unique for a player. In a set-up phase the script
was synchronous, which inherently contained a risk of delaying the execution of the video or audio
on the web page. This synchronous script was implemented and tested by some subscribers.
On the request of CIM, Gemius developed an asynchronous script, which does not interfere with the
execution of the stream (video or audio). Test subscribers using the synchronous script were asked to
migrate to the asynchronous script within a year after the launch of the streaming measurement (to
be implemented at the latest by Jan/2016).
2 The Internet traffic census reporting The results of the CIM Internet site centric measurement are published in reports and software tools.
2.1 Public results
The CIM publishes limited traffic results on its public website.
- For Belgium, Page Views and visits (sessions) are available for Belgian and worldwide traffic
within the gemiusOLA platform.
- In 2015 limited results on applications and streaming will be added.
- For Luxembourg, there are daily reports on Page Views, Sessions and Unique Browsers.
- The technical data (information about the browser, device, OS, etc.) is aggregated and displayed
on country level on the technical ranking website. Belgian results are available via
http://www.rankingbe.com and Luxembourgian results are available via http://www.ranking.lu.
2.2 gemiusPrism
gemiusPrism is an online web analytics tool for subscribers. In gemiusPrism publishers only have access
to the results of their own site(s), not to the data of any other site. The tool reports near-live traffic
data for participating sites (with a maximal delay of 2 hours). This tool reports raw data on pages,
streaming and applications measurement for each site, app or stream. These data are detailed but
unfiltered: they allow publishers to improve their website and review their tagging.
gemiusPrism results are for internal usage only. Subscribers cannot publish any data nor communicate
results to third parties.
A site that uses sections will basically see results for the entire website. However, a standard report
“Content > gemiusTraffic structure” is available to analyse the traffic by section. It is equally possible
to make a selection based on one section, and look at any report available in the gemiusPrism tool.
CIM Internet study - methodology Page 16
Users can find a general manual on the usage of the gemiusPrism tool and a how-to-use guide, tailored
to the Belgian market, on the subscriber section of the CIM website:
FR: http://www.cim.be/fr/internet/documentation-technique-pour-les-souscripteurs
NL: http://www.cim.be/nl/internet/technische-documentatie-voor-intekenaars
2.3 gemiusOLA
gemiusOLA is an online analytical tool for subscribers only. It contains a limited, filtered set of data for
all active sites and sections on a daily, weekly and monthly level. The most recent data concern the
previous day. gemiusOLA offers time, page views, visits, unique browsers and derived metrics.
gemiusOLA also contains Real User estimates (RUEst). These RUEst are available on a daily level,
offering the most recent audience data available in the current month. They are limited to the 18+
universe. For foreign traffic, there are also RUEst calculated with the same methodology.
In 2016, Real Users (surfers) will also become available in OLA. When the audience data for a month is
made available, the Real User estimates will be replaced by the final Real Users metric.
The results are also regrouped by media groups, agencies, thematical groups and co-branded websites.
Applications and streaming are also reported on a separate tab sheet.
Users can find a general manual on the usage of the gemiusOLA tool on the subscriber section of the
CIM website:
FR: http://www.cim.be/fr/internet/documentation-technique-pour-les-souscripteurs
NL: http://www.cim.be/nl/internet/technische-documentatie-voor-intekenaars
gemiusOLA offers the possibility to use an API to integrate data automatically into another platform.
For a technical description of the API, please contact CIM at [email protected]. Only CIM can give access
to the API.
CIM Internet study - methodology Page 17
In August 2015 the streaming results had not yet been officially published in gemiusOLA.
2.4 Official CIM reports
The official CIM reports are available on the CIM site for subscribers only.
Luxembourgian subscribers have access to traffic reports on a daily, weekly and monthly basis. They
contain detailed results onto the second level of subsection.
For the Belgian market only monthly audience reports are available. They are meant as an archival
back-up for the planning software. Traffic reports are planned to become available later in 2015.
2.5 Metrics
The CIM Internet traffic measurement offers a large range of results. All software tools have a manual
that extensively explains the available metrics. A list of definitions of metrics is also available in the
rules of the CIM Internet study.
Below we offer an overview of the most commonly used metrics and those that raise the most
questions.
Page view: a file or combination of files sent to a unique browser after a request of this
unique browser was received through the server of the site (formerly known as ‘Page
Request’).
Visit: a series of Page Views done by the same visitor within the same site, without
interruption of more than thirty (30) consecutive minutes. Also known as ‘session’.
UB (Unique Browser): a browser identified on the basis of a CIM Internet cookie or a unique
BrowserID (BID).
Real User estimates: a temporary estimation of the number of people that are represented
by the UBs. This estimation is based on the goodBID algorithm.
Real Users: an estimate of the number of people that are represented by the UBs. This
estimate is based on the goodBID algorithm and is made when the month has ended and all
cookies and browserIDs have been classified as goodBID or badBID.
Aggregation of dates = daily, weekly (Monday to Sunday) – Monthly, no aggregation for
other at random selected time periods.
The CIM rules contain more information on validation rules. This document is available on the CIM website: FR: http://www.cim.be/fr/internet/reglement-internet NL: http://www.cim.be/nl/internet/reglement-internet
CIM Internet study - methodology Page 18
3 CIM Internet Audience study
3.1 Introduction
The CIM traffic measurement counts all browsers and all the pages that are requested. Yet still a
dimension is missing: who is behind a browser? The CIM cookies and BrowserIDs cannot tell us
anything about gender, age or other socio-demographic characteristics of the surfer, although they are
essential in media planning.
The CIM Internet measurement is therefore fundamentally different from all other tactical studies of
the CIM: in these studies, starting from a perfectly known sample, an extrapolation is made regarding
to the media range in the universe. The CIM Internet traffic measurement is a site centric census
measurement. The number of page views, sessions and unique browsers is measured very reliably but
the socio-demographic reality behind it has to be added. The CIM Internet study has opted to set up
and fuse several ‘cookie’ panels which are representative samples of the sub-universes:
- A PC panel
- A tablet panel
- A smartphone panel
The respondents are invited through a pop-up that is linked with their cookie or browserID. They are
asked to fill out an online survey containing questions about their internet infrastructure, their surfing
behavior and their socio-demographic characteristics. For these panels both the browsing behavior
(from their cookie) and socio-demographic characteristics are known, and therefore the profile for
each site with sufficient visitors can be calculated. The questionnaires can be consulted in the annexes
of this document.
The advantage of this approach is two-fold:
- The relationship between socio-demographic data and browsing behavior is derived from the
cookie measurement and is therefore not dependent on the person’s recollection, the
correct completion or any other active intervention of the surfer.
- The socio-demographic data of person A, questioned because of a visit to site X, also apply to
site Y and Z if they were also visited by person A.
The latter is important for smaller sites: they also benefit from the data collected on larger sites. In this
way, it is also economically feasible for them to obtain reliable profile data.
Using a panel, instead of a survey that is limited in time, has the advantage that the monthly data will
always be based on the same sample on which the census data are collected. On the other hand, after
some time, panel members will have to be invited to check if their answers are still up-to-date.
This panel approach assumes that a solution is found for different issues. These will be discussed in the
next section.
CIM Internet study - methodology Page 19
3.2 Methodological issues with panel research
3.2.1 Representativity of the sample
Unlike any other tactical CIM study, a scientific prior sampling of participants is not possible with online
surveys that are triggered via a pop-up on participating websites. It is therefore unlikely that such an
online survey is perfectly representative. For the CIM internet study this is overcome by using a gross
panel that is split up into a net panel and a reserve panel. The net panel tries to represent the internet
population as closely as possible and in fact draws the sample a posteriori. The reserve panel contains
the rest of the surfers that have filled out the intake survey.
3.2.2 The multi-cookie problem
Surfers who regularly delete their cookies are registered with several consecutive cookies. When in
the PC Cookie Panel a sample of visitors is questioned, these surfers are recognized with only one
cookie. Because of this incomplete time series, their profile is only connected to a part of the actual
sites visited. This underestimation, and the fact that surfers who delete cookies might have a different
profile, may result in a distortion of the socio-demographic profiles.
Gemius bypasses this problem by only showing the invitation pop-up to cookies that are at least 7 days
old.
3.2.3 The multi-browser problem
Surfers can use more than one browser on a certain device. This is corrected by estimating the so-
called J coefficient (see below 3.7).
3.2.4 The multi-pc problem
Surfers can use the internet on more than one device, e.g. a desktop at work, and a laptop at home.
This is also corrected by estimating the so-called J coefficient (see below 3.7).
During 2015, a fusion of the home and work consumption for panel members declaring to surf both at
home and at work/school will solve this problem.
3.2.5 The multi-user problem
When several people are using the same device, their internet traffic cannot always be attributed
correctly. Therefore, surfers will only be allowed to any of the internet panels if they either:
- use their own login,
- use the same login as other people, but represent at least 50% of the total internet usage on
that device.
People who do not meet this condition are screened out.
CIM Internet study - methodology Page 20
3.3 Recruitment algorithm
Surfers are invited to join the CIM Internet PC cookie panel, the CIM Internet Tablet cookie panel or
the CIM Internet Smartphone cookie panel through pop-up invitations shown on the participating
websites.
These pop-ups follow a randomized pattern based on an algorithm that uses the following rules:
- Each day a subset of the Gemius cookie database is selected. This subset is maximum 45% of
the entire cookie database.
- From this subset the invitation is only shown for cookies that meet several conditions:
- The cookie must be at least 7 days old. This prevents surfers that delete their cookies
after each session to become part of the panel.
- If the cookie was already invited to participate, then the cookie should no longer be
invited in the same timeslot. The timeslots used are:
1. 08.00 h - 11.59 h
2. 12.00 h - 14.59 h
3. 15.00 h - 18.59 h
4. 19.00 h - 21.59 h
5. 22.00 h - 07.59 h
- In total, a given cookie cannot receive more than 5 invitations.
- All subscribing websites participate and they cannot impact the odds of being invited.
- The recruitment is continuous.
- Only sites for which the target audience are children under 12 years can be excluded
from the recruitment.
- The CIM Internet committee can ask to raise the odds for certain websites with a
specific profile. In 2014 this was done for websites in French and for websites that
attract younger surfers (12-24 years old). In 2015 this was done for websites that
attract younger French surfers (12-24 years old).
3.4 Online questionnaire
Two versions of the online questionnaire are available: one version for the CIM Internet PC cookie
Panel and a shortened version adapted for both the CIM Internet Tablet Panel and the CIM Internet
Smartphone Panel. The questions are presented sequentially on separate screens. The full surveys are
available in Annex 1.
CIM Internet study - methodology Page 21
3.5 Response rate
The average response rate in online research is significantly lower than in classical offline research.
Moreover, in studies such as CIM Internet there is no control on who participates: a random sample
of cookies is drawn, so the uncertainty about the representativeness of actual respondents is big.
This disadvantage is opposed to the low cost, speed, ease of use, and direct error-free coding of results.
Internet panels try to maximize the benefits of online research and minimize the disadvantages.
Moreover the CIM Internet study makes the connection between the online survey and the cookie or
BrowserID, which makes it possible to link the surfing behavior to the survey.
For this study a random selection of surfers on Belgian websites is recruited on a continuous basis. As
the study measures surfing behavior on these sites, the sampling frame perfectly fits the universe. The
response to the study can be measured in different ways.
In the start-up phase the emission ratio was set higher to build up the panel. Once the minimum size
was reached, the emission ratios were lowered, so that the panel is kept stable by substituting panel
members that delete their cookie and thereby leave the panel.
Gross response rate = the number of people who start the questionnaire compared to the number of
invitations shown. From January to March 2015 the gross response rate was 2% for PC and 5,6% for
mobile devices.
- This percentage should not be compared with off-line research since the pop-up will be
shown up to 5 times (each on a different site and on another part of the day) to the same
person, on the same device and browser as well as on other devices and browsers.
- A second, more important complication is that an attempt to present an invitation does not
always mean that the surfer can actually see the invitation. The degree to which the lay-out
of the pop-up contrasts with the lay-out of the webpage will impact its visibility on the site.
Not everyone who starts the survey gets to the end. From June to December 2014 the drop-out rate
was 39% (the number of people who start but do not end the survey, divided by the number of people
that answered the first question). Since this was particularly the case for young respondents (who do
not always know e.g. the degree and profession of the MRI) the survey was shortened for 12-17 year
olds in July 2015.
Not everyone who completed the questionnaire can be used in the panel.
- Some participants are filtered out on validation: on average 17,5% of respondents who
finished the survey were rejected because of validation rules (see 3.6).
- Some participants were filtered out based on the "good cookie" condition: their cookie was
not active before, during and after the month of reporting.
Gross panel = all respondents meeting all of the previous criteria make up the gross panel. This is the
pool of useable panelists.
Net panel = a selection from the gross panel based on the socio-demographic objective derived from
the establishment study (see under 3.9).
Reserve = the part of the gross panel not used for the net panel for a given month. These people may
be used in a later month.
CIM Internet study - methodology Page 22
Two tables are shown below. The first shows the size of the PC cookie panel used after half a year and
after 1 year of recruiting, the second shows the recruitment of mobile panel members after 9 months.
PC 16/06/2014 – 31/12/2014 16/06/2014 – 31/07/2015
Number % Number %
Questionnaires started 125 626 100,00% 262 438 100,00%
Dropout rate 49 597 39,00% 126 003 48,00%
Completed 76 029 136 435
Filtered out on validation* 13 329 17,50% 27 464 20,13%
Filtered out on condition "good" 18 695 24,60% 57 066 41,83%
Gross panel (validated/active) 44 005 51 905
Net 24 005 54,60% 24 034 46,30%
Reserve 20 000 45,40% 27 871 53,70%
Mobile (Smartphone + Tablet) 16/06/2014 - 31/07/2015
Number %
Questionnaires started 289 235
Dropout rate 237 443 82,00%
Completed 51 792
Filtered out on validation* 15 232 29,40%
Filtered out on condition "good" 20 646 39,90%
Gross panel (validated/active) 15 914
Phone 9 830 61,80%
Tablet 6 084 38,20%
3.6 Validation rules
In an online survey some inconsistencies will be made impossible. For example: if a person lives on his
own, the question ”main responsible for the income (MRI)” will not be asked but automatically set to
‘Yes’. However, some inconsistencies are still possible and will lead to rejection of a respondent: for
example, a 17-year-old with a doctorate or a 106 year old participant are very improbable.
The responses of foreign residents and the interviews that were completed on PCs in cybercafés are
discarded. Foreign residents do not belong to the target group. Surveys from cybercafés can only be
associated with session cookies and hence do not provide useful profile data.
There are two sets of rules: one for the CIM Internet PC cookie Panel and one for both the CIM Internet
Smartphone Panel and the CIM Internet Tablet Panel.
CIM Internet study - methodology Page 23
3.6.1 Validation rules for the CIM Internet PC Panel
internet usage less often than once a month
lives outside Belgium (STOP INTERVIEW)
fills in the questionnaire on device other than PC or laptop
age below 12
age above 99
wrong postal code
is not the main user of the device and doesn't have their own profile
share profile on desktop computer
share profile on portable computer
fills in the questionnaire in a public place on desktop computer
age below 15 and not a student
age 15-17 and (not a student or not working part-time)
age below 20 and member of the general management
age below 30 and pensioner or retired
not working but declares filling in the questionnaire at work
age below 18 and is a main income bringer
age below 17 and has a secondary level of education (or higher)
age below 20 and has a higher (bachelor education (or higher)
age below 22 and has an master education (or higher)
3.6.2 Validation rules for the Smartphone Panel and the Tablet Panel
lives outside Belgium
internet usage less often than once a month
device other than tablet or smartphone
this tablet usage to visit internet websites less once a month
this smartphone usage to visit internet websites less once a month
respondent is not the main user of this tablet
respondent is not the main user of this smartphone
age below 12
age above 99
wrong post-code
age below 15 and not a student
age 15-17 and (not a student or not working part-time)
age below 17 and has a secondary level of education (or higher)
age below 20 and has a higher (bachelor education (or higher)
age below 22 and has an master education (or higher)
CIM Internet study - methodology Page 24
3.7 Real users and Real User estimates
The CIM Internet Traffic study not only measures browsers but also tries to identify the number of
people (internet surfers) behind these browsers. One surfer can use multiple devices and therefore
multiple browsers or use multiple browsers on one single device. Gemius refers to surfers as ‘Real
Users’, they define this metric as the number of internet users who visited at least one of the
participating websites within the analyzed month. Real User estimates will be published on a daily
basis in OLA. The methodology is fairly comparable and the results are referred to as Real User
estimates, as the final Real User data is only available at the end of the month. It applies equally to
both websites/sections and streaming.
3.7.1 Real Users methodology
The basic hypothesis for the calculation of real users is that the number of Real Users of a website is
not the same as the number of browsers measured on a given website. The measured number of
browsers differs from the Real Users number because of different reasons. Looking at unique
browsers, Cookie deletion and multi-device usage overestimate the number of real users, while device
sharing underestimates it. Gemius uses the BEAST algorithm (Browser Estimation Algorithm Standard)
for the CIM Internet study in Belgium.
The method is based on the assumption that it is possible to define a subset of browsers (that show
cookie or browserID persistence) that are representative for all cookie files. The browsers from this
subset can be used for the calculation of the website’s reach. By knowing the average number of page
views per browserID in this representative group and knowing the total number of page views
measured on the website, one can calculate the number of Real Users on particular websites using the
following steps:
• Estimation of the number of browsers that would be registered for a studied website if there
was no cookie deletion.
• Calculation of the relative reach of such a website among all the measured websites.
• Calculation of the number of Real Users for a studied website based on the website’s reach
and the number of internet users in the country.
3.7.2 Representativeness
The traffic generated by this special group of browsers should have the same characteristics as the
traffic generated by all browsers. To achieve that, Gemius has conducted a set of analyses and defined
the general rules that must be fulfilled by a browser in order to be included to this group. These are
browsers with identifiers that exist throughout the entire studied month. This means that they have
existed both before and after the end of that month (e.g. 2 weeks after the analyzed month).
3.7.3 Concept of the algorithm
The first step in the “Real User” algorithm is the evaluation of the number of real browsers that access
the Internet. The main Gemius algorithm is called “Estimated Browser IDs” (EBID) and defines the set
of browsers that have not been deleted in the analyzed month. A Browser is a ”good BrowserID” if it
was measured before, during and up to 14 days after the analyzed month.
CIM Internet study - methodology Page 25
Gemius “Estimated Browser IDs” algorithm is based on the axiom that users who delete cookies or
browser id’s behave like users who do not. When the set of “good BrowserIDs” (BIDgood) is defined, the
average number of page views (PV) is counted for that set. Then the number of Estimated Browsers
(EB) is derived from the number of all the PVs measured in the universe and the average number of
PVs made by Good BrowserIDs (PVgood):
𝐸𝐵 =𝑃𝑉 ∗ 𝐵𝐼𝐷𝑔𝑜𝑜𝑑
𝑃𝑉𝑔𝑜𝑜𝑑
Using the EBID model we can establish the number of unique browsers for each single site in the study.
To have a Real User number we need an average number of users per unique browser (called J) for the
entire internet. This value is derived from the number of EstimatedBrowsers for the entire Universe
and from the number of Real Users coming from the external structural study. By multiplying the
number of estimated browsers counted on a particular website by this J-factor, we get the number of
Real Users for a particular website:
RUSite = EBSite * JInternet
3.7.4 Real Users algorithm with usage of BEAST model
Calculating the number of Real Users according to the above described method cannot be completed
immediately after the researched month is finished. The collection of traffic data from 2 additional
weeks is needed before it is possible to collect all cookies that also have traffic in the month after the
month that is being reported.
To shorten this waiting period, Gemius developed a probabilistic model called BEAST to estimate the
number of browsers equal to the EC model just after the end of the given month.
By applying a “probability function” to the analyzed data, every browser is classified in a group of “good
cookies” with a certain probability.
The BEAST model uses historical data (preferably from the past 3 months) to predict how probable it
is that the given browser is about to be a “good cookie” in the EC model. Historical data are used for
the creation of a mathematical model that applies an analytical weight to each browser. The Model is
defined via a function based on internet activity by browser and assigns the weight that equals the
share of good browsers with the same internet activity characteristic for all browsers. The analysis
limits the set of analyzed browsers so they can be created before the analyzed month (so called ‘not-
bad-cookies’ as this is a prerequisite to become a good cookie). All remaining browser identifiers
(created during the analyzed month) do not fulfil the basic requirements of good cookies definitions
so their analytical weights equal 0.
CIM Internet study - methodology Page 26
When each browser has an analytical weight, the calculations are run as described below in the section
“Calculation of Real Users number in steps”. Finally the number of Real User is established just after
the analyzed month without waiting until 14 days have passed.
3.7.5 Calculation of Real Users number in steps
a) Calculate the number of page views generated within the analyzed month by all browsers
registered – here: PV
b) Next, estimate the number of good BrowserIDs for which there is an assumption that they existed
throughout the entire researched month by summing up the analytical weights of all the browsers
that visited the analyzed website. This number in here is denoted as: BIDgood .
c) Calculate the number of page views generated by the browsers defined in point b) above – here:
PVgood.
d) Calculate the number of browsers that would be registered for the researched website if there was
no cookie / browser identification deletion, in accordance with the formula below:
𝐸𝐵𝑤𝑒𝑏𝑠𝑖𝑡𝑒 =𝑃𝑉 ∗ 𝐵𝐼𝐷𝑔𝑜𝑜𝑑
𝑃𝑉𝑔𝑜𝑜𝑑
e) In the same manner (but by replacing the researched website with the set of all websites
subscribed to the study), calculate the number of browsers that would be reregistered for all websites taking part in the site-centric research assuming there is no cookie (browser identification) deletion:
𝐸𝐵𝑇𝑜𝑡𝑎𝑙 =𝑃𝑉 ∗ 𝐵𝐼𝐷𝑔𝑜𝑜𝑑
𝑃𝑉𝑔𝑜𝑜𝑑
f) Calculate the relative reach of the researched browsers' website in the given month according to
the following formula:
𝑅𝑒𝑎𝑐ℎ𝑊𝑒𝑏𝑠𝑖𝑡𝑒 = 𝐸𝐵𝑊𝑒𝑏𝑠𝑖𝑡𝑒
𝐸𝐵𝑇𝑜𝑡𝑎𝑙
g) If P signifies the population of Internet Users within the researched month on all measured
websites, the number of Real Users 𝑅𝑈 visiting the researched website in the given month will be
calculated according to the following formula:
𝑅𝑈𝑊𝑒𝑏𝑠𝑖𝑡𝑒 = 𝑅𝑒𝑎𝑐ℎ𝑊𝑒𝑏𝑠𝑖𝑡𝑒 ∗ ∗ 𝑅𝑒𝑎𝑐ℎ𝑆𝑡𝑢𝑑𝑦
The information about the population of Internet Users (𝑃) and the reach of the study (how many
internet users surf on Belgian sites measured by CIM) is gathered using external Structural CIM Studies.
CIM Internet study - methodology Page 27
3.7.6 Weighting the cookie panel: Theoretical example
Let’s imagine that there is a population of 260 Internet Users in an exemplary country.
The socio-demography of Internet Users is described by their gender (which is known from the offline
structural study). 110 of Internet Users are men and 150 are women.
Websites: there are 5 websites in the country with Gemius scripts (JIC-member sites).
Every Internet user visits at least one of those 5 scripted websites, which means that the total audience
of those 5 sites equals 260 Real Users.
3.7.7 Real Users (site-centric) for websites
The Real Users algorithm calculated that Website1 was visited by 110 Internet Users, Website2 by 40
Internet Users, Website3 by 90 Internet Users, Website4 by 80 Internet Users, and Website5 by 120
Internet Users. Those values are called “Real Users”.
Website Real Users
Website 1 110
Website 2 40
Website 3 90
Website 4 80
Website 5 120
260 internet users in the
country
110 men
150 women
260 internet users in the
country
5 websites with
scripts pasted
CIM Internet study - methodology Page 28
3.7.8 Socio-demography of Panelists
The gender and place of living of the PC Cookie panelists are known.
Suppose that there is a PC Cookie Panel in the country with 10 members.
The socio-demographics of those panelists are shown in the table below.
Gender Place of living
Man Woman Region 1 Region 2 Region 3
Panelist 1
Panelist 2
Panelist 3
Panelist 4
Panelist 5
Panelist 6
Panelist 7
Panelist 8
Panelist 9
Panelist 10
3.7.9 Preparing the PC Cookie Panel data
It is also known which websites were visited by Cookie Panelists (see table below).
Website 1 Website 2 Website 3 Website 4 Website 5
Panelist 1
Panelist 2
Panelist 3
Panelist 4
Panelist 5
Panelist 6
Panelist 7
Panelist 8
Panelist 9
Panelist 10
Cookie Panel
CIM Internet study - methodology Page 29
A Cookie Panel of 10 people must represent the whole population of 260 Internet Users. This means
that every panelist must have a weight applied that shows how many Internet Users he represents.
The Cookie Panel must be representative in terms of behavioral and socio-demographical
characteristics.
Behavioral representativeness of the Panel means that:
a) All panelists that visited Website1 (Panelists 1, 3, 6 and 9) must represent all Internet Users
on website1 (together 111 Internet Users).
b) All panelists that visited Website2 (Panelists 1 and 10) must represent all Internet Users
on Website2 (together 40 Internet Users).
c) All panelists that visited Website3 (Panelists 1, 2, 4 and 7) must represent all Internet Users
on Website3 (together 90 Internet Users).
d) All panelists that visited Website4 (Panelists 2, 4 and 5) must represent all Internet Users
on Website4 (together 80 Internet Users).
e) All panelists that visited website5 (Panelists 3, 6, 8 and 9) must represent all Internet Users
on Website5 (together 120 Internet Users).
It is also known that 110 of Internet Users in the country are men and 150 are women. To make
sure that the panel is representative for gender, it is necessary that:
a) All panelists that are male (Panelists 1, 2, 3 and 4) must represent 110 male Internet Users
in the Population.
b) All panelists that are female (Panelists 5, 6, 7, 8, 9 and 10) must represent 150 female
Internet Users in Population.
Every Cookie Panelist must have weights assigned so that all the above mentioned rules are met.
All those rules are presented in the table below.
Weight Men Women Website1 Website2 Website3 Website4 Website5
Panelist 1 P1
Panelist 2 P2
Panelist 3 P3
Panelist 4 P4
Panelist 5 P5
Panelist 6 P6
Panelist 7 P7
Panelist 8 P8
Panelist 9 P9
Panelist 10 P10
SUM 260 110 150 110 40 90 80 120
CIM Internet study - methodology Page 30
3.7.10 Rules for Panel weighting
The Internet Audience Measurement uses a RIM weighting process. After weighting, each panelist has
his weight assigned. For our example the weights are presented in the table below.
Panelist Weight
Panelist 1 40
Panelist 2 10
Panelist 3 30
Panelist 4 30
Panelist 5 40
Panelist 6 20
Panelist 7 10
Panelist 8 50
Panelist 9 20
Panelist 10 10
SUM 260
Let’s combine the 2 tables together and add the information on the region of Panelists. This table
is presented below.
Weight Men Women Region 1 Region 2 Region 3 Website1 Website2 Website3 Website4 Website5
Panelist 1 40
Panelist 2 10
Panelist 3 30
Panelist 4 30
Panelist 5 40
Panelist 6 20
Panelist 7 10
Panelist 8 50
Panelist 9 20
Panelist 10 10
SUM 260 110 150 70 100 90 110 40 90 80 120
Notes:
- Website 2 was visited by 2 Panelists only. If the study rules define that socio-demographics are
shown only for websites with at least 3 Panelists, only results for websites 1, 2, 4 and 5 (which
were visited by 3 or more Panelists) will be shown.
- The value of “Real Users” for website 1 was 111, whereas after the weighting process “Real Users”
it equals 110. The (in)precision of the weighting process could lead to small differences between
those 2 values.
CIM Internet study - methodology Page 31
3.7.11 Results
How are the number of Internet Users calculated? Some examples are listed below.
1. RU for Website3.
Website3 was visited by Cookie Panelists 1, 2, 4 and 7. So the weights of those panelists must be
summarized: 40 + 10 + 30 + 10 = 90 Real Users.
2. RU for Website3 for target group “man”
Website3 was visited by Cookie Panelists 1, 2, 4 and 7, but among them the male panelists are panelists
1, 2 and 4. So the weights of those three panelists must be summarized: 40 + 10 + 30 = 80 Real Users
in target group “man”.
3. RU for Website 2.
Website2 was visited only by 2 Cookie Panelists: Panelist 1 and 10. This is a too small number of
Panelists to use Panel data for RU calculation. No data about the number of Real Users for this website
will be shown.
4. RU for two websites together: website 1 and website 5.
Those websites were visited by the following Cookie Panelists: Panelist 1 (who visited only website1),
Panelists 3, 6 and 9 (those Panelists visited both sites) and Panelist 9 (who visited only website3).
To get RU for those sites, the sum of weights of Panelists 1, 3, 6, 8 and 9 must be taken. It equals: 40 +
30 + 20 + 50 + 20 = 160. So the total number Real Users on those two websites is 160.
CIM Internet study - methodology Page 32
3.8 Universe
3.8.1 PC Cookie Panel
The PC cookie panel is weighted to a target audience which is derived from a structural study. The main
sources of this data are currently the CIM Press (2015 Q2), the CIM TV Other Screen Monitor (2014 Q4
and 2015 Q2) and the CIM HUB (2013).
The table below shows the socio-demographical targets for the PC population derived from the
internet population in the CIM Press study, combined with the penetration for PC in the CIM TV OSM.
Up to June 2015 the CIM HUB (2013) study was chosen to determine the socio-demographical
distribution for the PC population given the fact that the PC universe is more stable and the number
of participants in the HUB is higher than in the CIM TV OSM study.
PC Internet users (July 2015)
FR
Profile % NL
Profile % BE Profile
%
Total 18+ PC Internet users 43,8% 56,2% 100,0%
Gender FR NL BE
Men 50,8% 52,0% 51,5%
Women 49,2% 48,0% 48,5%
Age FR NL BE
12-17 0,0% 0,0% 0,0%
18-24 14,2% 12,9% 13,5%
25-34 20,7% 19,0% 19,7%
35-44 21,0% 19,2% 20,0%
45-54 19,3% 21,8% 20,7%
55+ 24,9% 27,1% 26,1%
Professional activity FR NL BE
Active 57,8% 67,3% 63,1%
Not active 42,2% 32,7% 36,9%
Educational level FR NL BE
Primary + Secondary 63,2% 61,9% 62,4%
University + High school 36,8% 38,1% 37,6%
Nielsen FR NL BE
Nielsen I 1,0% 41,7% 23,9%
Nielsen II 2,8% 55,8% 32,6%
Nielsen III - NL 0,0% 2,2% 1,2%
Nielsen III - FR 25,1% 0,0% 11,0%
Nielsen IV 33,9% 0,2% 15,0%
Nielsen V 37,1% 0,1% 16,3%
CIM Internet study - methodology Page 33
From February 2015 onwards, profession (11 categories) and degree (6 categories) were added on a
national level to the weighting factors to guarantee reliable data on social groups.
Profession BE
Small commerce, freelance and industrial 5-, artisan and farmer 5,3%
Big commerce, freelance and industrial 6+, upper management and liberal profession
3,4%
Middle management 4,9%
Employee 33,1%
Skilled worker 12,6%
Unskilled worker 3,9%
Housewife 3,8%
Retirement 13,9%
Unemployed 6,2%
Student 8,5%
Other 4,4%
Educational level BE
Never or primary 4,10%
Lower secondary 14,19%
Higher secondary General/Technical/Artistic 33,07%
Higher secondary Vocational 11,08%
Bachelor 22,07%
Master 15,49%
To establish the number of surfers in a given month, the monthly PC reach (the percentage of PC
internet users that have visited at least one of the participating Belgian website in the given month)
has to be estimated.
This PC Reach is influenced by a number of factors:
- Seasonality
- People with very low surfing behavior (< 1 / month)
- People who do not visit Belgian sites
- People who do visit Belgian sites, but not sites participating in the measurement (tagged sites).
In the following table we explain how this PC reach is calculated.
For the first months of 2015 it was decided to limit the results to the 18+ population. There were not
enough 12-17 year old panel members for a reliable result.
The total population 18+ is updated with the latest data originating from the CIM Press study 2015 Q2.
Internet usage is also updated from the same source. Next, consumption is reduced to PC internet
usage, based on the OSM study 2015 Q2. Seasonality influences are corrected by the fluctuation of the
site centric measurement (Estimated Cookies).
CIM Internet study - methodology Page 34
One more factor was estimated: people not surfing on Belgian sites in the last month. This information
is coming from the HUB study.
2015 06 : PC/NOTEBOOK REACH CALCULATION
Correction % Population Source
Total Belgians 18+ - 100,00% 8 829 425 Press 2015 Q2
All surfers - 79,00% 6 963 862 Press 2015 Q2
Surfers on pc/notebook 96,60% 76,00% 6 727 078 OSM 2015 Q2 on Surfers Press 2015 Q2
Minus not on Belgian CIM sites last month -7,93% 70,00% 6 193 751 HUB 2013 declared
This leads to a number of PC Real Users of 6 727 078 and a PC Reach of 92,07%
(6 193 751 / 6 727 078). PC reach is then solely determined by 2 factors:
- People who did not visit Belgian sites
- People who did not surf last month
The J Coefficient should then be variable according to RU (6 615 314) and PC Reach (100% - 7,93% =
92,07%) on the one hand, and Estimated Browsers on the other hand.
The universe data are updated 4 times a year:
- January: update of total population and internet penetration (Source: CIM Press Q1).
- June: update of device usage within the internet population (Source: CIM TV OSM Q2).
- September: update of total population and internet penetration (Source: CIM Press Q3).
- December: update of device usage within the internet population (Source: CIM TV OSM Q4).
In 2015 the total for PC was 6 615 314 from January through March, 6 600 925 in April and May and 6
727 078 from June onwards. PC Reach remained at 92,07% for the whole period.
In 2016 The CIM Press study will not organize a field study. The updates of the CIM Internet study will
have to look for alternatives for structural data update.
Because of a delivery delay, the schedule of the structural data update has been slightly changed. See the table below for the most recent schedule:
1/15 2/15 3/15 4/15 5/15 6/15 7/15 8/15 9/15 10/15 11/15 12/15
Internet penetration CIM Press 2015 Q1 CIM Press 2015 Q2
Internet by device OSM 2014 Q4 OSM 2015 Q2 OSM 2015 Q4
Not on Belgian sites HUB 2013
CIM Internet study - methodology Page 35
3.8.2 Smartphone Panel
The smartphone panel is weighted to a target audience derived from the same structural studies.
The table below shows the socio-demographical targets derived from the CIM TV OSM 2014-2015
(average from 2014 Q4 and 2015 Q2). Since smartphone usage is changing very rapidly, the most
recent study is used, even though the number of respondents is lower. Nielsen and Home/work data
could not be derived from the OSM and was therefore derived from the CIM HUB 2013 study.
Smartphone users 18+
FR Profile % NL Profile % BE Profile
Total Smartphone Internet users 36,9% 63,1% 100,0%
Gender OSM FR OSM NL OSM BE
Men 47,8% 51,0% 49,8%
Women 52,2% 49,0% 50,2%
Age OSM FR OSM NL OSM BE
12-17 0,0% 0,0% 0,0%
18-24 20,7% 17,4% 18,7%
25-34 28,6% 25,9% 27,0%
35-44 24,4% 23,3% 23,7%
45-54 14,7% 18,1% 16,7%
55+ 11,6% 15,3% 13,8%
Professional activity OSM FR OSM NL OSM BE
Active 62,7% 70,8% 67,5%
Not active 37,3% 29,2% 32,5%
Educational level OSM FR OSM NL OSM BE
Primary + Secondary 58,7% 59,4% 59,2%
University + High school 41,3% 40,6% 40,8%
Nielsen FR Profile %
HUB NL Profile %
HUB BE Profile %
HUB
Nielsen I 0,7% 37,3% 21,7%
Nielsen II 2,5% 59,8% 35,5%
Nielsen III - NL 0,0% 2,8% 1,6%
Nielsen III - FR 26,3% 0,0% 11,2%
Nielsen IV 37,2% 0,1% 15,8%
Nielsen V 33,4% 0,0% 14,2%
CIM Internet study - methodology Page 36
From February 2015 onwards the degree (6 categories) was added on a national level.
Educational level OSM BE
Never or primary 2,3%
Lower secondary 10,0%
Higher secondary General/Technical/Artistic 36,7%
Higher secondary Vocational 10,2%
Bachelor 22,9%
Master 18,0%
To establish the number of surfers in a given month we also have to estimate the Smartphone reach.
This is the percentage of Smartphone internet users that are active in a given month. Active means
that they have visited at least one of the participating Belgian website in the given month. This is
influenced by a number of factors:
- Seasonality
- People with very infrequent surfing behavior (less than once a month)
- People who do not visit Belgian sites
- People who do visit Belgian sites, but not participating sites (tagged sites)
In the following table we explain how this Smartphone reach is calculated.
For the first months of 2015 it was decided to limit the results to the 18+ population. There were not
enough 12-17 year old panel members for a reliable result.
The total population 18+ was updated with the latest data originating from the CIM Press study 2015
Q2. Internet usage was also updated from the same source. Next, consumption was reduced to
smartphone internet usage, based on the OSM study 2015 Q2. Seasonality influences are corrected by
the fluctuation of the site centric measurement (Estimated Cookies).
One more factor was estimated: people not surfing on Belgian sites in the last month. This information
comes from the HUB study.
Smartphone REACH CALCULATION 2015 07 Correction % Population N+S Source
Total Belgians 18+ - 100,00% 8 829 425 Press 2015 Q1
All surfers - 78,90% 6 963 862 Press 2015 Q1
Surfers on smartphone - 44,40% 3 921 843 OSM 2015 Q2 on Surfers Press 2015 Q2
Minus not on Belgian CIM sites last month
-7,93% 40,90% 3 610 841 Estimation
There are 3 921 843 Smartphone Real Users, the Smartphone Reach is 92,07% (3 610 841 / 3 921 843). Smartphone reach is then solely determined by 2 factors:
- People that did not visit Belgian sites
- People that did not surf last month
CIM Internet study - methodology Page 37
The J Coefficient should then be variable according to RU (3 921 843) and smartphone Reach (100% -
7,93% = 92,07%) on the one hand and Estimated Browsers (EBSmartphone) on the other hand.
The increase of smartphone penetration in 2015 has been quite spectacular: the total for smartphone
was 3 040 037 from January through March, 3 830 917 in April and May, and 3 921 843 from June
onwards. Smartphone reach remained at 92,07% for the whole period.
3.8.3 Tablet panel
The Tablet panel is also weighted to a target audience derived from the same structural studies.
The table below shows the socio-demographical targets derived from the CIM TV OSM 2014-2015
(average from 2014 Q4 and 2015 Q4). Since tablet usage is changing very rapidly, the most recent study
is used, even though the number of respondents is lower. Nielsen and Home/work data could not be
derived from the OSM and were therefore derived from the CIM HUB 2013 study.
Tablet users 18+
FR Profile %
HUB NL Profile
% HUB BE Profile %
HUB
Total Tablet Internet users 18+ 37,7% 62,3% 100,0%
Gender OSM av. FR OSM av. NL OSM av.
Men 54,0% 55,3% 54,8%
Women 46,0% 44,7% 45,2%
Age OSM av. FR OSM av. NL OSM av.
12-17 0,0% 0,0% 0,0%
18-24 13,4% 11,0% 12,0%
25-34 23,2% 19,3% 20,9%
35-44 24,1% 25,9% 25,2%
45-54 20,3% 20,1% 20,2%
55+ 19,0% 23,6% 21,8%
Professional activity OSM av. FR OSM av. NL OSM av.
Active 62,4% 69,2% 66,5%
Not active 37,6% 30,8% 33,5%
Educational level OSM av. FR OSM av. NL OSM av.
Primary + Secondary 58,4% 59,8% 59,2%
University + High school 41,6% 40,2% 40,8%
Nielsen FR Profile %
HUB NL Profile
% HUB BE Profile %
HUB
Nielsen I 1,4% 38,7% 25,0%
Nielsen II 2,9% 58,4% 38,0%
Nielsen III - NL 0,0% 2,5% 1,6%
Nielsen III - FR 23,0% 0,0% 8,4%
Nielsen IV 39,0% 0,4% 14,6%
Nielsen V 33,7% 0,0% 12,4%
CIM Internet study - methodology Page 38
The structural data for the tablet panel was calculated in exactly the same way as the smartphone panel (see above).
2015 07 : Tablet REACH CALCULATION
Correction % Population Source
Total Belgians 18+ - 100,00% 8 829 425 Press 2015 Q2
All surfers - 78,90% 6 963 862 Press 2015 Q2
Surfers on Tablet - 32,70% 2 885 179 OSM 2015 Q2 on Surfers Press 2015 Q2
Minus not on Belgian CIM sites last month -7,93% 30,09% 2 656 384 HUB 2013 declared
The number of Tablet Real users is 2 885 179 and Tablet Reach is 92,07% (2 656 384 / 2 885 179). Tablet reach is then solely determined by 2 factors:
- People that did not visit Belgian sites
- People that did not surf last month
The J Coefficient should then be variable according to RU (2 885 179) and Tablet Reach (100% - 7,93%
= 92,07%) on the one hand, and Estimated Browsers (EBTablet) on the other hand.
The rise of tablet penetration in 2015 has been remarkable: the total for tablets was 2 547 839 from January through March, 2 801 517 in April and May, and 2 885 179 from June onwards. Tablet reach remained at 92,07% for the whole period.
3.9 Data processing
Profiling CIM Internet audience data is no easy task. The (almost) perfect overview of traffic on Belgian
sites at the level of the browsers, page requests and visits cannot be converted easily into a profiling
on a human scale.
First of all, solutions must be found for the multi-cookie problem, the multi-PC problem and the multi-
user problem. It is indeed essential that the sample is a good pre-reflection of the universe in terms of
surfing behavior and socio-demographic characteristics.
This chapter describes which answers this publication provides to all these questions. Beforehand we
recall that the profile of a person completing the CIM PC Panel questionnaire on site X is also granted
to all other sites visited by this panelist. It allows to determine not one but several socio-demographic
profiles for all Belgian sites with sufficient observations: a profile of the net reach and a profile of the
gross contacts.
3.9.1 Gross versus net panel
Since it is not evident to reach a perfect socio-demographic distribution when recruiting an online
panel, CIM opted to recruit more panelists than will be reported. Therefore a distinction was made
between a gross panel and a net panel.
CIM Internet study - methodology Page 39
The gross panel contains all panelists that completed the questionnaire correctly and showed
BrowserID (cookie) activity during the month. This gross panel will not show a correct distribution on
socio-demographic characteristics. If the gross panel were to be used for the data production, this
would lead to high differences in weight factors, and therefore a low weighting efficiency.
From that gross panel a smaller net panel is derived. An algorithm is used that repeatedly calculates
the weight factors and deletes the 500 panelists with the lowest weight. This procedure is iterated a
number of times, whereby a trade-off is made between the efficiency and the size (statistical power)
of the sample.
In July 2015 from a gross panel of over 51.000 members, 27.000 were removed, leaving a net panel of
slightly more than 24.000 panelists and an efficiency score of 93,3% for the monthly weights.
The weighting is based on socio-demographical factors as well as on behavioral factors (visiting
subscribed websites). The latter also corrects for the bias towards more intense surfers that occurs
with panel-based research.
In February 2015 together with their introduction in the weighting of the net panel, profession (11
categories) and education (6 categories) were added to the gross-net algorithm.
From August 2015 onwards the PC, smartphone and tablet panel are fused. The PC panel is the base
panel, in which a mobile panel member can be a donor for more than one PC panelist, making a
reduction of the gross Tablet or Smartphone panel to a net Tablet or Smartphone panel unnecessary.
3.9.2 Website weighting (behavioral representativeness)
Each website with at least 33 panel members is weighted (calibrated) separately to correct to the
census data.
3.9.3 Socio-demographic weighting
A socio-demographic profiling study may be expected to be representative of the universe at the base.
Given the self-selection of the samples a thorough check of their composition and representativeness
certainly is recommended. A distortion of the net panel implies indeed a distortion of the site profiles.
To improve the socio-demographic representativeness of the sample, a classic weighting of the surfers
is applied.
The following table shows the initial situation before weighting (“observed”), the objectives according
to the CIM PC Internet universe (”target”), and the final result after a RIM weighting procedure
(”result”) for the weighting criteria: language, gender, age, professional status, education and Nielsen
region.
CIM Internet study - methodology Page 40
Socio-demos PC net panel June 2015 panelists
(unweighted) %
unweighted RU
(weighted) %
weighted
Total population 24 136 100% 6 193 621 100%
Language Dutch population 14 569 60,36% 3 502 741 56,55%
French population 9 567 39,64% 2 690 880 43,45%
Age
age 18 - 24 nl 1 879 7,79% 474 617 7,66%
age 18 - 24 fr 940 3,89% 402 956 6,51%
age 25 - 34 nl 2 894 11,99% 699 136 11,29%
age 25 - 34 fr 2 001 8,29% 590 190 9,53%
age 35 - 44 nl 3 093 12,81% 740 572 11,96%
age 35 - 44 fr 2 303 9,54% 627 105 10,13%
age 45 - 54 nl 3 230 13,38% 784 423 12,67%
age 45 - 54 fr 2 048 8,49% 524 661 8,47%
age 55 - 99 nl 3 473 14,39% 804 056 12,98%
age 55 - 99 fr 2 275 9,43% 545 905 8,81%
Gender
female nl 7 065 29,27% 1 679 339 27,11%
female fr 4 685 19,41% 1 330 327 21,48%
male nl 7 504 31,09% 1 823 341 29,44%
male fr 4 882 20,23% 1 360 614 21,97%
Active IP
active nl 9 639 39,94% 2 368 875 38,25%
active fr 5 886 24,39% 1 613 252 26,05%
non-active nl 4 930 20,43% 1 133 866 18,31%
non-active fr 3 681 15,25% 1 077 628 17,40%
Education IP
primary/secondary nl 8 949 37,08% 2 175 758 35,13%
primary/secondary fr 5 627 23,31% 1 657 723 26,77%
university/high-school nl 5 620 23,28% 1 326 983 21,42%
university/high-school fr 3 940 16,32% 1 033 157 16,68%
Nielsen
Nielsen I nl 5 985 24,80% 1 438 841 23,23%
Nielsen I+II fr 326 1,35% 89 683 1,45%
Nielsen II nl 8 194 33,95% 1 970 749 31,82%
Nielsen III nl 352 1,46% 84 171 1,36%
Nielsen III fr 2 461 10,20% 690 836 11,15%
Nielsen IV+V nl 38 0,16% 8 918 0,14%
Nielsen IV fr 3 274 13,56% 926 318 14,96%
Nielsen V fr 3 506 14,53% 984 105 15,89%
CIM Internet study - methodology Page 41
Socio-demos PC net panel June 2015 panelists
(unweighted) %
unweighted RU
(weighted) %
weighted
Total population 24 136 100% 6 193 621 100%
Education IP
Never or primary 1 126 4,67% 293 391 4,74%
Lower secondary 3 270 13,55% 866 487 13,99%
Higher secondary General/Technical/Artistic 7 933 32,87% 2 061 114 33,28%
Higher secondary Vocational 2 247 9,31% 612 425 9,89%
Bachelor 5 946 24,64% 1 449 370 23,40%
Master 3 614 14,97% 910 834 14,71%
Profession IP
Small commerce, freelance and industrial 5-, artisan and farmer
1 451 6,01% 358 177 5,78%
Big commerce, freelance and industrial 6+, upper management and liberal profession
1 186 4,91% 290 542 4,69%
Middle management 1 391 5,76% 340 029 5,49%
Employee 7 791 32,28% 1 920 890 31,01%
Skilled worker 2 746 11,38% 797 863 12,88%
Unskilled worker 960 3,98% 274 563 4,43%
Housewife 818 3,39% 212 812 3,44%
Retirement 3 478 14,41% 793 713 12,82%
Unemployed 1 539 6,38% 423 830 6,84%
Student 1 945 8,06% 560 833 9,06%
Other 831 3,44% 220 369 3,56%
In February 2015 profession (11 categories) and education (6 categories) were added to the weighting
of the PC cookie panel. Education (6 categories) was also added to the weighting of the Smartphone
and Tablet cookie panels.
3.10 Fusion of the CIM Internet Panels
Since August 2015 data, the panels for the different devices (PC, Smartphone and Tablet) are fused in
order to calculate a deduplicated cross-device audience. To achieve this, Gemius applies its “Behavioral
Panel Synthesis” (BPS) methodology.
The BPS algorithm allows to calculate the Total number of Real Users for any combination of platforms,
websites and target groups. It uses panels recruited on each analyzed platform and combines them
into one cross-platform panel.
CIM Internet study - methodology Page 42
3.10.1 Data sources
The main data to perform the Behavioral Panel Synthesis comes from the different platform panels.
When joining a panel, all panelists answer a set of questions regarding their demographics, interest
and the devices they use to connect to the Internet. However, only the traffic generated from the
device on which the online questionnaire was filled in, is being measured. This is problematic since
most panelists use more than one device and therefore only part of their online activity is measured.
ID Platform Device usage Gender Age … Visited
website 1
Visited
website 2
Visited
website 3 …
123 PC Only PC Male 11 … No Yes Yes …
456 PC PC & Tablet Male 45 … Yes No Yes …
789 PC PC & Mobile Female 28 … Yes Yes No …
456 Tablet PC & Tablet Male 45 … No No Yes …
101 Mobile PC & Mobile Female 28 … Yes No Yes …
… … … … … … … … …
Luckily, some panelists fill in the questionnaire on several devices. Those panelists constitute the
Calibration Panel: people for whom we measure their online activity on multiple devices.
The BPS algorithm uses this Calibration Panel to learn the behavioral patterns of Internet usage across
different devices. This allows to combine several single-platform panels into one cross-platform panel.
The calibration panel is built in two ways:
Panelists present in more than one panel (with same e-mail) are part of the calibration panel.
Panelists from the PC panel who indicated that they also surf on tablet or smartphone were
invited by e-mail to fill out the survey on those devices. The same was done for similar panel
members from the Tablet and Smartphone panel.
Calibration panelists who give different answers in the different panels are removed from the
database. Example: a panelist who indicates in the PC panel that he speaks Dutch, and in the
Smartphone panel that he speaks French, is rejected.
The BPS algorithm consists of four main steps:
Metrical Clustering
Behavioral Distance
Nearest Neighbor Merging
Weighting
socio-demographic behavioral devices
Cro
ss-p
latf
orm
CIM Internet study - methodology Page 43
3.10.2 Metrical Clustering
First, each panel is clustered into subsets of similar users based on the socio-demographic variables
gender, age and language and the devices they use to connect to the Internet. Within each cluster the
panelists from different platforms will be combined. It guarantees that two significantly different
panelists (from different clusters) will not be merged (e.g. a man with a woman, or a Dutch speaking
with a French speaking).
Panelists who use one device only, are excluded from the fusion process.
3.10.3 Behavioral Distance
There is a great variety of behavioral patterns within a single cluster. To avoid merging significantly
different panelists a probabilistic model is used whereby the distance between all the individuals is
calculated. That distance is based upon their socio-demos as well as on their surfing behavior. This
distance determines the probability of being the same person for two panelists on different devices.
This allows to merge only those panelists whose behavior indicates that they are likely to be the same
person.
In order to obtain such a model a Calibration Panel is required. It provides information about the
Internet usage of panelists across different devices. Moreover, we already know the outcome of the
question ‘Is this the same person on different devices?’.
Within the calibration panel, for each panelist on each device we now know whether he visited a
particular website within the analyzed period (Visited site 1, Visited site 2, …).
Each possible combination of a calibration panelist from one platform with a panelist from another
platform within a given cluster (e.g. Dutch Male 18-24) is then considered. The list of visited websites
CIM Internet study - methodology Page 44
on device one is matched with the list of visited websites on device two, indicating whether he visited
a particular website on both devices.
In order to tell whether a pair of panelists could represent the same person we need examples of both
correct and false connections between panelists. The first ones will reveal which variables (websites)
correlate with being the same person. The latter reveals which variables correlate with not being the
same person. The final dataset consists of all possible pairs of panelists from the Calibration Panel.
Every time two panelists are merged together we also indicate whether it is a correct or a false
connection (with 1 and 0). This creates a model of good and bad fits based on logistical regression. The
model is “self-training”. From month to month the algorithm builds upon the acquired information.
As a result we get a set of coefficients that minimize the error of predicting the probability of being the
same person. This process is repeated for each pair of platforms resulting in separate models for
Desktop & Mobile and Desktop & Tablet.
3.10.4 Nearest Neighbor Merging
Within each cluster we merge panelists from different platforms. The detailed process is the following:
1. For each Desktop panelist:
a. The probability of being the same person is calculated between a given Desktop
panelist and each Smartphone panelist (within the same metrical cluster).
b. The Smartphone panelist with the highest probability is merged with the given
Desktop panelist if the probability is higher than a given threshold and the given
Smartphone panelist was not already merged with too many Desktop panelists.
c. If there is no such Smartphone panelist, then the given Desktop panelist is not
merged with any Smartphone panelist.
2. Point 1. is repeated, but instead of Mobile panelists we use Tablet panelists.
a. There is an additional constraint that has to be met by a Tablet panelist. The
probability of being the same person between a Tablet panelist and an already
merged Smartphone panelist has to be higher than a given threshold.
3. Each panelist receives the demographic profile based on the Desktop panelist.
This process results in a cross-platform panel in which for each panelist we measure his activity on all
his devices.
3.10.5 Weighting
The last step of the BPS approach is the weighting process. There are three main groups of boundary
conditions to which the total panel is weighted:
1. The number of Real Users for each website on each platform.
2. The socio-demographic structure of each platform and the whole Internet population (based
on the structural study).
3. The population sizes of users who use a particular set of devices, e.g. Desktop only, Desktop
and Mobile, etc.
CIM Internet study - methodology Page 45
Finally, the total number of Real Users is equal to the sum of weights of all the panelists that visited a
particular website.
The table below presents the distribution of each combination of devices in the structural data (based
on the CIM Other Screen Monitor 2015 Q2) and the presence of panel members in the different cookie
panels.
Devices Structural data Panel Aug 15
only tablet 1,5 % 7,1%
only mobile 0,8 % 4,8%
tablet and mobile 1,1 % 4,6%
only pc 32,3 % 30,1%
pc and tablet 9,9 % 8,0%
pc and mobile 25,5 % 22,7%
pc and mobile and tablet 28,9 % 22,8%
The quality of a fusion also depends on the number of times a panelist is used. For the Smartphone
and the Tablet panel, more than half of the panelists are only used once, and 4 out of 5 are not used
more than 3 times.
PC - Smartphone fusion Aug 2015
Number of
times used
Smartphone
panelists %
1 4 247 56,8%
2 1 066 14,3%
3 704 9,4%
4 472 6,3%
5 803 10,7%
6 147 2,0%
7 6 0,1%
8 29 0,4%
CIM Internet study - methodology Page 46
PC - Tablet fusion Aug 2015
Number of
times used Tablet panelists %
1 3 102 59,4%
2 717 13,7%
3 458 8,8%
4 290 5,6%
5 263 5,0%
6 98 1,9%
7 71 1,4%
8 61 1,2%
9 43 0,8%
10 or more 124 2,4%
In August 2015 the calibration panel was composed of 525 panelists out of which 314 were enlisted in
the PC and Smartphone panel, 197 in the PC and Tablet panel and 14 in the Smartphone and Tablet
panel.
The mobile only panel members (Tablet only, Smartphone only or Tablet and Smartphone) keep the
socio-demographical information from the mobile panel. Due to the shortening of the mobile
questionnaire, some profile data (children and age of children, main responsible for income, main
responsible for purchases) is missing for them.
For January to July 2015, the results for the different devices were presented next to one another, not
deduplicated.
By the end of 2015 the results from home/work on PC will also be fused using the same BPS-
methodology.
CIM Internet study - methodology Page 47
4 Publication of CIM Internet audience results
4.1 Participation in the study
Each Belgian website included in the CIM Internet site centric measurement will automatically
participate in the panel recruitment, as long as the technical conditions for a correct display of the
invitations are met.
4.2 Conditions for publication in planning files
The audience results for participating sites will be published if the following conditions are met:
- The site is tagged correctly and the pop-up recruitment is shown.
- There is constant traffic (no more than 3 days without traffic due to a lack of scripting or site
downtime) - Profile results are only available if the selection has at least 40 unweighted observations.
- The website is in Active Room on the last day of the month where data was calculated.
4.3 Access to the results
The results of the CIM Internet audience study are exclusively accessible to its subscribers. They can
be consulted in 3 different formats.
- Monthly Excel reports, available in the subscribers section of the CIM website.
- GemiusExplorer, software allowing the analysis of traffic and audience data for all websites.
(This tool will only be available when the Gemius best weights algorithm is applied to the Media
Planning tools, i.e. not before January 2016.)
- Media planning tools. Certified software that makes the results available in planning software.
The results are accessible via planning software certified by the CIM. A current list of certified
software suppliers is available on the CIM website:
FR: http://www.cim.be/fr/internet/fournisseurs-de-logiciel
NL: http://www.cim.be/nl/internet/softwareleveranciers
4.4 Monthly Excel reports
There are three types of reports available with socio-demographical profiles:
- an individual report for each publication unit (website, section, sales house, …),
- a global report with an overview of all publication units,
- a total report with results for the entire internet in Belgium.
Each report presents data on Belgian traffic with a socio-demographical split on gender, age, language,
PRI, PRP, Social group, degree, employment status, internet devices, and number of children in the
household.
In February 2015 “Social Group” was added. “Number of children in the household” was added in
March 2015.
CIM Internet study - methodology Page 48
4.5 The gemiusExplorer audience reporting tool
GemiusExplorer is a locally installed Windows application in which you can open .gem files with
monthly traffic and audience data. It allows the analysis of a complete set of indicators of the
respondents in a calendar month with a breakdown into daily, weekly and monthly data. The available
socio-demographic variables are listed in the annexes.
This tool is based on the Gemius Best Weights algorithm (BWA) and will only be made available for the
CIM internet study when the BWA will be applied to the Media Planning tools (i.e. not before January
2016).
4.6 Monthly media planning tools
The monthly media planning tools basically contain the same information as the gemiusExplorer file,
but also allow for the planning of an internet campaign.
It is therefore possible to select an audience on several websites on a given period and predict the
reach and other metrics.
The available metrics are:
- Reach %
- Number of surfers
- Page views
- Visits
- Time: this is defined as the average time per person in a target group.
There are differences however, which make that the Technical committee has decided to make the
planning files the only currency. The table below contains a list of those differences:
Planning files gemiusExplorer
Use of average day weights Use of Gemius best weights algorithm
Reach is computed on the entire population
(internet and non-internet)
Reach is computed on the internet population on
a given device (including people who do not surf
on Belgian websites or who did not surf in the
given month, but excluding people who never use
the internet on a given device). Reach-Internet is
available as a separate metric.
For a list of socio-demographic variables available in both gemiusExplorer and the planning files, please
see annex 2.
CIM Internet study - methodology Page 49
5 CIM Internet Software panel The CIM Internet Software panel is still under development.
It is PC software under Microsoft Windows that registers all traffic on four main browsers (Google Chrome, Microsoft Internet Explorer, Firefox and Opera) independent of the login used on the device. It disposes of a virtual people meter, in which people have to register. On the last screen of the CIM Internet PC cookie questionnaire, the respondent is asked if he is willing
to participate in other CIM studies. If he answers positively and is surfing on Google Chrome, Microsoft
Internet Explorer, Firefox or Opera from a PC with a Windows OS, he is invited to install the CIM
Internet Software.
The purpose of this panel is to be able to measure traffic on websites that are not tagged in the CIM
Internet Traffic study. This can be the case if the website is not interested in the measurement (e.g. a
PC banking website or a government site that does not show any third party banners) or a website that
is interested in the measurement, but for technical reasons or for reasons of international company
policy, is not allowed to put tags on its website (e.g. a major foreign website like google.be that is
subject to international policy imposed by the worldwide headquarters).
6 Controls of the CIM internet study Each aspect of the CIM Internet study is tested thoroughly. Tests are done by the research institute
Gemius as well as by the CIM SPS and by the software house (GfK Probe).
6.1 Checking traffic data
The tagging of each new website/section is thoroughly tested before allowing a site into the study. On
an ongoing basis, the tagging of a random selection of websites is checked.
Amongst others, the following elements are tested:
- Is the scripting correctly implemented?
- Is the scripting available on the entire website?
- Is the correct identifier being used?
- Is the mandatory extraparameter ‘Language’ present?
- Are the CIM logo and disclaimer available on the website?
The tagging of native, hybrid and html5 applications, and of streaming players is also tested thoroughly.
Gemius certifies each streaming player individually and reports on the necessary changes in the
tagging.
Every day the CIM Internet staff checks the results in gemiusOLA, gemiusPrism and the offline reports
for the Luxembourgian market. Checks are done on the availability and stability of the results. If a
problem is discovered, the subscriber is contacted and asked to take the necessary measures.
CIM Internet study - methodology Page 50
6.1 Checking audience data
Before delivering the data to CIM, Gemius does a series of internal checks on the audience data:
- Checking the filtering conditions: is there a disproportionate filtering of page views for any of
the websites, based on the rules for allowed domains, autorefresh or use of iFrame,
- Month-by-month trend concerning validation rules: how many people are unavailable for
reporting based on conflicts in their answers to the intake survey?
- Check on % of not-good BID: is this percentage stable from month to month?
- Check on the socio-demographical structure of the panel versus the data known from the
structural studies.
- Check on weighting efficiency and weights distribution (average, min., max. weight)
- Check on the size of the panels: PC is kept stable by the gross-net algorithm, Smartphone and
Tablet are growing.
- Trend reports (comparison with past months) on the Internet node (e.g. EC, Population, J
coefficient)
- Trend report on all websites (i.e. change in PVs, RUs, panelists)
The permanent structure checks on the following elements:
- All websites are screened for continuous traffic. Sites that show no traffic for at least three
days will not be published for that month.
- Comparison of the weighted variables in the PC, Smartphone and Tablet panel with the data
in the structural studies
- After adding the non-internet data, the total for the weighted variables in the PC,
Smartphone and Tablet panel is compared with the data for total population in the structural
studies
- Comparison of non-weighted variables in the PC, Smartphone and Tablet panel with the data
in the structural studies
- Stability of the results for reach for all websites
On a monthly basis the audience data are checked and reported to the CIM Internet technical
committee.
Reports are presented on the overall gross panel size, the size of the net panel, the recruitment rate,
the socio-demographic composition of the panel, the efficiency of the weighting, the stability of the
number of real users for all sites from month to month, …
The planning file results in GfK Probe are tested and compared with the Excel files before they are
published to the market.
Before publication on the CIM website, random checks are also done on the Excel reports.
Annexes
CIM Internet study - methodology Page 51
ANNEXES
Annex 1 Below you can find a full version of the intake surveys.
6.2 CIM Internet PC Cookie Panel questionnaire
This is the general version that is used for adults (18+).
Age and place of living
Q0 year (1900 TO current year - 12) What is your year of birth? _ _ _ _
Q11A Do you live in Belgium? 1: Yes 2: No
Q11B numeric (min 1000 – max 9999) What is your zip code? _ _ _ _
Annexes
CIM Internet study - methodology Page 52
General internet usage Q1 single How often do you usually use internet? 1: 7 days a week
2: 5 or 6 days a week 3: 3 or 4 days a week 4: 1 or 2 days a week 5: Less than 1 day a week 6: Less than 1 day a month
Q1b How many hours do you use the Internet in a typical day?
1: up to one hour 2: 1-2 hours 3: 3-4 hours 4: 5-6 hours 5: 7 hours or more
Q1c GRID (TABLE) QUESTION. One answer for each location.
How often do you generally use internet on each of the following places ?
A. Home B. At work C. At school or at university D. Mobile (on the street, underway or at public places) E. Elsewhere 1. 7 days a week 2. 4-6 days a week 3. <= 3 days a week 4. Never
Q2 multiple + open text Over the past month, have you at any time used the internet via one of the following devices (e.g. to surf, send or receive e-mails, use online banking or social networks,…)?
1. Desktop computer 2. Portable computer (laptop) 3. Tablet 4. Smartphone (mobile telephone with internet access) 5. Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...) 6. TV player 7. Another device: _________________
Q2b If Q2 = 1 Over the past month, have you at any time used the internet on a desktop computer on the following places (e.g. to surf, send or receive e-mails, use online banking or social networks,…)? Several answers possible
1. at home 2. at work 3. in school or at the university 4. with friends or acquaintances 5. somewhere else
Q2c If Q2 = 2 Over the past month, have you at any time used the internet on a portable computer on the following places (e.g. to surf, send or receive e-mails, use online banking or social networks,…)? Several answers possible
1. at home 2. at work 3. in school or at the university 4. with friends or acquaintances 5. somewhere else
Annexes
CIM Internet study - methodology Page 53
Q3
single
What device are you using right now to fill out this survey?
1. Desktop computer 2. Portable computer (laptop) 3. Tablet 4. Smartphone (mobile telephone with internet access) 5. Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...) 6. TV player
Q4 single Are you the only person who uses the internet from this device?
1: Yes 2: No
Q4b (If Q4=2)
single (both) Together on 1
screen?
What is your share in the overall internet use from this device?
1: I am the main user of this device 2: I use this device to the same degree as other user(s) 3: I make less use of this device than other user(s) 4: I don't know
Q5 (If Q4 =2) Are different accounts (logins) being used on this device?
1: Yes 2: No 3: I don't know
If Q5=1
What account (login) do you use on this device?
1: I use the same account (login) as other users 2: I use my personal account (login) 3: I don't know
Q6 (If Q3=1 Vaste computer)
Single If (Q6 = 3 OR Q6 = 4 OR Q6 = 5) STOP the interview (screen out)
Where is this desktop computer located ?
1. at home 2. at work 3. in school or at the university 4. with friends or acquaintances 5. somewhere else
Q7 (If Q3=2 laptop)
single Where do you mainly use this portable computer?
1. at home 2. at work 3. in school or at the university 4. on different locations
Q8 (If Q3=3 Tablet)
single Where do you mainly use this tablet? 1. at home 2. at work 3. in school or at the university 4. on different locations
Annexes
CIM Internet study - methodology Page 54
Basic socio-demographic variables Q9 single What is your gender? 1. Male
2. Female
Q12 single What is the highest level of education that you have reached successfully, either in day school or evening school?
1: primary school or no degree 2: lower general secondary education (first three years achieved) 3: lower secondary technical, artistic or professional education (first three years achieved) 4: higher general secondary education (last three years achieved) 5: higher secondary technical, artistic education (last three years achieved) 6: higher secondary professional education (last three years achieved) 7: candidate, bachelor (academic or professional), graduate 8: university license, master, post graduate, extra-university higher education (long type) 9: university license with met additional degree, master after master 10: doctorate with thesis
Q13 single Which category responds the best to your current professional situation?
1: I am a pupil | student | in formation 2: I work full-time 3: I work part-time 4: I have temporarily suspended my professional occupation / I'm using up time credit 5: I have no employment at the moment (e.g. pensioned, jobless, …)
Q14 (If Q13=5) single Which statement suits you best? 1: I am a houseman | housewife 2: I am incapacitated 3: I am jobless 4: I am in early retirement 5: I am pensioned 6: Other
Q15 (If Q13= 2 or 3 or 4)
single What is your professional status (for your main occupation)?
1: independent 2: employed in the public sector 3: employed in the private sector
Q16 (If Q13= 2 or 3 or 4)
single Which of the following categories corresponds the best with your profession?
1: farmer 2: artisan 3: merchant, industrialist 4: worker 5: clerk 6: middle management (e.g. head of department or division, ...) 7: member of the general management, senior executive (e.g. director, manager, ...) 8: free profession 9: freelance, self-employed
Q17 (IF Q16 = 1 or 2 or 3 or 6 or 7 or 8 or 9)
single For how many employees are you responsible?
1: 0 2: from 1 to 5 3: from 6 to 10 4: 11 or more
Annexes
CIM Internet study - methodology Page 55
Q18 (If Q16=4) single Are you a …? 1: skilled worker 2: unskilled worker
Q19 (If Q16=5) single Are you a …? 1: clerk with primarily office work 2: clerk with little or no office work
Q20 numeric 0-15 Can you indicate how many household members, EXCLUDING yourself, are living with you on a permanent basis or at least half of the time (e.g. during the week, one week out of two)
I__I__I members
Q21 (If Q20 >0 for each)
numeric 0-99 Can you indicate the age of each household member ?
I__I__I year
Q22 (If Q20 >0) single Are you usually the person in charge of the choice of brands for food, drinks and maintenance products in your household?
1: Yes 2: No
Q23 (If Q20 >0) single Who is the main responsible for the income in your household (the person with the highest income)?
1: Myself 2: Someone else
Q24 (If NOT Q23 PI is Myself - PI = MRI)
single What is the highest level of education that was reached successfully by the main responsible for the income, either in day school or evening school?
1: primary school or no degree 2: lower general secondary education (first three years achieved) 3: lower secondary technical, artistic or professional education (first three years achieved) 4: higher general secondary education (last three years achieved) 5: higher secondary technical, artistic education (last three years achieved) 6: higher secondary professional education (last three years achieved) 7: candidate, bachelor (academic or professional), graduate 8: university license, master, post graduate, extra-university higher education (long type) 9: university license with met additional degree, master after master 10: doctorate with thesis 11: I don't know
Q25 (If NOT Q23 PI is Myself - PI = MRI)
single Which category responds the best to the current professional situation of the main responsible for the income?
1: pupil | student | in formation 2: working full-time 3: working part-time 4: has temporarily suspended his/her professional occupation / is using up time credit 5: has no employment for the moment
Q26 (If Q25 = 5) single Which statement suits the main responsible for the income best?
1: houseman | housewife 2: incapacitated 3: jobless 4: in early retirement 5: pensioned 6: other
Annexes
CIM Internet study - methodology Page 56
Q27 (if Q25 = 2 or 3 or 4)
single Which of the following categories corresponds the best with the profession of the main responsible for the income?
1: farmer 2: artisan 3: merchant, industrialist 4: worker 5: clerk 6: middle management (e.g. head of department or division, ...) 7: member of the general management, senior executive (e.g. director, manager, ...) 8: free profession 9: freelance, self-employed
Q28 (if Q10 date of birth leads to age >= 36 years)
single Are you a grandfather or grandmother ?
1: Yes 2: No
Q29 (if Q28=1) multiple Do you have grandchildren in the age ranges below?
1: 0 to 2 year 2: 3 to 5 year 3: 6 to 11 year 4: 12 to 14 year 5: 15 to 17 year 6: 18 to 24 year 7: 25 year and older
Q30 single What language do you usually speak at home?
1: Dutch 2: French 3: German 4: English 5: Arab 6: Spanish 7: Italian 8: Polish 9: Turkish 98: Other: …………… (specify)
Q31 single Do you speak other languages at home? 1: Yes 2: No
Q32 (if Q31 = 1) multiple What other languages do you speak at home?
1: Dutch 2: French 3: German 4: English 5: Arab 6: Spanish 7: Italian 8: Polish 9: Turkish 98: Other: …………… (specify)
Participation Q90 single + open Do you wish to participate in the
sweepstake? (You can win …..)
1. Yes, if I win, please inform me via the following e-mail address […………..] 2. No, I do not wish to participate
Q91 IN PROGRESS Software panel question(s)
Annexes
CIM Internet study - methodology Page 57
Q92A (If Q90 = 1. Ja)
single Can CIM invite you in the future to participate in other media studies?
1: Yes 2: No
Q92B (If Q90 = 2. Nee)
single + open Can CIM invite you in the future to participate in other media studies?
1. Yes, please invite me via the following e-mail address […………..] 2. No, I do not wish to be invited
For 12-17 year old a simplified version is used. You can find it below.
What is your year of birth?
12-14 years: You have to ask your parents for permission if you want to take part in the study. Do you give permission for your child to take part in the survey?
1: Yes 2: No
Do you live in Belgium? Yes/No
What province do you live in? Antwerp Brussels Hainaut Limburg Liège Luxembourg Namur East-Flanders Vlaams-Brabant Waals-Brabant West-Flanders I do not live in Belgium
What is your zip code? (Only for province Vlaams-Brabant)
Generally where do you connect to the Internet using desktop computer or laptop ?
at home
at work
at school or at the university
at friends or acquaintances
other place
Over the past month, have you at any time used the internet via one of the following devices (e.g. to surf, send or receive e-mails, use online banking or social networks, …)?
Desktop computer
Portable computer (laptop)
Tablet
Smartphone (mobile telephone with internet access)
Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...)
TV player
Another device
Who uses internet on the device you are on now? 1: mainly me 2: mainly other children 3. mainly adults
What is your gender? Female/Male
Do you wish to participate in the sweepstake? (You can win nice prizes)
Yes, if I win, please inform me via the following e-mail address
Annexes
CIM Internet study - methodology Page 58
No, I do not wish to participate
Can CIM invite you in the future to participate in other media studies?
Yes, please invite me via the following e-mail address: ______
No, I do not wish to be invited
6.3 CIM Internet Tablet Panel and CIM Internet Smartphone Panel
Multiplatform questions group
0 What is your year of birth?
1 You have to ask your parents for permission if you want to take part in the study.
2
0 Do you give permission for your child to take part in the survey?
Yes
1 No
3
0 How often do you usually use internet?
7 days a week
1 5 or 6 days a week
2 3 or 4 days a week
3 1 or 2 days a week
4 Less than 1 day a week
5 Less than 1 day a month
4
0 What device are you using right now to fill out this survey?
Tablet
1 Smartphone (mobile telephone with internet access)
2 Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...)
3 TV player
4 Other
5
0 Do you use this tablet to visit internet websites at least once a month?
Yes
1 No
6
0 How many people besides you use this tablet?
0
1 1
2 2
3 3
4 4
5 5 and more
7 0 Do you use this smartphone to visit internet websites at least once a month?
Yes
1 No
8
0 How many people besides you use this smartphone?
0
1 1
2 2
3 3
4 4
5 5 and more
9
0 What is your share in the overall internet use from this tablet?
I am the main user of this device
1 I use this device to the same degree as other user(s)
2 I make less use of this device than other user(s)
3 I don't know
10 0 0
Annexes
CIM Internet study - methodology Page 59
1 How many tablets in total do you use at least once a month to connect to the internet? Please think about all tablets, not only these that you own.
1
2 2
3 3
4 4
5 5 and more
11
0 What is your share in the overall internet use from this smartphone?
I am the main user of this smartphone
1 I use this smartphone to the same degree as other user(s)
2 I make less use of this smartphone than other user(s)
3 I don't know
12
0 How many smartphones in total do you use at least once a month to connect to the internet? Please think only about smartphones which you are the main user of.
0
1 1
2 2
3 3
4 4
5 5 and more
13
0 Over the past month, have you at any time used the internet via one of the following devices (e.g. to surf, send or receive e-mails, use online banking or social networks,…)?
Desktop computer
1 Portable computer (laptop)
2 Smartphone (mobile telephone with internet access)
3 Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...)
4 TV player
5 Another device
6 I don't use other devices
14
0 Over the past month, have you at any time used the internet via one of the following devices (e.g. to surf, send or receive e-mails, use online banking or social networks,…)?
Desktop computer
1 Portable computer (laptop)
2 Tablet
3 Portable multimedia console (e.g. PlayStation Portable, Nintendo DS,...)
4 TV player
5 Another device
6 I don't use other devices
15
0 How many desktop computers in total do you use at least once a month to connect to the internet?
1
1 2
2 3
3 4
4 5 and more
16
0 How many portable computers (laptops) in total do you use at least once a month to connect to the internet?
1
1 2
2 3
3 4
4 5 and more
17
0 1
1 2
Annexes
CIM Internet study - methodology Page 60
2 How many smartphones in total do you use at least once a month to connect to the internet?
3
3 4
4 5 and more
18
0 How many people besides you use this smartphone?
0
1 1
2 2
3 3
4 4
5 5 and more
19
0 Please think at this moment only about this smartphone which you use most often. How many people besides you use this smartphone?
0
1 1
2 2
3 3
4 4
5 5 and more
20
0 How many tablets in total do you use at least once a month to connect to the internet?
1
1 2
2 3
3 4
4 5 and more
21
0 How many people besides you use this tablet?
0
1 1
2 2
3 3
4 4
5 5 and more
22
0 Please think at this moment only about this tablet which you use most often. How many people besides you use this tablet?
0
1 1
2 2
3 3
4 4
5 5 and more
23
0 What is your share in the overall internet use from this smartphone?
I am the main user of this smartphone
1 I use this smartphone to the same degree as other user(s)
2 I make less use of this smartphone than other user(s)
3 I don't know
24
0 Please think at this moment only about this smartphone which you use most often. What is your share in the overall internet use from this smartphone?
I am the main user of this smartphone
1 I use this smartphone to the same degree as other user(s)
2 I make less use of this smartphone than other user(s)
3 I don't know
Annexes
CIM Internet study - methodology Page 61
25
0 What is your share in the overall internet use from this tablet?
I am the main user of this tablet
1 I use this tablet to the same degree as other user(s)
2 I make less use of this tablet than other user(s)
3 I don't know
26
0 Please think at this moment only about this tablet, which you use most often. What is your share in the overall internet use from this tablet?
I am the main user of this tablet
1 I use this tablet to the same degree as other user(s)
2 I make less use of this tablet than other user(s)
3 I don't know
Demographics
27
0 What is your gender? Female
1 Male
28
0 What is the highest level of education that you have reached successfully, either in day school or evening school?
Primary school or no degree
1 Lower general secondary education (first three years achieved)
2 Lower secondary technical, artistic or professional education (first three years achieved)
3 Higher general secondary education (last three years achieved)
4 Higher secondary technical, artistic education (last three years achieved)
5 Higher secondary professional education (last three years achieved)
6 Candidate, bachelor (academic or professional), graduate
7 University license, master, post graduate, extra-university higher education (long type)
8 University license with met additional degree, master after master
9 Doctorate with thesis
29
0 Which category responds the best to your current professional situation?
I am a pupil / student / in formation
1 I work full-time
2 I work part-time
3 I have temporarily suspended my professional occupation / I'm using up time credit
4 I have no employment at the moment
30
0 Do you live in Belgium? Yes
1 No
31 What is your zip code?
Annexes
CIM Internet study - methodology Page 62
Annex 2: socio-demographic variables in reporting The socio-demographic variables can be divided into two types: most criteria are the direct result of
the questions, such as age, gender or professional activity of the person interviewed. Others are the
result of treatments carried out on the basis of basic data, such as social groups, residences.
The following variables are available in gemiusExplorer and the planning files:
GENDER MEN
GENDER WOMEN
AGE IN YEARS
CIM LANGUAGE FRENCH
CIM LANGUAGE DUTCH
NIELSEN REGION NIELSEN I
NIELSEN REGION NIELSEN II
NIELSEN REGION NIELSEN III
NIELSEN REGION NIELSEN IV
NIELSEN REGION NIELSEN V
PROVINCES WALLOON BRABANT
PROVINCES BRUSSELS 19
PROVINCES ANTWERP
PROVINCES FLEMISH BRABANT
PROVINCES WEST FLANDERS
PROVINCES EAST FLANDERS
PROVINCES HAINAUT
PROVINCES LIEGE
PROVINCES LIMBURG
PROVINCES LUXEMBURG
PROVINCES NAMUR
EDUCATIONAL LEVEL INTERVIEWED PERSON NONE PRIMARY
EDUCATIONAL LEVEL INTERVIEWED PERSON SECONDARY LOW
EDUCATIONAL LEVEL INTERVIEWED PERSON SECONDARY HIGH GEN., TECHN. ART.
EDUCATIONAL LEVEL INTERVIEWED PERSON SECONDARY HIGH PROF
EDUCATIONAL LEVEL INTERVIEWED PERSON BACHELOR
EDUCATIONAL LEVEL INTERVIEWED PERSON MASTER
EMPLOYMENT STATUS RESPONDENT PUPIL, STUDENT, IN FORMATION
EMPLOYMENT STATUS RESPONDENT AT WORK FULL-TIME
EMPLOYMENT STATUS RESPONDENT AT WORK PART-TIME
EMPLOYMENT STATUS RESPONDENT TEMPORARILY SUSPENDED PROFESSIONAL OCCUPATION
EMPLOYMENT STATUS RESPONDENT NO EMPLOYMENT PROFESSIONAL ACTIVITY INTERVIEWED PERSON SELF EMPLOYED PROFESSIONAL ACTIVITY INTERVIEWED PERSON SALARY PUBLIC SECTOR PROFESSIONAL ACTIVITY INTERVIEWED PERSON SALARY PRIVATE SECTOR
Annexes
CIM Internet study - methodology Page 63
PROFESSIONAL ACTIVITY INTERVIEWED PERSON WITHOUT PROFESSIONAL ACTIVITY
PROFESSION INTERVIEWED PERSON SMALL COMMERCE, ARTISAN, INDUSTRIAL AND FREELANCE 5-, FARMER
PROFESSION INTERVIEWED PERSON UPPER MANAGEMENT, LIBERAL PROFESSIONS, BIG COMMERCE, INDUSTRIAL AND FREELANCE 6+
PROFESSION INTERVIEWED PERSON MIDDLE MANAGEMENT
PROFESSION INTERVIEWED PERSON EMPLOYEE
PROFESSION INTERVIEWED PERSON SKILLED WORKER
PROFESSION INTERVIEWED PERSON UNSKILLED WORKER
PROFESSION INTERVIEWED PERSON HOUSEWIFE
PROFESSION INTERVIEWED PERSON RETIRED
PROFESSION INTERVIEWED PERSON UNEMPLOYED
PROFESSION INTERVIEWED PERSON STUDENT
PROFESSION INTERVIEWED PERSON OTHER
MAIN RESPONSABLE FOR INCOME MRI YES
MAIN RESPONSABLE FOR INCOME MRI NO
SOCIAL GROUPS GROUP 1
SOCIAL GROUPS GROUP 2
SOCIAL GROUPS GROUP 3
SOCIAL GROUPS GROUP 4
SOCIAL GROUPS GROUP 5
SOCIAL GROUPS GROUP 6
SOCIAL GROUPS GROUP 7
SOCIAL GROUPS GROUP 8
SOCIAL GROUPS DON’T KNOW
NUMBER OF PEOPLE IN THE HOUSEHOLD 1 PERSON
NUMBER OF PEOPLE IN THE HOUSEHOLD 2 PERSONS
NUMBER OF PEOPLE IN THE HOUSEHOLD 3 PERSONS
NUMBER OF PEOPLE IN THE HOUSEHOLD 4 PERSONS
NUMBER OF PEOPLE IN THE HOUSEHOLD 5 PERSONS AND MORE
GRANDPARENT YES
GRANDPARENT NO
GRANDCHILDREN 0-2 YRS YES 0-2
GRANDCHILDREN 0-2 YRS NO 0-2
GRANDCHILDREN 3-5 YRS YES 3-5
GRANDCHILDREN 3-5 YRS NO 3-5
GRANDCHILDREN 6-11 YRS YES 6-11
GRANDCHILDREN 6-11 YRS NO 6-11
GRANDCHILDREN 12-14 YRS YES 12-14
GRANDCHILDREN 12-14 YRS NO 12-14
GRANDCHILDREN 15-17 YRS YES 15-17
GRANDCHILDREN 15-17 YRS NO 15-17
GRANDCHILDREN 18-24 YES 18-24
GRANDCHILDREN 18-24 NO 18-24
GRANDCHILDREN 25+ YES 25+
GRANDCHILDREN 25+ NO 25+
Annexes
CIM Internet study - methodology Page 64
PERSON RESPONSABLE FOR PURCHASES PRP YES
PERSON RESPONSABLE FOR PURCHASES PRP NO
LANGUAGE MOST SPOKEN AT HOME FRENCH
LANGUAGE MOST SPOKEN AT HOME DUTCH
LANGUAGE MOST SPOKEN AT HOME GERMAN
LANGUAGE MOST SPOKEN AT HOME ENGLISH
LANGUAGE MOST SPOKEN AT HOME OTHER LANGUAGE
INTERNET USAGE DEVICES DESKTOP COMPUTER
INTERNET USAGE DEVICES PORTABLE COMPUTER
INTERNET USAGE DEVICES TABLET
INTERNET USAGE DEVICES SMARTPHONE
INTERNET USAGE DEVICES PORTABLE MULTIMEDIA CONSOLE
INTERNET USAGE DEVICES TV PLAYER
INTERNET USAGE DEVICES ANOTHER DEVICE
INTERVIEWED PERSON'S AGE 12-14
INTERVIEWED PERSON'S AGE 15-17
INTERVIEWED PERSON'S AGE 18-20
INTERVIEWED PERSON'S AGE 21-24
INTERVIEWED PERSON'S AGE 25-29
INTERVIEWED PERSON'S AGE 30-34
INTERVIEWED PERSON'S AGE 35-39
INTERVIEWED PERSON'S AGE 40-44
INTERVIEWED PERSON'S AGE 45-49
INTERVIEWED PERSON'S AGE 50-54
INTERVIEWED PERSON'S AGE 55-59
INTERVIEWED PERSON'S AGE 60-64
INTERVIEWED PERSON'S AGE 65-69
INTERVIEWED PERSON'S AGE 70-74
INTERVIEWED PERSON'S AGE 75 AND MORE
INTERVIEWED PERSON AGE 18+ 12-17
INTERVIEWED PERSON AGE 18+ 18 +
INTERVIEWED PERSON'S AGE (7 cat.) 12-14
INTERVIEWED PERSON'S AGE (7 cat.) 15-24
INTERVIEWED PERSON'S AGE (7 cat.) 25-34
INTERVIEWED PERSON'S AGE (7 cat.) 35-44
INTERVIEWED PERSON'S AGE (7 cat.) 45-54
INTERVIEWED PERSON'S AGE (7 cat.) 55-64
INTERVIEWED PERSON'S AGE (7 cat.) 65+
PRESENCE OF CHILDREN 1 CHILD
PRESENCE OF CHILDREN 2 CHILDREN
PRESENCE OF CHILDREN 3 CHILDREN AND MORE
PRESENCE OF CHILDREN NO CHILDREN -55
PRESENCE OF CHILDREN NO CHILDREN +55
PRESENCE CHILDREN -1 YEAR CHILD -1Y YES
PRESENCE CHILDREN -1 YEAR CHILD -1Y NO
PRESENCE CHILDREN 1 YEAR CHILD 1Y YES
Annexes
CIM Internet study - methodology Page 65
PRESENCE CHILDREN 1 YEAR CHILD 1Y NO
PRESENCE CHILDREN 2 YEARS CHILD 2Y YES
PRESENCE CHILDREN 2 YEARS CHILD 2Y NO
PRESENCE CHILDREN 3-4 YEARS CHILD 3-4Y YES
PRESENCE CHILDREN 3-4 YEARS CHILD 3-4Y NO
PRESENCE CHILDREN 5-6 YEARS CHILD 5-6Y YES
PRESENCE CHILDREN 5-6 YEARS CHILD 5-6Y NO
PRESENCE CHILDREN 7-12 YEARS CHILD 7-12Y YES
PRESENCE CHILDREN 7-12 YEARS CHILD 7-12Y NO
PRESENCE CHILDREN 13-14 YEARS CHILD 13-14Y YES
PRESENCE CHILDREN 13-14 YEARS CHILD 13-14Y NO
Annexes
CIM Internet study - methodology Page 66
Annex 3: Calculation of social groups The social groups are based on the education and occupation of the PRI (Person mainly Responsible
for the Income).
For the CIM Internet study, we use a simplified version of the social groups as they are calculated in
the CIM Golden Standard. The simplified version leaves out the last occupation for people who are
retired or jobless and details on the PRI’s occupation if the respondent is not the PRI.
Education and occupation are each given a score; these scores are then multiplied. The result is split
into 8 groups, based on the distribution of the internet population in the structural study (CIM Press
Study 2015 Q1). In the Press Study the entire population (12+ internet and non-internet) is divided into
8 equal groups based on percentiles of the scores. This makes the social groups comparable to other
CIM studies. Since the internet population has a higher social level than the general population, the
highest social groups represent more than 1/8 of the panel.
In the tables below you can read the scores used for education and occupation.
Education Score
primary school or no degree 10
lower general secondary education (first three years achieved) 35
lower secondary technical, artistic or professional education (first three years achieved) 25
higher general secondary education (last three years achieved) 50
higher secondary technical, artistic education (last three years achieved) 45
higher secondary professional education (last three years achieved) 40
candidate, bachelor (academic or professional), graduate 75
university license, master, post graduate, extra-university higher education (long type) 85
university license with additional degree, master after master 90
doctorate with thesis 100
Occupation Score Comment
I am a pupil / student / in formation 10
I am a houseman / housewife 10
I am incapacitated 10
I am jobless 26
60 % of last profession (based on weighted average in press study 2012-2103)
I am in early retirement 39
75 % of last profession (based on weighted average in press study 2012-2103)
I am retired 34
60 % of last profession (based on weighted average in press study 2012-2103)
other 50
farmer 45
artisan 70
merchant, industrialist 90
worker 45
weighted average for skilled and unskilled (on press study 2012-2013)
clerk 62
weighted average for office worker or not (on press study 2012-2013)
Annexes
CIM Internet study - methodology Page 67
middle management (e.g. head of department or division, ...)
72 weighted average for middle management (on press study 2012-2013)
middle management (e.g. head of department or division, ...) + resp. for <=5 people
70
middle management (e.g. head of department or division, ...)
75
member of the general management, senior executive (e.g. director, manager, ...)
94 average for middle management
member of the general management, senior executive (e.g. director, manager, ...) + resp. for <=5 people
80
member of the general management, senior executive (e.g. director, manager, ...) + resp. for 6-10 people
90
member of the general management, senior executive (e.g. director, manager, ...) + resp. for 11+ people
100
free profession 100
freelance, self-employed 80 metrical average
freelance, self-employed 70
freelance, self-employed 90
skilled worker 50
unskilled worker 25
clerk with primarily office work 65
clerk with little or no office work 60
The social groups are available in the audience reporting starting with the February 2015 data.
These are the boundaries used to compute social groups in February 2015 and the following months:
February 2015
Group From To %
Group 1 5525 10000 17,2%
Group 2 4500 5400 18,6%
Group 3 3000 4250 11,9%
Group 4 2275 2925 12,4%
Group 5 1750 2250 12,1%
Group 6 1170 1700 12,4%
Group 7 500 1125 9,1%
Group 8 100 450 6,2%