Page 149 - Invited Paper Session (IPS) - Volume 1
P. 149
IPS122 Elise C. et al.
tested in comparison to the current one (based on a classic regression) and to
apply the methods selected on larger scales.
4.3. Mobile phone data
Sponsor: INSEE Regional Studies Directorate;
Team: SSP Lab (one permanent member and one intern);
Schedule: several experimentations since 2016;
Deliverables: institutional and academic contributions, data-processing
techniques and experimental prototypes.
Mobile phone data has proven to form an exciting new data source for
official statistics. The SSP Lab intends to explore the institutional, legal,
technical and methodological challenges that come with the integration of
mobile phone data in official statistics.
a. Data access at Orange Labs
Access to a pseudo-anonymised dataset collected by Orange for billing
purposes has been made possible through an agreement between Orange
Labs, Eurostat and INSEE. The dataset consists of Call Detail Records (CDR)
describing information on each phone call and text message (SMS) sent or
received by Orange users in the period from May to mid-October 2007.
b. First experiments
A number of experiments have been performed. The goal of the first
experiment was to detect urban zones thanks to mobile phone data and
application of supervised classifiers (Vanhoof, Combes, de Bellefon, 2017). The
second experiment intended to measure the residential population by using
the CDR during nights and advanced treatments of the data (de Bellefon,
Givord, Sakarovitch, Vanhoff, 2018). The last experiment is still in progress and
analyses the segregation by combining mobile data and fiscal data (Galiana,
Sakarovitch, Smoreda, 2018 presented in this conference).
c. Future prospects
These experiments showed that mobile data are extremely rich, but could
be unsuitable for some applications because of location imprecision or
representative bias. In order to exploit the whole richness of information of
these data, the following experiments will focus on the estimation of
population present within a given place and time (as opposed to the
residential population). Different time and geographical scales will be
explored, potentially with signalling data. A new agreement for continuing the
collaboration is ongoing.
138 | I S I W S C 2 0 1 9