Page 224 - Invited Paper Session (IPS) - Volume 1
P. 224
IPS142 Jorge L. V. V. et al.
distributions according to sex, age and place of residence in an external
source.
Current activity status
Method and sources: This is one of the variables that involves a greater
number of sources: Social Security Registers, Unemployment Registers, Public
Aids Database, Mutualities Registers, Register of Retired Civil Servants,
Registers of Students, Tax Agency information, 2001 and 2011 Census.
General issues: Information of this variable will only be provided for people
aged 15 or more. Using all these information, it is normal to have, in some
cases, conflicts among data sources. In order to solve this issue, priority rules
based on the recommendations of the United Nations and the European
Regulation for Censuses have been used. It is normal that some people (for
example women aged 55 or more) do not appear in any of the sources used
and they will be classified as “others” inside the group Outside of the labour
force. The results obtained for this variable in the 2016-PFC are very similar to
those one of the LFS although it is still necessary to refine the category of the
unemployed so that it resembles the ILO recommendation.
4. A new quality framework
One of the main novelties of the 2021 Census will be the inclusion of a new
mechanism that enables users to evaluate the quality of each one of the census
variables and that will also enable INE to make better decisions. The idea is to
create for each variable (for example: legal marital status, educational level
attained, etc.) a new one that would store information indicating the method
or type of source used to provide the value for every person.
The procedure will consist in the creation (for each Census variable) of a
new derived variable with several categories that will take into account various
factors reflecting if it is a direct, indirect source or if the information has been
imputed.
If we focus on the way we obtain the cell estimation, we will be able to
quantify quality in a two dimensional basis: quality along a specific variable and
quality in terms of each person.
An analysis by columns (variables) across people, allows us to detect for
every variable involved what is the percentage of records provided by different
sources or methods and the percentage of imputed records. This information
helps us to detect the quality of the sources.
If we concentrate on rows (people) we can identify those records with the
poorest quality level: those ones that have missing values or imputed
information in several variables. It is very plausible to identify profiles of people
with missing information that are difficult to estimate by administrative records,
such as foreigners or people living in deprived areas.
213 | I S I W S C 2 0 1 9