Page 13 - Contributed Paper Session (CPS) - Volume 6
P. 13
CPS1465 Claude Macchi et al.
1. Introduction
Each company and establishment stored in the SBER has a Swiss Economic
Activity Classification (NOGA) code, which is based on the Classification of
Economic Activities in the European Community (NACE). The codification is
carried out in two main steps: the first, with the assignment of a provisional
code when the business is integrated into the registers, and the second, which
takes place a few months later, with the validation of the first code assigned.
During the first codification, the FSO uses the economic activity code
assigned by the source itself, which provides SBER with information on the
new company. Depending on the data source, this coding may have been
done either by the source itself (mainly in the case of administrative data or
registers external to the FSO), or by third party companies (in the case of
announcements from commercial registers). The first code is then validated by
the FSO as part of a specific survey of all new companies registered in the
SBER. Based on the descriptions of economic activities provided by the
companies themselves, coders identify terms or concepts deemed relevant
that are compared to a list of keywords, currently containing more than 11’000
items and concepts in four different languages (German, French, Italian and
English) and related to each of the NOGA positions. This allows coders to
select a code and assign it to the observed company. After this phase, the
codes assigned may be updated or corrected at any time, on the basis of
inputs from surveys carried out in the context of statistical production,
administrative sources, external registers, the companies themselves and
information obtained on the Internet. The codes defining the economic
activity of companies are all assigned based on oral or written information
provided by the companies themselves. Codification therefore consists mainly
of reading, understanding and interpreting a text, followed by the definition
of terms or concepts that are compared with a list of keywords linked to the
classification codes.
With the NOGAuto system, the FSO aims to build a machine learning
system with the aim of automating the assignment of economic activity codes
to SBER companies. This will make it possible to
• reduce to a minimum the interpretation made by coders of texts
describing the economic activities of companies,
• harmonise and standardise the assignment of codes and
• minimise the time spent on the coding activity.
NOGAuto is not built in one go, but, like an onion, in different layers. The
central nucleus of the onion, the first phase of construction, makes it possible
to validate the codes currently associated with the units already registered in
the SBER. The following layer will assist coders with code proposals for the
activities of companies to be codified. The interaction of coders, who will
accept or reject the codes proposed by the system, as well as the continuous
2 | I S I W S C 2 0 1 9