Page 16 - Contributed Paper Session (CPS) - Volume 6
P. 16
CPS1465 Claude Macchi et al.
be linked with the different activity codes to be assigned to companies. This
model will be continuously enriched with elements from the descriptions of
the activities of the new companies to be codified or additional information
from the SBER. Once again here as well, feedback to the previous phase of the
process is essential, so that continuous correction and improvement of the
model can be achieved.
The last step in this phase is the evaluation of the codes that the system
proposes to the coders, who validate them, integrate them into the SBER or
reject them, which will generate feedback at earlier stages of the process, until
the defined success criteria are met.
3. Result
The NOGAuto project was launched in early 2018 and is still under
construction. After having built and tested the "Preparation" phase, we are
currently building the "Modelling" part. Full process testing, based on an
existing dataset that already has been tested manually, including the
evaluation of the codes at aggregate NOGA 2-digit level that the system
proposes, is scheduled for spring 2019. Its productive implementation is
planned in stages. In a first period, and until the quality level defined for the
most detailed NOGA code level is reached – expected by mid-2020 – the
system will only be used as a support tool for coders.
NOGAuto is not only a tool that can be used to codify the economic
activities of businesses, but can also be adapted to the needs of other
classifications. The more structured, standardised and targeted the
information to be codified, the easier it is to propose an automatic
codification. Initial discussions for an adaptation and an implementation of the
system in the context of the classifications of occupations and of diseases and
health problems have already been launched.
A central point that has accompanied this machine learning project from
the beginning is the question of acceptance. The word "automation" has
quickly been linked to "work reduction" and "loss of job", which caused quite
a lot of opposition to the project, principally among the staff responsible for
coding. Especially at the beginning of the work the cooperation with the
people who were supposed to give the initial inputs on the codification
processes was quite complex. An exercise of communication, explanation and
clarification was necessary to gain the trust and collaboration of the staff. For
the ISI 2019 conference, it is planned to present the complete system as well
as the results of the tests performed with data at NOGA 2-digit level.
4. Discussion and Conclusion
Thanks to machine learning, the NOGAuto system will take over the
reading and interpretation tasks of coders, propose new keywords and
5 | I S I W S C 2 0 1 9