Page 262 - Special Topic Session (STS) - Volume 2
P. 262
STS490 Riaan d.J.
Programming skills: Numerical programming skills in languages such
as SAS, R and Python.
Data management skills: Topics should include data bases and
warehousing that concentrate data manipulation and merging skills in
languages such as SQL, SAS, R and Python.
Subject matter knowledge in selected fields of application.
Professional problem-solving skills.
Assuming a sound knowledge of undergraduate training in the
mathematical and computer sciences, one could include the following topics
in a graduate programme: generalised additive models; regularisation (lasso
and elastic nets); model selection; time series analysis; multi-variate statistics;
cluster analysis; optimisation; neural networks and deep learning; support
vector and factorisation machines; event stream processing; text analytics;
database handling and extraction. All of these courses should have a practical
element, where the techniques are programmed in one of the above-
mentioned programming languages and applied to data and problems in a
relevant application. Depending on the application areas, suitable courses on
the important concepts in these fields should be included. For example, in
astrophysics, it might be necessary to include courses such as signal
processing and pattern recognition and basic concepts in astrophysics.
Similarly, if the application area is finance, courses could include scorecard
model building, risk management and other important financial concepts (e.g.
value-at-risk). It is of course, not practical to cater for all the fields and possible
topics, if not impossible. At my university we have spread the programme over
two years, where all the technical courses are covered in an honours degree
and half of the masters’ degree. The remainder of the masters’ programme
addresses the professional training aspects.
5. Adding professionalism to the training programme
Teaching students the problem-solving skills necessary for the industry is
a real challenge. The instructor should facilitate a mind set change among
students to ensure they focus on the importance of solving the business
problem and not a statistical or mathematical sub-problem. More importantly,
these courses should be taught by people with the necessary experience in
solving problems in the particular application area (see e.g. Coetzer & de
Jongh, 2016). This suggests that data science programmes comprising only
academics with no experience in solving industrial or business problems will
make it extremely difficult to equip data scientists with the requisite skills to
function effectively in industry.
In our Masters programme we follow an integrated hands-on training
approach in solving problems in the area of application. This is done in the
form of on-site (at the client company) internships where a student is assigned
251 | I S I W S C 2 0 1 9