Page 261 - Special Topic Session (STS) - Volume 2
P. 261
STS490 Riaan d.J.
3. What can we learn from the established subfields of data science?
As stated before, statistics and operations research are two of the oldest
subfields in data science and many practicing statisticians and operations
researchers consider themselves data scientists. What can we learn from these
fields that could help us in training the data scientists of the future?
Universities all over the world have largely failed to deliver professionally
trained graduates in the fields of OR and statistics. Although well trained
academically, many newly appointed graduates find it difficult to immediately
add value at their place of employment. Typically they lack subject matter
knowledge of the application field (e.g. finance or physics) and struggle with
real-world problem solving abilities, such as the formulation of messy
problems, meaningful interaction with clients, interrelationships with team
members and business communication. Some students also lack numerical
and data handling programming skills that are not addressed adequately in
many curricula.
Some of the lessons I learnt in the many industry projects I have been
involved in, include:
Always focus on the business value throughout the course of the
project.
Involve all role-players and instil trust and confidence about your ability
as a consultant.
Manage the client’s expectations, communicate clearly and pay
attention to fostering good interpersonal relationship skills.
Test the client’s understanding of his/her own problem and educate the
client when necessary.
Be sure that the problem to be solved is well formulated, because you
do not want to solve the wrong problem.
Always be cognisant of the importance of simplicity and when your
solution is very complicated, seek a simpler solution, if possible.
Do not be fixated on new untested technologies.
Solve the critical aspects that will determine eventual success, first.
Always revisit the scope and risks of the project and plan properly.
How do we, however, train students to ensure that they become
professional data scientists? This is not easy and will be addressed in the last
section.
4. What should we teach aspiring data scientists?
From the above it should be clear that a training programme should
include training in the following:
The mathematical and computational sciences: Topics could include
courses in statistical and probability theory, artificial intelligence,
machine learning, operations research, and computer science.
250 | I S I W S C 2 0 1 9