Page 63 - Special Topic Session (STS) - Volume 3
P. 63
STS515 Jim R. et al.
Most of these developments can be described as ‘engineering’ – a useful
product emerges from an analysis of an interesting challenge. The relationship
between statistics and data science is analogous to the relationship between
mathematics and engineering. Engineers don’t do ‘applied mathematics’ they
do ‘engineering’, and used mathematics where appropriate. Similarly, data
scientists don’t do ‘applied statistics’ they build things, and use statistics when
they (think they) need to.
It is worth reflecting on the extent to which analytic models, p-values and
effect sizes have contributed to the developments in computer science that
have radically reshaped the modern world. For the practical examples listed
above, the designers’ ambitions are for 100% success, not for theoretical
nicety, nor for performance that is ‘significantly better than chance’.
3. Designing the epistemological engine
We are living in interesting times; new phenomena are emerging
(associated with billions of people having internet access, much greater wealth
and better health, worldwide). New sorts of data are available; there are new
sorts of analytic tools; there are new creators of knowledge (notably
technology companies) and new distributors, consumers and users of
knowledge. The problems that beset the start of the twentieth century have
not gone away; modern societies now also face existential threats such as
global warming and nuclear war. There is a need for knowledge-generators to
engage with problems that can be characterised as ‘messy’, ‘complex’, or
‘wicked’. These problems are characterised as being ill-defined in terms of
specifying relevant variables or measuring progress; they often involve
interacting systems at different levels. For example, climate change is
influenced by the actions of individuals (e.g. car choice and use), local
structures (e.g. support for recycling), national structures (e.g. policies on
house insulation and domestic solar power), and international initiatives (e.g.
consensus on restricting carbon emissions). There is no ‘right’ level to work at;
there are multiple ways to measure system states and the results of different
initiatives.
Addressing ‘wicked problems’ is likely to involve working with multiple
sources of messy data, and using a variety of analytic tools (see Ridgway et al
2018). Inter-disciplinary action is almost certain to be essential to success.
However, scientists working with even relatively simple problems can make a
mess of things. There are serious challenges to current methods of knowledge
acquisition, illustrated by the very poor quality of much of the research funded
at great expense in universities (see Ioannidis (2005); Open Science
Collaboration (2015)). These cannot be explained away as the result of poor
practice by a few individuals; they reflect systemic failure by some academic
communities. There is an urgent need to analyse and improve the whole
52 | I S I W S C 2 0 1 9