Page 58 - Special Topic Session (STS) - Volume 3
P. 58

STS515 Jim R. et al.



                                Looking back – looking forward; statistics and
                                           the data science tsunami
                                 Jim Ridgway, James Nicholson, Rosie Ridgway
                                     School of Education, University of Durham, UK

                  Abstract
                  The discipline of statistics arose from pressing needs to address a variety of
                  social and scientific problems. The founders of the Royal Statistical Society in
                  the UK, and the American  Statistical Association were very  diverse in their
                  backgrounds  and  interests,  but  shared  a  common  purpose  –  namely,  to
                  address difficult and interesting challenges. They also acted in similar ways, by
                  working across disciplines, and inventing mathematics and models suited to
                  new problems. Computer scientists have also addressed real-world problems,
                  have pioneered interesting and exciting approaches to handling new sorts of
                  data (e.g. from sensors and social media) and have developed new analytic
                  tools (notably, tools based on machine learning); their work is having dramatic
                  (and sometimes unexpected) impacts on society. Early encounters between
                  statisticians and computer scientists often resembled ‘turf wars’ – with claims
                  that  statistics  was  fast  becoming  redundant,  and  that  computer  scientists’
                  ignorance of core statistical concepts such as sample bias would prove fatal
                  to their entire enterprise. The problems that beset the start of the twentieth
                  century have not gone away; modern societies face a wide range of existential
                  threats  such  as  global  warming  and  nuclear  war.  As  before,  collaboration
                  across  disciplines,  and  the  creation  of  new  modelling  tools  are  needed  to
                  address  these  problems.  Here  we  begin  by  drawing  lessons  from  the
                  development of computer science in its earliest days, focussing on Babbage’s
                  Analytical Engine. We then highlight key epistemological differences between
                  traditional  statistics  and  traditional  computer  science,  such  as  the  role  of
                  theory  and  the  use  of  ‘black-box’  models.  We  argue  the  case  for  the
                  development  of  the  Epistemological  Engine  –  a  tool  for  analysing  and
                  improving the processes of knowledge creation and utilisation that will require
                  the skills of both statisticians and data scientists. We conclude by identifying
                  competences  and  dispositions  relevant  to  students  of  statistics  and  data
                  science, drawing on both contemporary developments and the earliest days
                  of computing.

                  Keywords
                  Modelling; Turf wars; Epistemology; Black-box; engineering





                                                                      47 | I S I   W S C   2 0 1 9
   53   54   55   56   57   58   59   60   61   62   63