Page 64 - Special Topic Session (STS) - Volume 3
P. 64

STS515 Jim R. et al.
                  system  associated  with  the  creation  and  use  of  knowledge  –  in  short,
                  designing an Epistemological Engine (EE) has become a priority. The prime
                  candidates for creating and building the EE are statisticians and data scientists.
                     Early  encounters  between  statisticians  and  data  scientists  were  often
                  acrimonious; ‘statistics’ would be a casualty in ‘the death of theory’, and data
                  scientists’  ignorance  of  core  statistical  concepts  such  as  sample  bias  and
                  overfitting  would  prove  fatal  to  their  entire  enterprise.  The  EE  should  be
                  founded on techniques and skills used in both data science and statistics. Data
                  scientists create open data repositories (e.g. https://registry.opendata.aws/),
                  and  have  adopted  a  culture  of  sharing  code  –  especially  Workflows  (e.g.
                  https://github.com/)  to  facilitate  a  comparison  of  different  analytical
                  techniques and modelling assumptions. They use Common Task Frameworks
                  wherein  success  is  judged  on  terms  of  actual  performance  in  analysis, not
                  theoretical niceties. Statisticians bring sophistication about data acquisition
                  (including  synthesising  and  triangulating  data  sources),  preparation,  and
                  exploration.  They  can  contribute  to  analyses,  data  representation  and
                  communication, and can comment on issues such as the likely generalisability
                  of  findings.  They  bring  considerable  sophistication  about  modelling.
                  Identifying  the  style  of  modelling  being  used  by  different  researchers
                  (explicitly  or  implicitly)  should  be  automated  in  the  EE..  Ridgway  (1998)
                  classifies styles of modelling, and describes analytic models (such as those
                  found  in  school  physics),  systems  models  (such  as  those  found  in  school
                  biology) and macrosystemic models – these are systems models where the
                  system itself undergoes change. Macrosystemic models can be divided into
                  two  groups  –  models  where  the  changes  in  the  system  are  relatively
                  predictable  (e.g.  ecological  restoration;  the  life  cycle  of  the  butterfly)  or
                  unpredictable (Brexit; climate change and global political stability in the Trump
                  era).
                      The EE should comprise a large tool collection. Sample tools include:
                      •  Critical evaluation of specific studies, using criteria for evaluation such
                         as  those  identified  by  Ioannidis  (2005)  and  the  Open  Science
                         Collaboration (2015), e.g. identifying weak effects using small samples,
                         and testing multiple hypotheses until a ‘significant’ result is found;
                      •  Identification of academic areas where there is insufficient sharing of
                         data, code and workflows;
                      •  Identification  of  academic  areas  that  are  paradigm-bound  (i.e.
                         characterised by analyses of rather few classes of data, and by the use
                         of a small set of analytic tools);
                      •  Tools for automated testing of code and workflows;
                      •  Identification of results that are important for some theoretical claims,
                         where  the  evidence  base  is  weak  (e.g.  where  there  has  been  little
                         replication across relevant populations);

                                                                      53 | I S I   W S C   2 0 1 9
   59   60   61   62   63   64   65   66   67   68   69