Page 78 - Special Topic Session (STS) - Volume 3
P. 78

STS515 Jeremiah D. D. et al.
                  debated  in  daily  conversations,  and  in  scientific  discourses.  Dhar  (2013)
                  pointed out that, as a multi-disciplinary subject, Data Science (DS) is distinct
                  from  statistics,  referring  to  the  growing  needs of  processing  data  that are
                  increasingly  heterogeneous  and  unstructured.  There  is  an  emerging
                  consensus  to  see  DS  as  an  interdiscplinary  field  that  incorporates
                  mathematics/statistics,  computer  science  and  information  science,  and
                  domain knowledge (WSU, 2016). Consequently, we consider it an interesting
                  question to ask: is there an ideal design of DS undergraduate curricula?

                  2.  Methodology
                  Profiling DS Programmes
                   In this paper, we will examine a few typical DS curricula as adopted by some
                   institutions  in  the  US,  China,  and  New  Zealand.  The  diversity  of  these DS
                   programmes is demonstrated in the different types of insitutes (liberal-arts
                   colleges, business schools, and universities) and different degress (BA, BSc,
                   and BEng). To profile the curriculum designs from these programmes, we use
                   a  hybrid  quantification  approach  to  score  their  requirements  on  four
                   dimensions, and we use visualization to indicate the differences: mathematics,
                   statistics, programming, and computing. We concentrate on the prerequisite
                   and core levels and leave the electives out to gain some understanding of the
                   core structure of these programmes.
                  Using University of Columbia as an example for illustration, we take a look of
                                                                             2
                  its undergraduate DS programme design as shown in Table 1 :

                  Table 1. Programme structure of the DS major at Columbia

                   Prerequisites: 15 points
                       •  Calculus I – III
                       •  Linear Algebra (Math or Applied Math)
                       •  STAT 1201 (Calculus-Based Introduction to Statistics)
                   Core: 8 courses (STAT and COMS)
                   STAT (12 points):
                   1) STAT 4203 (Probability Theory)
                   2) STAT 4204 (Statistical Inference)
                   3) STAT 4205 (Linear Regression Models)
                   4)  STAT  4241  (Statistical  Machine  Learning)  or  COMS  4771  (Machine
                   Learning) COM (12 points)
                   1) Introduction to Computer Science: COMS 1004, COMS 1005, ENGI 1006,
                   or COMS 1007



                    Columbia University, URL https://mice.cs.columbia.edu/c/d.php?d=245, August 8, 2018.
                  2
                  Retrieved April 29, 2019
                                                                      67 | I S I   W S C   2 0 1 9
   73   74   75   76   77   78   79   80   81   82   83