Page 305 - Contributed Paper Session (CPS) - Volume 6
P. 305

CPS1937 Xu Sun et al.



                                  Data mining of mobility table
                             Based on community discovery methods
                                                           2
                                        Xu Sun , Xiao-hui Li
                                               1
                  1 School of statistics, Dongbei University of Finance and Economics, Dalian, China
              2 College of public administration and humanities, Dalian Maritime University, Dalian, China.

            Abstract
            Based on community discovery methods, a new approach to the modeling
            social mobility data is presented. Community detection algorithm to identify
            communities of social classes within which social classes share members at
            above expected rates. This approach, when applied to mobility data, may be
            used to substantially improve the fit of models of social mobility. To illustrate,
            the community effect model of social mobility is analyzed using data from the
            General Social Survey.

            Keywords
            Intergenerational  mobility  tables;  Log-linear  model;  Community  detection;
            Eigenspectrum decomposition

            1. Introduction
               Intergenerational mobility is an  important perspective of social mobility
            analysis, and have a variety of log-linear at their disposal with which to analyze
            the structures and patterns embedded within mobility tables (e.g.,Hout, 1983).
            In  empirical  analysis,  in  many  cases,  the  structure  in  mobility  tables  is  so
            sufficiently  complicated  that  parsimonious  models  do  not  capture  the
            observed patterns. In such circumstances, there are ever more complicated
            models that may be fit to the data. For example, Moses & Holland (2010)
            compared 12 statistical strategies which included significance tests based on
            four chi-squared statistics proposed for selecting log-linear models. Tibshirani
            (2011)  proposed  Lasso  (Least  aboslute  shrinkage  and  selection  operator)
            method  for  estimation  in  generalized  regression  model.  Yuan  et  al.  (2011)
            purposed an automatic data mining method of contingence table based on
            multinomial processing tree model. Likewise, if a preferred model (e.g., quasi-
            symmetry) does not fit the data, one can estimate a correspondence analysis
            on the residuals to “see” the associations left over in the data (Falguerolles &
            Leeuw, 1989). Melamed (2015) drawn on the idea that mining the residuals
            and uses community detection methods to “see” the associations left over in
            the data. These methods provided a good-fitting log-linear model, however,
            the results are often particularly complicated and an  understanding of the
            underlying  mobility  processes  may  be  obscured  by  the  complexity  of  the
            model.
                                                               294 | I S I   W S C   2 0 1 9
   300   301   302   303   304   305   306   307   308   309   310