Page 145 - Contributed Paper Session (CPS) - Volume 5
P. 145
CPS1159 Philip Hans Franses et al.
(min̂ ,| ; max̂ ,| )
as the explanatory variable, instead of ̂ , . These two new variables are
intervals, and often they are called symbolic variables. The MZ regression thus
also becomes a so-called Symbolic Regression, see Bertrand and Goupil
(1999), Billard and Diday (2000, 2003, 2007).
Table 2 presents an exemplary dataset for May in year t, so m = 17. Figure
1 visualizes the same data in a scatter diagram. Clearly, instead of points in the
simple regression case, the data can now be represented as rectangles.
How does Symbolic Regression work?
When we denote the dependent variable for short as y and the dependent
variable as x, we can compute for the Symbolic MZ Regression
(, )
̂
= ()
1
and
̂
̂
= ̅ − ̅
0
1
there by drawing on the familiar OLS formulae.
Under the assumption that the data are uniformly distributed in the
intervals, Billard and Diday (2000) derive the following results. At first, the
averages are
1
̅ = ∑(max + min )
2
and
1
̅ = ∑(max ̂ ,| + min ̂ ,| )
2
The covariance is computed as
(, )
1
= ∑(max + min ) (max ̂ ,| + min ̂ ,| )
4
1
− [∑(max + min )] [∑(max ̂ ,| + min̂ ,| )]
4 2
134 | I S I W S C 2 0 1 9