Page 317 - Invited Paper Session (IPS) - Volume 2
P. 317
IPS254 Thaddeus Tarpey et al.
we “learn” particular linear trans-formations of the data as a pre-conditioning
before we run the clustering algorithm. This approach will lead to an iterative
algorithm. The underlying idea is that if two diagnosis categories exist for a
particular mental disorder, then typically there will be a strong overlap in their
feature space. In order to illustrate this point, we introduce another index for
cluster quality: the variation of information (VI) (Meilá, 2007), which is a
measure of how well two clusterings of a data set coincide with each other.
This measure is particularly useful in simulations where we know the true
cluster memberships of data points and we can then use VI to determine how
well a clustering result coincides with the another clustering. The idea here is
described by the following algorithm:
0. Form an initial clustering of the features using k-means clustering.
1. Compute the VI measuring agreement between the k-means clustering
and the clinician-based diagnoses.
2. Use Newton’s method to estimate an optimal direction to “stretch” (via
a linear transformation) the features.
3. Re-run the clustering algorithm on this pre-conditioned data.
3. Results
The results will follow soon.
4. Discussion and Conclusion
We have proposed an unsupervised learning approach to the problem of
psychiatric nosol-ogy that implements a semi-supervised clustering algorithm.
The supervision comes from clinician-informed diagnosis decisions which
leads to linear transformations that are then used to optimize clustering
criteria.
References
1. American Psychiatric Association. (2013). Diagnostic and Statistical
Manual of Mental Disorders. American Psychiatric Publishing, Arlington,
VA, fifth edition.
2. Bruni, C. and Koch, G. (1985). Identifiability of continuous mixtures of
unknown gaussian distributions. Annals of Probability 13:1341–1357.
3. Clementz, B. A., Sweeney, J. A., Hamm, J. P., Ivleva, E. I., Ethridge, L. E.,
Pearlson, G. D., Keshavan, M. S., and Tamminga, C. A. (2016).
Identification of distinct psychosis biotypes using brain-based
biomarkers. American Journal of Psychiatry 173:373–383.
4. Diaconis, P. and Freedman, D. (1984). Asymptotics of graphical
projection pursuit. Annals of Statistics 12:793–815.
5. Grzadzinski, R., Martino, A. D., Brady, A., Mairena, M. A., O’Neale, M.,
Petkova, E., Lord, C., and Castellanos, F. X. (2011). Examining autistic
304 | I S I W S C 2 0 1 9