Today I would like to relate Machine Learning to different concepts like biological psychiatry, stratified medicine, or the “subjectivity gap between neuroscience research and the clinical reality for patients with mental disorders”. Behind all these concepts there is a redefinition of the diagnostic process, which should incorporate more and more objective data. Although this constitutes a trend in mental health, it is the basis for a paradigm change that has embraced all branches of medicine already for some time. This paradigm change enables as well the application of Machine Learning in health applications. Biological psychiatry, and especially stratified medicine imply a change in the type of Machine Learning / Computational Intelligence methodology to be used in the medical domain.
The most usual methodologies used in health applications are supervised ones, whose employment requires the existence of ground truth. We discussed some months ago about the difficulties on generating ground truth for applying machine learning in diagnosis and prognosis tests to be used in Decision Support Systems in the health domain. Taking psychiatric diagnosis as an example, the test is based on the diagnosis issued by a physician using the criteria described in DSM. You take this diagnosis as your ground truth or golden standard and train your system to reproduce it with unseen data. Usually supervised classification approaches have been used for this purpose. In the training phase of this kind of approaches you show the classifier system some input-output pairs. The input is formed by the different measurements you have realized on a patient, e.g. the power in the different EEG frequency bands. The output corresponds to the expected system response associated with this input. Therefore you use the diagnosis or prognosis outcome for this purpose, i.e. this is your ground truth. The problem of this approach is that you are reproducing not only the diagnosis, but also possible misdiagnosis.
A more exploratory application of Machine Learning / Computational Intelligence is based on unsupervised approaches. Unsupervised methodologies do not use any a priori information on the data. Concretely you do not need to define any ground truth in unsupervised methodologies. You just train your algorithm in order for it to structure your data following a particular criterion. For instance clustering, which is the best known unsupervised learning approach, attain splitting your data in sets that fulfill an homogeneity criteria. In my opinion this fulfills some requirements needed for the implementation of biological psychiatry, stratified medicine and similar trends. Unsupervised methodologies can be used for knowledge discovery. The clearest example of this is the case where you have data of patients with different pathologies and you use this data as the input of a clustering algorithm. You can end up with subjects diagnosed with different diseases in the same cluster, which might indicate the same type of disease from a data analysis point of view. Your algorithm would deliver a working hypothesis to further investigate that the two different diagnosis are well justified.
A further application of unsupervised methodologies is that of finding subtypes among subjects who have been diagnosed with the same disease. This has been for instance exploited in a paper dealing with EEG from psychotic patients. This is the kind of approach that can help in stratified medicine for establishing personalized diagnosis and treatments. Interestingly these are some of the objectives of the research programs established by funding agencies both in Europe and the USA for the coming years. I hope with this Machine Learning and Computational Intelligence to be further exploited as very interesting tools in the health domain.