Omar Maddouri, a doctoral candidate in the Department of Electrical and Computer Engineering at Texas A&M University, is working with Professor Dr. Byung-Jun Yoon and Robert M. Kennedy 26 Professor Dr. Edward Dougherty to evaluate the machine- learning models using transfer learning principles. Dr. Francis “Frank” Alexander of Brookhaven National Labs and Dr. Xiaoning Qian of the Department of Electrical and Computer Engineering at Texas A&M University are also involved in the project.
In data-driven machine learning, models are created to make predictions and estimates of what is going to happen in a given data set. An important area of machine learning is classification, which allows a set of data to be evaluated by an algorithm and then classified or decomposed into classes or categories. When the datasets provided are very small, it can be very difficult not only to build a classification model based on these data, but also to evaluate the performance of this model, ensuring its accuracy. This is where transfer learning comes in.
“In transfer learning, we try to transfer knowledge or bring data from another domain to see if we can improve the task we are doing in the domain of interest or the target domain,” explained Maddouri.
The target domain is where models are built and their performance is evaluated. The source domain is a distinct domain that is still relevant to the target domain from which knowledge is transferred to facilitate analysis in the target domain.
Maddouri’s project uses a joint prior density to model the relationship between source and target domains and proposes a Bayesian approach to apply transfer learning principles to provide a global error estimator of models. An error estimator will provide an estimate of the accuracy of these machine learning models to classify the available datasets.
This means that before any data is observed, the team builds a model using their initial inferences about model parameters in the target and source domains, then updates that model with increased accuracy as more evidence or information about datasets becomes available.
This transfer learning technique has been used to build models in previous work; however, no one has ever used this transfer learning technique to propose new error estimators to evaluate the performance of these models. For efficient use, the designed estimator was implemented using advanced statistical methods that enabled fast filtering of the source datasets, which improves the computational complexity of the transfer learning process by 10 to 20 times.
This technique can help serve as a benchmark for future research within academia to build upon. Additionally, it can help identify or categorize different medical issues that would otherwise be very difficult. For example, Maddouri used this technique to classify patients with schizophrenia using transcriptomic data from brain tissue samples originally acquired through invasive brain biopsies. Due to the nature and location of the brain region that can be analyzed for this disorder, the data collected is very limited. However, using a rigorous feature selection procedure that includes differential gene expression analysis and statistical tests for the validity of hypotheses, the research team identified the transcriptomic profiles of three genes from an additional brain region. found to be highly relevant to the desired brain tissue, as reported by independent research. studies of other literatures.
This knowledge allowed them to use the transfer learning technique to take advantage of samples taken from the second brain region (source domain) to facilitate analysis and greatly improve diagnostic accuracy in the brain region of origin (target domain). Data gathered from the source domain can be exploratory in the absence of information from the target domain, allowing the research team to improve the quality of their conclusion.
This research was funded by the Department of Energy and the National Science Foundation.