Machine learning to empower research
Artificial intelligence, or machine learning, can support complex analysis and advance quality research, but only when used carefully. John F. Wu shares advice on how machine learning can empower researchers
You may also like
Popular resources
Artificial intelligence (AI) – or machine learning – seems to be everywhere these days. If you’re a researcher, you’ve probably seen these terms pop up more and more in your field's academic literature. But how much of this is actually useful? Should you also be leveraging machine learning?
In this article, I'll describe a few cases of when machine learning is useful for research – and also when it isn't – by drawing inspiration from my own field in astronomy.
Machine learning delivers the most value for "data-driven" research problems: when you have so much data that you can't inspect it manually. In these scenarios, machine learning can lighten your workload and allow you to focus on your area of research. However, adopting machine learning is not without its pitfalls and hidden costs.
When applied carefully, through a sceptic's lens, machine learning can enable research programs that would otherwise be infeasible. Broadly speaking, machine learning can empower researchers in four ways.
Sometimes you want to know if your dataset can be used to determine something else. For example, you may have heard about how machine learning in medicine can help doctors screening for cancer. In my field of astronomy, it is fairly simple to take images of millions of galaxies, but we have traditionally needed to take and analyse specialised observations in order to understand the details of how galaxies evolve. By using machine learning, my collaborators and I found that we could actually study these galaxies solely using images.
It's easy to create new models of how things should behave, but the real test of any model is whether it has any predictive power. By identifying connections within your data, you can formulate a model – and machine learning can too. Scientists have used machine learning to summarise these connections into the language of mathematics and uncover a new formula that explains the distribution of matter on cosmic scales.
If machine learning can be used to find the typical trends, then perhaps it's not surprising that machine learning is also great at detecting anomalous things. Many research fields can benefit from a thorough investigation of rare phenomena, and machine learning can help you spot the "needle in the haystack". In astronomy, machine learning has also been used to detect rare phenomena, like gravitational waves events, supernovae, gravitationally lensed galaxies, incorrectly processed data, and much more. One analysis of outlier galaxies found many interesting phenomena (including many "galaxies" that weren't galaxies at all).
Let's be honest: some aspects of research are boring and time-consuming. In radio astronomy, vast computational resources and lots of time are required to remove artificial signals and corrupted data. Machine learning can perform these tasks using a fraction of the cost and time.
By speeding up the boring parts of research, machine learning can also enable new kinds of analyses that would otherwise not be possible. Many research problems try to address the following problem: given an observed outcome, what are the parameters for a model that produced such an outcome? These so-called inverse problems can be tackled efficiently using machine learning. For more details, read up on simulation-based inference.
Datasets are growing bigger and bigger, but there are many ways to combine features into condensed versions. Dimensionality reduction methods include classical approaches like Principal Component Analysis (PCA), t-distributed Stochastic Neighbour Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP), or machine learning techniques such as using pre-trained neural networks or similar algorithms in order to transform the data into summarised versions.
It's also useful to understand which inputs (or features) are most important for making predictions. Different machine learning algorithms reveal the most important features in different ways; for instance, random forests can automatically rank features by importance. For neural network models, saliency mapping enables you to pinpoint which pixels in an image are most essential for making a prediction (eg, Gradient-weighted Class Activation Mapping, or Grad-CAM). These algorithms provide some level of machine learning interpretability that can benefit your research program.
Remember that not every problem can be – or should be – tackled using machine learning methods. Machine learning simply provides a different set of tools that you can add to your toolkit. Hopefully, by combining these novel tools with domain-specific expertise, you’ll be able to discern which tools are best for the problems you’re trying to solve. Machine learning may be particularly useful when you have lots of data, and if your research benefits from finding trends or outliers, machine learning acceleration, or data visualisation or feature importance ranking. In the coming years, clever applications of machine learning can potentially transform the way that research is done.
John F. Wu is an assistant astronomer at Space Telescope Science Institute and an associate research scientist at Johns Hopkins University.
If you found this interesting and want advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the THE Campus newsletter .
Applying machine learning without thinking can result in some dangerous analyses. Some machine learning algorithms have a steep learning curve. Just because it can be done with machine learning doesn't mean that it should be. 1. Make predictions based on trends 2. Spot outliers 3. Save time 4. Visualise and prioritise complex data John F. Wu is an assistant astronomer at Space Telescope Science Institute and an associate research scientist at Johns Hopkins University. If you found this interesting and want advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the THE Campus newsletter .