Machine Learning methods to identify faint stellar streams in Milky Way-type galaxies
Student: Rosie Bartlett
Supervisors: Andreea Font (LJMU), Robert Lyon (LJMU), Robert Grand (LJMU)
Institution: Liverpool John Moores University
Tidal stellar streams are by-products of galaxy interactions through gravitational forces. They can be identified in galaxy images using deep-sky photometry; however, they are often very faint and therefore difficult to detect or identify. Nevertheless, the tidal streams provide crucial pieces of evidence for the process of galaxy formation, and they can even be used to constrain the nature of dark matter. Machine learning (ML) techniques can improve the detection of these faint features in the observational data, and they can improve the characterisation of their morphology which is an important diagnostic in constraining the nature of dark matter, as cold dark matter clumps disturb the morphologies of tidal streams more than the warm dark matter clumps. Large-scale surveys like Euclid or LSST are expected to generate images for millions of galaxies containing streams. It is therefore important to develop fast and reliable techniques that can identify these faint features automatically.
In this project you will develop new ML methods by training them on mock data created from cosmological simulations, where tidal streams can be easily identified. This approach will also provide a physical context to the models. The training set will be the largest to date for the purpose of identifying simulated streams with ML, containing more than 100 Milky Way-type galaxies. These systems are drawn from two suites of state-of-the-art cosmological simulations (ARTEMIS and Auriga), covering several plausible cosmological models - with cold, warm, or self-interacting dark matter - as well as many variations in the physical prescriptions for galaxy formation, thus providing the largest parameter space for training the ML methods.
From the shape, density, and extent of a tidal streams, the ML techniques can infer the dark matter distribution in large galaxies like the Milky Way and help in the interpretation of observations. The techniques may also be trained to identify and characterise any gaps in the observed tidal streams, and reconstruct the “stream mass functions” in these galaxies (a metric that could potentially help us distinguish between different dark matter models).
To achieve this, you will explore a variety of neural networks such as Generative Adversarial Networks (GANs), Convolution Neural Network (CNN) models and Vision Transformers (ViTs) to effectively de-noise, upscale, segment and classify real observational data. This represents a unique opportunity to develop in-demand industry skills relevant in multiple domains such as medical imaging, remote sensing, and computer vision. It is therefore anticipated that your contributions will not only help improve our understanding of the universe and dark matter phenomena, but also help drive forward machine learning research.
Throughout the project you will have access to the Astrophysics Research Institute’s postgraduate training programme. You will be given training in data science provided by the Centre for Doctoral Training LIV.INNO. Specialised training in deep learning methods via the Nvidia DLI programme, will also form part of your career development plan. This will help you make the most of your prioritised access to Prospero, the high-performance computing facility at Liverpool John Moores University. You will also be given the opportunity to carry out an industry placement of six months to broaden your wider research and career skills.