Opportunistic computing for real-world problems

Matthew Carter - A Case Study on Unleashing the Potential of Bayesian Inference through Spare Computers

Bayesian statistics provides a robust framework for statistical inference across various domains, including healthcare, manufacturing, and the physical sciences. Researchers from the University have previously utilised this framework to identify individuals at risk of mental health crisis. Central to Bayesian statistics is the generation of samples from a posterior probability distribution which is analytically intractable. Traditional methods for generating these samples, such as Markov Chain Monte Carlo (MCMC), suffer from poor parallelisation, limiting the complexity of models that can be analysed within reasonable time frames. For example, when modelling the evolution of COVID-19, practitioners were constrained to only consider infection rates at a national level where an ability to express them at the level of towns or postcodes was desired.

In contrast, Sequential Monte Carlo (SMC) samplers offer inherent parallelisability, allowing for efficient distribution across modern computational architectures. Researchers in the CDT in Distributed Algorithms along with researchers from IBM Research and STFC Hartree Centre have developed a scalable, parallelised SMC sampler which can harness the power of supercomputers (such as the University system, Barkla) to provide inferences in shorter timescales. Despite the impressive performance, high-performance computing (HPC) infrastructures come with significant barriers such as cost and expertise requirements which hinders universal accessibility, particularly in low to middle income countries.

Opportunistic computing presents an alternative paradigm, leveraging idle resources like workstations, GPUs, and even mobile devices to tackle large-scale computational tasks. By tapping into these underutilized resources, opportunistic computing offers a more accessible and cost-effective, and potentially energy-efficient alternative to traditional HPC solutions. Opportunistic computing clusters have been used by LIGO in the search of gravitational waves and UC Berkley in the search for extra-terrestrial intelligence. At their peak, these opportunistic frameworks have enough compute power to rival some of the biggest HPC infrastructures.

The Universities High Throughput Computing service allows users to utilise spare PCs located in teaching and learning centres across the campus. At it’s peak, there are over 500 PCs in the HTCondor pool which users can run their computational workloads on. CDT in Distributed Algorithms Ph.D. student, Matthew Carter, has developed an opportunistic SMC sampler that capitalises on this spare compute at the University. So far, the opportunistic SMC sampler has been applied to several example problems in epidemiology and proteomics with plans to apply the framework to pertinent real-world problems soon.

Matthew said, “Opportunistic computing has the potential to utilise spare compute resources across geographic areas, not just within a single institution. The City of Liverpool is interconnected via high-speed gigabit Ethernet, positioning us powerfully to leverage spare compute across the city to address previously inaccessible challenges. Having demonstrated the potential of the proposed opportunistic SMC sampler, our focus now lies in identifying relevant problem areas where the full capability of the SMC sampler can be harnessed. “

A conference paper on this work has been submitted to a leading conference and is currently in the review process. We plan to release the paper and an accompanying software package once this process is complete. Moving forward, we plan to apply the framework to a wide set of real-world problems including monitoring the evolution of pandemics, utilising proteomics to understand Alzheimer’s Disease, and aiding in the search of MH370. This work was co-funded by the EPSRC Centre for Doctoral Training in Distributed Algorithms at the University of Liverpool and IBM Research Europe.

Keywords: Bayesian Inference, Opportunistic Computing, Distributed Computing, Healthcare Analytics

Back to: Centre for Doctoral Training in Distributed Algorithms