Soumya Banerjee
Senior Research Fellow and Affiliated Lecturer
University of Cambridge, Cambridge, United Kingdom
∗ E-mail: [email protected] [email protected]
Office: FC01 (first floor) in the computer science department
I work in explainable AI (xAI) and unconventional approaches to AI. I work at the intersection of complex systems and xAI: I take inspiration from complex systems to suggest new approaches to AI, and use AI to analyze complex systems.
This project will explore the use of large language models (LLMs) and a simple robot to explore the idea of embodied intelligence. We will also explore the idea of a mirror test in robots.
Intrigued? Please come speak with me.
This project will extend the following paper:
Language Models Meet World Models: Embodied Experiences Enhance Language Models
https://arxiv.org/pdf/2305.10626.pdf
This project can look to extend LLMs to play video games in 3D environments. For example, see the paper:
Scaling Instructable Agents Across Many Simulated Worlds
https://arxiv.org/pdf/2404.10179
HAZARD challenge: embodied decision making in dynamically changing environments
https://arxiv.org/pdf/2401.12975.pdf
https://github.com/UMass-Foundation-Model/HAZARD
Natural language societies of mind
https://arxiv.org/abs/2305.17066
- Contemporary approaches towards explainable AI are model-centric. We will use data-centric approaches to explain the complex interplay between data and models. This will build on and extend published work [1]. This project will be ideal for a student with interest in machine learning and who has coding experience.
There are many ways in which the work presented in [1] can be extended (either new methods or new application areas). Please get in touch to discuss.
Another project idea is to apply explainable AI approaches to genomic data. This will be a machine learning, computational biology and bioinformatics project.
The student will develop explainable AI approaches for interpreting clusters in single-cell gene expression data or other biological data. There is an opportunity to also look at other computational biology projects. No background is biology is necessary.
This work is part of the Accelerate Programme for Scientific Discovery which aims to democratize access to AI tools and apply AI to problems from diverse disciplines. The student will be part of a growing community of inter-disciplinary AI researchers at the University of Cambridge.
Project idea 1C: explainable AI and LLMs/generative AI applied to health data/electronic healthcare record data
This project will involve developing novel explainable AI algorithms, LLMs and generative AI and applying them to health data from a local hospital. The data will be on mental health.
The student will work closely with a clinician and psychiatrist and work on real data. The student will learn skills on how to work in an inter-disciplinary manner. This work would have real work impact and will help patients with mental illnesses.
This is work in collaboration with Dr. Anna Moore Winter.
Tailor machine learning model explanations based on audience (e.g. patients, clinicians, farmers, etc.). Generate natural language explanations from machine learning model and tailor these natural language explanations based on the unique background of the listener/audience.
Develop explainable machine learning models that explain themselves but do so without leaking personal data. For example, class-contrastive reasoning techniques [1] can be used to generate explanations. But they can inadvertently leak personal data. This project will explore the tradeoff between explainability and privacy (and potentially bias). This will lead to models that balance explainability, privacy and bias.
This project will involve modelling complex systems (like an epidemic spread) with generative agents. For example, generative agents can be coupled to agent based models. This can be used to simulate epidemics.
https://arxiv.org/abs/2307.04986
https://github.com/bear96/GABM-Epidemic
We will further develop this this framework and apply it to other complex systems (like supply chains, modelling of disinformation, people who are against taking vaccines, conflicts in societies, ecosystem modelling [see below] etc.)
https://research.csiro.au/atlantis/home/model-components/
We can also apply this to models like the World3 model
https://en.wikipedia.org/wiki/World3
https://github.com/cvanwynsberghe/pyworld3
and models of ecosystems
Project idea (Explainable neural cellular automata applied to biology [computational biology project])
Extending the neural cellular automata
https://distill.pub/2020/growing-ca/#experiment-2
and make it more interpretable and explainable.
For example, you can apply it to data from the Game of Life and infer the rules.
https://github.com/lantunes/netomaton/blob/master/demos/game_of_life/README.md
https://github.com/tomgrek/gameoflife
We can also apply it to biological data from cell biology (this can be a computational biology project). We have real world data from cell biology.
https://greydanus.github.io/2022/05/24/studying-growth/
No experience in biology is required.
- Build a computational model of analogy making and apply it to biomedical and genomic data.
For other project ideas related to explainable AI see the following page:
https://github.com/neelsoumya/special_topics_unconventional_AI/
Broadly this will use concepts like analogies and stories to create new explainable AI methods.
See for example
https://github.com/Tijl/ANASIME
https://github.com/crazydonkey200/SMEPy
https://github.com/fargonauts/copycat
This is a project on automatically discovering scientific laws (like Kepler's Law) and invariants (like Boyle's Law) from data.
This may involve building a model or Bayesian model and/or probabilistic programming model of infection dynamics (like a SIR model) or an intra-cellular regulatory network [5]. This would apply a probabilistic programming model to infection data from different sources.
Other models include phsyics simulators like pymunk:
http://www.pymunk.org/en/latest/examples.html#planet-py
You would simulate physics based simulations (like pymunk) or other models (like the SIR model above) and develop a machine learning approach to automatically generate insights from this model.
This would be an explainable AI model for a complex model of a physical system.
The project would involve building a model that would generate insights from these complex systems (an artificial model of human creativity).
There is also scope to use large-language models in this project.
This is a project on automatically discovering scientific laws (like Kepler's Law) and invariants (like Boyle's Law) from data. This will enable us to automatically discover conservation laws from data.
One can build a model or Bayesian model and/or probabilistic programming model of a complex systems model like infection dynamics (SIR model) or an intra-cellular regulatory network [5].
This would involve building a qualitative process model for a physical system.
This would be an explainable AI model for a complex model of a physical system.
The project would involve building a model that would generate insights from these complex systems (an artificial model of human creativity).
This is a project on automatically discovering scientific laws or mathematical equations from data.
This would involve extending the Ramanujan machine by applying it to other data or other dynamical systems or using another machine learning approach.
https://github.com/RamanujanMachine/RamanujanMachine
Other ideas including extending AI Feynman 2.0
https://github.com/SJ001/AI-Feynman
or BACON.3
https://github.com/jantzen/BACON
One can also apply symbolic regression approaches like PySR (python) or gramEvol (R).
This can be applied to discover, for example, trigonometric identities.
Other approaches include Bayesian symbolic regression
https://arxiv.org/abs/1910.08892
https://github.com/ying531/MCMC-SymReg
or seq2seq approaches to symbolic regression
https://openreview.net/pdf?id=W7jCKuyPn1
https://github.com/SymposiumOrganization/NeuralSymbolicRegressionThatScales
This would be an artificial model of human creativity.
Other ideas include discovering ordinary differential equations from data
https://arxiv.org/abs/2211.02830#
These techniques can also be applied to healthcare data (for example, data from smartwatches). This would be an AI applied to healthcare project (jointly with Dr. Abhirup Ghosh).
An example dataset can be the following:
https://www.physionet.org/content/wearable-exam-stress/1.0.0/
Dynamics of collective learning in artificial neural networks, Hopfield networks, self organizing maps and neural gas.
Application of ideas of emergence of intelligence-like behaviour in these systems like the following paper
Evolving reservoir computers reveals bidirectional coupling between predictive power and emergent dynamics
https://arxiv.org/pdf/2406.19201v1
This project will investigate collective artifical intelligence in building behaviour (altruism, co-operation, competition) and structures (structures to capture prey). This will use the multi-agent platform MAgent
https://github.com/geek-ai/MAgent
We can also use the EvoJax framework
https://github.com/google/evojax/blob/main/examples/notebooks/EncirclingAgents.ipynb
We can also extend MAgent using dream mechanisms in World Models
https://worldmodels.github.io/
https://smartlabai.medium.com/world-models-a-reinforcement-learning-story-cdcc86093c5
This project will involve modelling complex systems (like an epidemic spread) with generative agents. For example, generative agents can be coupled to agent based models. This can be used to simulate epidemics.
https://arxiv.org/abs/2307.04986
https://github.com/bear96/GABM-Epidemic
We will further develop this this framework and apply it to other complex systems (like supply chains, modelling of disinformation, people who are against taking vaccines, conflicts in societies, ecosystem modelling [see below] etc.)
https://research.csiro.au/atlantis/home/model-components/
This can also be combined with neural automata (see projects below)
Project idea 7 (Explainable neural cellular automata applied to biology [computational biology project])
Extending the neural cellular automata
https://distill.pub/2020/growing-ca/#experiment-2
and make it more interpretable and explainable.
For example, you can apply it to data from the Game of Life and infer the rules.
https://github.com/lantunes/netomaton/blob/master/demos/game_of_life/README.md
https://github.com/tomgrek/gameoflife
We can also apply it to biological data from cell biology (this can be a computational biology project). We have real world data from cell biology.
https://greydanus.github.io/2022/05/24/studying-growth/
Another idea is to apply this to simulations from computational fluid dynamics using the software below:
https://github.com/md861/HypFEM
This would be jointly with Mayank Drolia.
This can also be applied to genomic data.
Use neural cellular automate for self-organized control of complex systems
https://arxiv.org/abs/2106.15240
We can use this framework to, for example, model and control epidemics.
Also see projects above on generative agents.
https://arxiv.org/abs/2307.04986
https://github.com/bear96/GABM-Epidemic
Neural Cellular Automata Enable Self-Discovery of Physical Configuration in Modular Robots Driven by Collective Intelligence
https://www.nichele.eu/ALIFE-DistributedGhost/1-Nadizar.pdf
unityml engine for controlling robots https://github.com/FrankVeenstra/EvolvingModularRobots_Unity
This project would involve injecting commonsense in large language models.
Large language models can fail in spectacular ways. Some of this can be attributed to a lack of commonsense:
http://web.archive.org/web/20230902080842/https://garymarcus.substack.com/p/doug-lenat-1950-2023
https://arxiv.org/pdf/2308.04445.pdf
This would involve using the open-source database of commonsense rules (OpenCyc)
https://github.com/asanchez75/opencyc
and incorporating small aspects of this in a simple large language model.
Build a large language model to solve the Abstraction and Reasoning Corpus Challenge
https://github.com/fchollet/ARC
Abstraction and Reasoning Corpus Challenge
https://blog.jovian.ai/finishing-2nd-in-kaggles-abstraction-and-reasoning-challenge-24e59c07b50a
https://github.com/alejandrodemiquel/ARC_Kaggle
It has been suggested that large language models cannot reason. This project will infuse some reasoning/priors into large language models and apply them to a large reasoning corpus (Abstraction and Reasoning Corpus).
We can also augment human performance with LLMs.
We can also apply large language models to reasoning in math problems.
This will be a collaboration with Mikel Bober-Irizar.
This project will use hydrologic data and rainfall data from the British Antarctic Survey. This will be used to build machine learning models to predict rainfall, climate change and the effect on diseases (vector-borne diseases like malaria).
This would be a collaboration with Dr. Andrew Orr at the British Antarctic Survey.
This project will use large language models (LLMs) for scientific and mathematical reasoning.
https://arxiv.org/pdf/2308.09583.pdf
https://arxiv.org/pdf/2307.10635.pdf
This is jointly with Dr. Abhirup Ghosh.
War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars
https://arxiv.org/pdf/2311.17227.pdf
https://github.com/agiresearch/WarAgent
This project will use mechanistic interpretability to explain LLMs.
Theory of Mind benchmark for large language models
https://arxiv.org/abs/2402.06044
https://github.com/seacowx/OpenToM
Creating multi-agent systems with LLMs.
In many companies, managers routinely decide what roles to hire, and then how to split complex projects — like writing a large piece of software or preparing a research report — into smaller tasks to assign to employees with different specialties. Using multiple agents is analogous. Each agent implements its own workflow, has its own memory (itself a rapidly evolving area in agentic technology: how can an agent remember enough of its past interactions to perform better on upcoming ones?), and may ask other agents for help. Agents can also engage in Planning and Tool Use. This results in a cacophony of LLM calls and message passing between agents that can result in very complex workflows.
https://github.com/OpenBMB/ChatDev
Self-Replicating Prompts for Large Language Models: Towards Artificial Culture
https://direct.mit.edu/isal/proceedings/isal2024/36/110/123523
https://github.com/gstenzel/TowardsACULTURECode
Extend the following framework to reason about the physical properties of objects
Compositional Physical Reasoning of Objects and Events from Videos
https://arxiv.org/pdf/2408.02687
https://physicalconceptreasoner.github.io/
https://emprise.cs.cornell.edu/rcareworld/
Extend this project and create AI that ill perform experiments and write papers
https://arxiv.org/pdf/2408.06292
Improvements to the following paper
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
https://arxiv.org/pdf/2407.20183
Also see the demo
https://mindsearch.netlify.app/demo
https://mindsearch.openxlab.org.cn/
Can Large Language Models Understand Symbolic Graphics Programs?
https://arxiv.org/pdf/2408.08313
Evaluating Math Reasoning in Visual Contexts
Extend the work below to solve math problems and physics problems
https://qwenlm.github.io/blog/qwen2-math/
Study emergence and phase-transitions in LLMs. We will also investigate how hallucinations arise in LLMs. Collaboration with Prof. Georgi Georgiev.
This project will develop Swahili LLMs for scientific question and answering. This is a project with Dr. Nirav Bhatt.
This project will explore what kind of morality LLMs have.
This will extend the work of the Moral Machine and this paper:
The moral machine experiment on large language models
https://royalsocietypublishing.org/doi/full/10.1098/rsos.231393
A project to explain large-language models (LLMs) using prompt engineering based on class-contrastive counterfactuals [1,2]. This is a joint project with Prof. Pietro Lio.
This project will look at building a part of an AI powered virtual cell (AI virtual cell foundation models).
How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities
https://arxiv.org/pdf/2409.11654
Students will be jointly supervised with Prof. Neil Lawrence or Prof. Pietro Lio.
Project ideas can be developed according to student interests.
Please contact Soumya Banerjee at [email protected] or [email protected] to have an informal chat. Please also send a copy of your CV.
Office: FC01 (first floor) in the computer science department
You can learn more about my work here:
https://sites.google.com/site/neelsoumya
The main objective is to develop a suite of techniques inspired by classical AI to inform explainable AI. This project is part of a wider effort of unconventional approaches to AI.
-
Banerjee S, Lio P, Jones PB, Cardinal RN (2021) A class-contrastive human-interpretable machine learning approach to predict mortality in severe mental illness. npj Schizophr 7: 1–13.
-
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1: 206–215.
-
Banerjee S, tom rp Bishop (2022) dsSynthetic: Synthetic data generation for the DataSHIELD federated analysis system. BMC Res Notes 15: 230.
-
Banerjee S, Sofack GN, Papakonstantinou T, Avraam D, Burton P, et al. (2022) dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD. BMC Res Notes 15: 197.
-
Modelling the effects of phylogeny and body size on within-host pathogen replication and immune response, Soumya Banerjee, Alan Perelson, Melanie Moses, Journal of the Royal Society Interface 14(136), 20170479, 2017
-
https://web.archive.org/web/20230606205118/https://www.nannyml.com/blog/when-data-drift-does-not-affect-performance-machine-learning-models, URL accessed June 2023
-
Influence of correlated antigen presentation on T cell negative selection in the thymus, Soumya Banerjee, SJ Chapman, Journal of the Royal Society Interface, 15(148), 20180311, 2018