Imperial College London's Marek Rei is Using Machine Learning and Language Models to Improve Healthcare

Marek Rei is an Associate Professor of Machine Learning at Imperial College London as well as a visiting researcher at the University of Cambridge in the UK.

Originally from Estonia and a graduate of Tallinn University of Technology, he went to the UK after finishing his undergraduate studies in 2008 and has carved out an academic career there focused on machine learning and natural language processing. With some experience in the private sector, notably at SwiftKey, where he created a neural network language model for text prediction, he’s also become an advisor on AI for a few companies, including Locai Labs and Esgrid Technologies, and offers consulting services through a firm called Perception Labs.

Currently, Marek works with his own research team to further develop language learning models, by expanding their abilities to plan, reason, and make decisions. He is also applying these new models to medical and healthcare data, with the hope of designing tools that can streamline the healthcare process so that physicians can make faster and better decisions.

How it all began

How long have you been studying language learning models?

I started moving in this direction back when I was studying at Tallinn University of Technology. I did my bachelor’s degree there and finished in 2008. After that, I went to the University of Cambridge to do a master’s on speech and text processing. The course was 50 percent on speech analysis, 50 percent on text analysis. I got more interested in the text aspect and that is where the interest in language modeling has come from.

This was all before neural networks and machine learning were such hot topics. There was machine learning, but we used different types of models like probabilistic machine learning, statistics-based machine learning, and algorithms like support vector machines, for example. But we were already doing interesting things with them.

My work was focused on trying to find ways of learning useful knowledge while acquiring minimal supervision data. How can we learn as much as possible from unsupervised data? Then add a little bit of labeled data to get the system to do specific useful tasks. This is also the foundation for language models these days. The idea is that we train these on huge amounts of unlabeled data and maybe add a little bit of labeled data or human-engineered prompting to teach them about the specific task.

I got into neural network language models after I had finished my master’s and PhD when I was working for SwiftKey, the company that developed predictive keyboards for phones. There I was working on neural network language models specifically. That was between 2012 and 2014.

When a person works on models what does that mean? Do you build new models or train existing models to do new things?

My work is research. I develop new models, that means implementing and coding algorithms that have new capabilities that other models do not have yet. Oftentimes it does mean starting from an existing model and building on it, coding new training strategies, extensions, etc. Back when I started my research journey, then none of the main neural network libraries existed yet. There was no model you could take off the shelf and start with. We were innovating everything from scratch ourselves. Nowadays, it is often better to start from existing approaches and build on those instead.

Bridging research and business

You have had an interesting career, where you are an academic but advise companies.

Yes, I am mostly in academia, doing research at the university, but I am able to collaborate with companies on the side as well. I collaborate with some companies, helping with their machine learning solutions. I have more than a decade of experience of working with these models, so I can advise others on how to use them, what their capabilities and weaknesses are, when and how to best apply them, etc. Or I can also help by implementing and training particular machine learning models that are then integrated into the frameworks of the company.

Is it impossible to stay up to date with all of the innovation that is going on in the field?

At this stage, it’s pretty impossible. It is still important to stay informed. I read papers and get notified of new papers related to my research. My students find new research publications and tell me about them. It’s a collaborative effort, but I am sure there is interesting work that goes unnoticed because the amount of papers about machine learning and language modeling has grown exponentially in the past few years.

What are you working on right now?

A few different things. My work on one side is very foundational, working on core models and improving those. In that area, I am looking at planning and reasoning with AI models. How can we get these models to reason better in order to solve more complex tasks? At their core, language models are very reactionary – they just predict the next token based on the previous tokens.

There is already interesting work out there that asks the models to reason by itself and then make the decisions and this does help. But still we find there are many shortcomings and models are not able to successfully solve many complex multistep tasks that they need to solve. So this is one area for example. Several of my PhD students are working on creating more capable models around this topic.

On the application side, I like to take these foundational ideas and apply them to different areas. The three biggest areas I have been applying to are 1) healthcare, applying health-based models, and working with electronic health records; 2) education and automated language assessment; and 3) sustainability, analyzing company reports to extract various sustainability information.

Building predictive health models

Have you been working with the UK National Health Service to get access to healthcare data?

We are talking to them and making plans for larger collaborations. But at the moment we are mostly using public data sets. There are some datasets that have been anonymized and we are using those to develop tools and to proof of concept ideas. We are building a health world model where the idea is to predict what is going to happen next. It’s very similar to language modeling, but we are not predicting language, we are predicting events.

The model encodes everything that has happened to a particular patient, like any kind of treatment they have received, any note that has been written about them, test that has been done, diagnosis that has been assigned. We are then training this model to predict what will happen next. This allows us to use the model to simulate the future events, calculate the probabilities of different possible outcomes and try out different strategies within the simulation.

So it is an offshoot of personalized medicine in a way.

Yes, the idea is motivated by the fact that when you go to see a doctor they basically have 10 minutes for you. During this time they can peek into your medical history but won’t have time to read it all. We can build models that go through all of your history in painful detail, every single reference, search additional info from databases, compare your history to other similar patients and then make suggestions based on that.

What would the physician get back, a report of suggested potential health issues, or what questions to ask?

The models we are building are more general purpose at the moment. They are trained to predict anything that will happen to the patient. We can then specialize these models for particular applications. For example, the model could predict possible diagnoses for that patient, give probabilities for those, and point to specific evidence in the data why they think that’s the case. Another possible application is to have the system recommend different tests that patients should be given. Another is when considering alternative treatments, the model could run simulations for different options, to see what is the probability that this patient will benefit from this particular treatment.

The preliminary models are trained and ready. We are now working to benchmark them on some complex tasks and get some additional data into these. Basically we are scaling up at the moment, trying to build better models and evaluating complex applications.

Balancing roots and research

There is always a question for Estonian scientists abroad about if they would return to Estonia. Have people tried to entice you to move back?

I am very glad that they want me back and have had invitations to return full-time as well. At the moment, I quite like what I have at Imperial. It’s a bit difficult to replicate that in Estonia. But I do feel a strong connection to Estonia and I am looking for ways to establish deeper connections there work-wise as well.

Estonia became famous for its Tiger Leap program and has now launched an AI Leap initiative to familiarize Estonian students with AI. What do you think about this?

I think Tiger Leap was very influential both in advancing the IT sector of the country but also the public relations aspect of Estonia. I think people know Estonia through IT. Working in AI, I think AI is a very interesting and promising area. I support putting focus on that as well. With AI Leap, I have gotten the impression that a large focus is on using the AI. In contrast, I would want to see a good proportion of the AI Leap to also focus on researching and developing new AI solutions. I don’t think we can be leaders if we are just using what others have already put out.

People do have an uneasy relationship with AI. Some are enthusiasts, some are intimidated by it. How do you feel about that?

There are both extremes. There are people who are uneasy about anything related to AI. And there are people who are a bit too excited and try to apply it to everythingand, thinking of it as a magic solution. That is part of the problem, that some people produce low-quality AI-generated output, then others get discouraged and reject everything related to AI. In my opinion, AI is just a tool. You need to use this tool for the right task. You cannot use it for everything. AI is not the best tool for every single application, but it can be very helpful for certain tasks. If you know how and when to use it, it can be very useful.

This article is written by Justin Petrone. This article was funded by the European Regional Development Fund through Estonian Research Council.

If this peek into AI has you training your own curiosity model, don’t hit delete! Upgrade to our next article and read more about how Frequent use of artificial intelligence may hinder students’ academic performance!

Imperial College London’s Marek Rei is Using Machine Learning and Language Models to Improve Healthcare

Table of Contents

How it all began

Bridging research and business

Building predictive health models

Balancing roots and research

Read more

Get our monthly newsletterBe up-to-date with all the latest news and upcoming events