How to include AI methods in your next proposal

For all those of us who write research funding proposals, the massive avenue of powerful AI methods poses a serious challenge: how can we appropriately include such methods in our next research project? Is it mandatory, or can we happily focus on our field, lab or numerical methods that we used to always work with? For proposal evaluators, on the other hand, there is this feeling that many proposals mention unspecific machine learning (ML) methods just because they must be there. What is the way forward here?

To answer some of these questions, I spoke with Jan Dirk Wegner who holds the “Data Science for Sciences” chair at the DM3L at University of Zurich and is head of the EcoVision Lab. In addition, Jan is a member of the evaluation panel for the Starting Grants of the Swiss National Science Foundation, where I am also actively involved.

Of course, what we discuss below reflects only our view, but it might give you some hints on how to shape your next proposal.

Let’s start with the central question: Do we, geoscientists, all have to include AI methods in our proposals?

Jan: I do think that machine learning and deep learning methods are too often used when and where they are actually not necessary. So, working without machine learning is completely fine if your approach works well without it. But if you are submitting a career grant proposal, ask yourself: can your work make an impact without applying machine learning methods? Will you miss the “train” if you keep away? If you have not worked with machine learning before but it would generally be useful in your field and you see it becoming more important there in the future, I would recommend to at least mention it under related works and possibly include a small task where you explore its opportunities together with machine learning experts (do not forgot to add a support letter from them!).

As non-specialists, what should I pay attention to if I talk about or propose to use AI methods?

Jan: Make sure to stay away from assembling only buzzwords; do not simply follow the hype, but carefully think if and if yes, where modern deep learning methods can have a substantial impact in your field and proposal. Be specific, no collection of unspecific methods and terminology; what are the promising methods for your problem? Who develops them? Are there code packages available? What is needed to apply them?

Make sure you know your data and the data requirements of the method you propose using! This will be critically reviewed by specialists. Supervised deep learning methods usually need large amounts of diverse training data that comes with ground truth of good quality. If you have not worked with deep learning methods before but are eager to explore those in your project, team up with someone who knows more about it, as project partner; attention: make sure you really interact with that person; be specific about how that partner will contribute; make clear if the partner will just act as a “helper” or if there are possible methodological developments.

And again: be specific about what deep learning method you would want to apply for what reason and with what input data!

In fact, should I talk about AI or Machine Learning?

Jan: You can talk about machine learning, deep learning or AI, all are generally fine today. I myself prefer using the umbrella term machine learning and then specify in the next sentence that I will focus on modern deep learning methods, and then in the following, I would describe what exact deep learning method I would want to develop or explore. If talking about AI, that often sounds a bit too buzzy for my taste. If you want it in your title or some section headings, make sure to then write very specifically in the following text that you are talking about deep learning methods and even more specific, which ones and why and based on what (very large amount) of data.

In conclusion

“Learning from data” is at the heart of the geosciences, to develop new theories, new process parameterisations, new observational tools or new management techniques. And we are actually the people who produce the data that are fed into machine learning tools, via our field and lab work, monitoring activities and traditional modelling work. Selling the value of these data in our research proposals becomes ever more important and requires putting them into context in the modern machine learning world. This can feel scary to many of us – but, hey, it has never been easier to find someone who explains to you the basics, recommends you a code toolbox, a class to follow or a colleague to work with.