Automated hypothesis generation: an AI role in science

When I was getting my PhD in Ann Arbor during the 1980’s, just staying up to date with the relevant literature to my own thesis project was a constant challenge. There was a paper magazine back then called Current Contents (CC). CC contained just that: the tables of content for all of the relevant journals (in Life Sciences). It was a critical resource because there was no other way—even then—to keep tabs on the collective scientific output.

Keeping tabs was not just for general knowledge about the field. Or even about properly giving credit to others. Rather, it was critical to the hypothesis creation. Asking the right question (at the right time) is what determines scientific success in many cases. But you can’t ask the right question without understanding whether it’s been already asked. And really you can’t ask the right question without a full understanding of what the current state of scientific knowledge is.

At the time, it was the habit, in many high impact papers, to have the last figure in the paper be a cartoon schematic that represented the author’s view of where the field was—at the moment of the paper’s acceptance into the journal. In my field of molecular neuroscience, this often was a series of shapes and arrows representing key biomolecules and pathways. It was often amusing to go from one paper to the very next that a particular group put out and see that some of the arrows would mysteriously reverse directions from the cartoon in the previous paper. This was presumably because the paper’s results along with other results had changed the thinking of the author.

In any case, that cartoon figure was always a clue into what the next hypothesis to be tested would be for a particular research group. So in a sense, you could predict the trajectory of scientific inquiry from that cartoon figure at the end of a paper.

That was the 1980’s. Our scientific knowledge base has expanded exponentially since then. One of the current versions of Current Contents is called Faculty of 1000 (F-1000). It’s on-line of course. The idea is that leaders in the field curate the papers that you should read based on your profile. It’s a great idea I guess, although science being as competitive as it is, I have doubts that the elect would give up some brilliant and undiscovered insight of a paper to the unwashed, if it really might supercharge some scientific inquiry. However, as a scientist, you have many other choices. Google Scholar comes to mind—it’s both comprehensive and I’m pretty sure it uses AI extensively to tailor its results. So machine-driven instead of human-driven (as in the case of F-1000).

However, the cartoon figure at the end of papers has become pretty obsolete (although it does still make appearances). That’s because pretty much all of science—certainly life sciences—has become incredibly complex. In my field, you can’t make a cartoon big enough to represent all the relevant biomolecules and pathways and the arrows have become incredibly intertwined because of the multiplicity of feedback loops and cross-talk links.

So not only is it difficult to glean the next hypothesis for the clever reader (even when there is a cartoon). It’s impossible for the author to do the same.

This has pushed much of science from the paradigm of Popper to exploratory research. In such science, I might read the data stream from some set of sensors, correlate that data with some other external variable (like seasonality) and publish a correlation that is intriguing. Correlation of course is not causation—we all know that.

And yet, science has the tools to do excellent hypothesis-based research. In neuroscience, optogenetics methods allow us to turn on and off neural circuits to understand their effects upon behavior. In molecular biology, CRISPR does the same for genetic circuits and networks.

The problem is not executing the research. It’s the ability to ask the right question. For biology, generating a hypothesis that is parsimonious with all of the current knowledge in a scientific discipline is challenging for human scientific superstars and downright impossible for your typical graduate student coming up with a thesis project. I believe that the same is true for any area of science where the volume of knowledge and relevant data has expanded exponentially.

But all is not lost. I think this is a perfect domain for AI as it exists today. Keeping tabs of many disparate but relevant data points and then coming up with a next move? That’s how AI’s beat humans in chess right now. So… AI in collaboration with human scientists might be a very fruitful collaboration going forward. And it may yet save hypothesis-based research.

Hubble Telescope

I know that we are waiting on the James Webb Space Telescope, but disturbingly, the Hubble Space Telescope is sitting in safe mode after a gyro failure this past weekend (hat tip to NASA watch). This is the telescope that has been the workhorse of NASA’s astronomy program.

In a sense this is the future. If we continue to send very complicated gadgets to technological edge environments, particularly in the near future with AI on-board, they are going to have to be much more resilient. Space and deep ocean are examples of such environments. There are implications for big science and DOD.

Drone attacks

The Atlantic has an excellent piece about the use of drones by non-state actors for bad purposes, here. My own view is that edge computing and AI will render these technologies vastly more destructive in the future. Not in terms of mass destruction, but in terms of targeted destruction. The key is how to defend against AI-enabled swarms. Could the new 5G networks somehow be deployed in an emergency to do just that?

Spoofing AI

This an interesting new scientific meme. It made it into Science on the basis of a presentation at the International Conference on Machine Learning here. The idea is that hackers can easily defeat AI’s (think “social engineering” used on a machine).

Meanwhile there is the contrasting meme of us getting spoofed by AI, in the FT, here. In this case AI’s are able to make videos of people doing things that they did not do.

All of this gets to the cybersecurity aspects of AI that potentially put society at risk.