Harnessing the Power of GPT-3 in Scientific Research

March 9, 2023 Dr. Ekta Dang

IMAGE: DEPOSITPHOTOS ENHANCED BY COGWORLD

Source: CogWorld Think Tank on VentureBeat

Since its launch in 2020, Generative Pre-trained Transformer 3 (GPT-3) has been the talk of the town. The powerful large language model (LLM) trained on 45 TB of text data has been used to develop new tools across the spectrum — from getting code suggestions and building websites to performing meaning-driven searches. The best part? You just have to enter commands in plain language.

GPT-3’s emergence has also heralded a new era in scientific research. Since the LLM can process vast amounts of information quickly and accurately, it has opened up a wide range of possibilities for researchers: generating hypotheses, extracting information from large datasets, detecting patterns, simplifying literature searches, aiding the learning process and much more.

In this article, we’ll take a look at how it’s reshaping scientific research.

The Numbers

Over the past few years, the use of AI in research has grown at a stunning pace. A CSIRO report suggests that nearly 98% of scientific fields have implemented AI in some capacity. Want to know who the top adopters are? In the top five, you have mathematics, decision sciences, engineering, neuroscience and healthcare. Moreover, around 5.7% of all peer-reviewed research papers published worldwide focused on AI.

As for GPT-3, there are more than 300 applications worldwide using the model. They use it for search, conversation, text completion and more. The maker of GPT-3, OpenAI, claims that the model generates a whopping 4.5 billion+ words every day.

How GPT-3 is Being Used in Research

Is this the future of scientific research? You could say that it’s a bit too early to suggest that. But one thing is for sure: The new range of AI-based applications is helping many researchers connect the dots faster. And GPT-3 has a massive hand in that. Labs and companies worldwide are using GPT-3’s open API to build systems that not just enable the automation of mundane tasks but also provide intelligent solutions to complex problems. Let’s look at a few of them.

In life sciences, you have GPT-3 being used to gather insights on patient behavior for more effective and safer treatments. For instance, InVibe, a voice research company, employs GPT-3 to understand patients’ speech and behavior. Pharmaceutical companies then use these insights to make informed decisions about drug development.

LLMs like GPT-3 have been used in genetic programming too. A recently published paper, “Evolution Through Large Models,” introduces how LLMs can be used to automate the process of mutation operators in genetic programming.

Solving mathematical problems is still a work in progress. A team of researchers at MIT found that you can get GPT-3 to solve mathematical problems with few-shot learning and chain-of-thought prompting. The study also revealed that to solve university-level math problems consistently, you need models pre-trained on the text and fine-tuned on code. OpenAI’s Codex had a better success rate in this regard.

Now, if you want to learn complex equations and data tables found in research papers, SciSpace Copilot can help. It’s an AI research assistant that helps you read and understand papers better. It provides explanations for math and text blocks as you read. Plus, you can ask follow-up questions to get a more detailed explanation instantly.

Another application tapping into GPT-3 to simplify research workflows is Elicit. The nonprofit research lab Ought developed it to help researchers find relevant papers without perfect keyword matches and get summarized takeaways from them.

System operates in a similar space. It’s an open data resource that you can use to understand the relationship between any two things in the world. It gathers this information from peer-reviewed papers, datasets and models.

Most researchers have to write a lot every day. Emails, proposals, presentations, reports, you name it. GPT-3-based content generators like Jasper and text editors like Lex can help take the load off their shoulders. From basic prompts in natural language, these tools will help you generate texts, autocomplete your writing and articulate your thoughts faster. More often than not, it will be accurate and with good grammar.

What about coding? Well, there are GPT-3-based tools that generate code. Epsilon Code, for instance, is an AI-driven assistant that processes your plain-text descriptions to generate Python code. But Codex-driven applications like that one by GitHub are best for this purpose.

At the end of the day, GPT-3 and other language models are excellent tools that can be used in a variety of ways to improve scientific research.

Parting Thoughts On GPT-3 and LLMs

As you can see, the potential of GPT-3 and the other LLMs for the scientific research community is tremendous. But you cannot discount the concerns associated with these tools: potential increase in plagiarism and other ethical issues, replication of human biases, propagation of misinformation, and omission of critical data, among other things. The research community and other key stakeholders must collaborate to ensure AI-driven research systems are built and used responsibly.

Ultimately, GPT-3 is a helpful tool. But you can’t expect it to be correct all the time. It’s still in its early stages of evolution. Transformer models, which form the foundation of LLMs, were introduced only in 2017. The good news is that early signs are positive. Development is happening quickly, and we can expect the LLMs to improve and be more accurate.

For now, you might still receive incorrect predictions or recommendations. This is normal and something to bear in mind when using GPT-3. To be on the safe side, always make sure you double-check anything produced by GPT-3 before relying on it.

Dr. Ekta Dang is CEO and Founder of U First Capital and drove venture capital investments at Intel prior to that. She has an excellent track record of achieving at least 1 Exit every year: Pensando (Acquired by AMD in 2022), Nextdoor (IPO 2021), Palantir (IPO 2020), Orb Intelligence (Acquired by Dun and Bradstreet in 2020), Pinterest (IPO 2019), Docusign (IPO 2018), etc. She has invested in category-leading companies such as SpaceX, Uniphore, Pensando, and DevRev. Dr. Dang is a member of Congressman Ro Khanna’s Leadership Circle and advises him on technology policy. She is also an Advisor to the Chancellor of the University of California San Diego and has been on the Science Translation and Innovative Research (STAIR) Grant Committee of the University of California Davis. She is also an Advisor to Israel’s Ben-Gurion University of the Negev. She is a contributing writer for VentureBeat and Cognitive World. She has been a successful venture capital investor, corporate executive, speaker, and writer in Silicon Valley for two decades. While at Intel, she demonstrated solid experience both in venture capital and on the operating side. She brings a stellar network from the venture capital and corporate world to the startup community. She has been a mentor at Alchemist Accelerator, Stanford, UC Berkeley, Google Launchpad, etc. She has been a member of the US-level Technology Policy Advisory Committee. She is an invited Speaker at several top venture capital/startup conferences, like TiECon, Silicon Valley Open Doors, etc. She has also co-chaired TiECon’s entrepreneurship track. She has a Ph.D. in physics (electronics) and has attended UC Berkeley Haas School's Venture Capital Executive Program. She has published several research papers in IEEE and other reputed international journals.