Reliant's paper-scanning AI takes on the tedious task of scientific data processing

AI models have proven they can do a lot, but what tasks do you actually want them to perform? Preferably mundane tasks, and there are plenty of them in research and academia. Reliant wants to specialize in the kind of time-consuming data extraction tasks that are currently the domain of exhausted graduate students and interns.

“The best thing AI can do is improve the human experience — reduce menial labor and let people do the things that matter to them,” said CEO Karl Moritz. In the world of research, where he and co-founders Marc Bellemare and Richard Schlegel have worked for years, literature reviews are one of the most common examples of this “menial labor.”

Every paper cites previous or related research, but finding these sources in a sea of science can be a challenge, and some papers, such as systematic reviews, cite or use thousands of data sources.

Regarding one study, Moritz recalled, “The authors had to sift through 3,500 scientific papers, many of which ultimately turned out to be irrelevant. It took a huge amount of time to extract even a fraction of useful information. This really felt like a task that should be automated by AI.”

They knew that modern language models could do it: in one experiment, they used ChatGPT for the task and found that it could extract data with an 11% error rate. Like many things that LLMs can do, this is impressive, but not quite what people actually need.

Image credit: Reliant AI

“That's not enough,” Moritz says. “It's really important not to make mistakes when doing these knowledge-based tasks, even if they are simple.”

Reliant's flagship product, Tabular, is based in part on LLM (LLaMa 3.1) but enhanced with other proprietary techniques, and is much more effective: in extracting the thousands of studies mentioned above, Reliant says it performed the same task without error.

So you can throw in 1,000 documents, pull out this or that data, and Reliant will comb through those documents and find that information, whether or not that information is perfectly labeled and structured (which is highly unlikely), and then it will present all that data and any analysis you want to run in a nice UI where you can drill down into each individual case.

“Users need to be able to interact with all the data at once, so we're building features that allow users to edit existing data or navigate from the data to the literature. We see our role as helping users find where to focus their attention,” Moritz said.

This customized, effective application of AI is less flashy than digital friends, but it's far more feasible and has the potential to accelerate science in many advanced tech domains. Investors have taken notice, funding the company in an $11.3 million seed round led by Tola Capital and Inovia Capital, with participation from angel investor Mike Volpi.

Like any application of AI, Reliant's technology is compute-intensive, so the company buys its own hardware rather than renting it a la carte from large providers. Sourcing hardware in-house brings both risks and rewards: it has to recoup the cost of these expensive machines, but it gives it the opportunity to develop problem domains with dedicated computing.

“One thing we've found is that it's very hard to give a good answer when you have a limited time to give it,” Moritz explains. For example, if a scientist asks the system to perform a new extraction or analysis task across 100 papers. Unless it predicts what the user will ask and figures out the answer, or something like it, in advance, it can do it quickly or well, but not both at the same time.

“A lot of people have the same questions, so we can find the answer before they even ask the question,” says Bellemare, the startup's chief scientific officer. “We can extract 100 pages of text into something else, which may not be exactly what you want, but it's manageable for us.”

Think of it this way: if you were trying to extract meaning from 1,000 novels, would you wait until someone asked for a character's name and then grab it? Or would you know you'll probably need that data and just do the work up front (along with places, dates, relationships, etc.)? Definitely the latter — if you have the computing power to spare.

This pre-extraction also gives the model time to resolve the inevitable ambiguities and assumptions found in different scientific disciplines. What one indicator “indicates” another might not mean the same thing in pharmaceuticals as it does in pathology or clinical trials. Not only that, but language models tend to give different outputs depending on how a particular question is asked. So Reliant's job was to turn ambiguity into certainty. “This is something that can only be done if you're willing to invest in specific science and disciplines,” Moritz noted.

As a company, Reliant's initial focus is on establishing whether the technology is profitable before trying anything more ambitious. “To make interesting progress, you need a big vision, but you also need to start with something concrete,” Moritz says. “From a startup survival standpoint, we're looking at for-profit companies because they give us the capital to pay for the GPUs. We're not going to sell this to customers at a loss.”

One might expect the company to feel pressure from companies like OpenAI and Anthropic, and implementation partners like Cohere and Scale, which are pouring money into handling more structured tasks like database management and coding. But Bellemare was optimistic: “We're building this on a groundswell of activity. Any improvement in the tech stack is great for us. LLM is one of maybe eight large-scale machine learning models. The others are completely proprietary and built from the ground up based on the uniqueness of the data.”

The biotech and research industry's AI-driven transformation is certainly just beginning and is likely to be fairly incomplete for the next few years, but Reliant appears to have found a solid foundation from which to start.

“If we want a 95% solution and all we have to do is apologize profusely to one of our customers every once in a while, that's great,” Moritz says. “We work in areas where precision and repeatability are really important, where mistakes really matter, and frankly, that's enough for us. We're happy to leave the rest to other companies.”

Source link

What's Hot

The housing crisis. “Starts to push the residents”

Dominican is a dead pa- / Andrs Cooscksonki dead

Pope Francis showed the Polish cardinal. He got an important mission

Saudi Ministry of Education is taking part in the Geneva International Exhibition for Invention 2025

Enhance scientific, technical and health cooperation between Komstique and China

Jordan recognizes the importance of investing in young minds and supporting an innovation environment – a coalition of press agencies for organisations of Islamic cooperation

Everything you need to know about Mercosur and its translation into several different languages

6 Best Free Online Translation Tools

How do I translate my mobile app?

EF Polymer Named to Forbes' List of 100 Asia Companies to Watch for Agricultural Innovation

Champions League draw: Improved format packed with high-profile rematches between Europe's biggest clubs

Costacurta expects Milan to have a 'good journey' in Europe this season

Two of Europe's most successful teams meet in the Champions League

The housing crisis. “Starts to push the residents”

Review: 7 Future Fashion Trends Shaping the Future of Fashion

Meta’s AlbedoGAN Advances Realistic 3D Face Generation

Subscribe to Updates

What's Hot

Reliant's paper-scanning AI takes on the tedious task of scientific data processing

Related Posts