AI Needs the ‘Applied Sciences’ Treatment

As industries rapidly advance in AI/machine learning, a key to unlocking the power of these approaches for companies is an enabling environment. Domain experts need to be able to use artificial intelligence on data relevant to their work, but they should not have to know computer or data science techniques to solve their problems. An environment which enables the domain expert to easily and intuitively label data and train models will allow AI to become truly ‘applied.’ The above image shows a series of fault planes predicted by our approach in the SubsurfaceAI Seismic application, created with ‘applied machine learning’ in mind. Learn More.

The Rise of ‘Applied Machine Learning’ and Geoscience

A generally accepted definition of ‘applied sciences’ is the use of the scientific method and associated knowledge to solve practical problems. The next phase for artificial intelligence now gaining attention is ‘applied AI,’ with an analogous definition being ‘the use of AI methods and associated domain expert knowledge to solve practical problems,’ with those underlying methods transparent to the expert owning the challenge and using the tools. 

In the last few years, the field of AI/machine learning has experienced rapid advances in capabilities and applications. A search on the words ‘applied machine learning’ provides a host of engaging articles. These articles also indicate AI capabilities have been driven largely by data scientists and experts in coding/model building using general approaches. 

Many of these capabilities have yet to be easily accessible to domain experts in a way that enables them to rapidly adapt them to solve their specific problems – in other words, to be truly ‘applied.’ 

Machine learning models themselves are increasingly becoming commoditized, freely available, and easy to build proof of concept work around. In business terms, this means that soon internally developed machine learning models will not provide differentiation to a company’s offering. 

Machine learning models have become more like plug and play building blocks that can be fit into a solution. This makes it very easy to rapidly test and prototype solutions, but it does not solve the critical problems associated with actually putting AI solutions into the hands of domain experts in a way that allows adaptation as working tools. 

The problem is one of industries moving to ‘applied machine learning.’ A key question for this transition is how do you set up a solution that gives a domain expert access to the machine learning tools in an environment where they can easily and intuitively train machine learning models? 

That is a much trickier proposition than prototyping a solution, and it’s why we’re seeing recent high valuations for companies such as Scale and Labelbox, which are focused on providing a way to operationalize AI for business. 

It’s All About the User and Labeled Data 

The machine learning models themselves are important, and it is necessary to test different networks, different models, and various ways of layering models to arrive at reasonable results. However, many of these models are essentially commodities. So, although you need to fiddle with them, a domain expert can take them off the shelf and connect different ones in various ways to test different solutions relatively quickly and easily. 

Increasingly, labeled data is the key. Things get more challenging when it comes to how the domain expert will interact with the machine learning models and with the data they are interested in manipulating or analyzing. A lot of effort from many companies has gone into labeling, analyzing, and interpreting everyday types of data. In the B2C world, this is dominated by pictures of people, roads, cars, or usage patterns of consumers of various services, such as social media platforms and video streaming services. 

These efforts have resulted in a large amount of labeled data on objects that commonly appear in our world. But areas that have been left behind in those efforts include many of the sciences where datasets are typically much smaller, the number of people working on it are fewer, and the number of people who can correctly label the data are fewer still. 

For example, let’s say an exploration team wants geologic core (rock) data labeled such that the stratigraphy is highlighted, as well as the general makeup of each of the stratigraphic layers (e.g., sand, shale and carbonates). They can’t just let someone with no geoscience background do the labeling. The result would be a bunch of meaningless training data. That’s the situation many science-based companies are in. They have good data, maybe not on a large scale, but enough to use AI and ML to good effect. However, they lack the labeling technology and labeled data to use it effectively.

So, a really important thing to build simply and intuitively is the user interface to the data and AI models. The domain expert must be able to easily and intuitively interact with the data and rapidly build-out training data on which to run AI models. 

The Ultimate Prize for the Geoscientist 

The ultimate prize in the subsurface world of energy is for the geoscientist to be training the machine learning model while labeling the data. This is a revolutionized workflow – one that completely removes any role for an intermediary such as a data scientist and one that enables the domain expert to utilize a model that will interpret the way they do.

In the energy industry subsurface world, one could envision analogs to ImageNet, for example a ‘Seismic ImageNet,’ a ‘WellLogNet’, and ‘CoreCTScanNet’ as open source datasets. There is rapidly enough open source data becoming available to develop such high-quality models. 

Automated, iterative image labeling integrated with models makes it possible, and the result is that companies with massive amounts of subsurface data exclusive to them will find their advantage in big data approaches eroding. 

This prize is available, albeit in an early stage, for seismic interpretation in our recently developed custom deep learning application, SubsurfaceAI Seismic. Anyone who would like to see how it meets the ‘applied machine learning’ test, please get in touch.  

About the Author

Mason Dykstra is the Enthought vice president of Energy Solutions. As an intuitive thought leader, he helps oil and gas companies connect the dots between science, engineering, technology, and business needs. Mason leads the Enthought team of energy experts and scientists in tackling big problems that contribute to the bottom line. Connect with Mason on LinkedIn at linkedin.com/in/mason-dykstra-a304b25/ to join his online conversations.

Share this article:

Related Content

Revolutionizing Materials R&D with “AI Supermodels”

Learn how AI Supermodels are allowing for faster, more accurate predictions with far fewer data points.

Read More

Digital Transformation vs. Digital Enhancement: A Starting Decision Framework for Technology Initiatives in R&D

Leveraging advanced technology like generative AI through digital transformation (not digital enhancement) is how to get the biggest returns in scientific R&D.

Read More

Digital Transformation in Practice

There is much more to digital transformation than technology, and a holistic strategy is crucial for the journey.

Read More

Leveraging AI for More Efficient Research in BioPharma

In the rapidly-evolving landscape of drug discovery and development, traditional approaches to R&D in biopharma are no longer sufficient. Artificial intelligence (AI) continues to be a...

Read More

Utilizing LLMs Today in Industrial Materials and Chemical R&D

Leveraging large language models (LLMs) in materials science and chemical R&D isn't just a speculative venture for some AI future. There are two primary use...

Read More

Top 10 AI Concepts Every Scientific R&D Leader Should Know

R&D leaders and scientists need a working understanding of key AI concepts so they can more effectively develop future-forward data strategies and lead the charge...

Read More

Why A Data Fabric is Essential for Modern R&D

Scattered and siloed data is one of the top challenges slowing down scientific discovery and innovation today. What every R&D organization needs is a data...

Read More

Jupyter AI Magics Are Not ✨Magic✨

It doesn’t take ✨magic✨ to integrate ChatGPT into your Jupyter workflow. Integrating ChatGPT into your Jupyter workflow doesn’t have to be magic. New tools are…

Read More

Top 5 Takeaways from the American Chemical Society (ACS) 2023 Fall Meeting: R&D Data, Generative AI and More

By Mike Heiber, Ph.D., Materials Informatics Manager Enthought, Materials Science Solutions The American Chemical Society (ACS) is a premier scientific organization with members all over…

Read More

Real Scientists Make Their Own Tools

There’s a long history of scientists who built new tools to enable their discoveries. Tycho Brahe built a quadrant that allowed him to observe the…

Read More