Q: How do we manage wells at the edge of a lease or frontier, where there’s not much data?
A: Small data doesn’t get the same hype as big data. But small data is important for oil and gas companies. Small or incomplete. There’s a way around that deficit with AI. Enter transfer learning. According to Arsenault from Towards Data Science, “Transfer learning is an up-and-coming technique that allows us to transfer the knowledge learned in one dataset and apply it to another dataset. Transfer learning has largely taken a back-seat in the machine learning community up until recently with the rise of deep neural networks. Deep neural networks are extremely flexible compared to most other machine learning techniques. They can be trained, chopped up, modified, retrained, and generally just abused in all sorts of ways.”
I asked our VP of Product, Charles Connell, who works with our development team blending together the right kinds and amounts of AI to interpret your pad’s data with a high level of accuracy with even sparse or small data.
Connell explains, “The risk with a lot of machine learning models is that you can interpolate between data but you can’t extrapolate beyond the bounds of what the data is showing you. What we’ll often do at the start of any project is some data exploration to look at the different variables we care about and what the range of data is that actually exists.
“Let’s say your pad is located right on the boundary of the basin. There are no wells to the west of you. All the wells are to the east. We capture some spatial trends through lat and long, based off the location. But, if something is changing drastically, it’s hard for the model to pick it up just in lat and long.
“Some of the things that we’ll try to do first is to supplement the data that we have that represents what could be happening spatially. One really good method would be if you have some geologic maps of the area that extend across the basin. If you know how the geology is changing that would be a more explicit way of capturing what’s happening as opposed to just lat and long.
“Petro.ai has the ability to load in these geologic maps and use those as features in our productivity models.
“Now let’s say, you don’t even have that. But you have some intuition in terms of what’s happening based off of individual logs that you’ve seen across the region. What we can try to do is create our own map that just has some arbitrary changes in it. We can create a map with a feature that ranges between 0 and 1 and it changes from east to west. That is giving the model an arbitrary feature that represents what we think might be happening.
“We can have it change gradually or steeply, east to west, or north to south. We can run the analysis at different angles and different gradients to see if the model picks it up as a significant feature. We look at the variable importance of everything we put in and if we look at this map and it comes back as the same importance as a random variable then we know that it’s not very predictive. If it comes back as being a highly important feature then we know that we’ve now validated that your intuition is correct and that there is something that’s changing along that direction. We can use that to help the model.
“Some of the internal discussions that we’ve had indicates that if you just want the model to pick up that trend in the lat and long, that’s asking a lot from the model to infer. By adding in this additional feature, we’re helping the model to understand what might be happening. We’re giving it a good starting point so the model can determine if it’s significant or not.
“These are data driven models where you don’t have actual data. You can try to supplement it with other things that you think might be happening. A client might have intuition in terms of what’s happening. If we take this approach where we have a random variable that has a gradient, and that comes back as not improving the model’s accuracy, we’ve at least shown that it’s not as significant as they thought it was. Or, if it does improve the model accuracy and we’ve validated that this is a significant trend, the model now captures that.”