A Deterministic Data Agent Framework for Generative AI in Oil and Gas Operation

Abstract

This paper presents a Generative AI and LLM (large language model)-based deterministic agent that decomposes tasks into controlled, deterministic subtasks, providing the LLM only with minimal contextual information. Raw data is stored in a structured graph, ensuring privacy and reducing token usage. Combined with preloaded analytics libraries, code generation, verification, and native visualizations, the framework delivers reliable, reproducible, and scalable insights, enabling trusted, cost-efficient deployment of agentic AI in petroleum operations. The Data Agent framework has been deployed at a midstream gas processing facility, where operators interact with the system through natural language to obtain current operational insights, asset health metrics, and actionable recommendations. The agent enables users to generate, prepare, and send reports directly from the application using natural language instructions, eliminating manual steps and improving workflow efficiency. Observations show that the deterministic, agentic workflow produces consistent, reproducible, and hallucination-free outputs, while advanced analytics and native visualizations enhance decision-making. The preloaded general-purpose and domain-specific libraries allow the system to handle a wide range of tasks reliably, and context-limited code generation ensures data privacy, residency, and cost-efficient token usage. Overall, the deployment demonstrates that such an agentic AI framework can accelerate operational decision-making, break data silos, and provide engineers and operators with trusted insights across upstream and downstream workflows. The proposed framework employs a deterministic, stepwise orchestration process that ensures Generative AI and LLMs are applied in a controlled and reliable manner. When a user initiates a task, the Data Agent decomposes it into smaller, well-defined subtasks that follow predefined paths, limiting open-ended reasoning and minimizing hallucinations. At each stage, the LLM is provided only the context necessary to generate executable analysis code, never raw data. The agent is equipped with a preloaded set of general-purpose analytics libraries for tasks such as forecasting, resampling, and interpolation, as well as domain-specific toolkits for industry-relevant calculations. Generated code is executed against real datasets and validated for accuracy before delivering results. Context narrowing ensures that only relevant assets or datasets are considered, maintaining data privacy, residency, and reproducibility. This structured approach allows the agent to produce reliable, reproducible insights while supporting advanced analytics and visualization across both upstream and downstream workflows. The objective of this paper is to present a Deterministic Data Agent Framework that leverages Generative AI,, LLMs, and agent-based workflows to improve data discoverability and accessibility across upstream and downstream operations. The framework enables organizations to break silos between diverse datasets, transform raw information into actionable insights, and accelerate digital transformation by creating domain-specific AI agents that enhance efficiency, reliability, and decision-making without requiring additional coding.

Abstract

Related papers