Acessibilidade / Reportar erro

Sources of uncertainty and error in the simulation of flow in porous media

Abstract

We are concerned here with the analysis and partition of uncertainty into component pieces, for a model prediction problem for flow in porous media.

uncertainty; solution error; porous media


Sources of uncertainty and error in the simulation of flow in porous media

James GlimmI; Shuling HouII; Yoon-Ha LeeI; David H. SharpIII; Kenny YeI

IDepartment of Applied Mathematics and Statistics SUNY at Stony Brook, Stony Brook NY 11794-3600 E-mail: glimm@ams.sunysb.edu

IITheoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545

IIIApplied Physics Division and Theoretical Division Los Alamos National Laboratory, Los Alamos, NM 87545 E-mail: yunha@ams.sunysb.edu

ABSTRACT

We are concerned here with the analysis and partition of uncertainty into component pieces, for a model prediction problem for flow in porous media.

Mathematical subject classification: 60H30, 62P30, 35R60.

Key words: uncertainty, solution error, porous media.

1 Introduction

In previous papers we have developed a Bayesian approach to uncertainty quantification [8, 9, 10] and the analysis of errors [5, 4, 6, 3, 2, 7] in numerical simulations. Prediction depends on both inverse and forward solutions, the former to fix the model and its parameters and the latter to solve the model and make predictions. In the Bayesian framework, the indeterminacy potentially inherent in the inverse solution is resolved by the use of a probability framework. In this framework, the probabilities of observation and solution errors are the leading contributions to the likelihood of an observation, which is a key factor in the definition of the posterior. The posterior is a probability distribution for models and parameters as constrained by observations. Both the inverse and the forward steps depend on the forward simulation, errors in which introduce error and uncertainty into the analysis.

The purpose of this paper is to analyze and partition prediction uncertainty into component pieces associated with substeps of the prediction process. As with earlier papers, we illustrate our ideas with a simplified study of prediction for an idealized petroleum reservoir. We suppose that we have observations of oil production from an early time period (the ''past'') and we use this information to constrain the prediction of the production for a later period (the ''future''). The past production is known but the geology which gives rise to it is not. The errors or uncertainty in predictions of future production depend on geological, physical, and numerical parameters, e.g. the geology correlation length, the oil to water viscosity ratio, and the coarseness of the mesh used in simulating either the forward or inverse problems.

Our goal is model transparency. We want an uncertainty model, that is simple to describe in an understandable and plausible manner the separate contributions to uncertainty, but accurate enough that it does not add to or increase the uncertainty being explained. Not only is such a model of benefit to the user. More importantly, such a model has a higher chance of being transferable from one problem to another. Because error models require extensive computation to validate and calibrate, the ability to transfer them (from a simple to a more realistic context) is important for the practical application of uncertainty analysis in general.

Our predictions yield a probability distribution for future oil production, based on observations of past oil production. The point of the uncertainty analysis is to assess the spread (uncertainty) contained in this distribution. For example, a probability concentrated at a single value for future production would indicate zero uncertainty. To assess the overall prediction uncertainty, we consider the standard deviation of the ensemble of prediction errors; the ensemble corresponds to different choices for the reservoir which generates the observations of the past production. Each prediction error has been non-dimensionalized by dividing it by its prediction and and this normalization has been performed before the standard deviation is computed. The standard deviation s is estimated by a Least Squares approximation of the slope of a normal plot of prediction errors, as distributed over the ensemble. The conventional 5%, 95% confidence intervals are then given as the mean prediction ±1.96 s.

The main result of this paper is to estimate the separate contributions to s. The observations are insufficient to specify the geology. As a result there is an inherent uncertainty in the problem specification. This contribution to s is labeled sgeology. In addition there is uncertainty associated with the use of approximate numerics to solve the inverse and forward problems, which we denote by sinverse and sforward. Finally, since we consider simplified (approximate) statistical methodologies, we introduce sstatistics to quantify the effects of these approximations. Use of finite sized ensembles and observational error introduce other uncertainties, not, considered here.

If the various sources of uncertainty were independent, then the s would combine according to the rule

While this formula is useful qualitatively, we observe deviations from it in the range of 30% - 50%, indicating significant correlation among the different contributions to the uncertainty. We have been able to observe directly only some of the individual si. For the others, some version of (1) is used to define the missing si, tacitly assuming independence.

A second result is to determine the dependence of the various si on the parameters defining the problem. These parameters represent geological, numerical, and physical information (explanatory variables) which may be assumed to be known. Even in our simple study, the explanatory variables take on 60 distinct values in total. Thus it is important to compress and synthesize this information, so that useful and comprehensible trends in the dependence of the various s on the explanatory variables can be understood.

Finally, we observe a general ordering in the magnitudes of the various s:

which must be a consequence of the specific features of porous media flow adopted in this work.

2 The problem formulation

The present paper and earlier ones in this series [5, 4, 6, 3] serve to establish a proto-type model for solution errors for flow in petroleum reservoirs. In order to focus on the uncertainty quantification issues, we have examined somewhat idealized reservoir descriptions. The idealized Darcy and Buckley-Leverett equations

are solved for a total seepage velocity v and oil saturation s. Here K = K(x,z) is the random total permeability, L the relative transmissivity and f the fractional flux. The permeability K is sampled from a lognormal distribution. The flow is nondimensionalized to lie in the unit square 0 < x, z < 1. See also [4, 6] for a more detailed specification of the simulations.

Our error model is based on arrival time differences for solution flux values (the observed oil to water production data, or oil cut), and thus measures errors in the propagation wave speeds, or saturation level sets, as the relative oil flux is a (nonlinear) function of the saturation. The arrival time errors are non-dimensionalized through their expression as a fraction of the actual arrival time intervals. These response variables parameterize the observations made in the past. We parameterize future observations with the total future oil production. We consider the dependence of the response variables upon the scaleup mesh spacing, the geology heterogeneity length scale and the hydrocarbon vs. injected fluid viscosity ratio. These are the explanatory variables. The scaled up mesh spacing [12, 13, 1, 11, 14] uses 5, 10 and 20 cells between wells, with a fine grid geology model defined by 100 cells between wells. The horizontal correlation length of the permeability field varies by a factor of five, from the fraction l = 0.2 of the inter well spacing to the entire inter well spacing l = 1.0, with the values l = 0.2, 0.4,0.6, 0.8, 1.0. The vertical correlation length is fixed at 0.02 as a fraction of the vertical reservoir height. The viscosity ratio n, needed in the definitions of L and f, varies from 5 to 40 with values n = 5, 10, 20, 40. Thus values typical of both water flooding and oil-gas systems are included.

The difference between the fine grid and the coarse grid solution is the solution error, which we study here. We only examine the oil cut, so the solution and the error are time series. We are not using actual production data in this study. We therefore model the real problem by selecting a particular geology Ki0 as the ''correct'' one, and consider the fine grid solution oil cut Si0 as a stand in for the observed oil production .

We assume we have production data up to the present time T0, defined in terms of an oil cut level Si0(T0), here selected as 0.8, 0.6, or 0.4. For each choice of horizontal correlation length l 50 geologies, sampled from a given log normal distribution, define the prior distribution P(K). According to Bayes theorem, the posterior P(K|)

P(K|) is defined in terms of a likelihood, P(|K), for the observation to occur assuming the geology K is correct. Let sf and sc denote the fine and coarse (upscaled) solutions, respectively. As explained above, we choose sf,i0 as a stand in for an actual observation . In evaluating the likelihood, we take = sf,i0 and compare to sc,j, j ¹ i0. The likelihood is thus the probability of an error in sc,j sufficient to produce the discrepancy

Since we assume Kj is correct in evaluating the likelihood, the likelihood specifies the probability of a solution error of a given size, to which we apply our Gaussian error model. See [4] for more detail.

We specify the arrival time error model with five degrees of freedom. They are the breakthrough time and incremental elapsed time at oil cut levels of 0.8, 0.6, 0.4 and 0.2. To avoid an undue influence from numerical diffusion in the coarse grid simulations, we define the breakthrough time in terms of an oil cut value of 0.95. That is D(Sl) = t(Sl)-t(Sl-1), where

and Sl = 1 - 0.2·l, 0 < l < N, and D(S0) = t(S0). See the Fig. 1. Thus, the errors to be modeled are e(Sl) = Df(Sl)-Dc(Sl), the time difference between a given change in the oil cut level for the fine and the coarse grid solutions. The mean and covariance matrix of the error are estimated by

This elapsed time error model is used only to model the past, t < T0, while future oil production

is used to describe the future. The final time is set at 1.4 (PVI) in (6) and si might denote the fine or coarse grid solution. We correct the course grid solution for the mean solution error in all of its uses.


3 Findings

The standard deviation for the relative error in prediction provides a convenient method for quantitative assessment of the accuracy of a prediction. The standard deviation is computed from the the Bayesian posterior and the error model for forward simulations. They are computed for each choice of exact geology i0 and then averaged (RMS) over i0 to yield a final value.

This method is not applicable for fine grid solutions. For the fine grids, we specify an arrival time window (size 0.03) for past observations of the oil cut. Simulations passing through this window define the posterior.

3.1 Approximate independence of distinct error sources

We make precise the definitions of the different si which will enter into our analysis. sgeology represents the uncertainty inherent in the prediction problem. sgeology originates in the incomplete set of observations (the oil cut) used to characterize the reservoir. It is defined as the standard deviation associated with the following prediction problem: the posterior is determined by fine grid solutions, using the windowing method, while the future is also simulated using the fine grid. sforward is defined as the standard deviation of the prediction error resulting from use of the upscaled solution operator for forward predictions, starting from a knowledge of the exact geology. sinverse, the uncertainty in the selection of the posterior distribution, i.e. the inverse problem, is not directly observable, so we define it in terms of an RMS difference

where sBayes ff is defined prediction s for prediction based on the Bayesian posterior (e.g. course grid solutions for the inverse problem) and the fine grid solutions for the forward simulation. The statistical uncertainty is also not directly observable, and is defined in terms of an RMS difference

where sBayes ff stat is the standard deviation associated with the Bayesian prediction using a fine grid solution for the forward step, and an approximate statistical model for the coarse grid error covariance used for the inverse problem in the definition of the Bayesian posterior. We consider two approximate statistical models: a diagonal covariance and a simple parametric model in which the diagonal covariance entries satisfy a linear dependence on explanatory variables. See [7] for details. For comparison with (1), we define sBayes stat as the standard deviation associated with the Bayesian prediction using a coarse grid for both the inverse and forward steps and an approximate statistical model for the solution error covariance.

Eq. (1) asserts that

Eliminating the trivial combinations from this formula which collapse by definition, we are asserting that

There are actually three versions of this formula, with the full statistics covariance (no approximations), the diagonal covariance and the parametric model diagonal covariance. The formula asserts independence of prediction errors associated with the inverse and forward (past and future) aspects of prediction.

To assess the accuracy of (10), we perform an RMS average over all explanatory parameter values of stotal as defined by the formula and of sBayes stat as defined directly. We find that (10) explains about 2/3 of the value of stotal compared to sBayes stat in the sense of an RMS average of the dependence of both sides of the equation on the explanatory parameters.

3.2 Dependence of error on problem parameters

Here we develop a parametric model for the dependence of the various si on the explanatory variables. We propose a relation of the form

Here the a's are coefficients to be determined and the x's represent values for the explanatory variables. Thus xmesh = Dx/D where D is the interwell spacing and Dx is the scaled up mesh spacing. Also xp is the natural logarithm of the ratio of the displaced to displacing viscosities, and xgeology is the horizontal correlation length, expressed as a fraction of D. These definitions are modified from related definitions in [7]. With these definitions, our main result for this section is a table of the a's for each si, together with a percentage of the RMS parametric variability captured by this formula. The a's are obtained from a nonlinear regression, to minimize errors in a fit to the exponential of (11). See Table 1.

We now discuss these results qualitatively. sgeology increases strongly with correlation length and viscosity ratio. This dependence is shown graphically in Fig. 2. Note that although the viscosity ratio has nothing to do with the geology, it affects the flow, and the degree to which the flow is predictable or unpredictable. High viscosity ratios make the flow less predictable. sforward has a strong dependence on the degree of scale up (mesh spacing), moderate dependence on the correlation length and very little dependence on the viscosity ratio. See Fig. 3. sBayes ff (fine grid forward simulation) is remarkably independent of the degree of scale up in the inverse problem, shows only a slight increase with viscosity ratio, but is strongly increasing with correlation length. sinverse, with its indirect definition, is somewhat less well predicted by the parametric model, but shows significant decrease with increasing viscosity ratio, and a possible increase with correlation length. The statistical standard deviation sstat for both diagonal covariance and for the parametric model for the diagonal covariance is small, so that the parametric dependence and the relative error in these quantities is not of much significance. We note also a small correction to Tables 3 and 4 of [7]: the full error model confidence intervals should be incremented by about one percentage point. This being done, the three different statistical models (exact and two approximations) are nearly identical, again showing that a sstat or confidence interval inferred from this data will be very small.



3.3 Ordering of error sources by magnitude

The main results of this section are summarized in Table 2, which justifies the ordering given in (2). Each entry is defined as an RMS average over the 60 explanatory parameter values, or over those values for fixed Dx.

4 Acknowledgments

James Glimm is supported by the MICS program of the U.S. Department of Energy DE-FG02-90ER25084, DE-AC02-98CH10886, grant DAAL-03-91-0027 and the National Science Foundation, grant DMS-01022480. Shuling Hou and David H. Sharp are supported by the U.S. Department of Energy. Yoon-ha Lee is supported by the Department of Energy DE-FG02-90ER25084.

Received: 14/IV/03. Accepted: 13/VIII/03.

#577/03.

  • [1] M.A. Christie, T.C. Wallstrom, L.J. D.S. Hou, D.H. Sharp and Q. Zou, Effective medium boundary conditions in upscaling, in Proceedings of the 7th European Conference on the Mathematics of Oil Recovery, Baveno, Italy, Sept. 5-8 (2000).
  • [2] B. DeVolder, J. Glimm, J.W. Grove, Y. Kang, Y. Lee, K. Pao, D.H. Sharp and K. Ye, Uncertainty quantification for multiscale simulations, Journal of Fluids Engineering, 124 (2002), pp. 29-41.
  • [3] J. Glimm, Y. ha Lee, and K. Ye, A simple model for scale up error, Contemporary Mathematics, 295 (2002), pp. 241-251.
  • [4] J. Glimm, S. Hou, H. Kim, Y. Lee, D. Sharp, K. Ye and Q. Zou, Risk management for petroleum reservoir production: A simulation-based study of prediction, J. Comp. Geosciences, 5 (2001), pp. 173-197.
  • [5] J. Glimm, S. Hou, H. Kim, D. Sharp and K. Ye, A probability model for errors in the numerical solutions of a partial differential equation, CFD Journal, 9 (2000).
  • [6] J. Glimm, S. Hou, Y. Lee, D. Sharp and K. Ye, Prediction of oil production with confidence intervals, SPE 66350, Society of Petroelum Engineers, (2001). SPE Reservoir Simulation Symposium held in Houston, Texas, 11-14 Feb.
  • [7] J. Glimm, S. Hou, Y. Lee, D. Sharp and K. Ye, Solution error models for uncertainty quantification, Contemporary Mathematics, (2003). Submitted. SUNYSB preprint 02-16. LANL preprint LA-UR: 02-5987.
  • [8] J. Glimm and D.H. Sharp, Stochastic partial differential equations: Selected applications in continuum physics, in Stochastic Partial Differential Equations: Six Perspectives, R.A. Carmona and B.L. Rozovskii, eds., Mathematical Surveys and Monographs, American Mathematical Society, Providence, (1997).
  • [9] J. Glimm and D.H. Sharp, Stochastic methods for the prediction of complex multiscale phenomena, Quarterly J. Appl. Math., 56 (1998), pp. 741-765.
  • [10] J. Glimm and D.H. Sharp, Prediction and the quantification of uncertainty, Physica D, 133 (1999), pp. 152-170.
  • [11] T. Wallstrom, M.A. Christie, L.J. Durlofsky and D.H. Sharp, Effective flux boundaryconditions for up scaling porous media equations, Transport in Porous Media, 46 (2000), pp. 139-153.
  • [12] T. Wallstrom, S. Hou, M.A. Christie, L.J. Durlofsky and D.H. Sharp, Accurate scale up of two phase flow using renormalization and nonuniform coarsening, Computational Geoscience, 3 (1999), pp. 69-87.
  • [13] T. Wallstrom, S. Hou, M.A. Christie, L.J. Durlofsky and D.H. Sharp, Application of a new two-phase upscaling technique to realistic reservoir cross sections, Proceedings of the SPE 15th Symposium on Reservoir Simulation, (1999), pp. 451-462. SPE 51939.
  • [14] T. Wallstrom, S. Hou, M.A. Christie, L.J. Durlofsky, D.H. Sharp and Q. Zou, Application of effective flux boundary conditions to two phase upscaling in porous media, Transport in Porous Media, 46 (2000), pp. 155-178.

Publication Dates

  • Publication in this collection
    11 July 2005
  • Date of issue
    Dec 2004

History

  • Received
    14 Apr 2003
  • Accepted
    13 Aug 2003
Sociedade Brasileira de Matemática Aplicada e Computacional Sociedade Brasileira de Matemática Aplicada e Computacional - SBMAC, Rua Maestro João Seppe, nº. 900 , 16º. andar - Sala 163, 13561-120 São Carlos - SP Brasil, Tel./Fax: 55 16 3412-9752 - São Carlos - SP - Brazil
E-mail: sbmac@sbmac.org.br