S. A. McKenna, Sandia National Laboratories, Geohydrology Department, P. 0. Box 5800, Mail Stop 1324, Albuquerque, NM 811851324. (e-mail:samcken@nwer.sandia.gov)
E. P. Poeter, Department of Geology and Geological Engineering, Colorado School of Mines, Golden, CO 80401. epoeter@mines.edu
The hydrofacies treatment of hydrogeologic heterogeneity has several advantages over Gaussian and binary treatments. First of all, recognition that hydrogeologic properties are related to geologic properties allows use of often large, albeit empirical, amounts of geologic information. Second, in most cases, it is not yet practical to directly determine hydrologic properties from field geophysical techniques, but determination of geologic heterogeneity from geophysical surveys in groundwater studies is becoming commonplace [Olsen et al., 1993; McKenna and Poeter, 1995]. Recognizing the influence of geologic heterogeneity on hydrogeologic heterogeneity facilitates conceptualization of deterministic heterogeneities (e.g., channels of well-connected, coarse-grained, high-hydraulic conductivity pathways in alluvial deposits). This type of heterogeneity is difficult to model through techniques based on a Gaussian distribution of log hydraulic conductivities.
Desbarats and Srivastava [1991] indicate that the perceived expense and effort required to obtain data that adequately represent spatial structure of subsurface heterogeneity may preclude field application of stochastic methods, and Gelhar [1986] calls for better methods to determine the spatial correlation structure of hydraulic properties. The concept of data fusion has been used informally in both the groundwater and oil industries for subsurface characterization. Previous work describing methods of combining multiple types of information for subsurface characterization in groundwater studies include Brannan and Haselow [1993], Hoeksema and Kitanidis [19841, Hyndman et al. [1994], and James and Freeze [1993]. The approach for an objective combination of geophysical, geological, and hydrological data through geostatistical simulation and inverse parameter estimation in subsurface hydrological applications was outlined by Fogg [1989]. The field application presented here follows that approach through demonstrating application of such a method, based on the hydrofacies conceptualization, to a fluvial sandstone.
Geologic, geophysical, and hydrologic data are used to identify hydrofacies and define their spatial distribution. Multiple-indicator, stochastic simulation conditioned on hard and soft information produces realizations of hydrofacies geometries, each of which serves as a zonation pattern for inverse flow modeling. Results of inverse flow modeling are used to discard implausible realizations and estimate values of hydraulic conductivity within each hydrofacies. Fusion of disparate types of data can have a greater impact on uncertainty reduction than refinement of any single type of data.
There are alternative approaches for using type A soft data in estimation of the covariance. In this case we estimate the true hard data covariance (C_{I}) as a linear combination of the sample hard data autocovariance, the soft-hard crosscovariance, and the soft data autocovariance. The hard-soft cross-covariance is scaled by the quantity p_{1} - p_{2}, and the soft data autocovariance is scaled by (p_{1} - p_{2})^{2}. Covariance estimation through this technique follows the theory put forth by Alabert [1987] and employed by James and Freeze [1993]. The use of soft data to estimate the covariance allows the maximum number of pairs available for estimation to be N_{h}^{2} + N_{h}N_{s} + N_{s}^{2}, where N_{h} and N_{s} refer are the number of hard and soft data respectively, rather than just N_{h}^{2} if only hard data are available.
Type C data are incorporated into covariance estimates by weighting each sample location by its probability of occurrence for the given cutoff in a manner similar to that presented by Kulkarni [1984] for fuzzy bits, using the standard variogram (or covariance) equation but weighting the calculation for each pair of locations. Each indicator is coded as 1.0, and a corresponding vector of probabilities is defined from the prior cumulative density function values. The coefficient weight k(h), is given by
For type C data the prior cdf is reproduced exactly as the posterior cdf independent of the proximity of any conditioning data. The posterior cdf is used in a Monte Carlo process to determine the simulated value at that location. The simulated data are then used as conditioning data for the simulation of other points.
Given the extensive spatial coverage of the seismic tomography on site, it is desirable to define hydrologically significant units in terms of seismic velocity. Seven seismic threshold levels, separating eight sonic facies, are defined by comparing the sonic logs to the recovered core and also examining the polymodality of the sonic log measurements. Examination of the other available logs in the context of these eight sonic facies reveals that three of the sonic facies contain bimodal distributions of one of the other geophysical logs. Dividing these three sonic facies into two separate facies each results in 11 electrofacies. An electrofacies is defined as a composite Geophysical log response which defines a portion of the subsurface [Moline et al., 1992]. The electrofacies approach to defining units allows the use of the Geophysical logs to define facies at locations where core has not been collected. Relative to examination by a geologist, electrofacies classification can provide a more objective discrimination of units in locations where core samples exist, especially when the geophysical log responses are indicative of rock properties important to fluid flow (e.g., clay content or porosity).
Multivariate discriminant analysis is employed to examine the similarities and differences between the 11 electrofacies. Discriminant analysis defines the centroid of each electrofacies in four-dimensional space where each type of geophysical log is a dimension. For each data point the probability of membership within each electrofacies is calculated as a distance between the data point and the centroid of each facies. The electrofacies groupings are then examined in terms of similarity. Similarity is defined as a small distance between two electrofacies centroids. Electrofacies can be combined if they are statistically and geologically similar and/or if they are statistically indistinguishable in terms of small-scale air permeability measurements.
A portable air permeameter based on the design described by Davis et al. [1994] was constructed for this project. The air permeameter was used to obtain 237 small-scale permeability measurements on core samples in the laboratory. Hypothesis testing on the means and variances of permeability was used to determined whether or not electrofacies could be combined into hydrofacies. Combining electrofacies in this manner adds geologic and hydrologic information and reduces the number of facies to seven.
The hydrogeologic character of these seven hydrofacies is described in Table 1. The hydrofacies can be distinguished by using thresholds of seismic velocity (Table 2). Figure 2 shows the distribution of seismic velocity within each facies. The probabilities of class membership as determined by discriminant analysis for each datum are updated by summing the probabilities of membership for electrofacies which were combined into hydrofacies. If the final probability of class membership is greater than or equal to 95% at any location, that uncertainty is deemed negligible, and the datum from the location is determined to be a hard datum. The remaining locations are used as type C data with the probabilities of class membership used to build the prior cdf at each location. Of a total of 1355 locations, 983 were determined to be hard data, and the remaining 372 locations are classified as type C data.
Hydrofacies | Description | Estimated Mean K, m/d |
---|---|---|
conglomeratic member of the Lyons Formation | 3.1 X 10^{-4 a} | |
well-cemented, poorly sorted, sandstone with gravel conglomerate lenses | 3.1 X 10^{-4 b} | |
sandstone member of the Lyons Formation | 9.0 X 10^{-4 c} | |
well-cemented, poorly sorted, sandstone | 1.2 x 10^{-3 d} | |
moderately consolidated, porous material with low to moderate clay content | 1.2 x 10^{-2 e} | |
combination of well-sorted, silty sandstone; moderately to poorly consolidated, poorly sorted sandstone; fine gravel conglomerate lenses; and siltstone lenses | 1.2 x 10^{-3 f} | |
poorly consolidated, moderately porous, clay poor material, weathered or fractured | 1.2 x 10^{-1 e} | |
a well-fractured area of any previous hydrofacies | 2.6^{g} |
On the basis of one packer test, the hydraulic conductivity of facies 8 appears to be at least an order of magnitude higher than the other facies. This estimate of hydraulic conductivity is reasonable given that facies 8 is fractured material. Facies 8 was not identified by the geophysical log responses incorporated in the discriminant analysis owing to their limited volume of investigation, but is identified by full waveform acoustic logs and by the cross-borehole tomography. Limited lateral extent of this fracture zone in the strike distance is delineated by the low-velocity zone (blue) at the 1793-m elevation between borings T-1 and T-2 (Plate 1). The lowest values of hydraulic conductivity (highest-velocity zones shown as red/orange) occur within strongly cemented, coarse-grained channel conglomerates, such as the Lyons conglomerate (hydrofacies 1), which is continuous across the interwell domain in the strike direction above the fractured zone and discontinuous below the fractured zone. Plate 1 reveals that lower-velocity hydrofacies are continuous across the upper portion of the interwell domain. A mosaic of four tomograms completed in adjacent wells situated approximately along dip direction (Plate 2) reveals locations of coarse-grained conglomerates in the east, as defined by high velocity (red) zones. The Lyons/Fountain contact occurs in the vicinity of well T-10. The high velocity (red) conglomerate units (hydrofacies 1) do not form a continuous zone of low K. Locations of high-velocity zones are interpreted as the expression of a channel belt meandering in and out of the plane of the tomogram cross section.
Seismic velocities derived from the tomography are used as imprecise estimates of hydrofacies (type A soft data) based on their calibration at the wells with collocated hard data (velocities measured with sonic logs) (Figure 3). High-quality soft data are obtained at the previously defined thresholds (Table 2), with calculated values for p_{1} ranging from 0.83 to 0.96 and for p_{2} from 0.02 to 0.07. A calibration index can be defined as the quantity [p_{1} - p_{2}]. If this index equals 1.0, there is a perfect calibration between the hard and soft data and the soft information can be considered hard data. If the calibration index equals 0.0, the soft data provide no information on the hard data. The quality of soft data at this site is good as indicated by the [p_{1} - p_{2}] indices of 0.81, 0.80, 0.80, 0.84, 0.89, 0.94, and 0.92 for thresholds 1.5-7.5, respectively. The calibration indices were calculated at the only places possible (the wells). Owing to the geometry of the inversion process, the quality of the indices will be nonstationary. Incorporation of nonstationary indices into the geostatistical simulation routine is beyond the scope of this study. The values of the indices at the wells were used because those are the only data and any increase in the uncertainty of these indices would be arbitrary.
All measures of spatial correlation were calculated as covariances. These covariances were then transformed to variograms prior to modeling using a nonergodic transform [Isaaks and Srivastava, 1988]. All variograms were modeled with two nested structures to permit flexibility in fitting the experimental data. In general, the hard data variograms exhibit trends and were fitted with a spherical model for the first structure and a Gaussian model with a long range for the second structure (see hard data vertical variograms in Figure 4). The Gaussian model with a long range permits the model variogram to follow the trend. The hard/soft data variograms were fit with two spherical models. The overall effect of adding the soft information to the variogram calculations is to better define the structure in the horizontal directions (compare the hard data and hard/soft data N-S and E-W variograms at threshold 4.5 in Figure 4) and to remove trends seen in the hard data variograms. To honor the zonal anisotropy with a constant sill in all directions, it was sometimes necessary to have the model variogram deviate from the experimental variogram. In these cases, search radii within the simulation program were set to be no greater than the distance at which the experimental and modeled variograms begin to deviate. For example, the hard/ soft data vertical variogram model at the 4.5 threshold begins to deviate from the experimental data at approximately 8 m (Figure 4). In this case the search radius was set to 7.5 m in the simulation program.
Most of the horizontal semivariograms are modeled with isotropic ranges. A principal direction may exist, but it cannot be identified owing to an inadequate amount of data. Hard data experimental semivariograms calculated at different dip angles along the dip direction do not yield different ranges; thus a dip of zero degrees is used. There may be insufficient data for defining a dip angle, or the dip angle may be masked by irregular diagenesis of the original sedimentary facies. In contrast, hard/soft data experimental variograms calculated in the dip direction exhibit a maximum spatial correlation between 30° and 36° of dip. The additional soft information is adequate to define the angle of geologic dip.
Threshold | |||||||
---|---|---|---|---|---|---|---|
The seven facies determined with the hard data are simulated using ordinary kriging for the entire simulation. At least three points must be within the search neighborhood to solve the kriging system (if only one or two points are found, the prior estimate of the global cdf is reproduced as the posterior local cdf). The maximum number of points retained from the search neighborhood is 24 (three per octant). Search radii are chosen to both retain the directions of anisotropy revealed in the variogram modeling and be large enough to provide an adequate search neighborhood without using excessive amounts of computer memory.
Most of the simulation parameters for hard/soft data simulations have the same values as the hard data simulations with the exception that the global estimate of the prior cdf is used only if no conditioning points are found in the search neighborhood. This produces simulations with a less random appearance when using the hard/soft conditioning data. Eight hydrofacies as identified by the hard and soft data are simulated.
In lieu of physically examining the reality which the simulations attempt to describe, other measures of simulation quality are used. For example, the closeness of fit between hydrologic observations and flow modeling results when the simulations are used for zonations in the flow model is discussed later. Another approach to evaluating quality of the simulations is the similarity of the prior and posterior global cdf's and spatial covariances.
In the simplest sense, geostatistical simulation attempts to reproduce the global prior cdf and spatial covariance that are used as input to the simulator. However, these input values are only estimates of the true values derived from a limited sample data set. Inference of the global statistics from a limited sample set raises the question of how well the global cdf and covariances should be reproduced [Deutsch and Journel, 1992]. Certainly, it is not desirable to exactly reproduce the input parameters in every realization because they are derived from a limited sample. Ideally, the estimates of the parameters derived from the resulting simulations will show "ergodic fluctuations" around the input parameter values due to the stochastic nature of the simulation process.
Each of the 200 resultant simulations are resampled by obtaining data from 50 randomly placed "wells" and recalculating the cdf and semivariograms. Inclusion of the soft data resulted in improved reproduction of the cdf as compared to using the hard data. In all cases (directions and thresholds), the soft data realizations yield better reproductions of the input semivariogram model than do the hard data realizations. Improvements resulting from soft data incorporation are attributed to the soft data's being less biased than the limited hard data.
The full-scale geostatistical realizations of 355,752 elements are too large to solve for flow in many realizations with the available facilities in a reasonable time frame; thus the geostatistical realizations are upscaled to cell dimensions of 6.1 x 6.1 x 0.6 m (x, y, z). Incorporation of 16 of the fine-scale cells into one coarse cell was accomplished by assigning to the larger cell the hydrofacies which was most common within its boundaries. Upscaling, reduces the total number of cells by more than 90%.
Average absolute head deviation obtained from forward flow modeling is large, even after incorporation of soft information into the zonation process (Figures 5a and 5b); however, head deviations are minimized through calibration. The estimates of the hydraulic conductivities made with available field data and seismic velocity information (Table 1) do not provide the optimum fit to the measured heads. Trial and error calibration of a few realizations are used to obtain updated hydraulic conductivity estimates for inverse modeling. The field estimates and initial estimates of hydraulic conductivities for both hard and hard/soft data realizations are given in Table 4. Hydraulic conductivities of hydrofacies 6 and 8 are thought to be known with greater confidence (Table 1) relative to other hydrofacies; therefore they are not estimated.
Data Set | ||||||||
---|---|---|---|---|---|---|---|---|
Field | -3.5 | -3.5 | -3.0 | -1.9 | -2.9 | -0.9 | 0.4 | |
Hard | -1.8 | -1.9 | -1.1 | -1.9 | -2.9 | -0.5 | . . . | |
Hard/soft | -2.8 | -2.9 | -2.1 | -5.9 | -2.9 | -3.1 | 0.4 |
Rules are developed for acceptable realizations: (1) hydrofacies I (well-cemented conglomerates) must have lower hydraulic conductivity than hydrofacies 3 (less consolidated sandstones); (2) hydrofacies 8 (fracture zones) must have the highest estimated hydraulic conductivity of any hydrofacies; and (3) after applying criteria 1 and 2, the half of the remaining solutions that has the smallest head residuals is retained.
After applying these elimination criteria, the hard data ensemble is reduced from 100 to 21 plausible realizations, and the hard/soft data ensemble from 100 to 19. After parameter estimation, residual head distributions are much smaller (Figures 5c and 5d) than those produced by forward modeling (Figures 5a and 5b). The average absolute head deviations are similar for both the hard (mean 0.73 m, standard deviation 0.05 m) and hard/soft (mean 0.71 m, standard deviation 0.05 m) simulations. Because of the similar head residuals after parameter estimation, it can be argued that the hard data alone produced the same quality realizations. However, estimates of hydraulic conductivity calculated from hard data realizations are different than estimates obtained from hard/soft data realizations and have greater standard deviations (with the exception of hydrofacies 4 where the standard deviations are roughly equal), suggesting less certainty in the result when only hard data are used (Figure 6). The decrease in variance of estimated hydraulic conductivity associated with the hard/soft data simulations is attributed to reduction in (geologic uncertainty resulting from use of soft data. The median value of hydraulic conductivity for every hydrofacies is higher for the hard versus the hard/soft data realizations. One factor that contributes to this difference is the absence of the highest hydraulic conductivity hydrofacies (hydrofacies 8) in the hard data realizations.
There is no overlap in distributions of hydraulic conductivity estimates in hydrofacies 4 and 5 from the hard and hard/soft realization ensembles (Figure 6), indicating that one of the approaches (hard data versus hard/soft) is inaccurate for these two hydrofacies. Hydraulic conductivity field data are insufficient to determine which assessment is inaccurate. One explanation for the lower hydraulic conductivities of facies 4 and 5 in the hard/soft data simulations is that they are more continuous across the site than was predicted by the hard data simulations. A model with well-connected bodies of facies 4 and 5 requires lower hydraulic conductivities in order to match the observed heads than if the zonation of these facies was very disconnected.
An even distribution of hydraulic conductivity tests between facies cannot be obtained during the first phase of a field investigation because the first phase data are needed to define the facies. Once the facies are delineated, subsequent field investigations can be designed to include tests in all of the facies. Eighteen field scale hydraulic conductivity tests were completed at this site. One of these is in facies 4, and none are available for facies 5. If this study were extended, further testing would be conducted in each facies in order to develop stricter rules for selection of plausible realizations. If such information were available, it might be determined that all of the realizations in one of the ensembles are unacceptable, as was determined by Poeter and McKenna [1995] for a synthetic data set.
To obtain unique estimates of hydraulic conductivities, head observations and either flow observations or prior estimates of parameters are required. In this study, 28 head observations are available, and hydraulic conductivity is known for facies 6 and 8. No significant correlation of hydraulic conductivity existed between any of the facies, indicating that seismic velocities are valuable for delineating hydrofacies at this site.
The combination of geophysical and geostatistical techniques allows for the definition of facies with distinct hydrological properties. This definition of distinct parameters is corroborated by the inverse parameter estimation because the hydraulic conductivity of each facies is not correlated with the hydraulic conductivity of other facies. Although the combination of geophysical techniques determines hydrologically distinct facies, the geophysical techniques currently do not provide values of hydraulic conductivity within the facies. Inverse modeling uses the observations of the hydraulic head across the site to estimate values of hydraulic conductivity for each facies given the plausible zonation patterns based on the geophysical, geological, and hydrological information. This data fusion constrains the solution (i.e., reduces uncertainty associated with the response function) to a greater degree than can be achieved through refined analysis and interpretation of any single data source.
Anderson, M. P., Hydrogeologic facies models to delineate large-scale spatial trends in glacial and glaciofluvial sediments, Geol. Soc. Am. Bull., 101, 501-511, 1989.
Brannan, J. R., and J. S. Haselow, Compound random field models of multiple scale hydraulic conductivity, Water Resour. Res., 29(2), 365-372, 1993.
Davis, J. M., R. C. Lohmann, F. M. Phillips, J. L. Wilson, and D. W. Love, Architecture of the Sierra Ladrones Formation, central New Mexico: Depositional controls on the permeability correlation structure, Geol. Soc. Am. Bull., 105, 998-1007, 1993.
Davis, J. M., J. L. Wilson, and F. M. Phillips, A portable airminipermeameter for rapid in-situ field measurements, Ground Water, 32(2), 258-266, 1994.
Desbarats, A. J., and R. M. Srivastava, Geostatistical characterization of groundwater flow parameters in a simulated aquifer, Water Resour. Res., 27(5), 687-698, 1991.
Deutsch, C. V., and A. G. Journel, GSLIB: Geostatistical Software Library and User's Guide, 340 pp., Oxford Univ. Press, New York, 1992.
Fogg, G. E., Emergence of geologic and stochastic approaches for characterization of heterogeneous aquifers, paper presented at the Conference on New Field Techniques for Quantifying the Physical and Chemical Properties of Heterogeneous Aquifers, Natl. Water Well Assoc., Dublin, Ohio, 1989.
Gelhar, L. W., Stochastic subsurface hydrology from theory to applications, Water Resour. Res., 22(9), 135S-145S, 1986.
Hill, M. C., A computer program (MODFLOWP) for estimating parameters of a transient, three-dimensional, groundwater flow model using nonlinear regression, U.S. Geol. Surv. Open File Rep., 91-484, 358 pp., 1992.
Hoeksema, R. J., and P. K. Kitanidis, An application of the geostatistical approach to the inverse problem in two-dimensional groundwater modeling, Water Resour. Res., i(7), 1003-1020, 1984.
Howard, J. D., Patterns of sediment dispersal in the Fountain Formation of Colorado, Mt. Geol., 3(4), 147-153, 1966.
Hubert, J. F., Petrology of the Fountain and Lyons Formations, Front Range, Colorado, Q. Colo. Sch. Mines, 55(1), 1-238, 1960.
Hyndman, D. W., J. M. Harris, and S. M. Gorelick, Coupled seismic and tracer test inversion for aquifer property characterization, Water Resour. Res., 30(7), 1965-1977, 1994.
Isaaks, E. H., and R. M. Srivastava, Spatial continuity measures for probabilistic and deterministic geostatistics, Math. Geol., 20(4), 313-341, 1988.
James, B. R., and R. A. Freeze, The worth of data in predicting aquitard continuity in hydrogeological design, Water Resour. Res., 29(7), 2049-2065, 1993.
Journel, A. G., Constrained interpolation and qualitative information, Math. Geol., 18(3), 269-286, 1986.
Journel, A. G., Fundamentals of geostatistics: In five lessons, Short Course Geol., vol. 8, 40 pp., AGU, Washington, D. C., 1989.
Kiusalaas, N. J., Estimation of groundwater recharge using neutron probe moisture readings near Golden, CO, M.E. thesis ER-4191, 186 pp., Colo. Sch. of Mines, Golden, 1992.
Kulkarni, R., Bayesian kriging in geotechnical problems, in Geostat-Tahoe, part 2, pp. 261-270, D. Reidel, Norwell, Mass., 1984.
McKenna, S. A., Utilization of soft data for uncertainty reduction in groundwater flow and transport modeling, Ph.D. dissertation T-4291, 325 pp., Colo. Sch. of Mines, Golden, 1994.
McKenna, S. A., and E. P. Poeter, Subsurface characterization through cross-borehole seismic tomography, Ground Water, in press, 1995.
Moline, G. R., J. M. Bahr, P. A. Drzewiecki, and L. D. Shepherd, Identification and characterization of pressure seals through the use of wireline logs: A multivariate statistical approach, Log Anal., 33(4), 362-372, 1992.
Olsen, H., C. Plough, U. Nielsen, and K. Sorensen, Reservoir characterization applying high resolution seismic profiling, Rabis Creek, Denmark, Ground Water, 31(1), 84-90, 1993.
Phillips, F. M., J. L. Wilson, and J. M. Davis, Statistical analysis of hydraulic conductivity distributions: A quantitative geological approach, paper presented at the Conference on New Field Techniques for Quantifying the Physical and Chemical Properties of Heterogeneous Aquifers, Natl. Water Well Assoc., Dublin, Ohio, 1989.
Poeter, E. P., and S. A. McKenna, Reducing uncertainty associated with groundwater flow and transport predictions, Ground Water, 33(6), 899-905, 1995.
Rubin, Y., and G. Dagan, Stochastic analysis of boundaries effects on head spatial variability in heterogeneous aquifers, 1, Constant head boundary Water Resour. Res., 24(10), 1689-1697, 1988.
Schneider, W. A., Jr., A three dimensional physical modeling study applying tomographic inversion and seismic migration to the tunnel detection problem, Ph.D. dissertation T-3866, 281 pp., Colo. Sch. of Mines, Golden, 1990.
Weimer, R. J., and C. B. Land, Lyons Formation (Permian), Jefferson County, Colorado: A fluvial deposit, Mt. Geol., 9(2-3), 289-297, 1972.
Zhu, H., and A. G. Journel, Formatting and integrating soft data: Stochastic imaging via the Markov-Bayes algorithm, in Geostatistics Troia '92, vol. 1, Quantitative Geology and Geostatistics, edited by A. Soares, pp. 1-12, Kluwer Acad., Norwell, Mass., 1993.