Metrology

Optimizing hybrid metrology: rigorous implementation of Bayesian and combined regression

[+] Author Affiliations
Mark-Alexander Henn, Richard M. Silver, John S. Villarrubia, Hui Zhou, Bryan M. Barnes, Bin Ming, András E. Vladár

National Institute of Standards and Technology, Engineering Physics Division, 100 Bureau Drive MS 8212, Gaithersburg, Maryland 20899-8212, United States

Nien Fan Zhang

National Institute of Standards and Technology, Statistical Engineering Division, 100 Bureau Drive MS 8212, Gaithersburg, Maryland 20899-8212, United States

J. Micro/Nanolith. MEMS MOEMS. 14(4), 044001 (Nov 12, 2015). doi:10.1117/1.JMM.14.4.044001
History: Received May 20, 2015; Accepted October 8, 2015
Text Size: A A A

Open Access Open Access

Abstract.  Hybrid metrology, e.g., the combination of several measurement techniques to determine critical dimensions, is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of three-dimensional (3-D) structures but also a more realistic estimation of the corresponding uncertainties. Recent developments at the National Institute of Standards and Technology feature the combination of optical critical dimension measurements and scanning electron microscope results. The hybrid methodology offers the potential to make measurements of essential 3-D attributes that may not be feasible otherwise. However, combining techniques gives rise to essential challenges in error analysis and comparing results from different instrument models, especially the effect of systematic and highly correlated errors in the measurement on the χ2 function that is minimized. Both hypothetical examples and measurement data are used to illustrate solutions to these challenges.

Figures in this Article

Hybrid metrology, e.g., the combination of distinct measurement techniques to determine critical dimensions (CDs), is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of three-dimensional (3-D) structures but also a more realistic estimation of the corresponding uncertainties. Ideally it helps to reduce the overall uncertainties by combining the individual strengths of each of the measurement techniques, making subnanometer uncertainties a realistic goal as CDs approach 10 nm.

Recent developments13 in hybrid metrology at the National Institute of Standards and Technology (NIST) feature the combination of optical critical dimension (OCD) and scanning electron microscope (SEM) measurements. The challenges and possible solutions have been outlined by some of these authors in a previous proceedings paper.3 Various methods have been presented to combine measurement results from different tool platforms, revealing two related but distinct challenges. There must be an overlapping parameter set for combined regression,3 such that each individual parametric geometry must share at least one parameter in common (e.g., height). Additionally a priori, each individual method must also yield parametric values that together with their uncertainties are statistically consistent, usually quantified by a Z-test; see Ref. 2 for more details.

This paper can be seen as a continuation of that work and Ref. 4, with an emphasis on the proper treatment of the measurement errors, including highly correlated and systematic errors and their influence on hybrid metrology. Tool-induced errors for OCD and scale errors for SEM are investigated, and it is shown how the parametric uncertainties can be decreased if those issues are addressed accurately in the hybridization.

Since the term hybrid metrology has gained increased significance in the dimensional metrology community outside NIST,57 we will start this work with a short overview of two of the most common techniques in Sec. 2, namely the Bayesian approach and combined regression. The measured targets and the generalized parameter sets that describe them are discussed in Sec. 3 before we give a detailed description of the performed error analysis for both OCD and SEM in Sec. 4 and their impact on the hybridization in Sec. 5. We will close with the conclusions in Sec. 6.

Hybrid metrology has gained significant recognition in recent times as an approach to considerably reduce parametric uncertainties by combining different measurements of the same measurand. We want to use this section to identify the main differences and similarities of two of the most common hybrid approaches, namely the use of a priori information in a Bayesian sense and combined regression. We start with the Bayesian approach and continue with combined regression. In order to keep the notation simple, throughout this section we will assume that only two measurement techniques are combined with each individual method, yielding a statistically consistent set of parameters. Note that even if some of the notations are different, the presented approaches are equivalent to those given in Ref. 2. Additional information for those who are not familiar with all of the terminology of Bayesian data analysis can be found in Refs. 8910.

Bayesian Approach

The Bayesian approach treats information provided by each of the measurement tools quite differently. The first tool provides m values of measurement data that can be described as a vector in an m-dimensional real vector space Display Formula

y=(y1,,ym)TRm(1)
that contains only indirect information about the quantity of interest. We therefore need to analyze the data in terms of an inverse problem.11 A common approach to solve an inverse problem is to set it up as a regression problem. Initially, we need to provide a model function Display Formula
f:RnRm,f(p)=[f1(p),,fm(p)]T(2)
that maps the parameters of interest (e.g., the height, the width, etc.), that are identified with a vector in an n-dimensional real vector space p=(p1,,pn)T, to the simulated quantities. These simulated data are generated in the same m-dimensional space defined by the measurement. The regression problem then amounts to minimizing the difference between the modeled and the measured data. We therefore solve for the parameter vector p^ that minimizes the so-called χ2 function, measuring the goodness of fit of the simulation to the measurement data given by the weighted norm Display Formula
χ2(p)=f(p)y2=[f(p)y]TV1[f(p)y].(3)

In this formulation, we implicitly assume that the measurement data y are a noisy realization of the model, and that we have an additive error model where Display Formula

y=f(p)+ε.(4)

Note that the errors that are added to each of the model values can also be arranged as an m-dimensional vector. We assume the errors to be normally distributed with zero mean and m×m covariance matrix Σε. The matrix VRm×m in Eq. (3) is usually chosen to be an estimate of Σε. We will discuss the importance of a good choice of V in more detail later. If we ignore normalization constants, we see that the function in Eq. (3) is proportional to the negative log-likelihood function for the chosen error model.10 We furthermore assume that we have information about the parameter vector of interest p, such as an estimate μp and an uncertainty or, more specifically, a covariance matrix Σp. This can be based on expert knowledge or an already completed analysis of measurement data from a second measurement tool. If we assume the parameters of interest to be normally distributed, we have all the information we need to express this prior information in terms of a probability density function (PDF). This PDF is usually called the prior distribution πpri. In the case of a normally distributed random vector p, it is given by Display Formula

πpri:RnR,πpri(p)=[(2π)ndet(p)]1/2exp[12(pμp)Tp1(pμp)].(5)
By subtracting the prior distribution or, more precisely, its logarithm, from the function in Eq. (3), the negative log-likelihood, we get a function that is proportional to the negative logarithm of the posterior probability distribution. If we again ignore normalization constants we get the function that serves as the modified χ2 function Display Formula
χ˜2(p)=[f(p)y]TV1[f(p)y]+(pμp)Tp1(pμp).(6)
Note that the second term that reflects the prior information acts as a penalty or regularization term, penalizing possible solutions to the inverse problem for measurement tool 1 that are not consistent with the prior information. The function in Eq. (6) is finally minimized to find the parameter vector estimate p^.

Combined Regression

In combined regression we start with two distinct sets of measurement data, yA and yB, that come from two different measurement techniques. Their model functions, fA(pA) and fB(pB), depend on parameter vectors pA and pB, respectively. The models must have at least one common model parameter in order to perform combined regression. Determining what the common parameters of the two models are can be a challenging task; see Ref. 3 for further details. In combined regression we define the combined χ2 function to be the sum of the individual χ2 functions for each of the tools. Note that this is only possible if we assume that the two measurements are independent of each other Display Formula

χ2(pAB)=χA2(pA)+χB2(pB).(7)
Here pAB is the vector that consists of the union of the elements in pA and pB. The solution p^AB to the inverse problem in combined regression is then found by minimizing the above combined χ2 function.

The investigated targets and the geometrical parameterizations used have already been described in detail in Ref. 3, so we will only give a brief overview. We investigate finite 30-line arrays of Si on Si with a thin native conformal oxide (see Fig. 1). The nominal widths are 14, 16, and 18 nm. A schematic representation of the geometrical parameterizations can be found in Fig. 2. For the OCD analysis, the geometry is fully characterized by the height, width, Δtop, and Δbot of a single line. The physics by which this geometry interacts with incident light to produce a signal was approximated by the rigorous coupled-wave analysis (RCWA) model that is based on a semianalytical treatment of Maxwell’s equations.12,13 The OCD data used in this study have been generated using this RCWA model for 30 lines and a measurement setup similar to the actual experiment at NIST.14,15 The nominal values are height=36.0nm, width=17.1nm, Δtop=2.9nm, and Δbot=0.5nm. The noise that has been added to this simulated data has been generated using the same correlated errors as described in Sec. 4.1.

Graphic Jump Location
Fig. 1
F1 :

Scanning electron microscope (SEM) image showing the 30 line arrays. Horizontal field of view is 3.63μm.

Graphic Jump Location
Fig. 2
F2 :

Initial parameters for (a) optical critical dimension (OCD) and (b) SEM modeling. The parameters that are in bold font are part of the reduced subset of these parameters that has been used in this work.

The initial geometrical parameters for the SEM fit of each line included the line position, its width at half height, and its left and right sidewall edge slopes (represented by the edge widths). Although the left and right sidewall slopes of an individual line often differed, this was a manifestation of roughness. On average, left and right edge widths were indistinguishable. Consequently, for hybridization with OCD, which averages over a comparatively large area, it was sufficient to treat sidewall width as a single parameter. Similarly, line position was not important for OCD. Hybridization was limited, therefore, to two parameters: the width at the middle and Δtop. The geometry is related to the measured signal by JMONSEL, a single-scattering Monte Carlo simulator with a choice among a number of physical models of the scattering phenomena that affect electron transport. We used tabulated Mott cross sections to model scattering of electrons from nuclei and a dielectric function theory approach to secondary electron generation. Electron–phonon interactions and electron refraction (or total internal reflection) at the vacuum/Si interface were included. Details of JMONSELs treatment of these models are available elsewhere.16

Equipped with the proper set of overlapping parameters, we will now focus on a more realistic modeling of the present measurement data. We will investigate the influence of tool-induced measurement errors for OCD and the influence of scaling errors for SEM.

Optical Critical Dimension

The specific way in which the OCD measurements are performed, by taking images on the xy plane going through focus, requires increased attention to possible systematic and correlated errors. For example, one can imagine that being slightly off axis in the illumination will affect the symmetry of the entire set of collected images. Determining such correlations from measurement data alone is often not possible, and other methods need to be developed in order to give a quantitative description of those effects. We use the Monte Carlo method,17 which is based on the following reasoning. If we denote by ν the vector of k fixed parameters of the measurement setup, we can make use of the following more general model function: Display Formula

f:Rn+kRm,f(p,ν)=[f1(p,ν),,fm(p,ν)]T.(8)

The effect that a slight deviation of the parameters ν from the nominal values ν0 has upon the simulated image can then be estimated from the following five steps:

  1. Assume a distribution for ν, based on tool specifications, expert knowledge, etc.
  2. Draw a sample {νi}i=1,,N from the distribution.
  3. Calculate {f(p,νi)=[f1(p,νi),,fm(p,νi)]T}i=1,,N.
  4. Define ri=f(p,ν0)f(p,νi).
  5. Calculate the sample covariance matrix via Display Formula
    V˜=1N1i=1N(rir¯)(rir¯)T.(9)

In this paper, the ν vector included components for the collection numerical aperture (CNA), illumination numerical aperture (INA), focus heights, and phase. This Monte Carlo procedure, therefore, estimates the effect of errors propagated from variations in those instrument parameters, e.g., we choose the INA to be normally distributed with a mean of 0.13 and a standard deviation of 0.01. Figures 3 and 4 show a graphical representation of the first 752 entries of the sample covariance matrix that correspond to the first four focus heights in X polarization for phase, focus, illumination, and CNA variations, along with the respective concatenated images.

Graphic Jump Location
Fig. 3
F3 :

Estimated covariance matrices for variations in the (a) phase and (b) focus. Random phase perturbations normally distributed with a mean of 0 rad and a standard deviation of π/5rad, and focus offset normally distributed with a mean of 0 nm and a standard deviation of 2 nm. The variance induced by phase variations is 3 orders of magnitude greater than those from focus variation. Four concatenated OCD intensity profiles are duplicated below the matrices in Figs. 3 and 4 to illustrate the positional dependence of the covariance. In general changes correlate to the position of the finite target.

Graphic Jump Location
Fig. 4
F4 :

Estimated covariance matrices for variations in the (a) illumination numerical aperture (INA) and (b) collection numerical aperture (CNA). INA normally distributed with a mean of 0.13 and a standard deviation of 0.01, CNA normally distributed with a mean of 0.95 and a standard deviation of 0.1. The variances induced by INA and CNA variations are one magnitude less than those from phase errors.

One can clearly see both the positive and negative correlation between errors as colored areas off the diagonal. In order to have a reliable model for the measurement errors, it is therefore very important to account for this effect in the covariance matrix that is being used in the χ2 function [see Eq. (3)]. This has an influence not only on the best fit values, but also on the estimation of the parametric uncertainties or, more precisely, the covariance matrix Σ, diagonal elements of which are the uncertainties in the estimated parameters, given by2Display Formula

=(JTV1J)1.(10)
Here J denotes the Jacobian matrix, i.e., the matrix of all first-order partial derivatives of the vector-valued model function f, at the best fit vector p^. A comparison of the parametric uncertainties based on using a diagonal V and the full V in Eq. (10) can be found in Table 1.

Table Grahic Jump Location
Table 1Estimates of the parametric uncertainties for simulated OCD data using diagonal V, i.e., only accounting for uncorrelated random noise, and full V, i.e., taking correlations into account.

Note that there is a notable difference between the estimated parametric uncertainties, with the most significant change in the uncertainty of the width. It is also very important to note that a given parametric uncertainty might increase or decrease if correlations in the measurement data are taken into account. Thus, a general statement about the effect correlated errors have on the estimated parametric uncertainties cannot be given, and it must be investigated separately for each new problem.

Scaling Errors in Scanning Electron Microscope

The biggest contribution to the SEM’s measurement error in this experiment is due to pixel calibration. Errors in this calibration directly influence the obtained values for the CDs. The usual approach is to simply add the uncertainty due to the calibration in quadrature with the estimated parametric uncertainty after the reconstruction. The estimation of the parameters of interest, i.e., the vector p and their uncertainties in combined regression, are based on the combination of the OCD and the SEM data, while the error induced by pixel calibration only affects SEM data; hence, it is not possible to simply include the uncertainty due to the calibration afterward. We will therefore model the effect a variation in the scale has on the measurement in a simple way, multiplying the model parameter vector p=(width,Δtop)T by a scaling parameter κ such that the modified model is given by Display Formula

f˜:R3Rm,f˜(κ,p)=f(κ·p)(11)
with f being the SEM’s model function. In this description, κ=1 corresponds to no scale error, κ=1.01 to 1% scale error, etc. It is clear that this approach leads to a high parametric correlation in the parameters of the model. We will explain this effect using a simple model with an added scaling factor in the following:

Let Display Formula

f:RRm,f(x)=[f1(x),,fm(x)]T,andJ=(Ji,1)i=1,,m,Ji,1=Dfi(12)
be a model function depending on only one parameter x and denote by Dfi the derivative of fi with respect to this parameter. If we assume the measurement errors to be independent and identically distributed (i.i.d.) with unit variance [hence V=(δi,j)i,j=1,,m] we have for the estimated covariance matrix Display Formula
=(JTV1J)1,withJTV1J=i=1m[Dfi]2,(13)
which is well defined if det(JTV1J)=i=1m[Dfi]20.

Now assume that we add a scale parameter to the above model by defining a slightly modified function Display Formula

F:R2Rm,F(κ,x)=f(κ·x),andJ=(Ji,j)i=1,,m;j=1,2,Ji,1=x·Dfi,Ji,2=κ·Dfi.(14)

The estimated covariance matrix for this two-parameter model is then given by Display Formula

=(JTV1J)1,withJTV1J=[i=1mx2[Dfi]2i=1mκx[Dfi]2i=1mκx[Dfi]2i=1mκ2[Dfi]2].(15)

However, Display Formula

det(JTV1J)=i=1mx2[Dfi]2·i=1mκ2[Dfi]2{i=1mκx[Dfi]2}2=0.(16)

Since the above term is always equal to zero, the estimated covariance matrix is not defined and we cannot assign a parametric uncertainty. Since we have prior information about the scale, the actual error lies between 1% and 2%; we can use the Bayesian approach as described in Refs. 2 and 9 under the premise that the prior information can be expressed in terms of normal distributions. The prior information on the parameter κ is treated as an additional data point the model function has to account for, such that we have a function that still depends on κ and x but now maps into an (m+1)-dimensional space, with the (m+1)’th value simply being κ. This also adds additional terms to the Jacobian, Jm+1,1=(/κ)κ=1 and Jm+1,2=(/x)κ=0, such that Display Formula

F˜:R2Rm+1,F˜(κ,x)=[F(κ,x),κ]T,andJm+1,1=1,Jm+1,2=0.(17)

Since we know that κN(1,σκ2), we add an additional entry to V, namely Vm+1,m+1=σκ2 with σκ=0.010.02, and obtain the estimated covariance matrix, again using Eqs. (10) and (13), Display Formula

=(JTV1J)1,withJTV1J=[i=1mx2[Dfi]2+1σκ2i=1mκx[Dfi]2i=1mκx[Dfi]2i=1mκ2[Dfi]2](18)
and Display Formula
det(JTV1J)={i=1mx2[Dfi]2+1σκ2}·i=1mκ2[Dfi]2{i=1mκx[Dfi]2}2.(19)

Equation (19) implies that as long as the prior knowledge about κ is not too vague, i.e., σκ is not too large, the above term is in general not equal to zero and the estimated covariance matrix is well defined. A graphical representation of the above-described phenomenon for the measured SEM data is shown in Fig. 5. Note that the χ2 surface with the added prior information about κ has a distinct minimum at p^=(1,17.01nm,2.86nm), while it is hard to determine where the minimum is for the χ2 surface without prior information. In fact, there is not a single distinct minimum but an infinite set of possible minima. Obviously, p^=(1,17.01nm,2.86nm) is also the minimum for the χ2 without prior information, but so is any vector p^=[κ,(17.01/κ)nm,(2.86/κ)nm] with κ0. Defining a parametric uncertainty is therefore not possible. In contrast, the error estimation for the model with prior information for the scale κ yields a 2% parametric uncertainty if we assume an error of 2% in the scale as expected; here, this strict linearity only holds since the random errors in the SEM data are much smaller than those attributed to the scale error.

Graphic Jump Location
Fig. 5
F5 :

χ2 surface for SEM data (a) without and (b) with Bayesian input for the scale κ.

We now combine the results that we found in the previous section with the hybridization of OCD and SEM data by combined regression. As pointed out in Sec. 2, this is done by minimizing the sum of the respective χ2 functions. Note that we use prior information about the scale κ, so that the χ2 function for the SEM data is modified as shown in Eq. (6). The individual χ2 surfaces in dependence on the width and Δtop are shown in Figs. 6 and 7. For these plots, the height and Δbot have been fixed for OCD. The plots also show the individual minima and the assigned parametric uncertainties. The χ2 surface from the combined regression is shown in Fig. 8, and the results from the combined regression are presented in Table 2. The combined minimum is close to the SEM’s minimum and the parametric uncertainties for combined regression are lower than the individual ones, even for the parameters that are only present in the OCD model. This is due to the strong parametric correlations in the models.

Graphic Jump Location
Fig. 6
F6 :

χ2 surfaces for the SEM data.

Graphic Jump Location
Fig. 7
F7 :

χ2 surfaces for the simulated OCD data.

Graphic Jump Location
Fig. 8
F8 :

Combined χ2 surface for the hybridization.

Table Grahic Jump Location
Table 2Parameter estimates and parametric uncertainties obtained from combined regression.

Following the approach outlined in Ref. 3, we studied the challenges in hybrid metrology due to measurement errors. Those included highly correlated tool-induced errors for the OCD data and systematic errors due to scaling errors in SEM. We have demonstrated how slight variations in the measurement setup for OCD, e.g., in the focus heights, the phase, INA, and CNA lead to highly correlated errors in the measurement data that manifest themselves as nonzero elements in the sample covariance matrix V. Including those off-diagonal elements in the estimation of a parametric uncertainty can lead to either an increased or a decreased parametric uncertainty compared to the case where only the diagonal of the V matrix is used, depending on the individual nature of that particular full V matrix. Furthermore, we demonstrated the influence of scaling errors on the analysis of SEM data. Attempting to account for such scale errors by including the scale as a fully free parameter would lead to unreasonable results due to strong, or even perfect correlations. This problem has been solved using prior information about the scale in a Bayesian approach. Finally, we demonstrated how the more sophisticated error analyses could be used in the hybridization of OCD and SEM data. With the proper treatment of those errors, we could achieve a subnanometer parametric uncertainty. It is important to note that the presented framework can be extended to include additional measurement techniques, such as atomic force microscopy or CD small angle X-ray scattering. In addition, it may also be applied across a homogeneous multiple-tool platform. However, for every added measurement technique, it is crucial to perform a careful error analysis in order to use its full capabilities.

Silver  R. M.  et al., “Improving optical measurement accuracy using multi-technique nested uncertainties,” Proc. SPIE. 7272, , 727202  (2009). 0277-786X CrossRef
Zhang  N. F.  et al., “Improving optical measurement uncertainty with combined multitool metrology using a Bayesian approach,” Appl. Opt.. 51, , 6196 –6206 (2012). 0003-6935 CrossRef
Silver  R. M.  et al., “Optimizing hybrid metrology through a consistent multi-tool parameter set and uncertainty model,” Proc. SPIE. 9050, , 905004  (2014). 0277-786X CrossRef
Henn  M.-A.  et al., “Optimizing hybrid metrology: rigorous implementation of Bayesian and combined regression,” Proc. SPIE. 9424, , 94241J  (2015). 0277-786X CrossRef
Vaid  A.  et al., “A holistic metrology approach: hybrid metrology utilizing scatterometry, CD-AFM, and CD-SEM,” Proc. SPIE. 7971, , 797103  (2011).CrossRef
Vaid  A.  et al., “Implementation of hybrid metrology at HVM fab for 20 nm and beyond,” Proc. SPIE. 8681, , 868103  (2013).CrossRef
Foucher  J.  et al., “Hybrid CD metrology concept compatible with high-volume manufacturing,” Proc. SPIE. 7971, , 79710S  (2011). 0277-786X CrossRef
Sivia  D. S., Data Analysis: A Bayesian Tutorial. ,  Oxford University Press ,  Oxford  (1996).
Gelman  A.  et al., Bayesian Data Analysis. ,  Chapman&Hall/CRC ,  London  (2014).
Kaipio  J., and Somersalo  E., Statistical and Computational Inverse Problems. ,  Springer Science & Business Media ,  Berlin  (2006).
Tarantola  A., and Valette  B., “Generalized nonlinear inverse problems solved using the least squares criterion,” Rev. Geophys.. 20, , 219 –232 (1982). 8755-1209 CrossRef
Moharam  M.  et al., “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A. 12, , 1068 –1076 (1995).CrossRef
Moharam  M.  et al., “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: enhanced transmittance matrix approach,” J. Opt. Soc. Am. A. 12, , 1077 –1086 (1995).CrossRef
Silver  R. M.  et al., “Scatterfield microscopy for extending the limits of image-based optical metrology,” Appl. Opt.. 46, , 4248 –4257 (2007). 0003-6935 CrossRef
Barnes  B. M.  et al., “Zero-order and super-resolved imaging of arrayed nanoscale lines using scatterfield microscopy,” AIP Conf. Proc. . 931, , 397 –401 (2007).
Villarrubia  J. S.  et al., “Scanning electron microscope measurement of width and shape of 10 nm patterned lines using a JMONSEL-modeled library,” Ultramicroscopy. 154, , 15 –28 (2015). 0304-3991 CrossRef
Kroese  D. P., , Taimre  T., and Botev  Z. I., Handbook of Monte Carlo Methods. ,  John Wiley & Sons ,  Hoboken, NJ  (2013).

Mark-Alexander Henn is a guest researcher at the National Institute of Standards and Technology. He received his diploma in mathematics from the Technical University of Berlin, Germany, in 2008, and his PhD degree in theoretical physics from the Technical University of Berlin in 2013. Prior to that he worked for the Physikalisch-Technische Bundesanstalt in Berlin. His current research interests include Bayesian data analysis, electromagnetic modeling, and Fourier optics.

Richard Silver received his BA in physics from the University of California at Berkeley and his PhD in physics from the University of Texas at Austin. He is the Surface and Nanostructure Metrology Group Leader, a fellow of SPIE, co-chair of the European Optical Modeling Conference and on the Program Committee of the Metrology, Inspection, and Process Control Conference. He has over 100 archived publications and was a recipient of the 2013 R&D 100 Award for Quantitative Hybrid Metrology and the 2013 Intel Outstanding Researcher Award.

John S. Villarrubia received his PhD in physics from Cornell University in 1987. He was a visiting scientist for two years at IBM, where he did scanning tunneling microscopy of silicon surfaces. At the National Institute of Standards and Technology since 1989, he has been working on aspects of nanometer-scale dimensional metrology, particularly modeling the instrument function for Atomic Force and Scanning Electron Microscopes and using these models of the instrument function to correct images for measurement artifacts.

Nien Fan Zhang is presently a Mathematical Statistician of the Statistical Engineering Division at the US National Institute of Standards and Technology (NIST). He received MS and PhD in statistics in 1983 and 1985 respectively, at Virginia Polytechnic Institute and State University.

Hui Zhou is a research scientist and a consultant at Dakota Consulting, Inc. His main research area is computational electrodynamics, and his interest spreads to all areas between physics and computing. When he is not at work, you may find him reading a random non-fiction book, playing a game of go, or dreaming of computation and physics in the future.

Bryan M. Barnes is a physicist at the National Institute of Standards and Technology. He received his BA degree in mathematics and physics from Vanderbilt University in 1995, and his MS and PhD degrees in physics from the University of Wisconsin-Madison in 1997 and 2004, respectively. He is the author of more than 40 proceeding and journal papers and holds one patent. His current research interests include optical defect inspection, hybrid metrology, and critical dimension metrology.

Andras E. Vladar, PhD is the leader of the Three-Dimensional Nanometer Metrology Project at the National Institute of Standards and Technology (NIST) USA. He is an expert in scanning electron microscopy and dimensional metrology, one the best-known research scientists and a technical leader of this field. His research interest is in SEM-based sub-10 nm three-dimensional measurements for semiconductor and nanotechnology applications.

© The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

Citation

Mark-Alexander Henn ; Richard M. Silver ; John S. Villarrubia ; Nien Fan Zhang ; Hui Zhou, et al.
"Optimizing hybrid metrology: rigorous implementation of Bayesian and combined regression", J. Micro/Nanolith. MEMS MOEMS. 14(4), 044001 (Nov 12, 2015). ; http://dx.doi.org/10.1117/1.JMM.14.4.044001


Figures

Graphic Jump Location
Fig. 1
F1 :

Scanning electron microscope (SEM) image showing the 30 line arrays. Horizontal field of view is 3.63μm.

Graphic Jump Location
Fig. 2
F2 :

Initial parameters for (a) optical critical dimension (OCD) and (b) SEM modeling. The parameters that are in bold font are part of the reduced subset of these parameters that has been used in this work.

Graphic Jump Location
Fig. 3
F3 :

Estimated covariance matrices for variations in the (a) phase and (b) focus. Random phase perturbations normally distributed with a mean of 0 rad and a standard deviation of π/5rad, and focus offset normally distributed with a mean of 0 nm and a standard deviation of 2 nm. The variance induced by phase variations is 3 orders of magnitude greater than those from focus variation. Four concatenated OCD intensity profiles are duplicated below the matrices in Figs. 3 and 4 to illustrate the positional dependence of the covariance. In general changes correlate to the position of the finite target.

Graphic Jump Location
Fig. 4
F4 :

Estimated covariance matrices for variations in the (a) illumination numerical aperture (INA) and (b) collection numerical aperture (CNA). INA normally distributed with a mean of 0.13 and a standard deviation of 0.01, CNA normally distributed with a mean of 0.95 and a standard deviation of 0.1. The variances induced by INA and CNA variations are one magnitude less than those from phase errors.

Graphic Jump Location
Fig. 5
F5 :

χ2 surface for SEM data (a) without and (b) with Bayesian input for the scale κ.

Graphic Jump Location
Fig. 6
F6 :

χ2 surfaces for the SEM data.

Graphic Jump Location
Fig. 7
F7 :

χ2 surfaces for the simulated OCD data.

Graphic Jump Location
Fig. 8
F8 :

Combined χ2 surface for the hybridization.

Tables

Table Grahic Jump Location
Table 1Estimates of the parametric uncertainties for simulated OCD data using diagonal V, i.e., only accounting for uncorrelated random noise, and full V, i.e., taking correlations into account.
Table Grahic Jump Location
Table 2Parameter estimates and parametric uncertainties obtained from combined regression.

References

Silver  R. M.  et al., “Improving optical measurement accuracy using multi-technique nested uncertainties,” Proc. SPIE. 7272, , 727202  (2009). 0277-786X CrossRef
Zhang  N. F.  et al., “Improving optical measurement uncertainty with combined multitool metrology using a Bayesian approach,” Appl. Opt.. 51, , 6196 –6206 (2012). 0003-6935 CrossRef
Silver  R. M.  et al., “Optimizing hybrid metrology through a consistent multi-tool parameter set and uncertainty model,” Proc. SPIE. 9050, , 905004  (2014). 0277-786X CrossRef
Henn  M.-A.  et al., “Optimizing hybrid metrology: rigorous implementation of Bayesian and combined regression,” Proc. SPIE. 9424, , 94241J  (2015). 0277-786X CrossRef
Vaid  A.  et al., “A holistic metrology approach: hybrid metrology utilizing scatterometry, CD-AFM, and CD-SEM,” Proc. SPIE. 7971, , 797103  (2011).CrossRef
Vaid  A.  et al., “Implementation of hybrid metrology at HVM fab for 20 nm and beyond,” Proc. SPIE. 8681, , 868103  (2013).CrossRef
Foucher  J.  et al., “Hybrid CD metrology concept compatible with high-volume manufacturing,” Proc. SPIE. 7971, , 79710S  (2011). 0277-786X CrossRef
Sivia  D. S., Data Analysis: A Bayesian Tutorial. ,  Oxford University Press ,  Oxford  (1996).
Gelman  A.  et al., Bayesian Data Analysis. ,  Chapman&Hall/CRC ,  London  (2014).
Kaipio  J., and Somersalo  E., Statistical and Computational Inverse Problems. ,  Springer Science & Business Media ,  Berlin  (2006).
Tarantola  A., and Valette  B., “Generalized nonlinear inverse problems solved using the least squares criterion,” Rev. Geophys.. 20, , 219 –232 (1982). 8755-1209 CrossRef
Moharam  M.  et al., “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A. 12, , 1068 –1076 (1995).CrossRef
Moharam  M.  et al., “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: enhanced transmittance matrix approach,” J. Opt. Soc. Am. A. 12, , 1077 –1086 (1995).CrossRef
Silver  R. M.  et al., “Scatterfield microscopy for extending the limits of image-based optical metrology,” Appl. Opt.. 46, , 4248 –4257 (2007). 0003-6935 CrossRef
Barnes  B. M.  et al., “Zero-order and super-resolved imaging of arrayed nanoscale lines using scatterfield microscopy,” AIP Conf. Proc. . 931, , 397 –401 (2007).
Villarrubia  J. S.  et al., “Scanning electron microscope measurement of width and shape of 10 nm patterned lines using a JMONSEL-modeled library,” Ultramicroscopy. 154, , 15 –28 (2015). 0304-3991 CrossRef
Kroese  D. P., , Taimre  T., and Botev  Z. I., Handbook of Monte Carlo Methods. ,  John Wiley & Sons ,  Hoboken, NJ  (2013).

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Related Book Chapters

Topic Collections

Advertisement
  • Don't have an account?
  • Subscribe to the SPIE Digital Library
  • Create a FREE account to sign up for Digital Library content alerts and gain access to institutional subscriptions remotely.
Access This Article
Sign in or Create a personal account to Buy this article ($20 for members, $25 for non-members).
Access This Proceeding
Sign in or Create a personal account to Buy this article ($15 for members, $18 for non-members).
Access This Chapter

Access to SPIE eBooks is limited to subscribing institutions and is not available as part of a personal subscription. Print or electronic versions of individual SPIE books may be purchased via SPIE.org.