Open Access
11 January 2021 Reconstruction of knee anatomy from single-plane fluoroscopic x-ray based on a nonlinear statistical shape model
Jing Wu, Mohamed R. Mahfouz
Author Affiliations +
Abstract

Purpose: Reconstruction of patient anatomy is critical to patient-specific instrument (PSI) design in total joint replacement (TJR). Conventionally, computed tomography (CT) and magnetic resonance imaging (MRI) are used to obtain the patient anatomy as they are accurate imaging modalities. However, computing anatomical landmarks from the patient anatomy for PSIs requires either high-resolution CT, increasing time of scan and radiation exposure to the patient, or longer and more expensive MRI scans. As an alternative, reconstruction from single-plane fluoroscopic x-ray provides a cost-efficient tool to obtain patient anatomical structures while allowing capture of the patient’s joint dynamics, important clinical information for TJR.

Approach: We present a three-dimensional (3D) reconstruction scheme that automatically and accurately reconstructs the 3D knee anatomy from single-plane fluoroscopic x-ray based on a nonlinear statistical shape model called kernel principal component analysis. To increase robustness, we designed a hybrid energy function that integrated feature and intensity information as a similarity measure for the 3D reconstruction.

Results: We evaluated the proposed method on five subjects during deep knee bending: the root-mean-square accuracy is 1.19  ±  0.36  mm for reconstructed femur and 1.15  ±  0.17  mm for reconstructed tibia.

Conclusions: The proposed method demonstrates reliable 3D bone model reconstruction accuracy with successful elimination of prior 3D imaging and reduction of manual labor and radiation dose on patient as well as characterizing joints in motion. This method is promising for applications in medical interventions such as patient-specific arthroplasty design, surgical planning, surgical navigation, and understanding anatomical and dynamic characteristics of joints.

1.

Introduction

Reconstruction of patient anatomy from medical images is important for designing patient-specific instruments (PSIs) for total joint replacement (TJR).14 Current 3D reconstruction methods include reconstruction from radiological images using computed tomography (CT), magnetic resonance imaging (MRI),5,6 x-ray,79 and fluoroscopic x-ray sequences.10,11 Three-dimensional reconstruction from high-resolution CT is expensive and requires prior 3D imaging, exposing patients to large amounts of radiation (0.01 rem for lower extremities) by revolving the x-ray source around the patient with a fine “fan” of x-ray beams through the body from all angles into their associated detectors.12 Three-dimensional reconstruction from MRI requires expensive and time-consuming 3D imaging, adding discomfort to the patient in the form of noise and constraint to the patient’s motion. In addition, reconstruction from CT or MRI requires significant manual intervention for 3D segmentation. Although reconstruction from x-ray images is more cost-efficient than CT or MRI, the x-ray image is static, and therefore, gives limited dynamic information of the joint for clinical assessment.

In contrast, three-dimensional reconstruction from fluoroscopic x-ray sequences provides a clear advantage in lower cost and neglectable radiation (0.003 rem for 33  s) and involves less manual intervention compared with reconstruction from CT and MRI.12 In addition, fluoroscopic x-ray images are sequences of x-ray images that can be captured during dynamic, weight-bearing activities, whereas x-ray imaging is static. Single-plane fluoroscopic x-ray is a useful tool for analyzing joint anatomy and joint kinematics in vivo,13,14 because it allows sufficiently unconstrained motion of the patients, such as deep knee bending (DKB), which can reveal soft tissue information of the joint in multiple positions. Biplanar x-ray using two orthogonal units, although generally more accurate, often unacceptably constrains the motion of the patient.15 In order to capture kinematic information, we reconstruct the 3D anatomy of the knee from single-plane fluoroscopic x-ray to allow patients to move without impairment.

Three-dimensional knee reconstruction from kinematic sequences was first proposed by Baker et al.10 However, this requires volumetric image reconstruction by scanning 360 deg around the patient. Baka et al.7 then reconstructed a distal femur from stereo x-ray images. The method was not optimized to sequence reconstruction as bone shape is expected to stay constant throughout the sequence. An improvement was made by the same author11 to estimate the shape based on all frames of the fluoroscopic x-ray sequence and estimate pose per frame using biplane fluoroscopic x-ray. We propose to reconstruct patient knee anatomy during kinematic motion from single-plane fluoroscopic x-ray sequences employing a two-stage optimization scheme. In the first stage, the pose is estimated with the mean model; in the second stage, optimization alternates between shape and pose. This two-stage optimization is challenging due to the limited information provided from single-plane images and large motion between frames. To address this, we employ a statistical shape model (SSM) to have good shape representation ability and a robust hybrid similarity measure.

Methods for bone shape reconstruction have been proposed in the literature with statistical shape modeling. The SSM was first introduced by Cootes et al. in 1995 who used “shape priors” to restrict the bone reconstruction to plausible shapes.16 The linear SSM is “learned” by a principal component analysis (PCA) of the training shapes of Gaussian-distributed shapes. Although this model has been successfully applied to the 3-D reconstruction of various anatomical structures in medical imaging,7,11,17 there are cases when the set of training shapes exhibits highly nonlinear shape deformations, such as large shape variations between bones. More recently, other shape reconstruction methods include Gaussian process morphable models (GPMMs)18 and an articulated SSM.19 Lüthi et al.18 developed GPMMs to represent the shape variations with a Gaussian process using the leading components of its Karhunen–Loeve expansion. The method reaches its limitations when very fine deformations need to be modeled over a large domain, as is sometimes required in image registration. Balestra et al.19 reconstructed 3-D patient-specific hip joints from x-ray images. In this work, an articulated statistical shape model was utilized to address the problems associated with hip joint narrowing. An average reconstruction error of 1.9 mm was achieved for the hip and 1.1 mm was achieved for the proximal femur.

In this paper, we build on our previously developed method for creation of statistical atlases and establishing correspondence between anatomical models.2023 In order to accommodate for the nonlinear variations, we employ kernel PCA (KPCA)24,25 to map the original data from the input space to a high-dimensional feature space via a nonlinear map and then apply a linear PCA in the feature space. (The justification analysis of the need for a nonlinear SSM is detailed in the Appendix.)

The main contributions of this paper are threefold. (1) We propose a 3-D reconstruction scheme from single-plane fluoroscopic x-ray sequences during kinematic activity. The 3-D reconstruction is based on kernel PCA as a nonlinear SSM to represent the bone shape variation. (2) We propose a hybrid energy function as the similarity measure to integrate feature and intensity information in the fluoroscopic x-ray images. (3) We perform a multi-body reconstruction of knee bones (e.g., distal femur and proximal tibia) from single-plane fluoroscopic x-ray during kinematic activity.

From a clinical point of view, the proposed method offers valuable information needed for accurate assessment of the knee joint in preoperative planning. First, the reconstruction of the patient anatomy under load-bearing conditions provides more accurate representation of the joint alignment compared to the nonload-bearing supine position in CT and MRI. Second, collecting patient kinematic information under load-bearing conditions offers critical information about behavior of soft tissue during unconstrained patient motion. This overcomes the current limitation of kinematic data collection in knee replacement candidates, which is currently obtained using surgical navigation systems intraoperatively and using passive motion of the patient joint under anesthesia. In summary, the reconstructed models, joint alignment, and kinematic information that are obtained can lead to more accurate preoperative planning and execution using either patient-specific instrumentation or computer/robotic guided surgery.

2.

Approach

The goal of this paper is to reconstruct the 3-D knee anatomy from a single-plane fluoroscopic x-ray sequence. The outline of the general framework is shown in Fig. 1. First, digital video fluoroscopic x-ray [two-dimensional (2-D)] images are acquired, allowing capture of dynamic data during kinematic activities. Each frame is calibrated using a known rectangular grid of metal beads.13 The pose of the 3-D model is initialized with a template matching method26 and the shape is initialized as a mean model. The 3-D model is reconstructed from a nonlinear SSM using KPCA. This requires a training process and preimage approximation as described in Sec. 2.1. The reconstruction employs a two-stage optimization scheme for the pose and shape of the -3-D model as discussed in Sec. 2.2. A hybrid energy function as discussed in Sec. 2.3 is used as a similarity measure. The output is the patient-specific knee 3-D model with poses corresponding to each frame in the fluoroscopic x-ray sequence.

Fig. 1

Framework of the proposed method.

JMI_8_1_016001_f001.png

2.1.

Nonlinear Statistical Shape Model

The creation of statistical atlases and establishing correspondence between anatomical models previously developed by our group is a verified method utilized in multiple medical imaging applications.2023 The addition described in this paper is the use of kernel PCA to capture nonlinear variations in anatomical structures (Sec. 2.1.1) and then performing a preimage approximation (Sec. 2.1.2) to get the reconstructed 3-D model from shape parameters. A set of 300 mixed gender knee bones of Caucasian and black American individuals were used to build this SSM.

2.1.1.

SSM training by kernel principal component analysis

The SSM represents the dominant shape variation of the training set. The training data XR3 are 3-D surface mesh models defined as X={xi|i{1,,N}}, where N is the number of training data. Each training shape xiX has a set of vertices, where xi={xik|k{1,,M}}, M denotes the number of vertices in the 3-D surface mesh model, and xik=(x,y,z) denotes the coordinates of the k’th vertex of xi. Corresponding points are found between each training shape before building the SSM with a statistical shape atlas.2023

The input data are mapped onto a high-dimensional feature space H via the nonlinear map Φ:R3H. This map does not need to be explicitly known. Alternatively, one can introduce the kernel matrix K, which is defined to be a function as follows:

Eq. (1)

K(xi,xj)=Φ(xi),Φ(xj),
such that for all data points xi
K=(k(x1,x1)k(x1,xN)k(xN,x1)k(xN,xN))
is symmetric and positive semidefinite. For a Gaussian kernel
K(xi,xj)=exixj22σ2.

Center Φ(xi) by Φ(xi)Φ(xi)1Ni=1NΦ(xi). The corresponding kernel matrix is modified as follows:

Eq. (2)

K¯ij=K1NjKij1NiKij+1N2ijKij.

Conduct PCA by solving the eigenvalue problem

Eq. (3)

CV=λV,
where C=1Ni=1NΦ(xi)Φ(xi)T, V are the eigenvectors, and λ are the eigenvalues. Since the eigenvectors V lie in the span of Φ(xi),,Φ(xN), Eq. (3) is equivalent to

Eq. (4)

Φ(xi)·C¯V=λ[Φ(xi)·V]for all  i=1,,N.

Since K(xi,xj)=Φ(xi),Φ(xj) is symmetric and it has a set of eigenvectors that spans the whole space, this is equivalent to solving the dual eigenvalue problem

Eq. (5)

Kα=Nλα,
where α are the eigenvectors and λ is the eigenvalue for both problems. Then standard PCA is conducted to get the first m eigenvectors to constitute the eigenmatrix A.

Normalize the eigenvectors α1,,αN by requiring that the corresponding vectors in H be normalized, i.e., Vk,Vk=1 for all k=1,,N. This is equivalent to

Eq. (6)

1=i,j=1NαikαjkΦ(xi)·Φ(xj)=λk(αk·αk).

Project the input data XR3 onto the feature space spanned by the first m eigenvectors of C. The projected input data are given by

Eq. (7)

PmΦ(X)=VT,Φ(X)=i=1mαiΦ(Xi)Φ(X)=i=1mαiK(Xi,X).

2.1.2.

Preimage approximation

We need to reconstruct the preimages X^ from the shape parameter θ={θ1,,θm} in this section. In standard PCA, the preimage X^ can simply be approximated by the linear relationship between the shape parameters and the preimage.16 However, this simple linear relationship does not exist for KPCA because the nonlinear map Φ is not necessarily known.27 For kernel PCA, we reconstruct the preimage X^ of the corresponding test point XR3 based on the relationship between input-space and feature-space distances.25

For any two points xi and xj in the input space, the Euclidean distance is given by d(xi,xj). Accordingly, their feature-space distance d˜[Φ(xi),Φ(xj)] is obtained using the mapped images. The relationship between d(xi,xj) and d˜[Φ(xi),Φ(xj)] can be derived as follows:

Eq. (8)

d˜2[Φ(xi),Φ(xj)]=1α1Kα12Kα1,
where α1=α(θb). For the Gaussian kernel of the form k(Φi,Φj)=exixj22σ2, which is invertible, the relationship between d(xi,xj) and d˜[Φ(xi),Φ(xj)] can be described by

Eq. (9)

dij2=2σ2log[12(Kii+Kjjd˜ij2)].

The distance to neighboring points has an exponential impact on the estimation of the current point according to the iterative scheme defined by Schölkopf et al.28 The contribution of neighboring points drops rapidly with increasing distance from the preimage, and only n neighbors {x1,,xn}RM are obtained, where nN.

An M×n matrix is constructed, X=[x1,,xn]. Assuming that the training patterns span an m-dimensional space, we can obtain the singular value decomposition of the centered M×n matrix XH as

Eq. (10)

XH=UΛV=UZ,
where the left singular vector U=[e1,em] is an M×m matrix with orthonormal columns ei and Λ is the right singular vector, Λ are singular values, and Z is an m×n matrix with columns zi containing the projection of xi onto ejs. The squared distance of xi to the origin is equal to zi2 such that d02=[z12,,zn2].

According to the assumption that the preimage x^ is in the span of n neighbors, its location can be estimated by its n nearest neighbors. Hence the least squares solution z^ is yielded by

Eq. (11)

z^=12Λ1V(d2d02).

The least squares solution z^ is expressed by the basis ej that defined a different coordinate system from the input space. Transforming it back to the original coordinate system x^

Eq. (12)

x^=Uz^+x¯.

This preimage approximation algorithm is summarized in Fig. 2.

Fig. 2

Preimage approximation algorithm.

JMI_8_1_016001_f002.png

2.2.

Optimization

To cope with the fitting of multiple parameters, we divided the optimization into two stages. The first stage provides an initialization for the reconstruction by fixing the shape parameters and optimizing pose parameters only. The initial fixed shape is the mean shape, which is rigidly registered to the 2-D image to yield a close-to-optimal pose. The second stage is shape estimation with the estimated pose from the first stage. These two stages run in turn, resulting in the optimal shape and pose estimation. The optimization scheme is shown in Fig. 3.

Fig. 3

Two-stage optimization. Top: stage 1: the estimated shape is rigidly aligned to each frame of fluoroscopic x-ray image. Mean shape is used as initial shape parameter. Center: stage 2: shape is optimized with the fixed pose from stage 1. Stage 1 (pose) and stage 2 (shape) alternate optimization until convergence occurs. Bottom: The hybrid energy function is minimized to get the final estimation of the 3D model’s shape and pose.

JMI_8_1_016001_f003.png

The shape and pose of the 3-D model are optimized in each frame Ii of the fluoroscopic x-ray sequence. Let SR3 be the smooth surface of the object of interest and denote X=[x,y,z], XS to be the spatial coordinates that are measured with respect to the referential frame of the imaging camera as shown in Fig. 4. Let X0R3 and S0R3 be the initial coordinates and surface in the camera reference frame, respectively. We assume a camera realization π:R3Ω such that X=[xz,yz] and ΩR2 denotes the domain of the image Ii. Similarly, we can form the edge image of the object of interest in the image Ii as Ci=Ii, where CiR2.

Fig. 4

Camera coordinate system definition. In the sagittal view, x axis is around the AP axis, y axis is around the around the proximal-distal axis, and z axis is around the medial-lateral axis.

JMI_8_1_016001_f004.png

We can locate the surface of the object of interest S in the camera reference frame via a transformation T such that S=T(S0) and the corresponding pointwise expression is

Eq. (13)

X=T(X0)=RX0+t,
where R is a rotation matrix. R=RγRβRα where
Rγ=[cosγsinγ0sinγcosγ0001],Rβ=[cosβ0sinβ010sinβ0cosβ],  Rα=[1000cosαsinα0sinαcosα]
and t is a translation vector, t=[txtytz].

X0=x^ is reconstructed from the shape parameters θ={θ1,,θm} via X0=f(θ) using kernel PCA, which is described in Sec. 2.1.2.

Two sets of parameters are determined to align the 3-D model to the 2-D x-ray image: one is the shape parameter set θ={θ1,,θm}; the other set contains the pose parameters, p={α,β,γ,tx,ty,tz}. The optimal shape and pose parameters are determined by minimizing the hybrid energy function defined in Sec. 2.3 with a global optimization algorithm called pattern search (PS). PS attempts to minimize an objective function by comparing its value in a finite set of trial points at each iteration. As a direct search method, PS can be applied to functions that are not continuous or differentiable. Convergence for stationary points can be proved from arbitrary starting points.29

2.3.

Hybrid Energy Function

An energy function E, as a similarity measure, quantifies how closely the 3-D model with the current pose and shape fits the corresponding object in the fluoroscopic x-ray image. A hybrid energy function that integrates both feature and intensity information is proposed, which is less time consuming than the intensity-based method and less prone to segmentation error than the feature-based method. A collision detection score is used to avoid a collision between neighboring bones for multi-body registration. Specifically, the hybrid energy function is defined as a linear combination of an edge score Ee, a region score Er, a homogeneity score Eh, and a collision detection score Ec

Eq. (14)

E=[c1u(x)Ee+c2u(x)Erc3Ehc4Ec],
where ci are the weighting parameters that set the contribution of each term to the overall energy function and u(x) is a mask. The value of u(x)=0 for pixels whose projection falls in the area of neighboring bones; u(x)=1 otherwise. The weighting scores are selected with the edge matching value weighted more heavily than the intensity matching values. By weighting the edge score more heavily than the intensity value, the edge value dominates when the 3-D models are close to the true contours. The weight of the homogeneity score is a penalty that matches the value of intensity and edge score, whereas the collision detection weight is a large constant to avoid such behavior.

Two input data are involved in the hybrid energy function: one is a 2-D x-ray image defined as I1:Ω1R2; the other is the projected 3-D model image defined as I2=π(x^), I2:Ω2R2. The generation of edge images is through a bilateral filter followed by a Canny edge detector. The corresponding edge images can be defined by C1=I1 and C2=I2, where C1 is the edge image of the 2-D fluoroscopic x-ray image, C2 is the projected 3-D model edge image, and C1, C2R2.

The intensity information is integrated into the hybrid energy function by matching the x-ray image against the projected 3-D model image. Unlike intensity-based methods, which require a time-consuming digitally reconstructed radiograph (DRR) as the simulated x-ray projection image, the projected model image is generated by rendering the 3-D surface model on the 2-D image plane by software developed by the authors using a 3-D graphics library.13

The region score Er is thus defined as a modified local cross correlation30 as follows:

Eq. (15)

Er(I1,I2)=1NjΩ1in(j)Gp(j)[I1(i)I¯1(j)]2[I2(i)I¯2(j)]2in(j)[I1(i)I¯1(j)]2in(j)[I2(i)I¯2(j)]2,
where I1 is the 2-D fluoroscopic x-ray image, I2 is the projected 3-D model image, I¯1 and I¯2 are the corresponding mean values, i represents a pixel in the neighborhood n(j) around pixel j in I1, and Gp is a Gaussian window function centered around each pixel in the region image.

The feature information is integrated into the hybrid energy function by a direct image-to-image similarity measure in our previous work on rigid 2D–3D registration of knee implants13 as the edge score Ee defined as follows:

Eq. (16)

Ee(C1,C2)=iΩ1,jΩ2C1(i)C2(j)jΩ2C2(j).

The edge score is maximized when the projected 3-D model edge image coincides with the corresponding object edges in the 2-D x-ray image.

The homogeneity score Eh is defined as

Eq. (17)

Eh(I1,I2)=iCinjn(i)|I1(j)I¯in(i)|nin+iCoutjn(i)|I1(j)I¯out(i)|nout,
where the first term designates the variance of the set of gray levels located on the internal of the projected 3-D model image, and the second term designates the variance located on the external of the projected 3-D model. The parameter j represents a pixel in the neighborhood n(i) around pixel i in the 2-D x-ray image, I¯in(i) is the mean intensity in the neighborhood of n(i) inside the 3-D model, nin is the number of pixels inside the 3-D model, I¯out(i) is the mean intensity in the neighborhood of n(i) outside the 3-D model, and nout is the number of pixels outside the 3-D model. Eh is minimal when the external contour of the projected 3-D model delineates two homogeneous regions separated by an edge.

Neighboring bones may overlap each other during registration. Therefore, we include a collision detection score Ec as a penalty to prevent collision of neighboring bones in the same frame

Eq. (18)

Ec(s1,s2)=CH1(s1,s2),
where C is a large constant, s1 and s2 are the surface mesh of the two involved bones, and H1 is the Heaviside function used to penalize any overlapping between two surface models
H1(s1,s2)={1,if collision occurs0,else.

Thus, if no collision is present, H1(s1,s2) will be equal to zero.

It is time-consuming to use a standard collision detection library. An effective solution is achieved using part of the meshes so that only adjacent surface patches of neighboring meshes are involved by the prior knowledge that collision between femur and tibia occurs only in the femur condyle and tibial plateau. Therefore, computational complexity is significantly reduced by searching only in the surface meshes that potentially show an intersection with the surface of a neighboring mesh. We detect a collision by comparing the distance of each tibial plateau vertex to tibial bottom plane h1 with the distance of the corresponding femur condyle vertex to tibial bottom plane h2 as shown in Fig. 5. Therefore, this algorithm is independent on the flexion of the femur or tibia. The proposed collision detection algorithm is summarized in Fig. 6. Note that Ec does not generally prohibit collisions, but makes points that lie inside other meshes less attractive. A plot of each term and the overall energy function with respect to deviation from the correct pose is shown in Fig. 7.

Fig. 5

Collision detection by comparing the distances to tibia bottom plane.

JMI_8_1_016001_f005.png

Fig. 6

Collision detection algorithm.

JMI_8_1_016001_f006.png

Fig. 7

Energy function versus x axis and y axis translation from the true pose: (a) edge score, Ee; (b) region score, Er; (c) homogeneity score, Eh; and (d) overall hybrid energy function.

JMI_8_1_016001_f007.png

Figure 7(a) illustrates that there are many local minima for the edge score. This is due to the multiple matched poses of the 3-D model and the edge image. The region score in Fig. 7(b) is smoother; however, it has a relatively shallow global minimum. The homogeneity score in Fig. 7(c) has a sharp global maximum, which corresponds to the match between the 3-D model projection and the 2-D x-ray image. Figure 7(d) depicts the overall hybrid energy function. The inclusion of an edge score, region score, homogeneity score, and collision detection score results in a hybrid energy function that is relatively smooth and has an obvious global minimum.

3.

Experiments

The training data came from the same database as our previous studies.21 Computed tomography (CT) scans of the knees were performed in the range from 120-mm proximal to the joint to 120-mm distal to the joint. The bones were scanned at 1- to 2-mm intervals and the volumetric data were interpolated at each 0.5 mm in the transverse plane. Automatic segmentation of the CT image was performed according to our previous study.31 The bone atlas was created using a method previously developed by our research group.2023 Segmented three-dimensional surface mesh models were added to the atlas by adaptation of a template mesh to accurately match an input training model. This adaptation generates accurate correspondence of surface vertices.

The fluoroscopic x-ray system is shown in Fig. 8. Single-plane fluoroscopic x-ray sequences of DKB were acquired for five healthy subjects using a high-frequency pulsated video fluoroscopy unit with an image resolution of 640×480  pixels at 60 Hz. Because the largest knee motion occurs in the flexion and antero-posterior translation, the fluoroscopic x-ray imaging was performed in the sagittal plane as shown in Fig. 8. A perspective projection model is used for the fluoroscope by modeling the fluoroscope machine as an x-ray point source and a planar image receptor where the image is formed. The calibration procedure to remove the “pincushion” or “barrel” effect visible in the raw fluoroscopic image is described in our previous work.13

Fig. 8

Monoplane fluoroscopic x-ray system for DKB.

JMI_8_1_016001_f008.png

All subjects underwent a CT scan of the knee and triangulated CT segmentations (extracted with Avizo 8.1.0, FEI Visualization Sciences Group, Hillsboro, Oregon), segmentation was performed by thresholding the volume using a bone threshold. Slices were then examined to correct the segmentation by matching the bone contour to the boundary of the cortical bone. Output models were used as ground truth shapes. The ground truth pose was derived by 2-D–3-D registration of the ground truth shape to the fluoroscopic x-ray images with manual fitting as the initial pose using the software developed in our previous work.13 The training data in the atlas did not include any of the testing bones related to the five subjects of the fluoroscopic x-ray images.

Two evaluation measures were used for reconstruction and rigid registration, respectively. The reconstruction accuracy was evaluated by the root mean square (rms) error between the reconstructed shape and the ground truth one. Let A be the set of estimated shape and estimated pose points and B be the set of ground truth shape and reference pose points. Supposing they are represented as point sets A={a0,a1,,  an1} and B={b0,b1,,bn1}, we define rms distance as follows:

Eq. (19)

RMS(A,B)=i=0n1(aibi)2n.

To evaluate the best achievable accuracy of the proposed KPCA-based SSM, we calculated the rms error for 3-D–3-D reconstruction of the SSM to the ground truth shape for the femur. The best achievable accuracy for the femur would be 0.6 to 0.7 mm.

The registration accuracy was evaluated by the motion difference with respect to the reference standard in the camera reference frame as defined in Fig. 4. Absolute mean and standard deviation of the motion differences (rotation and translation) between the tracked model and the reference standard were reported. The translation parameters relate the translation of the registered model relative to its starting position in mm, and the rotation parameters relate the angles along the defined coordinate system in degrees from the starting pose.

4.

Results

4.1.

Experiment 1

To demonstrate the ability of the proposed KPCA-based SSM reconstruction of femur and tibia in single-plane fluoroscopic x-ray frames, non-rigid 2-D–3-D registration was conducted on a sequence of closely spaced 2-D fluoroscopic x-ray images (around 600 fluoroscopic x-ray images) for five subjects during DKB. Both shape and pose parameters were initialized before reconstruction to prevent falling into local minima in optimization.26,32,33 The initial shape of the 3-D model was the mean model in the nonlinear SSM. The initial pose of the 3-D model was determined by a template-matching method.26

The reconstruction accuracy, listed in Table 1, was calculated by comparing the reconstructed model to the ground truth model segmented from CT.

Table 1

Reconstruction accuracy using single-place fluoroscopy of femur and tibia for five subjects

SubjectFemur rms error (mm)Tibia rms error (mm)
11.141.33
21.811.25
31.121.01
40.920.92
50.961.23
Average1.19±0.361.15±0.17

The rigid registration accuracy of the reconstructed models to the fluoroscopic x-ray images is summarized in Table 2, where Tx, Ty, and Tz are the mean absolute errors of translations in (x,y,z) axes and Rx, Ry, and Rz are the mean absolute errors of rotations in (x,y,z) axes as defined in Fig. 4.

Table 2

Pose estimation accuracy of femur and tibia.

Translation/rotationFemur mean absolute errorLimits of agreementFemur mean absolute errorLimits of agreement
Tx (mm)0.78±0.620.45±1.901.10±0.710.07±2.37
Ty (mm)0.74±0.720.27±1.961.25±1.050.40±1.59
Tz (mm)1.41±0.470.07±3.841.30±1.610.68±3.72
Rx (deg)0.97±0.680.59±2.170.97±0.820.31±2.39
Ry (deg)1.05±1.060.68±2.740.90±0.810.05±2.25
Rz (deg)0.81±0.640.56±1.720.84±0.880.12±2.35

The surface distance maps for the reconstructed femur and tibia of subject 4 compared with the ground truth shape are shown in Figs. 9 and 10, respectively.

Fig. 9

Femoral surface distance map between 3-D reconstruction and the CT model.

JMI_8_1_016001_f009.png

Fig. 10

Tibial surface distance map between 3-D reconstruction and the CT model.

JMI_8_1_016001_f010.png

The overlays of the projected 3-D model edges on the fluoroscopic x-ray images of the same subject are shown in Fig. 11 and its motion curve is shown in Fig. 12. The solid line represents the SSM fitting results, and the dots are ground truth pose in eight selected frames. The rotation error is very small during the whole sequence of DKB with the exception of early stage of DKB. The wide spread of the z axis rotation (around the medial–lateral axis) during DKB leads to large variation in that axis for both femur and tibia. In contrast, the rotation around the x axis [anterior–posterior (AP) axis] and y axis (proximal–distal axis) is relatively steady.

Fig. 11

3-D model edge overlay on the x-ray images in key poses for femur and tibia during DKB (subject 4).

JMI_8_1_016001_f011.png

Fig. 12

Femur and tibia kinematics of subject 4. (a) Femur and tibia rotation; (b) femur and tibia translation. Solid lines represent KPCA-based SSM fitting results and solid circles represent ground truth pose from manual fitting.

JMI_8_1_016001_f012.png

4.2.

Experiment 2

To compare single-plane reconstruction with biplane reconstruction, an additional AP view image was added to the fluoroscopic x-ray sequence. The AP view image was DRR using CT of the corresponding patient with known poses using the framework developed in our previous work.13 We tested all five patients, reducing the femoral average reconstruction error from 1.19 to 1.04 mm and the tibial average reconstruction error from 1.15 to 1.03 mm, as shown in Table 3.

Table 3

Reconstruction accuracy using biplane reconstruction of femur and tibia for five subjects

SubjectFemur rms error (mm)Tibia rms error (mm)
11.091.22
21.581.24
30.960.89
40.720.78
50.871.04
Average1.04±0.331.03±0.19

The surface distance maps for the reconstructed femur and tibia of subject 4 compared with the ground truth shape are shown in Figs. 13 and 14, respectively.

Fig. 13

Femoral surface distance map between 3-D reconstruction and the CT model with an additional simulated AP view image (subject 4).

JMI_8_1_016001_f013.png

Fig. 14

Tibial surface distance map between 3-D reconstruction and the CT model with an additional simulated AP view image (subject 4).

JMI_8_1_016001_f014.png

4.3.

Experiment 3

We performed leave-one-out experiments on the proposed method to compare reconstruction results between KPCA and linear PCA with different numbers of principal components (PC) σ. The reconstruction accuracy is calculated by rms error for the reconstructed surface shape compared with the ground truth. As shown in Fig. 15, the rms error decreases more quickly with the increase of σ at small numbers of principal components (pc=1 to 10), except bounding cases with large or small σ values. The rms error of σ=200 fluctuates greatly due to the decreased numerical stability for large σ value. As the number of principal components exceeds 10, the rms error stays relatively constant with increasing number of principal components. For a constant number of principal components between 5 and 10, the reconstruction error of KPCA (σ=30, 50, and 70) is smaller than that of PCA and the rms error of PCA approaches the rms error of KPCA with more than 10 principal components.

Fig. 15

Reconstruction error of KPCA and PCA models. The KPCA curve is shown for seven different values of kernel parameter σ.

JMI_8_1_016001_f015.png

The compactness of the model is examined by mapping the cumulative shape variance versus the number of principal components used in PCA and KPCA, as shown in Fig. 16. The cumulative variance of KPCA with different kernel parameters (σ=1, 10, 30, 50, 70, 100, and 200) is compared with that of PCA. As expected, the compactness of KPCA increases as the kernel parameter σ increases, and it approaches a straight line as σ approaches 1. Given a fixed number of principal components, a higher value of σ corresponds to a higher cumulative variance. This is because the Gaussian distribution is wider for a higher value of σ, resulting in more information remaining in the kernel mapping. The trade-off, however, is that a higher value of σ decreases the difference between different data points, decreasing the numerical stability of the KPCA reconstruction.

Fig. 16

Compactness of the statistical models. Percentage of total shape variance versus the number of principal components for SSMs using PCA and KPCA.

JMI_8_1_016001_f016.png

5.

Discussion

We introduced an automatic method for reconstructing both femur and tibia in single-plane fluoroscopic x-ray sequences without the acquisition of prior 3-D imaging. A nonlinear SSM, KPCA, was used for 3-D reconstruction. To validate the proposed method, we report reconstruction results for five subjects conducting DKB sequences. To compare with biplane reconstruction, we added a simulated AP view image for the reconstruction, which significantly reduced the reconstruction error.

The proposed nonlinear SSM-based tracking and reconstruction method has an accuracy of 1.19±0.36  mm for femur and 1.15±0.17  mm for tibia as reported in Table 1. The surface distance map in Fig. 9 illustrates that large errors in femoral 3-D reconstruction occur in the areas that are not visible in 2-D images, such as intercondyle fossa, medial condyle, and the popliteal surface. Error in these areas is not clinically important to overall sizing and PSIs.9 Similarly, large errors in tibial 3-D reconstruction occur in the medial epicondyle and tibial plateau (illustrated in Fig. 10). To prove this assertion, experiment 2 was conducted by adding an AP view image to the single-plane sequence with DRR image to mimic biplanar fluoroscopic x-ray, reducing the average reconstruction error for the femur from 1.19 to 1.04 mm and for the tibia from 1.15 to 1.03 mm.

In rigid registration, the largest errors in femur poses occur in the y axis rotation (around the proximal–distal axis), as reported in Table 2. Because the bone surface has a relatively short distance to this axis, rotations around this axis cause only small changes in the contour shapes. In addition, the shape variation from the SSM may also compensate for this change in y axis rotation. The largest translation error occurs in the out-of-plane translation, because translations in this axis only cause small changes in the contour scale instead of shape changes.

Our reconstruction accuracy is comparable to single-plane or biplane reconstruction results in the literature, despite the fact that our study used the lower resolution fluoroscopic x-ray images. Of the few single-plane reconstruction studies in the literature, Baka et al.11 reported a biplane reconstruction rms error of 1.48 mm on 10 in vivo reconstructions of the distal femur from fluoroscopic x-ray sequences. Whitmarsh et al.34 reported an accuracy of 1.22 mm mean error of 30 femoral reconstructions from single-plane simulated dual-energy x-ray absorptiometry (DXA) images. Zheng35 reported an average error of 1.6 mm on a pelvic reconstruction from a single standard AP x-ray radiograph. More studies are conducted on biplane reconstruction in the literature. Laporte et al.36 reported a biplane reconstruction mean error of 1 mm on eight distal femurs of dry cadavers. An average mean reconstruction error of 1.2 mm was reported by Zheng et al.37 using two views for cadaveric femurs.

Although in this work we present a method to reconstruct the 3-D knee anatomy from single-plane fluoroscopic x-ray sequences, the method can be applied to a multi-view reconstruction as well. A multi-view reconstruction usually achieves higher reconstruction accuracy; however, multi-view fluoroscopic x-ray machines are more expensive and less used in clinics than single-view devices. Additionally, multi-view fluoroscopic x-ray machines may constrain the patient’s motion.

Reconstruction was implemented in C++ on a 3.07 GHz desktop computer with 12.0 GB of RAM. It took about 20 min to reconstruct one bone in the unoptimized implementation. Since the most time-consuming part is the shape optimization on all the frames, computation time may be greatly decreased by conducting this in parallel.

One limitation in this study is the use of a healthy population for model reconstruction. As a result, emphasis is put on the reconstruction accuracy rather than pathology analysis. Osteoporosis might result in a more varied bone shape and impact the reconstruction accuracy, thus making the reconstruction more challenging. In addition, DKB and many other activities may not be feasible in pathological patients. In the future work, we will perform reconstructions on both healthy and osteoporotic patients and on a significantly larger patient population. Collision detection on osteoporotic patients will be conducted in the future study as well. This paper is also more focused on algorithm development rather than clinical study, and we will perform clinical study and kinematics analysis with the fluoroscopic x-ray images in the future.

6.

Conclusion

In conclusion, we present an automatic 3-D reconstruction scheme for reconstruction of knee anatomy from single-plane fluoroscopic x-ray sequences based on a nonlinear SSM. Both distal femur and proximal tibia were reconstructed from the single-plane fluoroscopic x-ray sequence during kinematic activity (e.g., DKB). The proposed method demonstrates great utility in successful elimination of prior 3-D imaging and reduction of manual labor and radiation dose on patient. It also provides more accurate information of the patients in motion than static imaging methods such as static x-ray imaging. The validation results demonstrate the reliability of the proposed method for 3-D bone model reconstruction during DKB with the average reconstruction accuracy of 1.19±0.36  mm for femur and 1.15±0.17  mm for tibia. The proposed method is promising for applications in medical interventions such as patient-specific arthroplasty design, surgical planning, surgical navigation, understanding anatomical and dynamic characteristics of joints, and characterizing joints in motion.

7.

Appendix: Justification for Nonlinear SSM

The linear SSMs using PCA are limited in their applicability to more complicated shape deformations, such as the knee. If the training data form distinct clusters in shape space or if the shapes are not Gaussian-distributed, the linear shape prior tends to mix classes and blur details of the shape information; consequently, the resulting shape prior no longer effectively models the shapes in the training data. We thus analyze the distribution of the training data with linear PCA. The training data are projected onto the first and second principal components of PCA. The distribution of the second principal component projection versus the first principal component projection in Fig. 17 clearly shows distinct clusters in left and right quadrants. Note that if the training set were Gaussian-distributed, then all projections should be Gaussian-distributed as well. Therefore, the training dataset is not represented by a Gaussian distribution.

Fig. 17

First and second principal components in the shape. Each point represents the value of the first and second principal component of PCA in the training dataset.

JMI_8_1_016001_f017.png

To validate this assumption, multiple parametric probability distributions to the first and second principal component projections of the training data were performed. Results show a large gap in the probability density function (PDF) to a Gaussian distribution (Fig. 18). The assumption of Gaussian distribution has some limitation to represent the distribution of the training dataset. Therefore, kernel PCA (KPCA) is used as a nonlinear shape prior to model the knee bone shapes.

Fig. 18

Distribution of the training dataset. Fit of various parametric probability distributions to (a) the first principal component and (b) second principal component of a training dataset for a PCA (linear) SSM.

JMI_8_1_016001_f018.png

Disclosures

The authors have no relevant financial interests in the manuscript and no other potential conflicts of interest to disclose.

Acknowledgments

The authors would like to thank Dr. Joseph Michael Johnson, Dr. Emam E. Abdel Fatah, Dr. H. El Dakhakhni, and Jayne Dadmun for their help in preparing this paper.

References

1. 

A. Sharma, R. D. Komistek and M. R. Mahfouz, “In vivo kinematics evaluation in flexion of patients implanted with primary TKA,” Tech. Knee Sur., 10 (2), 66 –72 (2011). https://doi.org/10.1097/BTK.0b013e31821cabb2 Google Scholar

2. 

D. A. Dennis et al., “In vivo determination of normal and anterior cruciate ligament-deficient knee kinematics,” J. Biomech., 38 (2), 241 –253 (2005). https://doi.org/10.1016/j.jbiomech.2004.02.042 JBMCB5 0021-9290 Google Scholar

3. 

R. E. Ellis et al., “A surgical planning and guidance system for high tibial osteotomy,” Comput. Aided Surg., 4 (5), 264 –274 (1999). https://doi.org/10.3109/10929089909148179 Google Scholar

4. 

D. K. Bae and S. J. Song, “Computer assisted navigation in knee arthroplasty,” Clin. Orthopedic Surg., 3 (4), 259 –267 (2011). https://doi.org/10.4055/cios.2011.3.4.259 Google Scholar

5. 

K. Rathnayaka et al., “Quantification of the accuracy of MRI generated 3D models of long bones compared to CT generated 3D models,” Med. Eng. Phys., 34 (3), 357 –363 (2012). https://doi.org/10.1016/j.medengphy.2011.07.027 MEPHEO 1350-4533 Google Scholar

6. 

A. Neubert et al., “Comparison of 3D bone models of the knee joint derived from CT and 3T MR imaging,” Eur. J. Radiol., 93 178 –184 (2017). https://doi.org/10.1016/j.ejrad.2017.05.042 EJRADR 0720-048X Google Scholar

7. 

N. Baka et al., “2D–3D shape reconstruction of the distal femur from stereo x-ray imaging using statistical shape models,” Med. Image Anal., 15 (6), 840 –850 (2011). https://doi.org/10.1016/j.media.2011.04.001 Google Scholar

8. 

G. Zheng et al., “A 2D/3D correspondence building method for reconstruction of a patient-specific 3D bone surface model using point distribution models and calibrated x-ray images,” Med. Image Anal., 13 (6), 883 –899 (2009). https://doi.org/10.1016/j.media.2008.12.003 Google Scholar

9. 

P. Cerveri et al., “2D/3D reconstruction of the distal femur using statistical shape models addressing personalized surgical instruments in knee arthroplasty: a feasibility analysis,” Int. J. Med. Rob., 13 (4), 1 –13 (2017). https://doi.org/10.1002/rcs.1823 Google Scholar

10. 

C. Baker et al., “CT from an unmodified standard fluoroscopy machine using a non-reproducible path,” Lect. Notes Comput. Sci., 3117 11 –23 (2004). https://doi.org/10.1007/978-3-540-27816-0_2 LNCSD9 0302-9743 Google Scholar

11. 

N. Baka et al., “Statistical shape model-based femur kinematics from biplane fluoroscopy,” IEEE Trans. Med. Imaging, 31 (8), 1573 –1583 (2012). https://doi.org/10.1109/TMI.2012.2195783 ITMID4 0278-0062 Google Scholar

12. 

, “Radiation dose calculator,” https://www.ans.org/pi/resources/dosechart/ Google Scholar

13. 

M. R. Mahfouz et al., “A robust method for registration of three-dimensional knee implant models to two-dimensional fluoroscopy images,” IEEE Trans. Med. Imaging, 22 1561 –1574 (2003). https://doi.org/10.1109/TMI.2003.820027 ITMID4 0278-0062 Google Scholar

14. 

W. A. Hoff et al., “Three-dimensional determination of femoral-tibial contact positions under in vivo conditions using fluoroscopy,” Clin. Biomech., 13 (7), 455 –472 (1998). https://doi.org/10.1016/S0268-0033(98)00009-6 CLBIEW 0268-0033 Google Scholar

15. 

P. Markelj et al., “A review of 3D/2D registration methods for image-guided interventions,” Med. Image Anal., 16 (3), 642 –661 (2012). https://doi.org/10.1016/j.media.2010.03.005 Google Scholar

16. 

T. F. Cootes et al., “Active shape models—their training and application,” Comput. Vision Image Understanding, 61 (1), 38 –59 (1995). https://doi.org/10.1006/cviu.1995.1004 Google Scholar

17. 

J. Yao and R. Taylor, “Deformable 2D-3D medical image registration using a statistical model: accuracy factor assessment,” Am. J. Sci. Eng., 1 (2), 1 –13 (2012). Google Scholar

18. 

M. Lüthi et al., “Gaussian process morphable models,” IEEE Trans. Pattern Anal. Mach. Intell., 40 (8), 1860 –1873 (2018). https://doi.org/10.1109/TPAMI.2017.2739743 ITPIDJ 0162-8828 Google Scholar

19. 

S. Balestra et al., “Articulated statistical shape model-based 2D-3D reconstruction of a hip joint,” Lect. Notes Comput. Sci., 8498 128 –137 (2014). https://doi.org/10.1007/978-3-319-07521-1_14 LNCSD9 0302-9743 Google Scholar

20. 

M. R. Mahfouz et al., “Patella sex determination by 3D statistical shape models and nonlinear classifiers,” Forensic Sci. Int., 173 161 –170 (2007). https://doi.org/10.1016/j.forsciint.2007.02.024 FSINDR 0379-0738 Google Scholar

21. 

M. R. Mahfouz et al., “Automatic methods for characterization of sexual dimorphism of adult femora: distal femur,” Comput. Methods Biomech. Biomed. Eng., 10 (6), 447 –456 (2007). https://doi.org/10.1080/10255840701552093 Google Scholar

22. 

E. E. A. Fatah et al., “A three-dimensional analysis of bilateral directional asymmetry in the human clavicle,” Am. J. Phys. Anthropol., 149 547 –559 (2012). https://doi.org/10.1002/ajpa.22156 Google Scholar

23. 

E. E. A. Fatah et al., “Improving sex estimation from crania using a novel three-dimensional quantitative method,” J. Forensic Sci., 59 (3), 590 –600 (2014). https://doi.org/10.1111/1556-4029.12379 Google Scholar

24. 

S. Mika et al., “Kernel PCA and de-noising in feature spaces,” in Adv. Neural Inf. Process. Syst., 536 –542 (1999). Google Scholar

25. 

J. T. Y. Kwok and I. W. H. Tsang, “The pre-image problem in kernel methods,” IEEE Trans. Neural Networks, 15 (6), 1517 –1525 (2004). https://doi.org/10.1109/TNN.2004.837781 ITNNEP 1045-9227 Google Scholar

26. 

J. Wu, E. E. A. Fatah and M. R. Mahfouz, “Fully automatic initialization of two-dimensional–three-dimensional medical image registration using hybrid classifier,” J. Med. Imaging, 2 (2), 024007 (2015). https://doi.org/10.1117/1.JMI.2.2.024007 JMEIET 0920-5497 Google Scholar

27. 

G. H. Bakir, J. Weston and B. Schölkopf, “Learning to find pre-images,” in Proc. Adv. Neural Inf. Process. Syst., 449 –456 (2003). Google Scholar

28. 

B. Schölkopf and A. Smola, Learning with Kernels, MIT Press, Cambridge, MA (2002). Google Scholar

29. 

W. Yu, “The convergence property of the simplex evolutionary techniques,” Sci. Sin., 1 66 –67 (1979). Google Scholar

30. 

T. Netsch et al., “Towards real-time multi-modality 3-D medical image registration,” in Proc. 8th IEEE Int. Conf. Comput. Vision, 718 –725 (2001). https://doi.org/10.1109/ICCV.2001.937595 Google Scholar

31. 

J. M. Johnson, “Analysis, segmentation and prediction of knee cartilage using statistical shape models,” Department of Mechanical, Aerospace, and Biomedical Engineering, University of Tennessee, (2013). Google Scholar

32. 

J. Wu and M. R. Mahfouz, “Robust x-ray image segmentation by spectral clustering and active shape model,” J. Med. Imaging, 3 (3), 034005 (2016). https://doi.org/10.1117/1.JMI.3.3.034005 JMEIET 0920-5497 Google Scholar

33. 

J. Wu, “2D-3D registration of knee joint from single plane z-ray fluoroscopy using nonlinear shape priors,” Department of Mechanical, Aerospace, and Biomedical Engineering, University of Tennessee, (2016). Google Scholar

34. 

T. Whitmarsh et al., “3D bone mineral density distribution and shape reconstruction of the proximal femur from a single simulated DXA image: an in vitro study,” Proc. SPIE, 7623 76234U (2010). https://doi.org/10.1117/12.844110 PSISDG 0277-786X Google Scholar

35. 

G. Zheng, “Statistical shape model-based reconstruction of a scaled, patient-specific surface model of the pelvis from a single standard AP x-ray radiograph,” Med. Phys., 37 (4), 1424 –1439 (2010). https://doi.org/10.1118/1.3327453 MPHYA6 0094-2405 Google Scholar

36. 

S. Laporte et al., “A biplanar reconstruction method based on 2D and 3D contours: application to the distal femur,” Comput. Methods Biomech. Biomed. Eng., 6 (1), 1 –6 (2003). https://doi.org/10.1080/1025584031000065956 Google Scholar

37. 

G. Zheng et al., “A 2D/3D correspondence building method for reconstruction of a patient-specific 3D bone surface model using point distribution models and calibrated x-ray images,” Med. Image Anal., 13 (6), 883 –899 (2009). https://doi.org/10.1016/j.media.2008.12.003 Google Scholar

Biography

Jing Wu received her bachelor’s and master’s degrees in material science and engineering from Shanghai Jiao Tong University in 2005 and 2008, respectively, and her PhD in biomedical engineering from the University of Tennessee, Knoxville, Tennessee, in 2016. She is a research engineer at TechMah Medical LLC. She has published 1 monograph, 15 journal publications, and 12 conference articles and abstracts. Her research interests include machine learning, deep learning, optimization, computer vision, signal processing, and medical imaging.

Mohamed R. Mahfouz is a professor of biomedical engineering at the University of Tennessee, Knoxville, Tennessee. His research interest is in medical imaging and wireless sensors in the biomedical field. He has more than 140 journal articles published in distinguished international journals, more than 350 abstracts and short papers accepted for publication at conferences, and has given more than 40 guest lectureships worldwide.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Jing Wu and Mohamed R. Mahfouz "Reconstruction of knee anatomy from single-plane fluoroscopic x-ray based on a nonlinear statistical shape model," Journal of Medical Imaging 8(1), 016001 (11 January 2021). https://doi.org/10.1117/1.JMI.8.1.016001
Received: 25 January 2020; Accepted: 23 October 2020; Published: 11 January 2021
Lens.org Logo
CITATIONS
Cited by 8 scholarly publications.
Advertisement
Advertisement
KEYWORDS
3D modeling

X-rays

X-ray imaging

3D image processing

Principal component analysis

Bone

X-ray computed tomography

Back to Top