Virtual Mechanical Testing from Low-Dose CT Scans Predicts Tibial Fracture Time to Union and Outperforms Subjective Outcomes Scoring

(325 Words Max)

Similarly, pilot data for the Warwick Hip Trauma Evaluation (WHiTE) trial indicated that nearly 37 1,000 patients would be needed detect a clinically significant difference in EuroQol 5-Dimension 38 (EQ-5D) scores between implant groups. 5 PROMs data can also lead to non-inferiority 39 conclusions, such as in a recent trial of immediate versus delayed weight-bearing after tibial 40 nailing. 6

41
In contrast to PROMs, radiographic assessments offer the promise of objectivity, but 42 historically have been limited by concerns with reliability. 7 In response to this need, the 43 Radiographic Union Scale for Tibial Fractures (RUST) score was developed as a structured semi-44 quantitative method for assessing callus. RUST scoring has demonstrable reliability, 8,9 has been 45 adapted for use in distal femur fractures, 10 has been used to diagnose nonunion fractures (RUST > 46 10 for union), 11-13 and has been adopted in the design of large randomized controlled trials. 14 Despite this wide use, no data currently exists to assess whether RUST scores are a reliable 48 measure of structural bone healing in clinical, not preclinical, application. 49 Accordingly, the objective of this study was to assess bone healing in a sequential cohort 50 of tibial fracture patients using a comprehensive suite of radiographic and patient-reported outcomes measures and to critically evaluate these instruments with reference to a new objective 52 biomechanical gold standard: the torsional rigidity of the fractured limb relative to the intact bone 53 derived using patient-specific image-based finite element models from low-dose CT scans. followed-up at 6, 12, 18, and 24 weeks or until clinical union, which was defined as radiographic 60 union with pain free ambulation. All cases were also reviewed at least one year after surgery.

61
Outcome measures included EQ-5D and NRS pain scores, RUST scores, and quantitative 62 CT-derived morphometric and structural measures of callus. To minimize bias, RUST scoring was 63 completed by the senior clinician using blinded radiographs that were randomly shuffled such that 64 films from each follow-up were not presented in chronological order or grouped by patient. RUST 65 scores were repeated on the blinded shuffled films by the same rater after a period of four months 66 to assess intra-rater agreement. None of the CT-derived measures were made available for 67 comparison to the RUST scores or clinical findings until after all evaluations were completed.

68
The study design was reviewed and approved by the local Institutional Review Board and 69 all patients provided written informed consent.

107
Funding to support study management was provided by a grant from OrthoXel, DAC.   121 Complete descriptive statistics for all EQ-5D component scores and the NRS pain scores 122 are shown in Table 2 and selected statistically or clinically meaningful temporal trends are 123 illustrated in Figure 3. Patients tended to report significantly improved mobility, capacity for self-124 care, engagement in usual activities, and general health over time. Post-hoc testing showed that 125 this trend was generally only statistically significant between the first and the last time points. One 126 exception was the usual activities score, where patients indicated the greatest difficulty at the six-

127
week follow-up, which was significantly different from the scores reported at all later time points.

128
Pain scores were very low for most patients and were steady over time, both in the EQ-5D pain 129 component score and on the NRS pain scale.

131
RUST scores at the four follow-ups are also shown in Figure 3, together with time to union.

132
The intra-class correlation coefficient, ICC(3,1) for test-retest reliability of RUST scoring with 133 consistency effects (two-way random single measures procedure in SPSS) was 0.727 (95% CI 134 0.597 -0.820). RUST scores significantly increased over time (p = 0.001), and after six weeks, 135 became non-normally distributed with a notable ceiling effect, with at least 75% of patients 136 achieving RUST ≥ 10 from 18 weeks onward.

138
Complete morphometric data for each patient-specific model is provided in Table 3, with 139 selected values illustrated in Figure 4. For each patient-specific model, the fractured limb VTR 140 was normalized by that patient's own reconstructed intact tibia to produce a dimensionless 141 indicator of healing relative to the pre-injury state: normalized VTR. Across all patients, median 142 normalized VTR was 99% (IQR 86% -113%) at 12 weeks.

144
Relationships between PROMs, RUST scores, and CT-derived properties at 12 weeks This study also suggests some potential difficulties with the widely used RUST score.   Mobility and Usual Activities components were recorded on an interval scale (1-5) with 1 representing 327 "no difficulty". The EQ-5D Health VAS score was recorded on an interval scale (0-100) with 100 being 328 "best imaginable health state". The NRS Pain Score was also recorded no an interval scale (0-10) with 10 329 being "worst pain imaginable". RUST scores and time to union were assessed by the senior clinician.