Impact of surgeons’ experience on accuracy of radiographic segmental kyphosis assessment in thoracolumbar fractures: a prospective observational study

Background The thoracolumbar region is where most fractures of the spine are located. Segmental kyphosis is an important factor for treatment decisions. There are various methods for measuring segmental kyphosis in thoracolumbar fractures. Our objective was to evaluate if the experience of the surgeon has any influence on kyphosis measurement by analyzing three different categories of orthopedic surgeons and evaluate possible clinical impacts. Material and methods Six physicians separated into three categories according to the level of experience evaluated 30 lateral view radiographs of the thoracic spine of patients with single-level fracture taken during their outpatient follow-up visits. Images had segmental kyphosis measured by five distinct methods. The x-rays were evaluated twice and in a random order after an eight-week interval. The reproducibility of the measurements was analyzed by the intraclass correlation coefficient (ICC) and its respective 95% confidence interval. Results The intraclass correlation coefficient (ICC) was calculated to evaluate the inter- and intra-examiner reliability for each method. The methods that disregard the fractured vertebra (1 and 4) achieved the highest intra and inter-observers reliability among the participants. The measurements from methods 3 and 5 were poorly reproducible between examiners. The difference between the averages of the measurements of the five methods studied was greater than 5 degrees in methods 1 and 2, suggesting risk for patient safety. Conclusion Methods that exclude the fractured vertebra were more reproducible for the evaluation of segmental kyphosis in thoracolumbar fractures. The evaluation of the spine fracture must be coupled with other radiographic criteria, more complex image exams and the patient’s clinical state to assist the surgeon in deciding between conservative or surgical treatment. The authors suggest that the measurements should be performed by methods that exclude the fractured vertebra and conducted by experienced doctors.


Background
The thoracolumbar transition is a region of biomechanical stress and is the location of most of the fractures of the spine [1]. The predominance of fractures at this location is explained by the contrast between the rigid thoracic kyphosis and the flexible lumbar lordosis. These fractures are bimodal and occur predominantly in young males after high energy trauma and the elderly after minor trauma. The main causes of these fractures are vehicle accidents, falls and sports trauma. A high incidence due to gunshot wounds has also been observed [2,3].
Several radiographic parameters are used to guide treatment and to determine the prognosis of these lesions, such as sagittal alignment of the affected segment,  percentage of spinal canal compromise, translation of the vertebral body and scoliosis of the region involved. There are various methods for measuring segmental kyphosis after thoracolumbar fractures, such as Cobb angle measurement, Gardner's method, wedging of the fractured vertebra and others, but none of these methods has been evaluated for reproducibility between groups of surgeons with different levels of experience. These measurements have extreme importance in the initial evaluation of patients, and the therapeutic decision is strictly linked to reliable and reproducible measures.
Studies report that a segmental kyphosis greater than 30 degrees is most likely a consequence of posterior ligamentous complex disruption, which indicates surgical treatment of the fracture [4][5][6]. However, the angle value may differ depending on the technique used or the experience of the surgeon. The present study aimed to evaluate and compare five measurement methods of segmental kyphosis obtained by different experience categories of orthopedic surgeons to determine if the level of experience has an impact on these measures and if the fracture morphology affects the measurement methods.

Methods
This study was previously submitted and approved by the ethical committee of this institution.
A sample calculation was performed using the intraclass correlation coefficient and SPSS® version 17.0. Thirty lateral view radiographs with single fractures between T11 and L2 were evaluated.
All 30 patients included in the study were seen during outpatient follow-up visits at this institution. The professionals who participated in this study considered the printed images of good quality, and there was no need to carry out additional tests, as the images were used as part of the initial assessment of the trauma patient.
The images were evaluated by different classes of professionals in the field of Orthopedics and Spine Surgery, including two members in each of the following categories: orthopedics residents with up to three years of experience, named OR; specializing fellows in spinal surgery with up to five years of experience, named SF; spinal surgeons with at least ten years of experience, named SS.
For measuring kyphosis, five methods described in the literature were used [7][8][9][10][11][12]: the first method ( Figure 1a) uses the superior endplate of the vertebra above and the inferior endplate of the vertebra below the fractured one, yielding the Cobb angle. The second method (Figure 1b) uses the upper endplate of the vertebra above and the lower endplate of the fractured vertebra, yielding the Gardner segment deformity; the third method ( Figure 1c)  Examiners received a brief introductory training prior to the first rating session. Each examiner received printed radiographs numbered from one to thirty and a sheet containing a draft of the five techniques to evaluate segmental kyphosis (Figure 1), in addition to pencils, an eraser and a standard goniometer.
After the first measurement, the x-rays were randomly set in a different order, and a new measurement of the participants was requested after eight weeks.
Patients with pathological fractures and those who had more than one level spine fracture were excluded.

Results
The intraclass correlation coefficient (ICC) and its respective 95% confidence interval were calculated to evaluate the inter-and intra-examiner reliability for each method.

Intraclass correlation
OR 1 had the greatest correlations with methods 1 and 2 and the smallest angular differences between the first and second measurements with methods 1 and 4. OR 2 obtained the greatest correlations and the smallest angular differences with methods 1 and 4. The third method yielded the lowest correlation and the greatest angular difference between the two measurements (Table 1). SF 1 yielded the highest correlations and the smallest angular differences between the first and second measurements with methods 1 and 2. SF 2 showed higher concordance in methods 1 and 2 as well, but the smallest angular differences between the two measurements were obtained with methods 1 and 4. Method 3 also had the lowest intraclass reproducibility ( Table 2).
The SS showed more uniform results, and the greatest correlations were observed for methods 1 and 4. The smallest angular differences between the first and second measurements were also obtained with these methods. Once again, method 3 had low reproducibility and the largest angular difference (Table 3).

Interclass correlation
Methods 1 and 4 showed greater reliability among SF and SS, being superior to others in these two groups of examiners.
As for the OR group, method 2 demonstrated the greatest interclass reliability, followed by method 1. Method 3 had the lowest correlation and was inferior to the other methods in all categories of evaluation in this study ( Table 4).
The correlation value obtained between the variability of the averages of the methods by the experts (standard deviation between the methods) and the percentage of height loss was r = 0.216 (p = 0.271). Therefore, there is no statistically significant relationship between the loss of vertebral body height and the discrepancy between the averages of the measurements of each method (Figure 2).
The difference between the averages of the measurements of the five methods studied was calculated (Table 5). Among some methods, this difference was greater than 5 degrees, suggesting risk to patient safety because there could be surgical indication if the kyphosis was considered alone. However remember that other criteria are critical for surgical indication.

Discussion
In this study, methods 1 and 4 were more reproducible among most of the participant surgeons. We believe that the comparison of measurement methods between examiners with different levels of experience adds a key differentiator to this study because in most University services, the first evaluation is performed by professionals who are not specialists in spine surgery.
Kuklo et al. [8] compared different methods of measurement of segmental kyphosis in thoracolumbar fractures but did not carry out comparative analysis between the measurements performed by examiners with different levels of experience. They compared the measurements performed by two orthopedists and a neurosurgeon, noting that the Cobb angle method was more reliable within and between groups of examiners.
Plain radiograph is the first imaging modality used in trauma and should provide important information that, when combined with other tests of greater complexity and with clinical examination of the patient, should indicate the most appropriate therapeutic protocol. Post-traumatic kyphosis is an important indicator of prognosis and treatment of thoracolumbar fractures because an increase in this angle is directly related to the instability of the fracture [13][14][15]. In addition, studies report a possible association between the kyphotic deformity and residual back pain, making this a crucial element in the indication of surgical treatment for these fractures [16][17][18].  We observed that the methods which disregard the fractured vertebra (methods 1 and 4), were more uniform and consequently had greater agreement within and between groups of professionals. This is because some fractures involve one of the vertebral endplates or cause large body destruction, making it difficult to determine the correct adjacent lines, leading to results with high angular variability. The methods that take into consideration the fractured vertebra are prone to mistakes and variability, but in most thoracolumbar fractures, the upper plateau is more affected than the lower plate. Therefore, the second method was found to have good reliability. The measurements from methods 3 and 5 were poorly reproducible between examiners.
Another factor that can lead to errors during measurement is the presence of osteophytosis in the posterior  region of the vertebral plateau. The measurement may change due to the presence of a bony prominence that often distorts the flat surface of the endplate causing crucial errors [10]. This was one of the factors causing the most disagreement among the residents. Therefore, it is recommended to ignore this posterior bone elevation often found in x-rays ( Figure 3).
An example used in this study shows L1 fracture in a patient victim of an automobile accident. The segmental kyphosis was measured by the five methods described previously. Figure 4 shows the results of measurements performed by a spine surgeon. In this case, only method 2 ( Figure 4b) showed segmental kyphosis greater than thirty degrees and all other methods showed lower values of kyphosis (Figure 4a,c,d and e). Since the patient had no other signs of instability, and the most reliable methods (1 and 4) showed regional kyphosis lower than thirty degrees, conservative treatment was prescribed. The patient was treated with Jewett vest for 12 weeks with fracture union and no complications.
Currently, the TLICS score (Thoracolumbar plate Injury Classification and Severity Score) proposed by Vaccaro et al. [19] provides a new perspective in the evaluation of fractures and helps with the therapeutic decision. It is based on the morphology of the fracture, ligament and disk complex injury and the neurological status of the patient. Radiographic analysis provides important data that suggest the severity of the injury and indirectly provide its prognosis, but it must always be complemented by other image tests, a detailed history that includes the trauma mechanism, a neurological examination and the patient's comorbidities to ultimately determine the best form of treatment.
Therefore, we believe that the evaluation of segmental kyphosis from lateral x-rays of the spine must be coupled with other radiographic criteria, more complex image exams and the patient's clinical state to assist the surgeon in deciding between conservative and surgical treatment.

Conclusion
Methods 1 and 4 were more easily reproducible for the evaluation of segmental kyphosis in thoracolumbar fractures among the examiners who participated in the study. No relationship between the loss of anterior vertebral body height and the discrepancy between the averages of the measurements of each method was observed.
The improvement of measurement methods makes it possible to obtain more reliable measures, regardless of the surgeon's experience, facilitating communication and homogenizing decisions. The authors suggest that the measurements should be performed by methods that exclude the fractured vertebra and conducted by experienced doctors.