Three-dimensional models increase the interobserver agreement for the treatment of proximal humerus fractures
Patient Safety in Surgery volume 14, Article number: 33 (2020)
The agreement for the treatment of proximal humerus fractures is low. Interpretation of exams used for diagnosis can be directly associated with this limitation. This study proposes to compare the agreement between experts and residents in orthopedics for treatment indication of proximal humerus fractures, utilizing 3D-models, holography (augmented reality), x-rays, and tomography as diagnostic methods.
Twenty orthopedists (ten experts in shoulder and elbow surgery and ten experts in traumatology) and thirty resident physicians in orthopedics evaluated nine fractures of the proximal humerus, randomly distributed as x-rays, tomography, 3D-models and holography, using the Neer and AO / OTA Classifications. After, we evaluated the interobserver agreement between treatment options (conservative, osteosynthesis and arthroplasty) and whether the experience of the evaluators interfered with the results.
The interobserver agreement analysis showed the following kappa-values: κ = 0.362 and κ = 0.306 for experts and residents (3D-models); κ = 0.240 and κ = 0.221 (X-ray); κ = 0.233 and κ = 0.123 (Tomography) and κ = 0.321 and κ = 0.160 (Holography), for experts and residents respectively. Moreover, residents and specialists were discordant in the treatment indication using Tomography as a diagnostic method (p = 0.003). The same was not seen for the other diagnostic methods (p > 0.05).
Three-dimensional models showed, overall, the highest interobserver agreement (experts versus residents in orthopedics) for the choice of treatment of proximal humerus fractures compared to X-ray, Tomography, and Holography. Agreement in the choice of treatment among experts that used Tomography and Holography as diagnostic methods were two times higher compared to residents.
Registered in Brazil Platform under no. CAAE 12273519.7.0000.5505.
Proximal humerus fractures are common in orthopedic practice and are likely to become more prevalent with increased life expectancy and the association with osteoporosis . Despite being a routine in orthopedic medical practice, understanding different patterns of shoulder fractures, the number associated injuries, the classification, and proposed treatment remains uncertain. The diversity of treatments has been discussed as a relevant subject in studies involving traumatology and shoulder surgery [1,2,3].
The interpretation of fractures in the proximal humerus and many other fractures depends on complementary diagnostic tests (usually x-ray and/ or tomography) and the correlation with pre-existing classifications. The widespread and well-known classifications are the Charles Neer, in 1970 [4, 5] and the AO/OTA group - Arbeit Gemeinschaft für Osteosynthesefragen . However, several studies demonstrate low agreement for intra and interobserver reproducibility and the correlation between the diagnosis, classification, and therapeutic proposal involving these lesions [2, 3, 7, 8]. These limitations encourage new studies to improve the classifications known or even alternative diagnostic methods. Recently, Raffaele Russo. et al. , Fernando Carlos Mothes. et al.  and You W. et al.  used 3D-models to improve the surgical programming of proximal humerus fractures and reported good results. Awan OA. et al. , using the same three-dimensional models to present acetabular fractures to resident physicians, reported an improvement in the understanding of the particularities of this fracture. In a recent publication from our group , we suggested a relevant improvement in the diagnostic agreement among specialists and residents in orthopedics utilizing 3D-models for proximal humerus fractures compared to x-rays and tomographies. In addition, we presented the use of augmented reality (holography) as a diagnostic method to find a way to reproduce the characteristics of these fractures reliably.
In this work, we evaluate the interobserver agreement among four diagnostic methods (x-rays, tomographies, 3D-models, and holography) chosen as the best treatment strategy for proximal humerus fractures.
This study was observational, cross-sectional, involving the presentation of proximal humerus fractures as digital x-rays, tomography, 3D-models, and augmented reality to 2 groups of doctors (1 and 2). The images were presented at random, and each group was submitted to four exams. The group was unable to discriminate among the exams during the evaluations.
Sample size determination
A sample size of 9 cases was determined by statistical analysis, to obtain a 95% confidence interval, with an amplitude of 0.40 for a kappa concordance coefficient estimated at 0.70. A standard deviation of 0.30 was assumed for calculations [14,15,16].
The groups were identified at the time of evaluation as follows:
Group 1: Twenty experts in shoulder or traumatology from the Brazilian Society of Shoulder and Elbow Surgery (SBCOC) and Brazilian Society of Orthopedic Trauma (SBTO), respectively;
Group 2: Thirty resident physicians in orthopedics and traumatology from the Department of Orthopedics and Traumatology, UNIFESP / EPM, attending the first, second, or third year of the course.
Likewise, the observers were not identified and were not exposed during the study period.
The x-ray and tomography images of proximal humerus fractures originated from the Hospital Samaritano de São Paulo, Americas Medical Service database. They were used for the 3D-models and holography reconstruction through a specific software used by BioArchitects Company, which was donated for the study. We used the Objet350 Connex 3 printer, with a speed of 12 mm/ hour, 16 μm layers, compatible with Windows 7 and 8. The pieces were printed in resin (photopolymer), with high resolution and in real size, within an average of two hours and thirty minutes per model. The three-dimensional printing models faithfully reproduced the fractures’ original characteristics, such as the number and displacement between the fragments, bone loss and humeral head involvement.
No patient identification information was used to guarantee confidentiality, so we request an exemption from the informed consent form.
To evaluate the proximal humerus fractures through the holographs, glasses were available (Hololens) under the proper positioning of the hologram on the lens according to the user’s viewing angle (Fig. 1).
Biomodels are replicas of patients’ anatomical parts, a three-dimensional model identical to the original. The 3D-models reconstruction, also known as prototyping, is the end product of this process (Fig. 2). Each of the evaluated proximal humerus fractures went through this process, originating the models used for the assessment.
The researchers selected 9 fractures based on the quality of the radiographic images and whether they presented the complete tomographic sequences. Adults (bone growth plate closed) of both sexes were included, without restrictions on laterality. Images with suspected pathological (neoplastic) fractures, infectious diseases, previous fractures in the proximal humerus, congenital deformities, or morphological alterations were not included.
We decided to use only Groups A, B, and C as adopted by the AO/OTA, with correspondence to 2, 3, and 4 parts, respectively, as published in the Journal of Orthopedic Trauma in 2018 . We decided that because there was no objective correspondence of AO/OTA subtypes (A1.1, A1.2, A2.1, A2.2, etc.) and Neer classification.
Therefore, we obtained the following distribution:
Three fractures in 02 parts or 11A;
Three fractures in 03 parts or 11B or 11C;
Three fractures in 04 parts 11C
During the analysis of the images and the questionnaires filling, the two groups received both classifications in a table, which could be consulted throughout the evaluation, helping the observers choose the answers judged compatible with the exams presented (Figs. 3 a, b, c e 4).
No clinical or epidemiological information (sex, age, dominance between limbs, associated diseases, fracture period, or mechanism of trauma) was presented to the evaluators. Thus, the indications for fractures treatment were done exclusively by the interpretation of the four diagnostic methods.
The treatment options were presented in a general questionnaire, without specifying non-surgical methods (slings or plaster immobilizations), implants (locked plates, nails, wires, or screws), or prosthesis models (total, partial or reverse). Therefore, the evaluators could decide among only one of the options between treatments: conservative (or non-surgical), osteosynthesis, or arthroplasty (Fig. 5).
The global Kappa coefficients  were determined to assess the agreement in the choice of treatment among specialists and residents in orthopedics using different diagnostic methods. The statistical analysis was performed in a general and stratified way, using only the cases where classifications were in agreement among the observers.
Initially, the association between treatment, type of evaluator and method of diagnosis via Fisher’s exact test was descriptively evaluated, considering independence between the same observer’s response.
The associations between treatments and evaluators (experts or residents) were verified using the Chi-Square test, or alternatively, in case of small samples,Footnote 1 Fisher’s exact test. To verify differences in treatment indication, the standardized adjusted residue was used to identify local differences – cells with absolute values above 1.96 indicate evidence of local associations between the categories.
For all statistical tests, a significance level of 5% was used.
Statistical analyzes were performed using the statistical software SPSS 20.0 and STATA 12.
Twenty experts in orthopedic trauma/shoulder surgery and thirty residents in orthopedics evaluated nine cases, and the results were tabulated.
Table 1 and Fig. 6 show the overall Kappa coefficients by diagnostic methods between experts or residents in orthopedics. For each diagnostic method, agreement and choice of treatment was assessed, dichotomizing each response against the other. The closer the Kappa value is to 1, the higher the agreement. Values close to zero points to an absence of agreement. Landis and Koch  provided the rules: A. from 1.00 to 0.81 - Almost perfect agreement; B. from 0.61 to 0.80 – Substantial agreement; C. from 0.41 to 0.60 – Moderate agreement; D. from 0.21 to 0.40 – Weak agreement; E. from 0.0 to 0.20 - Light agreement and F. < 0 – Poor agreement.
It was observed that the Kappa coefficients were weak, ranging from 0.123 to 0.362. The agreement in the indication of treatment using 3D-models was higher for experts (κ = 0.362, p < 0.001) and residents (κ = 0.306, p < 0.001). For X-ray, the concordances were slightly lower, but similar among experts and residents. Experts showed good agreement using holography compared to residents (two times higher) and very similar to 3D-models. The experts, in general, showed higher agreement in the treatment strategy using different diagnostic methods when compared to residents.
In addition to the agreement analysis between experts and residents, we also performed a comparison of both groups in the choice of treatment using the different diagnostic methods (Table 2). It was possible to verify the differences in the treatment indication (%) between specialists and residents.
As shown in Table 2, there were differences in the treatment indication using tomography (p = 0.003) between residents and experts. It was observed that using tomographic images, experts indicated shoulder arthroplasty more frequently, while residents chose osteosynthesis as the treatment of choice.
The present work evaluated the correlation between different diagnostic methods, the indication of treatment, and the experience of the evaluator. The 3D-models as a diagnostic method showed the highest agreement among interobservers for the treatment of proximal humerus fractures (overall Kappa coefficient), both among experts and residents. Higher concordance based on proximal humerus fractures classification was found using 3D-models in our previous work, when compared to x-ray, tomography or holography . Although 3D-models were the diagnostic method with a higher agreement in the choice of treatment among all interobservers, experts showed overall higher agreement when compared to residents. Therefore, experience time appears to be a significant factor for agreement in the choice of treatment among the four types of diagnostic methods used. However, the results presented here should not be confused with the best treatment for each of the fractures analyzed.
The manipulation of three-dimensional models seems to facilitate the diagnosis and reproducibility using the AO/OTA and Neer Classifications, 1970 . In the case of surgical treatments, preoperative planning can be carried out with 3D-models, allowing the surgeon to train strategies and maneuvers to reposition the deviated bone fragments and choose the implants for each patient before surgery [11, 18, 19]. The choice involves size and models of plates or nails, screws, and others.
Although the prevalence of proximal humerus fractures is relevant and growing worldwide , we still have problems with diagnosis and definitions about the best treatment [2, 3, 7, 8, 13, 21,22,23,24,25,26,27,28]. With these findings presented here, we believe that surgeons, while still in training, are influenced by tactile rather than exclusively visual aspects to understand shoulder fractures. The manipulation of 3D-models stimulates areas of reasoning and interpretation that may not be required by merely visual exams such as x-rays, tomographies, and holographies. Similar to the manipulation of 3D models, the palpation of bone fragments is part of the surgical procedure for interpreting the exact fracture pattern. In this respect, probably 3D-models can reproduce this type of stimulus better, justifying the higher agreement obtained in the choice of treatment seen here.
The classifications proposed by Charles Neer, 1970  and the AO / OTA group - Arbeit Gemeinschaft für Osteosynthesefragen , the most widespread and used worldwide, were not able to find a relevant reproducibility for diagnosis of proximal humerus fractures [21,22,23]. It seems logical, therefore, that the best choice to treat patients has uncertainties [2, 3, 13]. Handoll et al.  state that there is not enough evidence that surgical treatments are superior to conservative ones. Only a few surgical indications are well established, such as open fractures associated with vascular or neurological injuries that need immediate repair. Because of that, surgeons opt for fixation methods or implants based on their experience and training. Slings, plates, nails, and prostheses are nowadays the therapeutic arsenal used to correlate the patient’s fracture with the surgeon’s skills to decide on the best treatment. Thus, research on topics involving new classifications or diagnostic methods have been presented [6, 13, 18, 20,21,22,23,24, 29, 30] [6, 13, 18, 20,21,22,23,24, 29, 30] and studies with 3D-models are promising [11,12,13, 19, 31, 32].
Augmented reality or holography is another diagnostic method helping activities that depend on detailed images to show the surgical access routes anatomically. It is composed of tomographic images obtained from fracture analysis and observed by evaluators through special glasses. The beauty and the details of the images do not produce residues, and the innovative and futuristic aspect of the resource encourages the development of new sustainable diagnostic methods by enthusiasts.
The work here assessed the correlation between different diagnostic methods, the indication of treatment, and the experience of the evaluator (experts and residents in orthopedics). We chose not to include clinical or epidemiological information of the nine cases studied (sex, age, dominance between limbs, associated diseases, fracture time or trauma mechanisms). In addition, we decided to keep aside the options between surgical and non-surgical treatments. The alternatives were widely presented as non-surgical or conservative, leaving out the use of slings or plastered immobilizations. The osteosynthesis and arthroplasty methods similarly did not indicate the types of surgical implants (locked plates, nails, wires, screws, models, or types of shoulder prostheses, Fig. 6 A,B,C,D). Therefore, the evaluators were presented with standardized treatment options: non-surgical, osteosynthesis, or shoulder arthroplasty.
Even with significant therapeutic agreement results among experts for all diagnostic methods proposed (as shown in Table 1), we believe that the absence of patients’ clinical variables may have affected the evaluators’ experience and the indications of treatment. However, the inclusion of these parameters would result in a large number of inconclusive variables for the statistical analysis due to the small sample size used here. Thus, the indications for treating fractures of the proximal humerus described here cannot necessarily be reproducible in the presence of a real patient. However, it gives an indication of the diagnostic method and the type of treatment that is more effective and consensual among the observers.
The agreement for the type of treatment of proximal humerus fractures using three-dimensional models showed, overall, the highest interobserver agreement (experts versus residents in orthopedics) compared to x-rays, tomography, and holography. Moreover, the experts showed two times higher agreement in the treatment that uses tomography and holography, compared to residents.
Availability of data and materials
All work citation in this work are found in the references section.
More than 20% of the cells in a contigency table with less than 5 cases.
Arbeit Gemeinschaft für Osteosynthesefragen group
Universidade Federal de São Paulo
Howard L, Berdusco R, Momoli F, Pollock J, Liew A, Papp S, et al. Open reduction internal fixation vs non-operative management in proximal humerus fractures: a prospective, randomized controlled trial protocol. BMC Musculoskelet Disord BMC Musculoskeletal Disorders. 2018;19:1–10.
Handoll HHG, Brorson S. Interventions for treating proximal humeral fractures in adults. Cochrane Database Syst Rev. 2015. https://doi.org/10.1002/14651858.CD000434.pub4.
Cocco LF, Ejnisman B, Belangero PS, Cohen M, dos Reis FB. Quality of life after antegrade intramedullary nail fixation of humeral fractures: A survey in a selected cohort of Brazilian patients. Patient Saf Surg. 2018;12:1–8.
Carofino BC, Leopold SS. Classifications in brief: the neer classification for proximal humerus fractures. Clin Orthop Relat Res. 2013;471:39–43.
Neer CS. Displaced proximal humeral fractures. I. Classification and evaluation. J. Bone Jt Surg (Am.). 1970;52:1077–89.
Meinberg EG, Agel J, Roberts CS, Karam MD, Kellam JF. Fracture and dislocation classification Compendium-2018. J Orthop Trauma. 2018;32(pS1-S10). https://doi.org/10.1097/BOT.0000000000001063.
Court-Brown CM, Cattermole H, McQueen MM. Impacted valgus fractures (B1.1) of the proximal humerus. J. Bone Jt Surg (Brit.). 2002;84:504–8.
Bahrs C, Kühle L, Blumenstock G, Stöckle U, Rolauffs B, Freude T. Which parameters affect medium- to long-term results after angular stable plate fixation for proximal humeral fractures? J Shoulder Elb Surg. 2015;24:727–32.
Russo R, Guastafierro A, Rotonda G Della, Viglione S, Ciccarelli M, Mortellaro M, et al. A new classification of impacted proximal humerus fractures based on the morpho-volumetric evaluation of humeral head bone loss with a 3D model. J Shoulder Elb Surg; 2020;1–12. Available from: https://doi.org/10.1016/j.jse.2020.02.022.
Mothes FC, Britto A, Matsumoto F, Tonding M, Ruaro R. Application of three-dimensional prototyping in planning the treatment of proximal humerus bone deformities. Rev Bras Ortop. 2018;53:595–601 Available from: https://doi.org/10.1016/j.rboe.2018.07.016.
You W, Liu LJ, Chen HX, Xiong JY, Wang DM, Huang JH, et al. Application of 3D printing technology on the treatment of complex proximal humeral fractures (Neer3-part and 4-part) in old people. Orthop Traumatol Surg Res. 2016;102:897–903.
Awan OA, Sheth M, Sullivan I, Hussain J, Jonnalagadda P, Ling S, et al. Efficacy of 3D Printed Models on Resident Learning and Understanding of Common Acetabular Fracturers. Acad Radiol. 2019;26:130–5.
Cocco LF, Yazzigi JA, Kawakami EFKI, Alvachian HJF, Dos Reis FB, Luzo MVM. Inter-observer reliability of alternative diagnostic methods for proximal humerus fractures: A comparison between attending surgeons and orthopedic residents in training. Patient Saf Surg. 2019;13:1–13.
Fleiss JL, Cohen J, Everitt BS. Large sample standard errors of kappa and weighted kappa. Psychol Bull. 1969;72:323–7.
Flack VF, Afifi AA, Lachenbruch PA, Schouten HJA. Sample size determinations for the two rater kappa statistic. Psychometrika. 1988;53:321–5.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159.
Jo MJ, Gardner MJ. Proximal humerus fractures. Curr Rev Musculoskelet Med. 2012;5:192–8.
Wang JQ, Jiang BJ, Guo WJ, Zhao YM. Indirect 3D printing technology for the fabrication of customised β-TCP/chitosan scaffold with the shape of rabbit radial head - An in vitro study. J Orthop Surg Res. 2019;14:1–9.
Launonen AP, Lepola V, Saranko A, Flinkkilä T, Laitinen M, Mattila VM. Epidemiology of proximal humerus fractures. Arch Osteoporos. 2015;10:1–5.
Murray IR, Amin AK, White TO, Robinson CM. Proximal humeral fractures: current concepts in classification, treatment and outcomes. J Bone Jt Surg (Brit). 2011;1:1–11.
Olerud P, Ahrengart L, Ponzer S, Saving J, Tidermark J. Internal fixation versus nonoperative treatment of displaced 3-part proximal humeral fractures in elderly patients: a randomized controlled trial. J Shoulder Elb Surg. 2011;20:747–55.
Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N. The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J Bone Jt Surg (Am.). 1993;75:1745–50.
Bell JE, Leung BC, Spratt KF, Koval KJ, Weinstein JD, Goodman DC, et al. Trends and variation in incidence, surgical treatment, and repeat surgery of proximal humeral fractures in the elderly. J Bone Jt Surg (Am.). 2011;93:121–31.
Fjalestad T, Hole M, Hovden IAH, Blücher J, Strømsøe K. Surgical treatment with an angular stable plate for complex displaced proximal humeral fractures in elderly patients: a randomized controlled trial. J Orthop Trauma. 2012;26:98–106.
Hatzidakis AM, Shevlin MJ, Fenton DL, Curran-Everett D, Nowinski RJ, Fehringer EV. Angular-stable locked intramedullary nailing of two-part surgical neck fractures of the proximal part of the humerus: A multicenter retrospective observational study. J Bone Jt Surg (Am.). 2011;93:2172–9.
Zhu Y, Lu Y, Shen J, Zhang J, Jiang C. Locking intramedullary nails and locking plates in the treatment of two-part proximal humeral surgical neck fractures: A prospective randomized trial with a minimum of three years of follow-up. J Bone Jt Surg (Am.). 2011;93:159–68.
Klein M, Juschka M, Hinkenjann B, Scherger B, Ostermann PAW. Treatment of comminuted fractures of the proximal humerus in elderly patients with the delta III reverse shoulder prosthesis. J Orthop Trauma. 2008;22:698–704.
Hertel R, Hempfing A, Stiehler M, Leunig M. Predictors of humeral head ischemia after intracapsular fracture of the proximal humerus. J Shoulder Elb Surg. 2004;13:427–33.
Marongiu G, Leinardi L, Congia S, Frigau L, Mola F, Capone A. Reliability and reproducibility of the new AO/OTA 2018 classification system for proximal humeral fractures: a comparison of three different classification systems. J Orthop Traumatol. 2020;21(1):4. https://doi.org/10.1186/s10195-020-0543-1.
Chen Y, Jia X, Qiang M, Zhang K, Chen S. Computer-assisted virtual surgical technology versus three-dimensional printing technology in preoperative planning for displaced three and four-part fractures of the proximal end of the humerus. J Bone Jt Surg (Am). 2018;100:1960–8.
Edelson G, Kelly I, Vigder F, Reis ND. A three-dimensional classification for fractures of the proximal humerus. J Bone Jt Surg (Brit). 2004;86:413–25.
We would like to thank to ICEP (Instituto de Ensino e Pesquisa) from Hospital Samaritano - São Paulo and the BioArchitects Company for donating the 3D-models and the augmented reality material. Also we thank all the physicians for donating their time to evaluate the images and to the Department of Orthopedics and Image Diagnostic from UNIFESP,
The Ethics Committee approved the project and the study was registered in the Brazil Platform under no. CAAE 12273519.7.0000.5505.
Ethics approval and consent to participate
No patient identification information was used to guarantee their confidentiality, so we request exemption from the Informed Consent.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Cocco, L.F., Aihara, A.Y., Franciozi, C. et al. Three-dimensional models increase the interobserver agreement for the treatment of proximal humerus fractures. Patient Saf Surg 14, 33 (2020). https://doi.org/10.1186/s13037-020-00258-2