PubMedClinical orthopaedics and related research2026-05-22
Development and Internal Evaluation of EAST-BMS (East Asian Survival Tool for Bone Metastasis Surgery): A Multinational Machine-learning Survival Prediction Model for Patients Undergoing Surgery for Nonspinal Bone Metastasis in East Asia.
Yang Eunkyu E, Lin Hao-Chen HC, Shimomura Seiji S, Lee Jaemin J et al.
Patients undergoing surgery for bone metastases typically have advanced disease, and postoperative survival varies substantially. Accurate survival estimation is important for surgical decision-making and patient counseling. Several prognostic models have been externally validated in East Asian populations, but these tools were originally developed in Western cohorts and do not incorporate region-specific epidemiology or treatment patterns.
(1) To develop, internally evaluate, and select a machine learning-based survival prediction model for patients undergoing surgery for nonspinal bone metastases using a multinational East Asian cohort. (2) To compare the performance of the selected model with that of an established Western prognostic tool developed by the Skeletal Oncology Research Group (SORG). (3) To identify which clinical features carried the greatest importance in the new model that we developed.
All patients who underwent surgery for nonspinal bone metastases at three tertiary referral centers in the Republic of Korea, Taiwan, and Japan between January 2009 and December 2022 were included. In total, 1045 patients met the inclusion criteria. The median (range) age at surgery was 64 years (19 to 96), 46% (478 of 1045) of patients were female, and the femur was the most common metastatic site (66% [690]). Data for 3-month, 6-month, 1-year, 3-year, and 5-year overall survival were available for 82% (854), 68% (709), 51% (529), 23% (243), and 15% (160) of patients, respectively. The corresponding survival proportions were 84%, 71%, 56%, 36%, and 31%. Data on routinely available clinical, functional, and laboratory variables were collected, and candidate predictors were predefined based on clinical relevance and data availability across institutions. Missing data were < 4% for all variables in each institution and were handled by multivariate imputation by chained equations. We trained four models using different machine-learning algorithms, and the performance of each model was evaluated using leave-one-site-out validation, in which models were trained on data from two institutions and tested on the remaining institution to ensure separation between training and testing data sets. Model performance was assessed using the Concordance Index (C-index; the ability of the model to correctly rank patients according to their expected survival), Brier score (overall prediction error), time-dependent area under the curve (tdAUC; how well the model distinguishes patients with different survival outcomes at specific time points), calibration slope and intercept (agreement between predicted and observed survival), and decision curve analysis (the potential clinical benefit of using the model to guide treatment decisions). The best-performing model was designated as the East Asian Survival Tool for Bone Metastasis Surgery (EAST-BMS) and was compared with the SORG model. To allow a fair comparison, the performance of the SORG model was evaluated on the same held-out test data sets in each iteration of the leave-one-site-out validation, applying the same performance metrics used to select the final model.
Gradient boosting survival analysis demonstrated the most favorable overall performance and was selected as the EAST-BMS. The number of outcome events used for model evaluation was 170 at 3 months and 447 at 12 months. The EAST-BMS achieved tdAUC values of 0.81 (95% confidence interval [CI] 0.78 to 0.85) at 3 months and 0.78 (95% CI 0.70 to 0.84) at 12 months, compared with 0.81 (95% CI 0.74 to 0.86) and 0.76 (95% CI 0.67 to 0.83), respectively, for the SORG model, indicating comparable ability to distinguish patients with different survival outcomes. Brier scores were 0.12 (95% CI 0.09 to 0.15) and 0.23 (95% CI 0.17 to 0.28) for EAST-BMS versus 0.14 (95% CI 0.12 to 0.16) and 0.25 (95% CI 0.15 to 0.34) for SORG, indicating lower prediction error in EAST-BMS. Calibration intercepts were -0.08 (95% CI -0.25 to 0.09) versus -1.06 (95% CI -1.26 to -0.86) at 3 months and -0.35 (95% CI -0.49 to -0.22) versus -1.23 (95% CI -1.37 to -1.08) at 12 months, indicating better agreement between predicted and observed survival in EAST-BMS. Decision curve analysis showed wider threshold probability ranges with positive net clinical benefit for EAST-BMS (0.04 to 0.96 versus 0.05 to 0.66 at 3 months; 0.17 to 0.77 versus 0.08 to 0.67 at 12 months), which means that using the EAST-BMS to guide treatment decisions may provide greater clinical benefit than the SORG model. Albumin, Karnofsky performance status, percentage of lymphocytes, and C-reactive protein level were among the most influential predictors.
The EAST-BMS, the first multinational machine-learning survival model for patients from East Asia undergoing surgery for nonspinal bone metastases of which we are aware, demonstrated favorable predictive accuracy and clinical utility. This web-based tool may support personalized prognostic assessment and surgical decision-making. It is freely available as a web-based tool at https://bms.east-mskonco.org.
Level III, therapeutic study.