Strong Optimal Classification Trees

| Ask | Become a Library member

Strong Optimal Classification Trees

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_crossref_primary_10_1287_opre_2021_0034

Strong Optimal Classification Trees

About this item

Full title

Strong Optimal Classification Trees

Author / Creator

Aghaei, Sina , Gómez, Andrés and Vayanos, Phebe

Publisher

INFORMS

Journal title

Operations research, 2024-07

Language

English

Formats

Articles

Publication information

Publisher

INFORMS

Subjects

Subjects and topics

More information

Scope and Contents

Contents

Decision trees are among the most interpretable and popular machine learning models that are used routinely in applications ranging from revenue management to medicine. Traditional heuristic methods, although fast, lack modeling flexibility for incorporating constraints such as fairness and do not guarantee optimality. Recent efforts aim to overcome these limitations using mixed-integer optimization (MIO) for better modeling flexibility and optimality, but speed remains an issue. In “Strong Optimal Classification Trees,” Aghaei, Gómez, and Vayanos use integer optimization and polyhedral theory to create an MIO-based formulation with a strong LO relaxation resulting in a 29% speed-up in training time compared with state-of-the-art MIO-based formulations, as well as up to an 8% improvement in out-of-sample accuracy.
Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning
optimal
binary classification trees with univariate splits. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer optimization (MIO) technology. Yet, existing MIO-based approaches from the literature do not leverage the power of MIO to its full extent: they rely on
weak
formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. Moreover, we show that our formulation has a
stronger
linear optimization relaxation than existing methods in the case of binary data. We exploit the decomposable structure of our formulation and max-flow/min-cut duality to derive a Benders’ decomposition method to speed-up computation. We propose a tailored procedure for solving each decomposed subproblem that provably generates
facets
of the feasible set of the MIO as constraints to add to the main problem. We conduct extensive computational experiments on standard benchmark data sets on which we show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques and improve out-of-sample performance by up to 8%.
Funding:
P. Vayanos and S. Aghaei gratefully acknowledge support from the Hilton C. Foundation, the Homeless Policy Research Institute, the Home for Good foundation under the “C.E.S. Triage Tool Research & Refinement” grant. P. Vayanos is funded in part by the National Science Foundation, under CAREER [Grant 2046230]. She is grateful for this support. A. Gómez is funded in part by the National Science Foundation under [Grants 1930582 and 2006762]. We would like to express our sincere thanks to the anonymous reviewers for their valuable and constructive feedback, which greatly improved the quality of this paper.
Supplemental Material:
The online appendix and code files are available at
https://doi.org/10.1287/opre.2021.0034
....

Alternative Titles

Full title

Strong Optimal Classification Trees

Authors, Artists and Contributors

Author / Creator

Aghaei, Sina
Gómez, Andrés
Vayanos, Phebe

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_crossref_primary_10_1287_opre_2021_0034

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_crossref_primary_10_1287_opre_2021_0034

Other Identifiers

ISSN

0030-364X

E-ISSN

1526-5463

DOI

10.1287/opre.2021.0034

How to access this item

Full text available

View in old catalogue

Strong Optimal Classification Trees

Strong Optimal Classification Trees

Strong Optimal Classification Trees

About this item

Publication information

Subjects

More information

Scope and Contents

Alternative Titles

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Other Identifiers

How to access this item

Connecting people and collections

Indigenous engagement

Learning

Stories