How one can Make Black Bins Intervenable?


[Submitted on 24 Jan 2024]

Obtain a PDF of the paper titled Past Idea Bottleneck Fashions: How one can Make Black Bins Intervenable?, by Riv{c}ards Marcinkeviv{c}s and three different authors

Obtain PDF
HTML (experimental)

Summary:Not too long ago, interpretable machine studying has re-explored idea bottleneck fashions (CBM), comprising step-by-step prediction of the high-level ideas from the uncooked options and the goal variable from the expected ideas. A compelling benefit of this mannequin class is the consumer’s capacity to intervene on the expected idea values, affecting the mannequin’s downstream output. On this work, we introduce a technique to carry out such concept-based interventions on already-trained neural networks, which aren’t interpretable by design, given an annotated validation set. Moreover, we formalise the mannequin’s intervenability as a measure of the effectiveness of concept-based interventions and leverage this definition to fine-tune black-box fashions. Empirically, we discover the intervenability of black-box classifiers on artificial tabular and pure picture benchmarks. We show that fine-tuning improves intervention effectiveness and sometimes yields better-calibrated predictions. To showcase the sensible utility of the proposed methods, we apply them to deep chest X-ray classifiers and present that fine-tuned black bins might be as intervenable and extra performant than CBMs.

Submission historical past

From: Ricards Marcinkevics [view email]
Wed, 24 Jan 2024 16:02:14 UTC (27,080 KB)

Supply hyperlink


Please enter your comment!
Please enter your name here