Skip to content

ashhass/Concept-BottleNeck-Model-Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 

Repository files navigation

Concept-BottleNeck-Model-Papers

This repository contains important papers on concept bottleneck models organized by year of publishing.

2023

Interactive Concept Bottleneck Models

A closer look at the intervention procedure of concept bottleneck models

Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Label Free Concept Bottleneck Models

Concept-based Explainable Artificial Intelligence: A Survey

2022

Post-hoc Concept Bottleneck Models

Addressing Leakage in Concept Bottleneck Models

From “Where” to “What”: Towards Human-Understandable Explanations through Concept Relevance Propagation

2021

Do Concept Bottleneck Models learn as intended

Description: Investigates which regions in the input space CBMs use to make predictions. Claims that pretrained concepts do not correspond to anything semantically meaningful in the input space suggesting that CBMs might be using confounding information to make concept label predictions.

Promises and Pitfalls of Black-Box Concept Learning Models

Description: Empirically shows that current methods such as concept whitening models and sequential CBMs that attempt to address the information leakage problem in concept bottleneck models are largely ineffective.

Editing a Classifier by Rewriting Its Prediction Rules

IS DISENTANGLEMENT ALL YOU NEED? COMPARING CONCEPT-BASED & DISENTANGLEMENT APPROACHES

2020

Concept Bottleneck Model

Description: Introduces a framework to extract concepts from feature vectors that are later used to predict target labels. Three methods:

  1. Independent: Train to predict concept c from input x independently from predicting label y from concept c.
  2. Sequential: Train to predict concept c first then predict the label from predicted concepts c.
  3. Joint: Simultaneously predicts concept c and target label y using a joint loss function.

Limitations:

  1. Does not investigate the possibility of concept botteneck models learning spurious input features to make concept predictions.

  2. The joint framework (the preferred framework in the paper) might learn features directly from the input to predict target labels giving less value to the pre-specified concepts and more to uninterpretable attributes.

Now You See Me (CME): Concept-based Model Extraction

Description: Introduces a model extraction framework that is used to analyse concept information in DNN models. Specifically,

  1. Discovers concepts learnt by a DNN model: Does so by approximating two functions, a function that predicts intermediate concept labels and a function that predicts the target labels from concept predictions.
  2. Analyses how DNNs use concept information when predicting labels: utilizes latent space analysis methods to inspect which concepts are learned and how these concepts are represented across different DNN layers.
  3. Identifies the most important concept information: Does so by picking the 32 highest coefficients from a logistic regression model trained to predict target labels from ground-truth concept labels.

On Completeness-aware Concept-Based Explanations in Deep Neural Networks

Description: Explores the idea of complete concepts in deep neural networks. Specifically,

  1. Defines completeness of concepts in deep neural networks
  2. Introduces a completeness score to evaluate how sufficient concepts are for model predictions
  3. Introduces a method to discover complete, interpretable concepts
  4. ConceptSHAP: studies how important individual concepts are to the overall completeness score

DEBIASING CONCEPT-BASED EXPLANATIONS WITH CAUSAL ANALYSIS

Description: Introduces a causal prior graph which attempts to model unobserved confounding information the model might be using to make its predictions. Uses a two stage regression technique to remove the detected confounding information.

Wilds: A Benchmark of in-the-Wild Distribution Shifts

Releases

No releases published

Packages

No packages published