Design a site like this with
Get started

Lecture 4

Prof. Henning Müller and Mara Graziani (PhD student)

University of Applied Sciences of Western Switzerland

Concept-based Interpretability

This class focuses on concept-based approaches to interpret complex models such as deep learning. In the class, we will start from the notion in cognitive psychology of concept learning. We will define what a concept is and how this can be learned by humans, and by machines.

We will look at introducing concept-based interpretability in the design of deep learning models, a technique known as concept whitening. We will then analyze post-hoc explanations in terms of binary concepts and how to extend this to concepts that have a measurable expression, e.g. area, shape, texture.

The assignments for this class focus on the post-hoc explanation methods.

Class Outline
  1. Insights from Cognitive Psychology on Concept learning
  2. Identifying object categories through binary attributes
  3. Concept-whitening
  4. Testing with Concept Activation Vectors (TCAV)
  5. Identifying object categories with measurable (continuous) attributes
  6. Regression Concept Vectors (RCVs)
  7. Discovering concepts
  8. Concept-based guidance of network training


Take an application of your choice. It may be some project you are already working on, something you have always wanted to try, or also something that you think it is relatively simple to complete the assignment (especially if you do not have too much time to spend on this).

If your task is a classification problem, think of a list of concepts that may help you describing the categories in your data. What attributes can you find? What are these concepts like, are they binary, discrete, categorical, continuous? Make a table to help you keep track of the concepts.

If you are solving a different task, for example a regression or segmentation, the application of concept learning is less trivial and it may not be as easy to learn concepts from samples. Can you try to think of what kind of features in your model you would want to test?

A2.Open the Colab notebook L4A2 – TCAV. Run the cells and complete the exercises o at the end of the notebook page. Did you manage to change the model and input images? How do the score change across models?

(Optional) If you are interested in concept-based explanations, you can try them on the models and tasks you have worked on in assignment 1. You can use a trained model, if you have it, and the concepts you added to the table.


Now move to the notebook for L4A3 – Regression Concept Vectors.

Extend the set of inputs used for the RCV analysis on natural images (for the lion class). How does the determination coefficient change if you use 4, 30 or 100 images?

(Optional) Implement conceptual sensitivities for the “color”-RCVs.

%d bloggers like this: