
Prof. Henning Müller and Mara Graziani (PhD student)
University of Applied Sciences of Western Switzerland
Concept-based Interpretability
This class focuses on concept-based approaches to interpret complex models such as deep learning. In the class, we will start from the notion in cognitive psychology of concept learning. We will define what a concept is and how this can be learned by humans, and by machines.
We will look at introducing concept-based interpretability in the design of deep learning models, a technique known as concept whitening. We will then analyze post-hoc explanations in terms of binary concepts and how to extend this to concepts that have a measurable expression, e.g. area, shape, texture.
The assignments for this class focus on the post-hoc explanation methods.
Class Outline
- Insights from Cognitive Psychology on Concept learning
- Identifying object categories through binary attributes
- Concept-whitening
- Testing with Concept Activation Vectors (TCAV)
- Identifying object categories with measurable (continuous) attributes
- Regression Concept Vectors (RCVs)
- Discovering concepts
- Concept-based guidance of network training
Material
“Concept-whitening for interpretable image recognition“, Zhi Chen et al., 2020
“Testing with Concept Activation Vectors (TCAV)”, Been Kim et al., 2018
“Regression Concept Vectors” Graziani Mara et al., 2020
Assignments
A1.
Take an application of your choice. It may be some project you are already working on, something you have always wanted to try, or also something that you think it is relatively simple to complete the assignment (especially if you do not have too much time to spend on this).
If your task is a classification problem, think of a list of concepts that may help you describing the categories in your data. What attributes can you find? What are these concepts like, are they binary, discrete, categorical, continuous? Make a table to help you keep track of the concepts.
If you are solving a different task, for example a regression or segmentation, the application of concept learning is less trivial and it may not be as easy to learn concepts from samples. Can you try to think of what kind of features in your model you would want to test?
A2.Open the Colab notebook L4A2 – TCAV. Run the cells and complete the exercises o at the end of the notebook page. Did you manage to change the model and input images? How do the score change across models?
A3.
Now move to the notebook for L4A3 – Regression Concept Vectors.
Extend the set of inputs used for the RCV analysis on natural images (for the lion class). How does the determination coefficient change if you use 4, 30 or 100 images?
(Optional) Implement conceptual sensitivities for the “color”-RCVs.