Prof. Henning Müller and Mara Graziani (PhD student)
University of Applied Sciences of Western Switzerland
This class focuses on concept-based approaches to interpret complex models such as deep learning. In the class, we will start from the notion in cognitive psychology of concept learning. We will define what a concept is and how this can be learned by humans, and by machines.
We will look at introducing concept-based interpretability in the design of deep learning models, a technique known as concept whitening. We will then analyze post-hoc explanations in terms of binary concepts and how to extend this to concepts that have a measurable expression, e.g. area, shape, texture.
The assignments for this class focus on the post-hoc explanation methods.
- Insights from Cognitive Psychology on Concept learning
- Identifying object categories through binary attributes
- Testing with Concept Activation Vectors (TCAV)
- Identifying object categories with measurable (continuous) attributes
- Regression Concept Vectors (RCVs)
- Discovering concepts
- Concept-based guidance of network training
“Concept-whitening for interpretable image recognition“, Zhi Chen et al., 2020
If your task is a classification problem, think of a list of concepts that may help you describing the categories in your data. What attributes can you find? What are these concepts like, are they binary, discrete, categorical, continuous? Make a table to help you keep track of the concepts.
If you are solving a different task, for example a regression or segmentation, the application of concept learning is less trivial and it may not be as easy to learn concepts from samples. Can you try to think of what kind of features in your model you would want to test?
Now move to the notebook for L4A3 – Regression Concept Vectors.
Extend the set of inputs used for the RCV analysis on natural images (for the lion class). How does the determination coefficient change if you use 4, 30 or 100 images?
(Optional) Implement conceptual sensitivities for the “color”-RCVs.