publications
Please visit my Google Scholar profile for a more up-to-date list.
2024
- Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck ModelsIn European Conference on Computer Vision (ECCV), 2024
Concept Bottleneck Models (CBMs) are designed to ground image classification on human-understandable concepts, to make model decisions interpretable. Crucially, the CBM design also allows for interventions, giving users the ability to modify internal concept choices to intuitively influence the decision behavior of the model. However, existing approaches often require numerous human interventions per image to achieve strong performances, posing practical challenges in scenarios where obtaining human inputs is expensive. This is primarily due to the independent treatment of concepts during intervention, where a change of one concept does not influence the use of other ones in the model’s final decision. To address this issue, we propose a simple concept correction technique to automatically realign concept assignments post-intervention by exploiting statistical relationships between them. In doing so, our approach improves intervention efficacy and raises both classification and concept prediction accuracy across various architectures and real-world datasets. In addition, it easily integrates into existing concept-based architectures without requiring changes to the models themselves. We anticipate that our method will reduce the cost of human-model collaboration, and enhance the feasibility of CBMs in resource-constrained environments.
2023
- Oral (Top 2%)CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive LearningIn International Conference on Computer Vision (ICCV), 2023
Multimodal contrastive pretraining has been utilized to train multimodal representation models, like CLIP, on vast amounts of paired image-text data. However, previous studies have highlighted the susceptibility of such models to backdoor attacks. Specifically, when training on backdoored examples, CLIP learns spurious correlations between the embedded backdoor trigger and the target label, aligning their representations in the joint embedding space. With injecting only a few poisoned examples e.g., 75 examples in the 3M pretraining data, the model’s behavior can be significantly manipulated, thus making it hard to detect or unlearn such correlations. To address this issue, we propose CleanCLIP, a finetuning framework that weakens the learned spurious associations introduced by backdoor attacks by re-aligning the representations for individual modalities independently. CleanCLIP can be employed for both unsupervised finetuning on paired image-text data and for supervised finetuning on labeled image data. We demonstrate that unsupervised finetuning with a combination of multimodal contrastive and unimodal self-supervised objectives for individual modalities can significantly reduce the impact of the backdoor attack. Additionally, supervised finetuning on task-specific labeled data of the individual modality, such as image data, removes the backdoor trigger from the CLIP vision encoder. Empirically, we show that CleanCLIP maintains model performance on benign examples while mitigating the impact of a range of backdoor attacks on multimodal contrastive learning.
- OralToward a normative theory of (self-) management by goal-settingIn Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), 2023
People are often confronted with problems whose complexity exceeds their cognitive capacities. To deal with this complexity, individuals and managers can break complex problems down into a series of subgoals. Which subgoals are most effective depends on people’s cognitive constraints and the cognitive mechanisms of goal pursuit. This creates an untapped opportunity to derive practical recommendations for which subgoals managers and individuals should set from cognitive models of bounded rationality. To seize this opportunity, we apply the principle of resource-rationality to formulate a mathematically precise normative theory of (self-)management by goal-setting. We leverage this theory to computationally derive optimal subgoals from a resource-rational model of human goal pursuit. Finally, we show that the resulting subgoals improve the problem-solving performance of bounded agents and human participants. This constitutes a first step towards grounding prescriptive theories of management and practical recommendations for goal-setting in computational models of the relevant psychological processes and cognitive limitations.
- Full PaperUsing Computational Models to Understand the Role and Nature of Valuation Bias in Mixed GamblesNishad Singhi, Sumeet Agarwal, and Sumitava MukherjeeIn Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), 2023
It is a well-known observation that people tend to dislike risky situations that could potentially lead to a loss, a phenomenon that is called loss aversion. This is often explained using valuation bias, i.e., the subjective value of losses is larger than the subjective value of gains of equal magnitude. However, recent studies using the drift-diffusion model have shown that a pre-valuation bias towards rejection is also a primary determinant of loss-averse behavior. It has large contributions to model fits, predicts a key relationship between rejection rates and response times, and explains the most individual heterogeneity in the rejection rates of participants. We analyzed data from three previously published experiments using the drift-diffusion model and found that these findings generalize to them. However, we found that valuation bias plays the most important role in predicting how likely a person is to accept a given gamble. Our findings also showed that a person’s loss aversion parameter, which captures their propensity to avoid losses is closely related to valuation bias. These results combined highlight the importance of valuation bias in understanding people’s choice patterns. Finally, using the leaky, competing accumulator model, we show strong mimicking between valuation bias and an attentional bias wherein people pay more attention to losses as compared to gains. This finding suggests that behaviors that seem to arise due to valuation bias may arise due to such an attentional bias.
- Motivated With Joy or Anxiety: Does Approach-Avoidance Goal Framing Elicit Differential Reward-Network Activation In The Brain?Nishad Singhi, Michiko Sakaki, Kou Murayama, Madoka Matsumoto, Keise Izuma, Yukihito Yomogida, Ayaka Sugiura, Ryuta Aoki, and Kenji MatsumotoIn Psychologie und Gehirn, 2023
There have been a considerable number of behavioral studies showing that approach goals (i.e., achieving success) and avoidance goals (i.e., avoiding failure) lead to different motivational states. Approach goals are associated with positive emotional outcomes, whereas avoidance goals tend to elicit negative emotional states such as anxiety. In this study, we investigated the neural correlates of goal-directed behavior under these goals using fMRI with a game-like, intrinsically motivating task. We especially focused on the key regions implicated in previous work, i.e., the striatum, midbrain, lateral prefrontal cortex, and ventromedial prefrontal cortex. Our findings indicate that despite the fact that approach and avoidance goals produce different motivational states, the striatum, and other key areas are insensitive to the goals. For example, the striatum is activated after a successful outcome in both approach and avoidance goal conditions. These findings suggest that the striatum may encode general motivation or effort mobilization, not the positive motivational state such as intrinsic motivation. Furthermore, we found that the hippocampus was more activated after successful feedback in the approach condition and after failure feedback in the avoidance condition, which suggests that it encodes salient events.