Artificial Intelligence (AI) and Cognitive Science (CogSci) have always had a very fruitful interaction over decades. On the one hand, AI takes its inspiration from the cognitive architecture of human intelligence, and on the other hand, modern AI is providing powerful tools to investigate these cognitive mechanisms. This workshop will bring together world leading researchers at the crossroads of AI and CogSci, and also feature talks by local research groups at Chalmers.
Overview
- Date:Starts 28 November 2024, 08:30Ends 28 November 2024, 18:00
- Seats available:30
- Location:Chalmerska huset
- Language:English
- Last sign up date:19 November 2024
Preliminary schedule:
8.30-9.30 Working Breakfast
9:30 - 10.30 Kenny Smith (U. Edinburgh): The evolution of linguistic regularities and exceptions
Abstract: Languages persist through a cycle of learning and use - we learn the language of our community through immersion in that language, then in using that language to meet our communicative goals we generate more linguistic data which others learn from. In previous work we have used computational and experimental methods to show how this cycle of learning and use can explain some of the fundamental structural features shared by all languages - for example, the fact that all languages exploit regular rules for generating meaningful expressions allows languages to be both relatively learnable but also exceptionally powerful tools for communication. In this talk I’ll briefly review this older work on the evolution of regularity, then apply the same approach to understanding exceptions to those regular rules. Within individual languages, exceptions and irregularities tend not to be distributed randomly - idiosyncratic exceptions tend to occur for high-frequency items, with low-frequency items following the general regular rule. And languages spoken in small, isolated communities tend to have more irregularities, exceptions, and complexity in general than languages (like English) spoken in large heterogeneous communities. I’ll describe a recent series of experiments, using artificial language learning and iterated learning methods, showing how this distribution of irregularity within and across languages can be explained as a consequence of the same processes of learning and use that account for linguistic regularity.
10.30-11.00 Coffee
11.00-12.00 Tessa Verhoef (U. Leiden): The emergence of language universals in neural agents and vision-and-language models
Abstract: Human cognition constrains how we communicate. Our cognitive biases and preferences interact with the processes that drive language emergence and change in non-trivial ways. A powerful method to discern the roles of cognitive biases and processes like language learning and use in shaping linguistic structure is to build agent-based models. Recent advances in computational linguistics and deep learning sparked a renewed interest in such simulations, creating the opportunity to model increasingly realistic phenomena. These models simulate emergent communication, referring to the spontaneous development of a communication system through repeated interactions between individual neural network agents. However, a crucial challenge in this line of work is that such artificial learners still often behave differently from human learners. Directly inspired by human artificial language learning studies, we proposed a novel framework for simulating language learning and change, which allows agents to first learn an artificial language and then use it to communicate, with the aim of studying the emergence of specific linguistics properties. I will present two studies using this framework to simulate the emergence of a well-known language phenomenon: the word-order/case-marking trade-off. I will also share some very recent findings where we test for the presence of a well-known human cross-modal mapping preference (the bouba-kiki effect) in vision-and-language models. Cross-modal associations play an essential role in human language understanding, learning, and evolution, but our findings reveal that current multimodal language models do not align well with such human preferences.
12.00-13.00 Lunch
13.00-13.45 Stefano Sarao Mannelli: Analytical Connectionism: reinterpreting connectionist models in light of modern ML theory
Abstract: Neural network models have been used since the 1990s to study cognition across various contexts. Unlike normative modeling, which often focuses on final performance while neglecting the learning process, the connectionist approach has been instrumental in studying development and neurodegenerative conditions, such as semantic dementia. However, this approach has traditionally relied on simulations, limiting our mathematical understanding of the underlying mechanisms. In this talk, I will show how recent advances in machine learning theory can help revisit connectionist models, offering a more analytical perspective. I will specifically highlight the role of curriculum learning and its impact on accelerating learning during development.
13.45-14.30 Andrea Silvi, Moa Johansson and Jonathan Thomas: Abstractions for Efficient Communication
14.30-15.00 Fika
15.00-16.00 Shane Steinert-Threlkeld (U. Washingto): Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence
A central focus of the cognitive sciences has been what features of the human conceptual system are built-in and which are (and, more generally, can be) learned from experience. In this talk, I will introduce a simple method for training language models called filtered corpus training and argue that it can help shed light on these debates when it comes to what forms of inductive bias are necessary for language learning. The method trains language models (LMs) on corpora with certain linguistic constructions entirely filtered out from the training data, and then measures the ability of LMs to perform linguistic generalization on the basis of indirect evidence. We apply the method to both LSTM and Transformer LMs (of roughly comparable size), developing filtered corpora that target a wide range of linguistic phenomena. Our results show that while transformers are better qua LMs (as measured by perplexity), both models perform equally and surprisingly well on linguistic generalization measures, suggesting that they are capable of generalizing from indirect evidence. A deeper dive on one phenomenon---negative polarity items---also shows that LMs learn to base their judgments in this domain on the semantic concept of monotonicity, in a way not dissimilar to how they are known to be processed in human language.
16.00-17.00 Emil Carlsson and Terry Regier: Iterative learning + RL
17.00-17.30 Concluding Panel Discussion.