Langy langy

3/9/2023

Natural language is an ideal domain to study theory discovery for several reasons. For the linguist, this is the target of empirical inquiry, for the child, this includes those linguistic resources that they bring to the table for language acquisition. Language-specific grammar must be formulated within a common theoretical framework, sometimes called universal grammar. The cognitive sciences of language have long drawn an explicit analogy between the working scientist constructing grammars of particular languages and the child learning their languages 14, 15. We primarily focus on the linguist’s construction of language-specific theories, and the linguist’s synthesis of abstract cross-language meta-theories, but we also propose connections to child language acquisition. In this paper, we study the problem of AI-driven theory discovery, using human language as a testbed. Thus, a key goal for both artificial intelligence and computational cognitive science is to develop methods to understand-and perhaps even automate-the process of theory discovery 6, 7, 8, 9, 10, 11, 12, 13. The similarities between the process of developing scientific theories and the way that children construct an understanding of the world around them have led to the child-as-scientist metaphor in developmental psychology, which views conceptual changes during development as a form of scientific theory discovery 4, 5. This faculty is most clearly manifested in the historical development of science 1 but also occurs in miniature in everyday cognition 2 and during childhood development 3. These results suggest routes to more powerful machine-enabled discovery of interpretable models in linguistics and other scientific domains.Ī key aspect of human intelligence is our ability to build theories about the world. Finally, the same algorithm captures few-shot learning dynamics, acquiring new morphophonological rules from just one or a few examples.

Joint inference across all 70 data sets automatically synthesizes a meta-model encoding interpretable cross-language typological tendencies. Across 70 datasets from 58 diverse languages, our system synthesizes human-interpretable models for core aspects of each language’s morpho-phonology, sometimes approaching models posited by human linguists.

We integrate Bayesian inference with program synthesis and representations inspired by linguistic theory and cognitive models of learning and discovery. We present a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sounds. Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence.

0 Comments

Langy langy

Leave a Reply.

Author

Archives

Categories