Important, neglected AI topics
Lukas Finnveden discusses some important but neglected AI topics that don’t fit easily within the usual conception of alignment:
- The potential moral value of AI.
- The potential importance of making AI behave cooperatively towards humans, other AIs, or other civilizations (whether it ends up intent-aligned or not).
- Questions about how human governance institutions will keep up if AI leads to explosive growth.
- Ways in which AI could cause human deliberation to get derailed, e.g. powerful persuasion abilities.
- Positive visions about how we could end up on a good path towards becoming a society that makes wise and kind decisions about what to do with the resources accessible to us. (Including how AI could help with this.)
I’m currently trying to figure out useful research projects on AI that speak to my comparative advantages (whatever those may be), so I’m interested in exploring suggestions like these.
There’s already some good work on the potential moral value of AI (e.g. by Bostrom, and also this report on consciousness in AI). I’m not sure how much I have to add to this, though it’s certainly an area I’d like to keep up with.
I’ve been thinking a bit about cooperation as a useful framing for AI alignment and safety, particularly in the context of cultural evolution, though without making much progress so far. But I wonder how valuable this is if we don’t get intent alignment. (I’m intuitively skeptical, though perhaps that’s wrong.) And I’m not sure I’m ready to take the plunge on wacky multiverse-wide cooperation stuff.
The question of how governance institutions will keep up with explosive AI growth is probably better suited for someone with more of a social science background.
I haven’t spent much time thinking about the dangers of AI persuasion. Probably something I should read up on.
I’m definitely excited about creating positive visions of what an AI future could look like. This is something to pursue further, though I’m unsure where to start. Holden Karnofsky has some suggestions here.
A framing
From Marx to Morris, many thinkers have argued that mode of production shapes values and norms—think of egalitarian foragers and hierarchical farmers. Advances in AI will bring dramatic changes in mode of production, comparable to the agricultural and industrial revolutions. In this new regime, it is unlikely that current values and norms will prove most adaptive: they will be replaced by something new.
There is room for moral entrepreneurs, or legislators of value, to shape which set of norms will prevail. This is how I understand the work Nick Bostrom is doing on digital minds. Granted, the space of feasible options is constrained. But history is full of path-dependence and persistent contingency. To some extent, we can create the values of tomorrow. This is the task we face today.
What Children Can Do That Large Language Models Cannot (Yet)
Paper by Yiu et al (2023).
They argue that LLMs and vision models should not be thought of as individual agents, but rather as new cultural technologies, similar to writing, print, libraries, the Internet, or language. LLMs offer a new means for cultural production and evolution. They aggregate large amounts of information previously generated by humans and extract patterns from that information.
The authors claim that this is very different from the truth-seeking epistemic processes that underly perception and action systems that intervene on the external world and generate new information about it. These truth-seeking epistemic processes are found in some AI systems (model-based RL systems, robotics).
Instead, LLMs allow (like cultural learning and imitation) for the faithful transmission of representations from one agent to another, regardless of the accuracy of those representations.
Not sure I buy that these two processes are so fundamentally different, but I find the shift in perspective interesting and potentially fruitful. Of course, even if they are right about current frontier models, the big question is how long we should expect that to remain true for..
They also make some draw some interesting further parallels with cultural evolution:
This contrast between transmission and truth is in turn closely related to the imitation/innovation contrast in discussions of cultural evolution in humans. Cultural evolution depends on the balance between these two different kinds of cognitive mechanisms. Imitation allows the transmission of knowledge or skill from one person to another. Innovation produces novel knowledge or skill through contact with a changing world. Imitation means that each individual agent does not have to innovate—they can take advantage of the cognitive discoveries of others. But imitation by itself would be useless if some agents did not also have the capacity to innovate. It is the combination of the two that allows cultural and technological progress.
They connect it to the debate over embodiment:
large language and vision models provide us with an opportunity to discover which representations and cognitive capacities, in general, human or artificial, can be acquired purely through cultural transmission itself and which require independent contact with the external world—a long-standing question in cognitive science.
I’ve always been a bit skeptical of views that emphasize embodiment, but not quite sure why.
Deep learning models trained on large data sets today excel at imitation in a way that far outstrips earlier technologies and so represent a new phase in the history of cultural technologies. Large language models such as Anthropic’s Claude and OpenAI’s ChatGPT can use the statistical patterns in the text in their training sets to generate a variety of new text, from emails and essays to computer programs and songs. GPT-3 can imitate both natural human language patterns and particular styles of writing close to perfectly. It arguably does this better than many people.
Although the imitative behavior of large language and vision models can be viewed as the abstract mapping of one pattern to another, human imitation appears to be mediated by goal representation and the understanding of causal structure from a young age. It would be interesting to see whether large models also replicate these features of human imitation.
They then examine whether LLMs can innovate in various contexts. The first concerns novel tool use:
So far, we have found that both children aged 3 to 7 years old presented with animations of the scenario and adults can recognize common superficial relationships between objects when they are asked which objects should go together. But they can also discover new functions in everyday objects to solve novel physical problems and so select the superficially unrelated but functionally relevant object . In ongoing work, we have found that children demonstrate these capacities even when they receive only a text description of the objects, with no images.
Using exactly the same text input that we used to test our human participants, we queried OpenAI’s GPT4, gpt-3.5-turbo, and text-davinci-003 models; Anthropic’s Claude; and Google’s FLAN-T5 (XXL). As we predicted we found that these large language models are almost as capable of identifying superficial commonalities between objects as humans are. They are sensitive to the superficial associations between the objects, and they excel at our imitation tasks—they generally respond that the ruler goes with the compass. However, they are less capable than humans when they are asked to select a novel functional tool to solve a problem.
From this they conclude:
This suggests that simply learning from large amounts of existing language may not be sufficient to achieve tool innovation.
Maybe, but couldn’t it also turn out that the problem goes away with further scaling? They do address this later in the paper:
But a child does not interact with the world better by increasing their brain capacity. Is building the tallest tower the ultimate way to reach the moon? Putting scale aside, what are the mechanisms that allow humans to be effective and creative learners? What in a child’s “training data” and learning capacities is critically effective and different from that of LLMs? Can we design new AI systems that use active, self-motivated exploration of the real external world as children do? And what we might expect the capacities of such systems to be? Comparing these systems in a detailed and rigorous way can provide important new insights about both natural intelligence and AI.
But this doesn’t really answer the objection. It could still be that scaling up will allow for the recognition of new patterns in a way that solves these issues.
In another study, they found that children (including 4 year olds) were better than LLMs at discovering causal relationships.
Overall found this paper interesting, even if I remain unconvinced about many things.
Prestige and content biases together shape the cultural transmission of narratives
Cultural transmission biases such as prestige are thought to have been a primary driver in shaping the dynamics of human cultural evolution. However, few empirical studies have measured the importance of prestige relative to other effects, such as content biases present within the information being transmitted. Here, we report the findings of an experimental transmission study designed to compare the simultaneous effects of a model using a high- or low-prestige regional accent with the presence of narrative content containing social, survival, emotional, moral, rational, or counterintuitive information in the form of a creation story. Results from multimodel inference reveal that prestige is a significant factor in determining the salience and recall of information, but that several content biases, specifically social, survival, negative emotional, and biological counterintuitive information, are significantly more influential. Further, we find evidence that reliance on prestige cues may serve as a conditional learning strategy when no content cues are available. Our results demonstrate that content biases serve a vital and underappreciated role in cultural transmission and cultural evolution.
Four levers of reciprocity across human societies
This paper surveys five human societal types — mobile foragers, horticulturalists, pre-state agriculturalists, state-based agriculturalists and liberal democracies — from the perspective of three core social problems faced by interacting individuals: coordination problems, social dilemmas and contest problems. We characterise the occurrence of these problems in the different societal types and enquire into the main force keeping societies together given the prevalence of these. To address this, we consider the social problems in light of the theory of repeated games, and delineate the role of intertemporal incentives in sustaining cooperative behaviour through the reciprocity principle. We analyse the population, economic and political structural features of the five societal types, and show that intertemporal incentives have been adapted to the changes in scope and scale of the core social problems as societies have grown in size. In all societies, reciprocity mechanisms appear to solve the social problems by enabling lifetime direct benefits to individuals for cooperation. Our analysis leads us to predict that as societies increase in complexity, they need more of the following four features to enable the scalability and adaptability of the reciprocity principle: nested grouping, decentralised enforcement and local information, centralised enforcement and coercive power, and formal rules.
We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents’ actions. Causal influence is assessed using counterfactual reasoning. At each timestep, an agent simulates alternate actions that it could have taken, and computes their effect on the behavior of other agents. Actions that lead to bigger changes in other agents’ behavior are considered influential and are rewarded. We show that this is equivalent to rewarding agents for having high mutual information between their actions. Empirical results demonstrate that influence leads to enhanced coordination and communication in challenging social dilemma environments, dramatically increasing the learning curves of the deep RL agents, and leading to more meaningful learned communication protocols. The influence rewards for all agents can be computed in a decentralized way by enabling agents to learn a model of other agents using deep neural networks. In contrast, key previous works on emergent communication in the MARL setting were unable to learn diverse policies in a decentralized manner and had to resort to centralized training. Consequently, the influence reward opens up a window of new opportunities for research in this area.