If modernity meant recognizing that we are masters of our own fate, the next transition will mean handing over that fate to our successors. It will require a similar reexamination of fundamental values and assumption.
The Enlightenment view of history
Ancient writers, both classical and biblical, assumed that the essential patterns of life remained identical and therefore that history provided lasting models for instruction and imitation. Hence the search for historical prototypes of current customs and institutions. Legendary founders of cities, ancestors of existing professions, prehistorical legislators, and establishers of rituals were believed to grant them legitimacy. This belief in tradition persisted among Christians, even though the coming of Christ divided their time into two distinct periods. The basic relation between past and present remained constant, except for the unique event of the Incarnation that had set a new beginning and a new end to history.
The scientiﬁc revolution of the seventeenth century undermined this stable concept of time. The abrupt change it caused in the modern worldview suggested that time was pregnant with novelty and directed toward the future rather than repeating the past. The new orientation was supported by a philosophy that viewed the person as the source of meaning and value and hence capable of changing the course of history. The modern conception of history resulted in two quite different attitudes toward the past. Some, beginning with Descartes and all those primarily interested in the scientiﬁc achievements of their age, felt that the study of the past could contribute little to the scientiﬁc enterprise. For others, however, a more accurate knowledge of the past formed an integral part of that comprehensive renewal of knowledge introduced by the scientiﬁc revolution. Thus, David Hume regarded the study of history as essential to the study of human nature, the basis of all scientiﬁc knowledge. Some historians, such as Montesquieu, Voltaire, and Gibbon, were convinced that a solid acquaintance with the past was to vindicate the changes of the present.
Louis Dupré (2004) The Enlightenment and the Intellectual Foundations of Modern Culture, p. 187-188
Moral facts as facts about cooperation
I find the perspective of Sterelny and Fraser (2017) rather appealing:
while there is no full vindication of morality, no seamless reduction of normative facts to natural facts, nevertheless one important strand in the evolutionary history of moral thinking does support reductive naturalism—moral facts are facts about cooperation, and the conditions and practices that support or undermine it.
For moral thinking has evolved in part in response to these facts and to track these facts. So one function of moral thinking is to track a class of facts about human social environments, just as folk psychological thinking has in part evolved to track cognitive facts about human decision-making.
The idea that connects moral thinking to the expansion of cooperation in the human lineage has two complementary aspects. First, it is important to an individual to be chosen as a partner by others; access to the proﬁts of cooperation often depends on partner choice. Choice, in turn, is often dependent on being of good repute, and (often) the most reliable way of having a good reputation is to deserve it. It is worth being good to seem good. Recognizing and internalizing moral norms is typically individually beneﬁcial through its payoff in reputation. Second, human social life long ago crossed a complexity threshold, and once it did so, problems of coordination, division of labour, access to property and products and rights and responsibilities in family organization could no longer be solved on the ﬂy, or settled on a case-by-case basis by individual interactions. Default patterns of interaction became wired in as social expectations and then norms, as individuals came to take decisions and make plans on the assumption that those defaults would be respected, treating them as stable backgrounds; naturally resenting unpleasant surprises when faced by deviations from these expectations. The positive beneﬁts of successful coordination with others, and the costs of violating other’s expectations, gave individuals an incentive to internalize and conform to these defaults.
These gradually emerging regularities of social interaction and cooperation were not arbitrary: they reﬂected (no doubt imperfectly) the circumstances in which human societies worked well, and how individuals acted effectively in these societies to mutual beneﬁt. Given the beneﬁts of cooperation in human social worlds, we have been selected to recognize and respond to these facts. So this adaptationist perspective on moral cognition suggests that normative thought and normative institutions are a response to selection in the hominin lineage for capacities that make stable, long-term, and spatially extended forms of cooperation and collaboration possible.
No doubt there are trade-offs between the size of the cooperation proﬁt and its distribution. But despite these complications, a natural notion of moral truth emerges from the idea that normative thought has evolved to mediate stable cooperation. The ideal norms are robust decision heuristics, in that they satisﬁce over a wide range of agent choice points, typically providing the agent with a decent outcome, in part by giving others incentives to continue to treat the agent as a social partner in good standing. The moral truths specify maxims that are members of near-optimal normative packages—sets of norms that if adopted, would help generate high levels of appropriately distributed, and hence stable, cooperation proﬁts.
So no adaptationist, truth-tracking conception of the evolution of moral thinking will deliver a full, clean vindication of diverse moral opinion. Indeed, we expect the moral case to be intermediate in a variety of respects: First, our moral practices are a mosaic; some elements may turn out to be vindicated, others revised, others discarded. Second, as we have noted, moral judgements function to signal, to bond, and to shape, not just to track; vindication is only in question with respect to tracking. Third, as we shall now explain, tracking is only partially successful; moreover, its success may well have varied across time and circumstance.
On the Historical Role of Philosophy
The greatest philosophers of tomorrow will be the ones grappling with AI, according to R.G. Collingwood:
In part, the problems of philosophy are unchanging; in part they vary from age to age, according to the special characteristics of human life and thought at the time; and in the best philosophers of every age these two parts are so interwoven that the permanent problems appear sub specie saeculi, and the special problems of the age sub specie aeternitatis. Whenever human thought has been dominated by some special interest, the most fruitful philosophy of the age has reﬂected that domination; not passively, by mere submission to its inﬂuence, but actively, by making a special attempt to understand it and placing it in the focus of philosophical inquiry.
Dupré on the Renaissance and the Enlightenment
It soon appeared that no direct causal succession links the humanism of the ﬁfteenth century with the Enlightenment. When Max Weber described modernity as the loss of an unquestioned legitimacy of a divinely instituted order, his deﬁnition applies to the Enlightenment and the subsequent centuries, not to the previous period. We ought to avoid the mistake made by Jacob Burckhardt in The Civilisation of the Renaissance in Italy, and often repeated in the twentieth century, of interpreting the Renaissance as the ﬁrst stage of the Enlightenment. It is true, though, that the early period introduced one fundamental characteristic of modern culture, namely, the creative role of the person. Yet that idea did not imply that the mind alone is the source of meaning and value, as Enlightenment thought began to assume.
Louis Dupré (2004), The Enlightenment and the Intellectual Foundations of Modern Culture
AI beyond alignment
Even if we solve intent alignment and build AI systems that are trying to do what their deployers want them to do, plenty of issues remain to be addressed if we are to successfully navigate the transition to a world with advanced AI, as an increasing number of people are pointing out.
Part of the problem is that “AI alignment shouldn’t be conflated with AI moral achievement,” as Matthew Barnett explains:
if we succeed at figuring out how to make AIs pursue our intended goals, these AIs will likely be used to maximize the economic consumption of existing humans at the time of alignment. And most economic consumption is aimed at satisfying selfish desires, rather than what we’d normally consider our altruistic moral ideals.
Solving AI alignment does not once and for all solve the problem of how multiple autonomous individuals with partially conflicting interests should cooperate and coexist—a problem that will likely always be with us. But the transition to a world with advanced AI does represent a crucial time where many fundamental parameters of this implicit agreement may need to be renegotiated. And doing so requires doing a wide range work, going far beyond issues of technical alignment.
Holden Karnofsky explicitly highlights that transformative AI issues are not just misalignment, listing the following further problems:
- Power imbalances. As AI speeds up science and technology, it could cause some country/countries/coalitions to become enormously powerful - so it matters a lot which one(s) lead the way on transformative AI. (I fear that this concern is generally overrated compared to misaligned AI, but it is still very important.) There could also be dangers in overly widespread (as opposed to concentrated) AI deployment.
- Early applications of AI. It might be that what early AIs are used for durably affects how things go in the long run - for example, whether early AI systems are used for education and truth-seeking, rather than manipulative persuasion and/or entrenching what we already believe. We might be able to affect which uses are predominant early on.
- New life forms. Advanced AI could lead to new forms of intelligent life, such as AI systems themselves and/or digital people. Many of the frameworks we’re used to, for ethics and the law, could end up needing quite a bit of rethinking for new kinds of entities (for example, should we allow people to make as many copies as they want of entities that will predictably vote in certain ways?) Early decisions about these kinds of questions could have long-lasting effects.
- Persistent policies and norms. Perhaps we ought to be identifying particularly important policies, norms, etc. that seem likely to be durable even through rapid technological advancement, and try to improve these as much as possible before transformative AI is developed. (These could include things like a better social safety net suited to high, sustained unemployment rates; better regulations aimed at avoiding bias; etc.)
- Speed of development. Maybe human society just isn’t likely to adapt well to rapid, radical advances in science and technology, and finding a way to limit the pace of advances would be good.
Of course, AI governance is already a thriving research field, and many of the issues are within its scope. (Have some examples here)Yet many of the issues are more fundamental, and may require us to reconceive fundamental notions of political philosophy.
GPI’s new research agenda on on risks and opportunities from artificial intelligence covers some relevant topics, e.g. political philosophy:
Some of the risks posed by AI are political in nature, including the risks posed by AI-enabled dictatorships. Other risks will inevitably involve a political dimension, for example with regulation and international agreements playing an important role in enabling or mitigating risks. For this reason, it’s likely that political philosophy will be able to provide insight. Questions we’re interested in include: Should AI development be left in the hands of private companies? How if at all should our political and economic institutions change if we one day share the world with digital moral patients or agents? Will AI exacerbate and entrench inequalities of wealth and power? Will AI cause mass unemployment? Will AI increase the risk of war between great powers? In each of these cases, how severe is the threat, what can be done to mitigate it, and what are the relevant trade-offs?
GPI is interested in work that clarifies the nature of lock-in and the relationship between lock-in and the achievement of a desirable future. We’re also interested in work that explores whether AI is likely to bring about various types of lock-in (Karnofsky, 2021; Finnveden et al., 2022). One important-seeming type is value lock-in (MacAskill, 2022, Chapter 4): the values instantiated by advanced AI could persist for a very long time. That suggests that it is especially important to get these values right. Unfortunately, there are also many ways in which we might get these values wrong. We might endow powerful AI with the wrong theory of normative ethics, or the wrong theory of welfare, or the wrong axiology, or the wrong population ethics, or the wrong decision theory, or the wrong theory of infinite ethics. Each of these mistakes could make the future significantly worse than it otherwise would be. With what values - if any - should we endow AI?
Navigating rapid change:
As noted above, AI might lead to rapid societal and technological change. What can we do ahead of time to mitigate the risks and realise the opportunities? One idea is ensuring that powerful actors agree ahead of time to coordinate in certain ways. For example, actors might agree to share the benefits of AI and to refrain from taking actions that might be irreversible, like settling space and developing dangerous technologies. What sort of agreements would be best? Could humanity bring about and enforce agreements of this kind?
Various issues at the intersection of value theory and the philosophy of mind might be relevant to determining whether AI counts as a moral patient and how we ought to treat AI systems if so. This might include work exploring the nature of consciousness and sentience, work exploring which mental properties are relevant to moral status, and work exploring the nature of wellbeing.
Digital minds might raise unique challenges for political philosophy. For example, digital minds might be able to duplicate themselves with relative ease, which might raise challenges for integrating them into democratic systems. How if at all should our political systems change in that case?
Many of Samuel Hammond’s recent blog posts are also in this vein, and Lukas Finnveden has a recent series of posts on non-alignment project ideas for making transformative AI go well, covering governance during explosive technological growth, epistemics, sentience and rights of digital minds, backup plans for misaligned AI, and cooperative AI.
This is the general sort of area I hope to be working in. Now I just need to figure out some specific projects do dig into.