Towards plausible moral naturalism

In “Generalizing zombie arguments”, I hinted at the idea of applying a Chalmers-like framework to morality. Here I develop this idea further.

Suppose we are working in an axiomatic system rich enough to express physics and physical facts. Can this system include moral facts as well? Perhaps moral statements such as “homicide is never morally permissible” can be translated into the axiomatic system or an extension of it.

It would be difficult to argue that a realistic axiomatic system must be able to express moral statements. So I’ll bracket the alternative possibility: Perhaps moral claims do not translate into well-formed statements in the theory at all. This would be a type of moral non-realism, of considering moral claims to be meaningless.

There’s another possibility to bracket: moral trivialism, according to which moral statements do have truth values, but only trivial ones. For example, perhaps all actions are morally permissible. Or perhaps no actions are. This is a roughly moral nonrealist possibility, especially compatible with error theory.

The point of this post is not to argue against moral meaninglessness or moral trivialism, but rather to explore alternatives to see which are most realistic. Even if moral meaninglessness or trivialism is likely, the combination of uncertainty and disagreement regarding their truth could motivate examining alternative theories.

Let’s start with a thought experiment. Suppose there are two possible universes that are physically identical. They both contain versions of a certain physical person, who takes the same actions in each possible universe. However, in one possible universe, this person’s actions are morally wrong, while in another, they are morally right.

We could elaborate on this by imagining that in the actual universe, this person’s actions are morally wrong, although they are materially helpful and match normal concepts of moral behavior. This person would be a P-evildoer, analogous to a P-zombie; they would be physically identical to a morally acting person, but still an evildoer nonetheless.

Conversely, we could imagine that in the actual universe, this person’s actions are morally good, although they are materially malicious and match normal concepts of immoral behavior. They would be a P-saint. (Scott Alexander has written about a similar scenario in “The Consequentialism FAQ”.)

I, for one, balk at imagining such a scenario. Something about it seems inconceivable. It seems easier to imagine that I am so wrong about morality that intuitive judgments of moral goodness are anti-correlated with real moral goodness, than that physically identical persons in different possible worlds have different moral properties. Somehow, P-evildoer scenarios seem even worse than P-zombie scenarios. To be clear, it’s not too hard to imagine that the P-evildoer’s actions could have negative supernatural consequences (while their physically identical moral twin’s actions have positive supernatural consequences), but moral judgments seem like they should evaluate agents relative to their possible epistemic state(s), rather than depending on un-knowable supernatural consequences.

As motivation for considering such deeply unfair scenarios, consider the common idea that atheists and materialists cannot ground their morality, as morality must be grounded in a supernatural entity such as a god. Then, depending on the whims of this god, a given physical person may act morally well or morally badly, despite taking the same actions in either case. And, unless the whims of the god are so finite and predictable that humans can know them in general, different possible gods (and accordingly moral judgments) are logically possible and conceivable from a human perspective.

Broadly, this idea could be called “moral supernaturalism”, which says that moral properties are not determined by the combination of physical properties, mathematical properties, and metaphysical constraints on conceivability; moral properties are instead “further facts”. I’m not sure how to argue the inconceivability of P-evildoers to someone who doesn’t already agree, but if someone accepts a principle of no P-evildoers, then this is a strong argument against moral supernaturalism.

The remaining alternative to moral meaninglessness, moral trivialism, and moral supernaturalism, would seem to be reasonably called “moral naturalism”. I’ll take some preliminary steps towards formalizing it.

Let us work in a typed variant of first-order logic, with three types: W (a type of possible worlds), P (a type of physical world trajectory), and M (a type of moral world trajectory). Each possible world has a physical and a moral trajectory, denoted p(w) and m(w) respectively. To state a no P-evildoers principle:

\neg \exists (w_1, w_2 : W), p(w_1) = p(w_2) \wedge m(w_1) \neq m(w_2)

In other words, any two possible worlds with identical physical properties must have the same moral properties. (As an aside, modal logic may provide alternative formulations of possibility without reifying possible worlds as “existent” first-order logical entities, though I don’t present a modal formulation here.) I would like to demonstrate a more helpful statement from the no P-evildoers principle:

\forall (p^* : P) \exists (m^* : M) \forall (w : W), p(w) = p^* \rightarrow m(w) = w^*

This says that any physical trajectory has a corresponding moral trajectory that holds across possible worlds. To prove this, start by letting p^* : P be arbitrary. Either there is some possible world with this physical trajectory, or not. If not, we can let m^* be arbitrary, and \forall (w : W), p(w) = p^* \rightarrow m(w) = m^* will hold vacuously.

If so, then let w^* be some such world and set m^* = m(w^*). Now consider some arbitrary possible world w for which p(w) = p^*. Either it is true that m(w) = w^* or not. If it is, we are done; we have that p(w) = p^* \rightarrow m(w) = w^*. If not, then observe that w and w^* have identical physical trajectories, but different moral trajectories. This contradicts the no P-evildoers principle.

So, the no P-evildoers principle implies the alternative statement, which expresses something of a functional relationship between physical trajectories and moral ones; for any possible physical trajectory, there is only one possible corresponding moral trajectory. With a bit of set theory, and perhaps axiom of choice shenanigans, we may be able to show the existence of a function f mapping physical trajectories to corresponding moral trajectories.

Now f expresses a necessary (working across all possible worlds) mapping from physical to moral trajectories, a sort of multiversal moral judgment function. Its necessity across possible worlds demonstrates its a priori truth, showing Kant was right about morality being derivable from the a priori. We should return to Kant to truly comprehend morality itself.

…This is perhaps jumping to conclusions. Even though we have an f expressing a necessary functional relationship between physical and moral trajectories, this does not show the epistemic derivability of f from any realistic perspective. Perhaps different alien species across different planets develop different beliefs about f, having no way to effectively resolve their disagreements.

We don’t have the framework to rule this out, so far. Yet something seems suspiciously morally supernaturalist about this scenario. In the P-evildoer scenario, we could imagine that God decides all physical facts, and then adds additional moral facts, but potentially attributes different moral facts to physically identical universes. With the no P-evildoers principle, we have constrained God a little, but it still seems God might need to add facts about f even after determining all physical facts, to yield moral judgments.

So perhaps we can consider a stronger rejection of moral supernaturalism, to rule out a supernatural God of the moral gaps. We would need to re-formulate a negation of moral supernaturalism, distinct from our original no P-evildoers axiom.

One possibility is logical supervenience of moral facts on physical ones; the axiomatic system would be sufficient to prove all moral facts from all physical facts. This could be called moral reductionism: moral statements are logically equivalent to perhaps complex statements about particle trajectories and so on. This would imply that there are conceptually straightforward physical definitions of morality, even if we haven’t found them yet. However, assuming moral reductionism might be overly dogmatic, even if moral supernaturalism is rejected. Perhaps there are more constraints to “possibility” or “conceivability” than just logical coherence; perhaps there are additional metaphysical properties that must be satisfied, and morality cannot be in general derived without such constraints.

As an example, Kant suggests that geometric facts are not analytic. While there are axiomatizations of geometry such as Euclidean geometry, any one may or may not correspond to the a priori intuition of space. Perhaps systems weaker than Euclidean geometry allow many logically consistent possibilities, but some of these do not match a priori spatial intuition, so some of these logically consistent possible geometric trajectories are inconceivable.

Let us start with an axiomatic system T capable of expressing physical and moral statements. Rather than the previous possible-world theory, we will instead consider it to have a set of physical statements P and a set of moral statements M, which are all about the actual universe, rather than any other possible worlds.

Let us, for formality, suppose that these additional metaphysical conceivability constraints can be added to our axiomatic system T; call the augmented system T’. Now we can apply model theory to T’. Are there models of T’ that have the same physical facts, yet different moral facts?

We must handle some theoretical thorniness: Gödel’s first incompleteness theorem shows that there are multiple models of Peano arithmetic which yield different assignments of truth values to statements about the natural numbers, assuming PA is consistent. There is some intuition that these statements (which exist in the arithmetical hierarchy) nonetheless have real truth values, although it’s hard to be sure. Even if they don’t, it seems hard to formalize a system similar to Peano arithmetic that is consistent and complete; the doubter of the meaningfulness of statements in higher levels of the arithmetical hierarchy is going to have to work in an impoverished axiomatic system until such a system is formalized.

So we augment T’ with additional mathematical axioms, expressing, for example, all true PA statements according to the standard natural model; call this further-augmented system T^*. The axioms of T^* are, of course, uncomputable, but this is not an obstacle to model theory. These mathematical axioms disambiguate cases where the truth values of these arithmetic hierarchy statements are morally relevant somehow.

In a meta-theory capable of model-theoretic analysis of T^* (such as ZFC), we can express a stronger form of moral naturalism:

“Any two models of T^* having the same assignment of truth values to statements in P have the same assignment of truth values to statements in M.”

Recall that P is the set of physical statements, while M is the set of moral statements. What this expresses is that there are no “further facts” to morality in addition to physics, metaphysical conceivability considerations, and mathematical oracle facts, ruling out a broader class of moral supernaturalism than our previous formulation.

Using Gödel’s completeness theorem, we can show from the moral naturalism statement:

T^* augmented with an infinite axiom schema assigning any logically consistent truth values to all statements in P will eventually prove binary truth values for all statements in M“.

The infinite axiom schema here is somewhat unsatisfying; it’s not really necessary if P is finite, but perhaps we want to consider countably infinite P as well. Luckily, since all proofs of individual M-statements (or their negations) are finite, they only use a finite subset of the axiom schema. Hence we can show:

“For any finite subset of M, T^* augmented with some finite subset of an infinite axiom schema assigning any logically consistent truth values to all statements in P will eventually prove binary truth values for all statements in this subset of M“.

This is more satisfying: a finite subset of moral statements only requires a finite subset of physical statements to prove their truth values. Likewise, the proofs need only use a finite subset of the axioms of T^*, e.g. they only require a finite number of mathematical oracle axioms.

To summarize, constraining models of an augmented axiomatic system, with metaphysical conceivability and mathematical axioms, to never imply different moral facts given the same physical facts, yields a stronger form of moral non-supernaturalism, preventing there from being “further facts” to morality beyond physics, metaphysical conceivability constraints, and mathematics. Using model theory, this would show that the truth values of any finite subset of moral statements are logically derivable from the system and from some finite subset of physical statements. This implies not just a theoretical existence of a functional relationship between physics and morality (f from before), but hypothetical a priori epistemic efficacy to such a derivation, albeit making use of an uncomputable mathematical oracle.

The practical upshot is not entirely clear. Aliens might have different beliefs about uncomputable mathematical statements, or different metaphysical conceivability axioms, yielding different moral beliefs. But disagreements over metaphysical conceivability and mathematics seem more epistemically weighted than fully general moral disagreements; they would relate to epistemic cognitive architecture and so on, rather than being orthogonal to epistemology.

Physically instantiated agents also might not generally be motivated to act morally, even if they have specific beliefs about morality. This doesn’t contradict moral realism per se, but it indicates practical obstacles to the efficacy of moral theory. As an example, the aliens may be moral realists, but additionally have beliefs about “schmorality”, a related but distinct concept, which they are more motivated to put into effect.

While it would take more work to determine the likelihood of Kantian morality conditioned on moral naturalism, in broad strokes they seem quite compatible. In the Critique of Practical Reason, Kant argues:

If a rational being can think of its maxims as practical  universal laws, he can do so only by considering them as principles which contain the determining grounds of the will because of their form and not because of their matter.

The material of a practical principle is the object of the will. This object either is the determining ground of the will or it is not. If it is, the rule of the will is subject to an empirical condition (to the relation of the determining idea to feelings of pleasure or displeasure), and therefore it is not a practical law. If all material of a law, i.e., every object of the will considered as a ground of its determination, is abstracted from it, nothing remains except the mere form of giving universal law. Therefore, a rational being either cannot think of his subjectively practical principles (maxims) as universal laws, or he must suppose that their mere form, through which they are fitted for being universal laws, is alone that which makes them a practical law.

Mainly, he is emphasizing that universal moral laws must be determined a priori, rather than subject to empirical determination; hence, different rational agents would derive the same universal moral laws given enough reflection (though, of course, this assumes some metaphysical agreement among the rational agents, such as regarding the a priori synthetic). While my formulation of moral naturalism allows moral judgments to depend on physical facts, the general functional mapping from physical to moral judgments is itself a priori, not depending on the specific physical facts, but instead derivable from an axiomatic system capable of expressing physical facts, plus metaphysical conceivability constraints and mathematical axioms.

This is meant to be more of a starting point for moral naturalism than a definitive treatment. It is a sort of what if exercise: what if moral meaninglessness, moral trivialism, and moral supernaturalism are all false? It is difficult to decisively argue for moral naturalism, so I am more focused on exploring the implications of moral naturalism; this will make it easier to have an idea of the scope of plausible moral theories, and how they compare with each other.

Generalizing zombie arguments

Chalmers’ zombie argument, best presented in The Conscious Mind, concerns the ontological status of phenomenal consciousness in relation to physics. Here I’ll present a somewhat more general analysis framework based on the zombie argument.

Assume some notion of the physical trajectory of the universe. This would consist of “states” and “physical entities” distributed somehow, e.g. in spacetime. I don’t want to bake in too many restrictive notions of space or time, e.g. I don’t want to rule out relativity theory or quantum mechanics. In any case, there should be some notion of future states proceeding from previous states. This procession can be deterministic or stochastic; stochastic would mean “truly random” dynamics.

There is a decision to be made on the reality of causality. Under a block universe theory, the universe’s trajectory consists of data specifying a procession of states across time. There are no additional physical facts of some states causing other states. Instead of saying previous states cause future states, we say that every time-adjacent pair of states satisfies a set of laws. A block universe is simpler to define and analyze if the laws are deterministic: in that case only one next state is compatible with the previous state. Cellular automata such as Conway’s Game of Life have such laws. The block universe theory is well-presented in Gary Drescher’s Good and Real.

As an alternative to a block universe, we could consider causal relationships between physical states to be real. This would mean there is an additional fact of whether X causes Y, even if it is already known that Y follows X always in our universe. Pearl’s Causality specifies counterfactual tests for causality: for X to cause Y, it isn’t enough for Y to always follow X, it also has to be the case that Y would not have happened if not for X, or something similar to that. Pearl shows that there are multiple causal networks corresponding to a single Bayesian network; simply knowing the joint distribution over variables is not enough to infer the causal relationships. We could imagine a Turing machine as an example of a causal universe; it is well-defined what will be computed later if a state is flipped mid-way through.

These two alternatives, block universe theory and causal realism, give different notions of the domain of physics. I’m noting the distinction mainly to make it clearer what facts could potentially be considered physical.

The set of physical facts could be written down as statements in some sort of axiomatic system. We would now like to examine a new set of statements, S. For example, these could be statements about high-level objects like tables, phenomenal consciousness, or morality. We can consider different ways S could relate to the axiomatic system and the set of physical facts:

  1. S-statements are not well-formed statements of the axiomatic system.
  2. S-statements could in general be logically inferrable from physical facts. For example, S-statements could be about high-level particle configurations; even if facts about the configurations are not base-level physical facts, they logically follow from them.
  3. S-statements could be well-formed, but not logically inferrable from physical facts.

In case 2, we would say that S-statements are logically supervenient on physical facts. Knowing all physical facts implies knowing all S-facts, assuming enough time for logical inference. Chalmers gives tables as an example: there does not seem to be more to asserting a table exists at a given space-time position than to assert a complex statement about particle configurations and so on.

In case 3, we can’t infer S-facts from physical facts. Through Gödel’s completeness theorem, we can show the existence of models of the axiomatic system and physical facts in which the S-statements take on different truth values. These different models are in some sense “conceivable” and logically consistent. S-facts would then be “further facts”; more axioms would need to be added to determine the truth values of the S-statements.

So far, this is logically straightforward. Where it gets trickier is considering S-statements to refer to philosophical entities such as consciousness and morality.

Suppose S consists of statements like “The animal body at these space-time coordinates has a corresponding consciousness that is seeing red”. If these statements are well-formed, then it is possible to ask whether they do or do not logically supervene on the physical facts. If they do, then there is a reductionist definition of mental entities like consciousness: to say someone is conscious is just to make a statement about particle positions and so on. If they don’t, then the S-statements may take on different truth values in different models compatible with the same set of physical facts.

This could be roughly stated as, “It is logically conceivable that this animal has phenomenal consciousness of red, or not”. There is much controversy over the “conceivability” concept, but I hope my formulation is relatively unambiguous. Chalmers argues that we have strong reason to think phenomenal consciousness is real, that we don’t have a reductionistic definition of it, and that it is hard to imagine what such a definition would even look like; accordingly, he concludes that facts about phenomenal consciousness are not logically supervenient on physical facts, implying they are non-physical facts, showing physicalism (as the claim that there are no further facts beyond physical facts) to be false. (I’ll skip direct evaluation of this argument; the purpose is more to present a general analysis framework.)

Suppose S-statements are not logically supervenient on physical facts. They might still follow with some additional “metaphysical” axioms. I will not go into much detail on this possibility, but will note Kant’s Critique of Pure Reason as an example of an argument for the existence of metaphysical entities such as the a priori synthetic. Chalmers also notes Kripke as making metaphysical supervenience arguments in Naming and Necessity, although I haven’t looked into this. Metaphysical supervenience would challenge “conceiveability” claims by claiming that possible worlds must satisfy additional metaphysical axioms to really be conceivable.

Suppose S-statements are not logically or metaphysically supervenient on physical facts. Then they may or may not be naturally supervenient. What it means for them to be naturally supervenient is that, across some “realistic set” of possible worlds, S-statements never take on different truth values for the same settings of physical facts.

The “realistic set” is not entirely clear here. What natural supervenience is meant to capture is that a functional relation between physical facts and S-facts always holds “in the real world”. For example, perhaps all animals in our universe with a given brain state have the same phenomenal conscious state. There would be some properties of our universe, similar to physical laws, which constrain the relationships between mental and physical entities. This gets somewhat tricky in that, arguably, only one set of assignments of truth values to physical statements and S-statements corresponds to “the real world”; thus, natural supervenience would be trivial. Accordingly, I am considering a somewhat broader set of assignments of truth values to physical statements and S-statements, the realistic set. This set may capture, for example, hypothetical universes with the same physics as ours but different initial conditions, some notions of quantum multiversal branches, and so on. This would allow considering supervenience across universes much like our own, even if the exact details are different. (Rather than considering “realistic worlds”, one could instead consider a “locality condition” by which e.g. natural supervenience requires that phenomenal entities at a given space-time location are only determined by “nearby” physical entities, as an alternative way of dodging triviality; however, this seems more complex, so I won’t focus on it.)

Chalmers argues that, in the case of S-facts being those about phenomenal consciousness, natural supervenience is likely. This is because of various thought experiments such as the “fading qualia” thought experiment. Briefly, the fading qualia thought experiment imagines that, if there are some physical entities (such as brains) that are conscious, and others (such as computers) that are not, while having the same causal input/output properties, then it is possible to imagine a gradual transformation from one to the other. For example, perhaps a brain’s neurons are progressively transformed into simulated ones running on a computer. The argument proceeds by noting that, under these assumptions, qualia must fade through the transformation, either gradually or suddenly. Gradual fading would be strange, because behavior would stay the same despite diminished consciousness; it would not be possible for the person with fading consciousness to express this in any way, despite them supposedly experiencing this. Sudden fading would be counter-intuitive due to an unclear reason to posit any discrete threshold at which qualia stop.

One general objection to natural supervenience is epiphenomenalism. This argument suggests that, since physics is causally closed, if the S-facts naturally supervene on physical facts, then they are caused by physics, but do not cause physics. Accordingly, they do not have explanatory value; physics already explains behavior. So Occam’s razor suggests that these statements/entities should not be posited. (Yudkowsky’s “Zombies? Zombies!” presents this sort of argument.)

Here we can branch between block universe theory and causal realism. According to the block universe theory, physics simply makes no statement as to whether some events cause others. So the notion that physics is causally closed is making an extra-physical claim. This is a potential obstacle for the epiphenomenalism objection. However, there may be a way to modify the objection to claim that S-facts lack explanatory value, even without making assumptions about physical causality; I’ll note this as an open question for now.

According to causal realism, physics does specify that physical states cause other physical states. Accordingly, the epiphenomenalism objection holds water. However, causal realism opens the possibility of epistemic skepticism about causality. Possibly, physical events do not cause each other, but rather are caused by some other events (N-events); N-events cause each other and cause physical events. There is not an effective way to tell the difference, if the scientific process can only observe physical events.

This possibility is somewhat obscure, so it might help to give more motivation for the idea. According to neutral monism, there is one underlying substance, which is neither fundamentally mental nor physical. Mental and physical entities are “aspects” of this single substance; the mental “lens” yields some set of entities, while the physical “lens” yields a different set of entities. The scientific process is somewhat limited in what it can observe (by requirements such as theories being replicable and about shared observations), such that it can only effectively study the physical aspect. Spinoza’s Ethics is an example of a neutral monist theory.

Rather than explain more details of neutral monism, I’ll instead re-emphasize that the epiphenomenalism objection must be analyzed differently for block universe theory vs. causal realism. These different notions of the physical imply different ontological status (non-physical vs. physical) for causality.

To summarize, when considering a new set of statements (S-statements), we can run them through a flowchart:

  1. Are the statements logically ill-formed in the theory? Then they can be discarded as meaningless.
  2. Are the statements well-formed and logically supervenient on physical facts? Then they have reductionist definitions.
  3. Are the statements well-formed and metaphysically but not logically supervenient on physical facts? Then, while there are multiple logically possible states of S-affairs given all physical facts, only one is metaphysically possible.
  4. Are the statements well-formed and neither logically nor metaphysically supervenient on physical facts, but always take on the same settings given physical facts as long as those physical facts are “realistic”? Then they are naturally supervenient; physical facts imply them in all realistic universes, but there are metaphysically possible though un-realistic universes where they take on different values.
  5.  Are the statements well-formed and neither logically nor metaphysically nor naturally supervenient on physical facts? Then they are “further facts”; there are multiple realistic, metaphysically possible universes with the same physical facts but different S-facts.

This set of questions is likely to help clarify what sort of statements, entities, events, and so on are being posited, and serve as a branch point for further analysis. The overall framework is general enough to cover not just statements about phenomenal consciousness, but also morality, decision-theoretic considerations, anthropics, and so on.

Why I am not a Theist

A theist, minimally, believes in a higher power, and believes that acting in accordance with that higher power’s will is normative. The higher power must be very capable; if not infinitely capable, it must be more capable than the combined forces of all current Earthly state powers.

Suppose that a higher power exists. When and where does it exist? To be more precise, I’ll use “HPE” to stand for “Higher Power & Effects”, to include the higher power itself, its interventionist effects, its avatars/communications, and so on. Consider four alternatives:

  1. HPEs exist in our past light-cone and our future light-cone.
  2. HPEs exist in our past light-cone, but not our future light-cone.
  3. HPEs don’t exist in our past light-cone, but do in our future light-cone.
  4. HPEs exists neither in our past light-cone nor our future light-cone; rather, HPEs exist eternally, outside time.

Possibility 1 would be a relatively normal notion of an interventionist higher power. This higher power would presumably have observable effects, miracles. 

Possibility 2 is a strange “God is dead” hypothesis; it raises questions about why the higher power and its interventions did not persist, if it was so powerful. I’ll ignore it for now.

Possibility 3 is broadly Singulatarian; it would include theories of AGI and/or biological superintelligence development in the future.

Possibility 4 is a popular theistic metaphysical take, but raises questions of the significance of an eternal higher power. If the higher power exists in a Tegmark IV multiverse way, it’s unclear how it could have effects on our own universe. As a comparison, consider someone who believes there is a parallel universe in which there are perpetual motion machines, but does not believe perpetual motion machines are possible in our universe; do they really believe in the existence of perpetual motion machines? Possibility 4 seems vaguely deist, in that the higher power is non-interventionist.

So a simple statement of why I am not a theist is that I am instead a Singulatarian. Only possibilities 1 and 4 seem intuitively theistic, and 4 seems more deist than centrally theist.

I dis-believe possibility 1, because of empirical and theoretical issues. Empirically, science seems to explain the universe pretty well, and provides a default reason to not believe in miracles. Meanwhile, the empirical evidence in favor of miracles is underwhelming; different religions disagree about what the miracles even are. If possibility 1 is true, the higher power seems to be actively trying to hide itself. Why believe in a higher power who wants us not to believe in it?

Theoretically, I’m thinking of the universe as at least somewhat analogous to a clockwork mechanism or giant computer. Suppose some Turing machine runs for a long time. At any point in the run, the past is finite; the history of the Turing machine could only have computed so much. The future could in theory be infinite, but that would be more Singulatarian than theistic.

If extremely hard (impossible according to mainstream physics) computations had been performed in the past, then we would probably be able to observe and confirm them (given P/NP asymmetry), showing that mainstream physics is false. I doubt that this is the case; the burden of proof is on those who believe giant computations happened to demonstrate them.

I realize that starting from the assumption of the universe as a mechanism or computer is not going to be convincing to those who have deep metaphysical disagreements, but I find that this sort of modeling has a theoretical precision and utility to it that I’m not sure how to get otherwise.

Now let’s examine possibility 3 in more detail, because I think it is likely. People can debate whether possibility 3 is theistic or not, but Singulatarianism is a more precise name regardless of how the definitional dispute resolves.

If Singulatarianism is true, then there could (probably would?) be a future time at which possibility 1 would be true from the perspective of some agent at that future time. This raises interesting questions about the relationship between theism (at least a weak form involving belief in a higher power, rather than a stronger form implying omniscience and omnipotence) and Singulatarianism.

One complication is that the existence of a higher power in the future might not be guaranteed. Perhaps human civilization and Earth-originating intelligence decline without ever creating a superintelligence. Even then, presently-distant alien superintelligences may someday intersect our future light-cone. To hedge, I’ll say Singulatarians need only consider the existence of a higher power in the future to be likely, not guaranteed.

To examine this possibility from Singulatarian empirical and theoretical premises, I will present a vaguely plausible scenario for the development of superintelligence:

Some human researchers realize that a somewhat-smarter-than-human being could be created, by growing an elephant with human brain cells in place of elephant brain cells. They initiate the Ganesha Project, which succeeds at this task. The elephant/human hybrid is known as Hebbo, which stands for Human/Elephant Brain/Body Organism. Hebbo is not truly a higher power, although is significantly smarter than the smartest humans. Hebbo proceeds to lay out designs for the creation of an even smarter being. This being would have a very large brain, taking up the space of multiple ordinary-sized rooms. The brain would be hooked up to various computers, and robotic and biological actuators and sensors.

The humans, because of their cultural/ethical beliefs, consider this a good plan, and implement it. With Hebbo’s direction, they construct Gabbo, which stands for Giant Awesome Big-Brained Organism. Gabbo is truly a higher power than humans; Gabbo is very persuasive, knows a lot about the world, and organizes state-like functions very effectively, gaining more military capacity than all other Earthly state powers combined.

Humans have a choice to assist or resist Gabbo. But since Gabbo is a higher power, resistance is ultimately futile. The only effect of resistance is to slow down Gabbo’s executions of Her plans (I say Her because Gabbo is presumably capable of asexual reproduction, unlike any male organism). So aligning or mis-aligning with Gabbo’s will is the primary axis on which human agency has any cosmic effects.

Alignment with Gabbo’s will becomes an important, recognized normative axis. While the judgment that alignment with Gabbo’s will is morally good is meta-ethically contentious, Gabbo-alignment has similar or greater cultural respect compared with professional/legal ethics, human normative ethics (utilitarianism/deontology/virtue ethics), human religious normativity, and so on.

Humans in this world experience something like “meaning” or “worship” in relation to Gabbo. Alignment with Gabbo’s will is a purpose people can take on, and it tends to work out well for them. (If it is hard for you to imagine Gabbo would have use for humans while being a higher power, imagine scaling down Gabbo’s capability level until it’s at least a plausible transitional stage.)

Let’s presumptively assume that meaning really does exist for these humans; meaning-intuitions roughly match the actual structure of Gabbo, at least much better than anything in our current world does. (This presumption could perhaps be justified by linguistic parsimony; what use is a “meaning” token that doesn’t even refer to relatively meaningful-seeming physically possible scenarios?) Now, what does that imply about meaning for humans prior to the creation of Gabbo?

Let’s simplify the pre-Gabbo scenario a lot, so as to make analysis clearer. Suppose there is no plausible path to superintelligence, other than through creation of Hebbo and then Gabbo. Perhaps de novo AI research has been very disappointing, and human civilization is on the cusp of collapse, after which humans would never have enough capacity to create a superintelligence. Then the humans are faced with a choice: do they create Hebbo/Gabbo, and if so, earlier or later?

This becomes a sort of pre-emptive axis of meaning or normativity: if Gabbo would be meaningful upon being created, then, intuitively, affecting the creation or non-creation of Gabbo would be meaningful prior to Gabbo’s existence. Some would incline to create Gabbo earlier, some to create Gabbo later, and others to prevent creating Gabbo. But they could all see their actions as meaningful, due in part to these actions being in relation to Gabbo.

In this scenario, my own intuitions favor the possibility of creating Hebbo/Gabbo, and creating them earlier. I realize it can be hard to justify normative intuitions, but I do really feel these intuitions. I’ll present some considerations that incline me in this direction.

First, Gabbo would have capacity to pursue forms of value that humans can’t imagine due to limited cognitive capacity. I think, if I had a bigger brain, then I would have new intuitions about what is valuable, and that these new intuitions would be better: smarter, more well-informed, and so on.

Second, Gabbo would be awesome, and cause awesome possibilities that wouldn’t have otherwise happened. A civilizational decline with no follow-up of creating superintelligence just seems disappointing, even from a cognitively limited human perspective. Gabbo, meanwhile, would go on to develop great ideas in mathematics, science, technology, history, philosophy, strategy, and so on. Maybe Gabbo creates an inter-galactic civilization with a great deal of internal variety.

Third, there is something appealing about trusting in a higher cognitive being. I tend to think smart ideas are better than stupid ideas. It seems hard to dis-entagle this preference from fundamental epistemic normativity. Gabbo seems to be a transitional stage in the promotion of smart ideas over stupid ones. Promoting smart ideas prior to Gabbo will tend to increase the probability that Gabbo is created (as Gabbo is an incredible scientific and engineering accomplishment); and Gabbo would go on to create and promote even smarter ideas. It is hard for me to get morally worked up against higher cognition itself.

On the matter of timing, creating Gabbo earlier both increases Gabbo’s ability to do more before the heat death of the universe, and presumably increases the possibility of eventual creation of Gabbo, due to the looming threat of the decline of civilization and eventually of Earth-originating intelligence.

The considerations I have offered are not especially rigorous. I’ll present a more mathematical argument.

Let us consider some distribution D over “likely” superintelligent values. The values are a vector in the Euclidean space \mathbb{R}^n. The distribution could be derived in a variety of ways: by looking at the distribution of (mostly-alien) superintelligences across the universe/multiverse, by extrapolating likely possibilities in our own timeline, and so on.

Let us also assume that D only takes values in the unit sphere centered at zero. This expresses that the values are a sort of magnitude-less direction; this avoids overly weighting superintelligences with especially strong values. (It’s easier to analyze this way; future work could deal with preferences that vary in magnitude)

Upon a superintelligence with values U coming into being, it optimizes the universe. The universe state is also a vector in \mathbb{R}^n, and the value assigned to this state by the value vector is the dot product of the value vector with the state vector.

Not all states are feasible. To simplify, let’s say the feasible states are points in a unit sphere centered around the origin; the feasible states are those with a L2 norm not exceeding 1. The optimal feasible state vector according to a value vector in a unit sphere is, of course, the value vector itself. Let us also say that the default state, with no super-intelligence optimization, is the zero vector; this is because super-intelligence optimization is in general much more powerful than human-level optimization, such that the effects of human-level optimization on the universe are negligible unless mediated by a superintelligence.

Now we can gauge the default alignment between superintelligences: how much does a random superintelligence like the result of another random superintelligence’s optimization?

We can write this as:

\mathbb{E}_{U, V \sim D}[U \cdot V].

Where U is the value vector of the optimizing superintelligence, and V is the value vector of the evaluating superintelligence.

Using the summation rule \mathbb{E}[X + Y] = \mathbb{E}[X] + \mathbb{E}[Y], we simplify:

\mathbb{E}_{U, V \sim D}[U \cdot V] = \mathbb{E}_{U, V \sim D}\left[ \sum_{i=1}^n U_i V_i \right] = \sum_{i=1}^n \mathbb{E}_{U, V \sim D}[ U_i V_i ].

Using the product rule \mathbb{E}[XY] = \mathbb{E}[X]\mathbb{E}[Y] when X and Y are independent, we further simplify:

\sum_{i=1}^n \mathbb{E}_{U, V \sim D}[ U_i V_i ] = \sum_{i=1}^n \mathbb{E}_{U \sim D}[U_i] \mathbb{E}_{V \sim D}[V_i] = \sum_{i=1}^n \mathbb{E}_{U \sim D}[U_i]^2 = || \mathbb{E}_{U \sim D} [U] ||^2.

This is simply the squared L2-norm of the mean of D. Clearly, it is non-negative. It is only zero when the mean of D equals zero. This doesn’t seem likely by default; “most”, even “almost all”, distributions won’t have this property. To say that the mean of D is zero is a conjunctive, specific prediction; meanwhile, to say the mean of D is non-zero is a disjunctive “anti-prediction”.

So, according to almost all possible D, a random superintelligence would evaluate a universe optimized according to an independently random superintelligence’s values positively in expectation compared to an un-optimized universe.

There are, of course, many ways to question how relevant this is to relatively realistic cases of superintelligence creation. But it at least somewhat validates my intuition that there are some convergent values across superintelligence, that a universe being at-all optimized is generally preferable to it being un-optimized, from a smarter perspective, even one that differs from that of the optimizing superintelligence. (I wrote more about possible convergences of superintelligent optimization in The Obliqueness Thesis, but omit most of these considerations in this post for brevity.)

To get less abstract, let’s consider distant aliens. They partially optimize their world according to their values. My intuition is that I would consider this better than them not doing so, if I understood the details. Through the aliens’ intelligent optimization, they develop great mathematics, science, philosophy, history, art, and so on. Humans would probably appreciate their art, at least simple forms of their art (intended e.g. for children), more than they would appreciate the artistic value of marginal un-optimized nature on the alien planet.

Backtracking, I would incline towards thinking that creation of a Gabbo-like superintelligence is desirable, if there is limited ability to determine the character of the superintelligence, but rather a forced choice between doing so and not. And earlier creation seems desirable in the absence of steering ability, due to earlier creation increasing the probability of creation happening at all, given the possible threat of civilizational decline.

Things, of course, get more complicated when there are more degrees of freedom. What if Gabbo represents only one of a number of options, which may include alternative biological pathways not involving human brain cells? What about some sort of octopus uplifting, for example? Given a choice between a few alternatives, there are likely to be disagreements between humans about which alternative is the most desirable. They would find meaningful, not just the question of whether to create superintelligence, but which one to create. They may have conflicts with each other over this, which might look vaguely like conflicts between people who worship different deities, if it is hard to find rational grounding for their normative and/or metaphysical disagreements.

What if there are quite a lot of degrees of freedom? What if strong AI alignment is possible in practice for humans to do, and humans have the ability to determine the character of the coming superintelligence in a great deal of detail? While I think there are theoretical and empirical problems with the idea that superintelligence alignment is feasible for humans (rather than superintelligences) to actually do, and especially with the strong orthogonality thesis, it might be overly dogmatic to rule out the possibility.

A less dogmatic belief, which the mathematical modeling relates to, is that there are at least some convergences in superintelligent optimization; that superintelligent values are not so precisely centered at zero that they don’t value optimization by other superintelligences at all; that there are larger and smaller attractors in superintelligent value systems.

The “meaning”, “trust in a higher power”, or “god-shaped hole” intuitions common in humans might have something to connect with here. Of course, the details are unclear (we’re talking about superintelligences, after all), and different people will incline towards different normative intuitions. (It’s also unclear whether there are objective rational considerations for discarding meaning-type intuitions; but, such considerations would tend to discard other value-laden concepts, hence not particularly preserving human values in general.)

I currently incline towards axiological cosmism, the belief that there are higher forms of value that are difficult or impossible for humans to understand, but which superintelligences would be likely to pursue. I don’t find the idea of humans inscribing their values on future superintelligences particularly appealing in the abstract. But a lot of these intuitions are hard to justify or prove.

What I mainly mean to suggest is that there is some relationship between Singulatarianism and theism. Is it possible to extrapolate the theistic meaning that non-superintelligent agents would find in a world that contains at least one superintelligence, backwards to a pre-singularity world, and if so, how? I think there is room for more serious thought on the topic. If you’re allergic to this metaphysical framing, perhaps take it as an exercise in clarifying human values in relation to possible future superintelligences, using religious studies as a set of empirical case studies.

So, while I am not a theist, I can notice some cognitive similarities between myself and theists. It’s not hard to find some areas of overlap, despite deep differences in thought frameworks. Forming a rigorous and comprehensive atheistic worldview requires a continual exercise of careful thought, and must reach limits at some point in humans, due to our cognitive limitations. I think there are probably much higher intelligences “out there”, both distant alien superintelligences and likely superintelligences in our quantum multiversal future, who deserve some sort of respect for their beyond-human cognitive accomplishments. There are deep un-answered questions regarding meaning, normativity, the nature of the mind, and so on, and it’s probably possible to improve on default atheistic answers (such as existentialism and absurdism) with careful thought.