“Credibility” for being unbelievable

The word “credible” is perversely ambiguous.  On the face of it, it means: being trustworthy, being believable (in a Bayesian sense), being likely to make true statements and pay one’s debts.  But there’s another way the word is used, which is to indicate authority and prestige: control over which propositions are considered “truthy” (and/or agreement with controlling processes), rather than prediction of which statements are actually true.

Control over narratives, however, is anticorrelated with, and opposed to, actual believability.  If you can control the narrative to say that some proposition is either X or ~X at will, arbitrarily, then you’re using a symmetric process for “convincing” others: it’s just as easy to use it to convince of falsehood as of truth.  This is as opposed to asymmetric processes which are easier to use to convince of truth than of falsehood, e.g. public experiments, logical debate.

(The word “authority” is interesting here: “authority”, “authoritarian”, and “author” come from the same root, indicating a relation between the “authoring” of arbitrary narratives, “authoritarian” use of force by some parties to control others, and “authority” assigned to statements and producers of statements.)

While oracular reality-trackers discern facts, authority creates facts, primarily social facts; if these are the “facts” used to determine credibility, then authority and those close to it can “win” credibility, while having no corresponding ability to discern truth.

Being in a position to control narratives means having power: having maneuvered into a position to exert arbitrary influence on others.  Since power is rivalrous (it can’t be the case that everyone has lots of arbitrary influence on everyone else), acquiring power requires winning zero-sum games.  Winning zero-sum games requires allocating attention to the game itself; unless the game is set up so as to correlate with truth (e.g. a formal debate judged according to pro-epistemology standards such as logical rigor and consistency with evidence), it will be won by actors who are barely paying attention to the truth, who are bullshitting (not simply lying!).

Beyond this, zero-sum game play is opposed to revelation of information; such revelation is interpreted as aggression, as it breaks the “nothing changes” power-maintaining equilibrium.

The “calling a deer a horse” story is illustrative, demonstrating more severity than simply not paying attention to the truth.  When Zhao Gao points to a deer and says it is a horse, he effectively controls the narrative: those who want to live will “agree” with him that it’s a horse.  He isn’t believable, but he’s authoritative; he’s “credible”, as are those who submit to the threat and “agree” (ironically) with him.  (Ironic agreement is a state of doublethink, of internally disbelieving while outwardly agreeing; such ironic states of mind are suited to environments of reversed credibility.)

This story is more severe than simple bullshit, in that it involves selectively promoting false statements.  Paying enough attention to the truth to invert it and thus gain an advantage over truth-based actors is, of course, compatible with zero-sum play.

If a government known to promote lots of false stories promotes a false story as part of mobilization of military/police threat (say, the story that Saddam Hussein purchased yellow cake), is this story “credible” or “non-credible”?  It will be printed in prestigious newspapers, and will become a default assumption in many discussions, but people tracking history will have a sense of the government’s track record and know that the claim is made by the sort of actor who gets there by bullshitting.

Fiat currency is an interestingly explicit case.  The US adopted a metallic standard in 1785; government-issued money notes (US dollars) were exchangeable for a particular amount of a precious metal, initially silver and then gold.  To value US dollars is to bet that the government will be willing to exchange it for silver/gold; the money is valuable insofar as this promise is credible.

However, around WWI (1914-1918), many governments (including the US) suspended convertibility.  If the value of the money were simply based on the belief that it could be exchanged for precious metal, then the value would plummet accordingly.  But by then the money unit was well-integrated into the economy: it was used to set prices, pay wages, pay taxes, be used for bank savings and loans, and so on.  Changing protocols everywhere to adopt a new currency would be slow and difficult, and (given taxation) would run into conflict with the government.  While the value of money did reduce substantially (e.g. prices doubled in the US), this was not the totalizing devaluation that would be naively expected from a collapse of convertibility.

During the Great Depression, through Executive Order 6102 of 1933, the US government confiscated the vast majority of gold, “exchanging” it for a fixed amount of US dollars.  By the time the government is confiscating almost all gold, it’s obvious that US dollars are not valued primarily due to the expectation that they could be exchanged for gold.

So, though the “credibility” (market value) of the US dollar originally came from the belief that it could be exchanged for gold, its credibility over time shifted to be backed primarily by the authority of the US government, which is opposed to the expectation that it will pay debts.  Even if US dollars can’t be exchanged for precious metal, they are (since 1884) legal tender, valid for paying public debts (e.g. taxes) and private debts.  Since US dollars are valid for private debts (according to US courts), it’s impractical for private debts between Americans to not be reliant on the “credibility” of the US dollar.

US dollars are, at this point, a stage 3-4 simulacrum with respect to the original claim of value.  This paves the way for further manipulation of currency through Federal Reserve policy implementing Keynesian macroeconomics, a form of military mobilization (the relation between macroeconomics and mobilization is de-obfuscated by Modern Monetary Theory).  Direct manipulation of the currency is, of course, a form of authority, opposed to believability, in that it undermines use of the currency to denominate unironic debts.

Back to the more general problem.  If you asked an average college-educated American whether institutions such as the CDC or the WHO are credible, they would probably say “yes”.  However, these institutions repeatedly made hard-to-believe claims during COVID, such as the claim that masks were unhelpful, or the claim that the virus was not airborne.  Prestigious news outlets such as the New York Times did not call out these claims as false early on, which is correlated with such outlets’ “credibility”; they’re “credible” due to repeating claims made by authoritative narrative-controllers (thus, being part of the narrative-control apparatus), not due to tracking reality.

As Nick Land asks: “Assuming the WHO, CDC, and FDA wanted to kill you, how would their behavior differ?”  It wouldn’t be a coincidence for authoritative institutions to be trying to kill those they exert authority over: power is the ability to threaten others, and threats can control narratives.

I’ve seen a lot of discussions where people with some shared explicit agenda (e.g. Effective Altruists) talk about the need to “gain credibility”, and assume that the way to do so is to be closer to power; their central example of a “credible” person would be a high-level corporate/government strategic consultant or a journalist of a prestigious publication.  Such talk doesn’t distinguish between credibility-as-believability and credibility-as-authority: is being a strategic consultant helpful for convincing others because it is correlated with saying true propositions, or is it helpful because the authority of the institution (or upstream institutions) intimidates people into accepting claims made by its members despite their unbelievability?

In conclusion:

  • “Credibility” conflates between believability (Bayesian evidence) and authority (ability to control narratives arbitrarily).
  • Authority is derived from zero-sum game play, which is opposed to revelation of new information, and which threatens those who authority is exerted over.
  • Thus, these different properties being conflated are opposed.

On commitments to anti-normativity

Normativity: morality, ethics, doing the right thing, treating others as one would want to be treated, respecting moral symmetries, telling the truth, keeping commitments, following rules that are there to restrict harmful behavior, behaving in a way that contributes to the benefit of one’s society.

The idea of commitment to normativity is familiar.  Someone can be committed to behaving ethically, to the point that they forego some narrowly self-interested benefit to avoid behaving unethically.

What about commitment to anti-normativity?  This is commitment to doing the wrong thing, treating others as one wouldn’t want to be treated, disregarding moral symmetries, lying, breaking commitments, preventing rules from being followed, and parasitizing one’s society.

It is, naively, unsurprising that some people behave non-normatively, because non-normative behavior can bring a selfish benefit.  It is rather more surprising that commitment to anti-normativity may be a thing; such a commitment would cause one to continue behaving anti-normatively, even when normative behavior would be selfishly optimal.

Let’s look at some examples of anti-normativity:

  • The phrase “snitches get stitches”, and the idea that whistleblower protections might be necessary, points at the commonality of criminal conspiracies, which punish members not for breaking the law, but for causing the law to be enforceable.  Turning in other members of a conspiracy one is part of is, in a sense, aggressing upon them: it’s causing them to face negative consequences they expected not to face.  Members of a conspiracy commit to hiding themselves and each other from the law.
  • Privacy-related social norms are optimized for obscuring behavior that could be punished if widely known.  A common justification for such norms is that behavior that would be punished if known about is common, hence actual punishment is unfair scapegoating based on unpredictable factors; under privacy norms, revelation is more rare.  Such norms are sometimes enshrined into law, e.g. the Right to be Forgotten, by which some people can force records of their own behavior to be deleted.  (Note, privacy norms are an example of a paradoxical norm that is opposed to enforcement of norms-in-general).
  • Traumatized people are forcefully made part of a conspiracy, and learn to side with the transgressor who is aggressing upon them.  Such learning generalizes to siding with transgressors in general, as described in The Body Keeps the Score; while watching a play about dating violence, the traumatized children yell things like “kill the bitch”, siding with the transgressor in the scene.  This is despite this transgressor not actually being powerful; in the outer setting in which the play is being put on, such behavior is frowned upon, so the traumatized kids are going against powerful social structures.  (It is easy for traumatized people to conflate transgressiveness with power, but these frequently come apart)
  • It’s very common to want to exclude people who are too “moralistic” or “judgy” from social groups.  If this were just a matter of disagreeing with these people about morality, then moral argumentation would be the most natural response; what is being opposed is, rather, individuals making moral judgments in a way that implies that some normal behaviors are unacceptable.  Being committed to behaving normally, then, means being committed not to follow moral laws that would compel behaving abnormally.  (Relatedly, “vice signalling”, e.g. smoking, can make others less afraid of moral judgment, as the vice signaller has morally lowered themselves, having less optionality to claim the moral high ground.  Many Christian teachings, e.g. “judge not lest you be judged”, “Recognize always that evil is your own doing, and to impute it to yourself.”, recommend the social strategy of not claiming moral high ground.)
  • Some social groups separate themselves from the “commoners”, making it clear that they’re a different class, not subject to the rules that constrain the commoners, e.g. militaries, intelligence agency members, high-level corporate executives, some professional classes, some spiritual practitioners, aristocracies throughout history.  The Inner Ring describes a general dynamic of this form.  They may transgression-bond with each other to show that they are not subject to the normal rules.  Nazi legal theorist Carl Schmitt writes that “Sovereign is he who decides on the exception”, i.e. the truly autonomous leader can allow rules to be broken at will; David Graeber describes royal and ritualistic power as involving socially tolerated value inversion in the last chapter of On Kings.

Why would dynamics like these result in commitments to anti-normativity? In some cases, like criminal conspiracy, the answer is obvious: exiting the conspiracy is, by default, dangerous. In general, being part of a conspiracy for enough time will cause conspiratorial behavior to seem “normal”, such that going back to non-conspiratorial behavior requires resetting one’s sense of normal behavior, as in cult deconversion.

Anti-normativity is closely related to motive ambiguity; if there is ambiguity between the motives of normativity and of local expediency (or other local social motives), then behaving anti-normatively signals that local expediency is what is being optimized for, and shows that one is giving up the option of blaming others for behaving non-normatively.

A bubble of anti-normativity is one where members are constantly signalling that they are behaving non-normativity and are encouraging others to behave non-normatively as well.  Such a bubble (essentially, a conspiracy) can maintain itself as long as it can continue meeting its constraints, e.g. intaking enough resources and not being successfully opposed.

How is anti-normativity related to oppression?  In a society that runs on normativity, there can be something approaching equality of opportunity; people can gain for themselves by following the rules and providing value to others.  In a society that runs on anti-normativity, such strategies will fail.  Instead of following the rules being the way to get ahead, accommodating anti-normativity while still conforming to local cultural expectations is necessary to get ahead.  Kelsey Piper recently described dynamics in bureaucracies by which lower-class people get treated worse than upper-middle-class people, despite appealing to the same rules.  Simply depending on the bureaucracies to follow the rules fails, since they don’t follow the rules; instead, it’s necessary to have more subtle social skills, such as knowing when to appeal, talking to people in a polite yet demanding way, seeming like the kind of person who society generally treats well, seeming to be expensive to mess with, and so on.

Our society has a term for people who follow rules consistently (Asperger’s syndrome); it is considered a mental disorder, one that sharply reduces people’s social skills.  While Asperger’s is adaptive in lawful societies, it is maladaptive in anti-lawful societies, such as Nazi Germany, where the term was coined.  Hans Asperger was a Nazi who euthanized some of his patients; he identified the flaw of Asperger’s patients as failure to be absorbed into the national super-organism, a flaw also attributed to Jews, who have a highly lawful religion and are disproportionately likely to be diagnosed with Asperger’s.

If bureaucracies followed rules consistently, then Asperger’s would not be a social disadvantage; it would imply a high ability to navigate society.  In a society where corporations and other bureaucracies are anti-normative moral mazes, Asperger’s is a disadvantage, because appealing to rules alone is not an effective way to cause bureaucracies to provide service.

(A common intuition is that bureaucracies are bad because they follow the rules consistently, lacking subtle human factors.  As a counter to this intuition, consider the case of MMORPG games; the game mechanics function as a rule-following bureaucracy, e.g. the mechanics of stores and banking in the game.  Such games are fun because of the consistency of the software rules; inconsistency in game mechanics decreases predictability of effects of action, thereby decreasing effective planning horizons and increasing perception of unfairness.)

One can appeal to institutions on the basis of rules, or one can appeal on the basis of privilege, being the sort of person who should be rewarded for no reason.  Social classes are a matter of privilege, of people being treated one way or another because of who they are, what category they fit in, based on largely aesthetic properties.

If treatment by institutions is a matter of illegible cultural factors, then a large part of what is important is to be “normal”: being near the center of some Gaussian-ish distribution over people, such as a social class.  When everyone is transgressing, non-transgression isn’t a defense, while not standing out from the crowd (hiding as a statistic) is, since it prevents being singled out for scapegoating. The behavior is much more Fristonian (avoiding surprise) than decision-theoretic (trying to accomplish something that isn’t already the case).

Culture is correlated with race, both because people of different ancestry have different histories, and because people treat each other differently depending on appearance.  If society’s institutions are disproportionately occupied by people of some cultural group, then their sense of “normal” will accord with what is normal for that cultural group, not what is normal for other cultural groups.

So, anti-normativity is racially/culturally biased by default, in a way that normativity isn’t, or at least is much less so.  While explicit rules can be followed by people of a variety of different cultures, implicit social expectations are naturally particular to a narrow set of cultures.  Anti-normativity will tend to force behavior to follow a Gaussian-like distribution, where more central behavior is, by default, more rewarded than extremal behavior (with the exception of savvy extremal behavior optimized for taking advantage of the anti-normative dynamic).

Therefore, explicit anti-racism is much more necessary for mitigating oppression if anti-normativity is dominant than if normativity is dominant; having institutions staffed by a people of a variety of different cultures broadens the set of what is considered normal by people in the institution, causing it to be more natural for the institution to service people of a variety of races/cultures.  This is, obviously, nowhere near a good solution, since institutions are still not following the rules, and not all cultures can be represented in a given institution; it is, rather, a harm-reduction measure given an already-bad situation.

Many-worlds versus discrete knowledge

[epistemic status: I’m a mathematical and philosophical expert but not a QM expert; conclusions are very much tentative]

There is tension between the following two claims:

  • The fundamental nature of reality consists of the wave function whose evolution follows the Schrödinger equation.
  • Some discrete facts are known.

(What is discrete knowledge? It is knowledge that some nontrivial proposition X is definitely true. The sort of knowledge a Bayesian may update on, and the sort of knowledge that logic applies to.)

The issue here is that facts are facts about something. If quantum mechanics has any epistemic basis, then at least some things are known, e.g. the words in a book on quantum mechanics, or the outcomes of QM experiments. The question is what this knowledge is about.

If the fundamental nature of reality is the wave function, then these facts must be facts about the wave function. But, this runs into problems.

Suppose the fact in question is “A photon passed through the measurement apparatus”. How does this translate to a fact about the wave function?

The wave function consists of a mapping from the configuration space (some subset of R^n) to complex numbers. Some configurations (R^n points) have a photon at a given location and some don’t. So the fact of a photon passing through the apparatus or not is a fact about configurations (or configuration-histories), not about wave functions over configurations.

Yes, some wave functions assign more amplitude to configurations in which the photon passes through the apparatus than others. Still, this does not allow discrete knowledge of the wave function to follow from discrete knowledge of measurements.

The Bohm interpretation, on the other hand, has an answer to this question. When we know a fact, we know a fact about the true configuration-history, which is an element of the theory.

In a sense, the Bohm interpretation states that indexical information about which world we are in is part of fundamental reality, unlike the many-worlds interpretation which states that fundamental reality contains no indexical information. (I have discussed the trouble of indexicals with respect to physicalism previously)

Including such indexical information as “part of reality” means that discrete knowledge is possible, as the discrete knowledge is knowledge of this indexical information.

For this reason, I significantly prefer the Bohm interpretation over the many-worlds interpretation, while acknowledging that there is a great deal of uncertainty here and that there may be a much better interpretation possible. Though my reservations about the many-worlds interpretation had led me to be ambivalent about the comparison between the many-worlds interpretation and the Copenhagen interpretation, I am not similarly ambivalent about Bohm versus many-worlds; I significantly prefer the Bohm interpretation to both many-worlds and to the Copenhagen interpretation.

Modeling naturalized decision problems in linear logic

The following is a model of a simple decision problem (namely, the 5 and 10 problem) in linear logic. Basic familiarity with linear logic is assumed (enough to know what it means to say linear logic is a resource logic), although knowing all the operators isn’t necessary.

The 5 and 10 problem is, simply, a choice between taking a 5 dollar bill and a 10 dollar bill, with the 10 dollar bill valued more highly.

While the problem itself is trivial, the main theoretical issue is in modeling counterfactuals. If you took the 10 dollar bill, what would have happened if you had taken the 5 dollar bill? If your source code is fixed, then there isn’t a logically coherent possible world where you took the 5 dollar bill.

I became interested in using linear logic to model decision problems due to noticing a structural similarity between linear logic and the real world, namely irreversibility. A vending machine may, in linear logic, be represented as a proposition “$1 → CandyBar”, encoding the fact that $1 may be exchanged for a candy bar, being consumed in the process. Since the $1 is consumed, the operation is irreversible. Additionally, there may be multiple options offered, e.g. “$1 → Gumball”, such that only one option may be taken. (Note that I am using “→” as notation for linear implication.)

This is a good fit for real-world decision problems, where e.g. taking the $10 bill precludes also taking the $5 bill. Modeling decision problems using linear logic may, then, yield insights regarding the sense in which counterfactuals do or don’t exist.

First try: just the decision problem

As a first try, let’s simply try to translate the logic of the 5 and 10 situation into linear logic. We assume logical atoms named “Start”, “End”, “$5”, and “$10”. Respectively, these represent: the state of being at the start of the problem, the state of being at the end of the problem, having $5, and having $10.

To represent that we have the option of taking either bill, we assume the following implications:

TakeFive : Start → End ⊗ $5

TakeTen : Start → End ⊗ $10

The “⊗” operator can be read as “and” in the sense of “I have a book and some cheese on the table”; it combines multiple resources into a single linear proposition.

So, the above implications state that it is possible, starting from the start state, to end up in the end state, yielding $5 if you took the five dollar bill, and $10 if you took the 10 dollar bill.

The agent’s goal is to prove “Start → End ⊗ $X”, for X as high as possible. Clearly, “TakeTen” is a solution for X = 10. Assuming the logic is consistent, no better proof is possible. By the Curry-Howard isomorphism, the proof represents a computational strategy for acting in the world, namely, taking the $10 bill.

Second try: source code determining action

The above analysis is utterly trivial. What makes the 5 and 10 problem nontrivial is naturalizing it, to the point where the agent is a causal entity similar to the environment. One way to model the agent being a causal entity is to assume that it has source code.

Let “M” be a Turing machine specification. Let “Ret(M, x)” represent the proposition that M returns x. Note that, if M never halts, then Ret(M, x) is not true for any x.

How do we model the fact that the agent’s action is produced by a computer program? What we would like to be able to assume is that the agent’s action is equal to the output of some machine M. To do this, we need to augment the TakeFive/TakeTen actions to yield additional data:

TakeFive : Start → End ⊗ $5 ⊗ ITookFive

TakeTen : Start → End ⊗ $10 ⊗ ITookTen

The ITookFive / ITookTen propositions are a kind of token assuring that the agent (“I”) took five or ten. (Both of these are interpreted as classical propositions, so they may be duplicated or deleted freely).

How do we relate these propositions to the source code, M? We will say that M must agree with whatever action the agent took:

MachineFive : ITookFive → Ret(M, “Five”)

MachineTen : ITookTen → Ret(M, “Ten”)

These operations yield, from the fact that “I” have taken five or ten, that the source code “M” eventually returns a string identical with this action. Thus, these encode the assumption that “my source code is M”, in the sense that my action always agrees with M’s.

Operationally speaking, after the agent has taken 5 or 10, the agent can be assured of the mathematical fact that M returns the same action. (This is relevant in more complex decision problems, such as twin prisoner’s dilemma, where the agent’s utility depends on mathematical facts about what values different machines return)

Importantly, the agent can’t use MachineFive/MachineTen to know what action M takes before actually taking the action. Otherwise, the agent could take the opposite of the action they know they will take, causing a logical inconsistency. The above construction would not work if the machine were only run for a finite number of steps before being forced to return an answer; that would lead to the agent being able to know what action it will take, by running M for that finite number of steps.

This model naturally handles cases where M never halts; if the agent never executes either TakeFive or TakeTen, then it can never execute either MachineFive or MachineTen, and so cannot be assured of Ret(M, x) for any x; indeed, if the agent never takes any action, then Ret(M, x) isn’t true for any x, as that would imply that the agent eventually takes action x.

Interpreting the counterfactuals

At this point, it’s worth discussing the sense in which counterfactuals do or do not exist. Let’s first discuss the simpler case, where there is no assumption about source code.

First, from the perspective of the logic itself, only one of TakeFive or TakeTen may be evaluated. There cannot be both a fact of the matter about what happens if the agent takes five, and a fact of the matter about what happens if the agent takes ten. This is because even defining both facts at once requires re-using the Start proposition.

So, from the perspective of the logic, there aren’t counterfactuals; only one operation is actually run, and what “would have happened” if the other operation were run is undefinable.

On the other hand, there is an important sense in which the proof system contains counterfactuals. In constructing a linear logic proof, different choices may be made. Given “Start” as an assumption, I may prove “End ⊗ $5” by executing TakeFive, or “End ⊗ $10” by executing TakeTen, but not both.

Proof systems are, in general, systems of rules for constructing proofs, which leave quite a lot of freedom in which proofs are constructed. By the Curry-Howard isomorphism, the freedom in how the proofs are constructed corresponds to freedom in how the agent behaves in the real world; using TakeFive in a proof has the effect, if executed, of actually (irreversibly) taking the $5 bill.

So, we can say, by reasoning about the proof system, that if TakeFive is run, then $5 will be yielded, and if TakeTen is run, then $10 will be yielded, and only one of these may be run.

The logic itself says there can’t be a fact of the matter about both what happens if 5 is taken and if 10 is taken. On the other hand, the proof system says that both proofs that get $5 by taking 5, and proofs that get $10 by taking 10, are possible.

How to interpret this difference? One way is by asserting that the logic is about the territory, while the proof system is about the map; so, counterfactuals are represented in the map, even though the map itself asserts that there is only a singular territory.

And, importantly, the map doesn’t represent the entire territory; it’s a proof system for reasoning about the territory, not the territory itself. The map may, thus, be “looser” than the territory, allowing more possibilities than could possibly be actually realized.

What prevents the map from drawing out logical implications to the point where it becomes clear that only one action may possibly be taken? Given the second-try setup, the agent simply cannot use the fact of their source code being M, until actually taking the action; thus, no amount of drawing implications can conclude anything about the relationship between M and the agent’s action. In addition to this, reasoning about M itself becomes harder the longer M runs, i.e. the longer the agent is waiting to make the decision; so, simply reasoning about the map, without taking actions, need not conclude anything about which action will be taken, leaving both possibilities live until one is selected.


This approach aligns significantly with the less-formal descriptions given of subjective implication decision theory and counterfactual nonrealism. Counterfactuals aren’t real in the sense that they are definable after having taken the relevant action; rather, an agent in a state of uncertainty about which action it will take may consider multiple possibilities as freely selectable, even if they are assured that their selection will be equal to the output of some computer program.

The linear logic formalization increases my confidence in this approach, by providing a very precise notion of the sense in which the counterfactuals do and don’t exist, which would be hard to make precise without similar formalism.

I am, at this point, less worried about the problems with counterfactual nonrealism (such as global accounting) than I was when I wrote the post, and more worried about the problems of policy-dependent source code (which requires the environment to be an ensemble of deterministic universes, rather than a single one), such that I have updated towards counterfactual nonrealism as a result of this analysis, although I am still not confident.

Overall, I find linear logic quite promising for modeling embedded decision problems from the perspective of an embedded agent, as it builds critical facts such as non-reversibility into the logic itself.

Appendix: spurious counterfactuals

The following describes the problem of spurious counterfactuals in relation to the model.

Assume the second-try setup. Suppose the agent becomes assured that Ret(M, “Five”); that is, that M returns the action “Five”. From this, it is provable that the agent may, given Start, attain the linear logic proposition 0, by taking action “Ten” and then running MachineTen to get Ret(M, “Ten”), which yields inconsistency with Ret(M, “Five”). From 0, anything follows, e.g. $1000000, by the principle of explosion.

If the agent is maximizing guaranteed utility, then they will take the $10 bill, to be assured of the highest utility possible. So, it cannot be the case that the agent can be correctly assured that they will take action five, as that would lead to them taking a different action.

If, on the other hand, the agent would have provably taken the $5 bill upon receiving the assurance (say, because they notice that taking the $10 bill could result in the worst possible utility), then there is a potential issue with this assurance being a self-fulfilling prophecy. But, if the agent is constructing proofs (plans for action) so as to maximize guaranteed utility, this will not occur.

This solution is essentially the same as the one given in the paper on UDT with a known search order.

Topological metaphysics: relating point-set topology and locale theory

The following is an informal exposition of some mathematical concepts from Topology via Logic, with special attention to philosophical implications. Those seeking more technical detail should simply read the book.

There are, roughly, two ways of doing topology:

  • Point-set topology: Start with a set of points. Consider a topology as a set of subsets of these points which are “open”, where open sets must satisfy some laws.
  • Locale theory: Start with a set of opens (similar to propositions), which are closed under some logical operators (especially and and or), and satisfy logical relations.

What laws are satisfied?

  • For point-set topology: The empty set and the full set must both be open; finite intersections and infinite unions of opens must be open.
  • For local theory: “True” and “false” must be opens; the opens must be closed under finite “and” and infinite “or”; and some logical equivalences must be satisfied, such that “and” and “or” work as expected.

Roughly, open sets and opens both correspond to verifiable propositions. If X and Y are both verifiable, then both “X or Y” and “X and Y” are verifiable; and, indeed, even countably infinite disjunctions of verifiable statements are verifiable, by exhibiting the particular statement in the disjunction that is verified as true.

What’s the philosophical interpretation of the difference between point-set topology and locale theory, then?

  • Point-set topology corresponds to the theory of possible worlds. There is a “real state of affairs”, which can be partially known about. Open sets are “events” that are potentially observable (verifiable). Ontology comes before epistemology. Possible worlds are associated with classical logic and classical probability/utility theory.
  • Locale theory corresponds to the theory of situation semantics. There are facts that are true in a particular situation, which have logical relations with each other. The first three lines of Wittgenstein’s Tracatus Logico-Philosophicus are: “The world is everything that is the case. / The world is the totality of facts, not of things. / The world is determined by the facts, and by these being all the facts.” Epistemology comes before ontology. Situation semantics is associated with intuitionist logic and Jeffrey-Bolker utility theory (recently discussed by Abram Demski).

Thus, they correspond to fairly different metaphysics. Can these different metaphysics be converted to each other?

  • Converting from point-set topology to locale theory is easy. The opens are, simply, the open sets; their logical relations (and/or) are determined by set operations (intersection/union). They automatically satisfy the required laws.
  • To convert from locale theory to point-set topology, construct possible worlds as sets of opens (which must be logically coherent, e.g. the set of opens can’t include “A and B” without including “A”), which are interpreted as the set of opens that are true of that possible world. The open sets of the topology correspond with the opens, as sets of possible words which contain the open.

From assumptions about possible worlds and possible observations of it, it is possible to derive a logic of observations; from assumptions about the logical relations of different propositions, it is possible to consider a set of possible worlds and interpretations of the propositions as world-properties.

Metaphysically, we can consider point-set topology as ontology-first, and locale theory as epistemology-first. Point-set topology starts with possible worlds, corresponding to Kantian noumena; locale theory starts with verifiable propositions, corresponding to Kantian phenomena.

While the interpretation of a given point-set topology as a locale is trivial, the interpretation of a locale theory as a point-set topology is less so. What this construction yields is a way of getting from observations to possible worlds. From the set of things that can be known (and knowable logical relations between these knowables), it is possible to conjecture a consistent set of possible worlds and ways those knowables relate to the possible worlds.

Of course, the true possible worlds may be finer-grained than these consistent set; however, it cannot be coarser-grained, or else the same possible world would result in different observations. No finer potentially-observable (verifiable or falsifiable) distinctions may be made between possible worlds than the ones yielded by this transformation; making finer distinctions risks positing unreferenceable entities in a self-defeating manner.

How much extra ontological reach does this transformation yield? If the locale has a countable basis, then the point-set topology may have an uncountable point-set (specifically, of the same cardinality as the reals). The continuous can, then, be constructed from the discrete, as the underlying continuous state of affairs that could generate any given possibly-infinite set of discrete observations.

In particular, the reals may be constructed from a locale based on open intervals whose beginning/end are rational numbers. That is: a real r may be represented as a set of (a, b) pairs where a and b are rational, and a < r < b. The locale whose basis is rational-delimited open intervals (whose elements are countable unions of such open intervals, and which specifies logical relationships between them, e.g. conjunction) yields the point-set topology of the reals. (Note that, although including all countable unions of basis elements would make the locale uncountable, it is possible to weaken the notion of locale to only require unions of recursively enumerable sets, which preserves countability)

If metaphysics may be defined as the general framework bridging between ontology and epistemology, then the conversions discussed provide a metaphysics: a way of relating that-which-could-be to that-which-can-be-known.

I think this relationship is quite interesting and clarifying. I find it useful in my own present philosophical project, in terms of relating subject-centered epistemology to possible centered worlds. Ontology can reach further than epistemology, and topology provides mathematical frameworks for modeling this.

That this construction yields continuous from discrete is an added bonus, which should be quite helpful in clarifying the relation between the mental and physical. Mental phenomena must be at least partially discrete for logical epistemology to be applicable; meanwhile, physical theories including Newtonian mechanics and standard quantum theory posit that physical reality is continuous, consisting of particle positions or a wave function. Thus, relating discrete epistemology to continuous ontology is directly relevant to philosophy of science and theory of mind.

Two Alternatives to Logical Counterfactuals

The following is a critique of the idea of logical counterfactuals. The idea of logical counterfactuals has appeared in previous agent foundations research (especially at MIRI): here, here. “Impossible possible worlds” have been considered elsewhere in the literature; see the SEP article for a summary.

I will start by motivating the problem, which also gives an account for what a logical counterfactual is meant to be.

Suppose you learn about physics and find that you are a robot. You learn that your source code is “A”. You also believe that you have free will; in particular, you may decide to take either action X or action Y. In fact, you take action X. Later, you simulate “A” and find, unsurprisingly, that when you give it the observations you saw up to deciding to take action X or Y, it outputs action X. However, you, at the time, had the sense that you could have taken action Y instead. You want to be consistent with your past self, so you want to, at this later time, believe that you could have taken action Y at the time. If you could have taken Y, then you do take Y in some possible world (which still satisfies the same laws of physics). In this possible world, it is the case that “A” returns Y upon being given those same observations. But, the output of “A” when given those observations is a fixed computation, so you now need to reason about a possible world that is logically incoherent, given your knowledge that “A” in fact returns X. This possible world is, then, a logical counterfactual: a “possible world” that is logically incoherent.

To summarize: a logical counterfactual is a notion of “what would have happened” had you taken a different action after seeing your source code, and in that “what would have happened”, the source code must output a different action than what you actually took; hence, this “what would have happened” world is logically incoherent.

It is easy to see that this idea of logical counterfactuals is unsatisfactory. For one, no good account of them has yet been given. For two, there is a sense in which no account could be given; reasoning about logically incoherent worlds can only be so extensive before running into logical contradiction.

To extensively refute the idea, it is necessary to provide an alternative account of the motivating problem(s) which dispenses with the idea. Even if logical counterfactuals are unsatisfactory, the motivating problem(s) remain.

I now present two alternative accounts: counterfactual nonrealism, and policy-dependent source code.

Counterfactual nonrealism

According to counterfactual nonrealism, there is no fact of the matter about what “would have happened” had a different action been taken. There is, simply, the sequence of actions you take, and the sequence of observations you get. At the time of taking an action, you are uncertain about what that action is; hence, from your perspective, there are multiple possibilities.

Given this uncertainty, you may consider material conditionals: if I take action X, will consequence Q necessarily follow? An action may be selected on the basis of these conditionals, such as by determining which action results in the highest guaranteed expected utility if that action is taken.

This is basically the approach taken in my post on subjective implication decision theory. It is also the approach taken by proof-based UDT.

The material conditionals are ephemeral, in that at a later time, the agent will know that they could only have taken a certain action (assuming they knew their source code before taking the action), due to having had longer to think by then; hence, all the original material conditionals will be vacuously true. The apparent nondeterminism is, then, only due to the epistemic limitation of the agent at the time of making the decision, a limitation not faced by a later version of the agent (or an outside agent) with more computation power.

This leads to a sort of relativism: what is undetermined from one perspective may be determined from another. This makes global accounting difficult: it’s hard for one agent to evaluate whether another agent’s action is any good, because the two agents have different epistemic states, resulting in different judgments on material conditionals.

A problem that comes up is that of “spurious counterfactuals” (analyzed in the linked paper on proof-based UDT). An agent may become sure of its own action before that action is taken. Upon being sure of that action, the agent will know the material implication that, if they take a different action, something terrible will happen (this material implication is vacuously true). Hence the agent may take the action they were sure they would take, making the original certainty self-fulfilling. (There are technical details with how the agent becomes certain having to do with Löb’s theorem).

The most natural decision theory resulting in this framework is timeless decision theory (rather than updateless decision theory). This is because the agent updates on what they know about the world so far, and considers the material implications of themselves taken a certain action; these implications include logical implications if the agent knows their source code. Note that timeless decision theory is dynamically inconsistent in the counterfactual mugging problem.

Policy-dependent source code

A second approach is to assert that one’s source code depends on one’s entire policy, rather than only one’s actions up to seeing one’s source code.

Formally, a policy is a function mapping an observation history to an action. It is distinct from source code, in that the source code specifies the implementation of the policy in some programming language, rather than itself being a policy function.

Logically, it is impossible for the same source code to generate two different policies. There is a fact of the matter about what action the source code outputs given an observation history (assuming the program halts). Hence there is no way for two different policies to be compatible with the same source code.

Let’s return to the robot thought experiment and re-analyze it in light of this. After the robot has seen that their source code is “A” and taken action X, the robot considers what would have happened if they had taken action Y instead. However, if they had taken action Y instead, then their policy would, trivially, have to be different from their actual policy, which takes action X. Hence, their source code would be different. Hence, they would not have seen that their source code is “A”.

Instead, if the agent were to take action Y upon seeing that their source code is “A”, their source code must be something else, perhaps “B”. Hence, which action the agent would have taken depends directly on their policy’s behavior upon seeing that the source code is “B”, and indirectly on the entire policy (as source code depends on policy).

We see, then, that the original thought experiment encodes a reasoning error. The later agent wants to ask what would have happened if they had taken a different action after knowing their source code; however, the agent neglects that such a policy change would have resulted in seeing different source code! Hence, there is no need to posit a logically incoherent possible world.

The reasoning error came about due to using a conventional, linear notion of interactive causality. Intuitively, what you see up to time t depends only on your actions before time t. However, policy-dependent source code breaks this condition. What source code you see that you have depends on your entire policy, not just what actions you took up to seeing your source code. Hence, reasoning under policy-dependent source code requires abandoning linear interactive causality.

The most natural decision theory resulting from this approach is updateless decision theory, rather that timeless decision theory, as it is the entire policy that the counterfactual is on.


Before very recently, my philosophical approach had been counterfactual nonrealism. However, I am now more compelled by policy-dependent source code, after having analyzed it. I believe this approach fixes the main problem of counterfactual nonrealism, namely relativism making global accounting difficult. It also fixes the inherent dynamic inconsistency problems that TDT has relative to UDT (which are related to the relativism).

I believe the re-analysis I have provided of the thought experiment motivating logical counterfactuals is sufficient to refute the original interpretation, and thus to de-motivate logical counterfactuals.

The main problem with policy-dependent source code is that, since it violates linear interactive causality, analysis is correspondingly more difficult. Hence, there is further work to be done in considering simplified environment classes where possible simplifying assumptions (including linear interactive causality) can be made. It is critical, though, that the linear interactive causality assumption not be used in analyzing cases of an agent learning their source code, as this results in logical incoherence.

What is metaphysical free will?

This is an attempt to explain metaphysical free will. This serves to explain metaphysics in general.

First: on the distinction between subject-properties and object-properties. The subject-object relation holds between some subject and some object. For example, a person might be a subject looking at a table, which is an object. Objects are, roughly, entities that could potentially be beheld by some subject.

Metaphysical free will is a property of subjects rather than objects. This will make more sense if I first contrast it with object-properties.

Objects can be defined by some properties: location, color, temperature, and so on. These properties yield testable predictions. Objects that are hot will be painful to touch, for example.

Object properties are best-defined when they are closely connected with testable predictions. The logical positivist program, though ultimately unsuccessful, is quite effective when applied to defining object properties. Similarly, the falsificationist program is successful in clarifying the meaning of a variety of scientific hypotheses in terms of predictions.

Intuitively, free will has to do with the ability of a someone to choose from one of multiple options. This implies a kind of unpredictability, at least from the perspective of the one making the choice.

Hence, there is a tension in considering free will as an object-property, in that object properties are about predictable relations, whereas free will is about choice. (Probabilistic randomness would not much help either, as e.g. taking an action with 50% probability does not match the intuitive notion of choice)

The most promising attempts to define free will as an object-property are within the physicalist school that includes Gary Drescher and Daniel Dennett. These define choice in terms of optimization: selection of the best action from a list of options, based upon anticipated consequences. This remains an object-property, because it yields a testable prediction: that the chosen action will be the one that is predicted to lead to the best consequences (and if the agent is well-informed, one that actually will). Drescher calls this “mechanical choice”.

I will now contrast object-properties (including mechanical choice) with subject-properties.

The distinction between subjects and objects is, to a significant extent, grammatical. Subjects do things, objects have things done to them. “I repaired the table with some glue.”

It is easy to detect notions of choice in ordinary language. “I could have gone to the store but I chose not to”; “you don’t have to do all that work”; “this software has so many options and capabilities“.

Functional definitions of objects are often defined in terms of the capabilities the subject has in using the object. For example, an axe can (roughly) be defined as an object that can be swung to hit another object and create a rift.

The desiderata of products, including software, are about usability. The desire is for an object that can be used in a number of ways.

Moral language, too, refers to capabilities. What one should do depends on what one can do; see Ought implies Can.

We could say, then, that this sort of subjunctive language is tied with orienting towards reality in a certain way. The orientation is, specifically, about noticing the capabilities that one’s self (and perhaps others) have, and communicating about these capabilities. I find that replacing the word “metaphysics” with the word “orientation” is often illuminating.

When this orientation is coupled with language, the language describes itself as between observation and action. That is: we talk as if we may take action on the basis of our speech. Thus, our language refers to, among other things, our capabilities, which are decision-relevant. This is in contrast to thinking of language as a side effect, or as an action in itself.

This could be studied in AI terms. An AI may be programmed to assume it has control of “its action”, and may have a model of what the consequences of various actions are, which correspond to its capabilities. From the AI’s perspective, it has a choice among multiple actions, hence in a sense “believing in metaphysical free will”. To program an AI to take effective actions, it isn’t sufficient for it to develop a model of what is; it must also develop a model of what could be made to happen. (The AI may, like a human, generate verbal reports of its capabilities, and select actions on the basis of these verbal reports)

Even relatively objective ways of orienting towards reality notice capabilities. I’ve already noted the phenomenon of functional definitions. If you look around, you will see many objects, and you will also likely notice affordances: ways these objects may be used. It may seem that these affordances inhere in the objects, although it would be more precise to say that affordances exist in the subject-object relationship rather than the object itself, as they depend on the subject.

Metaphysics isn’t directly an object of scientific study, but can be seen in the scientific process itself, in the way that one must comport one’s self towards reality to do science. This comportment includes tool usage, logic, testing, observation, recording, abstraction, theorizing, and so on. The language scientists use in the course of their scientific study, and their communication about the results, reveals this metaphysics.

(Yes, recordings of scientific practice may be subject to scientific study, but interpreting the raw data of the recordings as e.g. “testing” requires a theory bridging between the objective recorded data and whatever “testing” is, where “testing” is naively a type of intentional action)

Upon noticing choice in one’s metaphysics, one may choose to philosophize on it, to see if it holds up to consistency checks. If the metaphysics leads to inconsistencies, then it should be modified or discarded.

The most obvious possible source of inconsistency is in the relation between the metaphysical “I” and the physical body. If the “I” is identical with one’s own physical body, then metaphysical properties of the self, such as freedom of choice, must be physical properties, leading to the usual problems.

If, on the other hand, the “I” is not identical with one’s physical body, then it must be explained why the actions and observations of the “I” so much align with the actions of the body; the mind-body relation must be clarified.

Another issue is akrasia; sometimes it seems that the mind decides to take an action but the body does not move accordingly. Thus, free will may be quite partial, even if it exists.

I’ve written before about reconciliation between metaphysical free will and the predictions of physics. I believe this account is better than the others I have seen, although nowhere near complete.

It is worth contrasting the position of believing in metaphysical free will with its converse. For example, in the Bhagavad Gita, Krishna states that the wise do not identify with the doer:

All actions are performed by the gunas of prakriti. Deluded by identification with the ego, a person thinks, “I am the doer.” But the illumined man or woman understands the domain of the gunas and is not attached. Such people know that the gunas interact with each other; they do not claim to be the doer.

Bhagavad Gita, Easwaran translation, ch. 3, 27-28

In this case the textual “I” is dissociated from the “doer” which takes action. Instead, the “I” is more like a placeholder in a narrative created by natural mental processes (gunas), not an agent in itself. (The interpretation here is not entirely clear, as Krishna also gives commands to Arjuna)

This specific discussion of metaphysical free will generalizes to metaphysics in general. Metaphysics deals with the basic entities/concepts associated with reality, subjects, and objects. It is contrasted with physics, which deals with objects, generalizing from observable properties of them (and the space they exist in and so on) to lawful theories.

To summarize metaphysical free will:

  • We talk in ways that imply that we and others have capabilities and make choices.
  • This way of talking is possible and sufficiently-motivated because of the way we comport ourselves towards reality, noticing our capabilities.
  • Effective AIs should similarly be expected to model their own capabilities as distinct from the present state of the world.
  • It is difficult to coherently identify these capabilities we talk as if we have, with physical properties of our bodies.
  • Therefore, it may be a reasonable (at least provisional) assumption that the capabilities we have are not physical properties of our bodies, and are metaphysical.
  • The implications of this assumption can be philosophically investigated, to build out a more coherent account, or to find difficulties in doing so.
  • There are ways of critiquing metaphysical free will. The assumption may lead to contradictions, with observations, well-supported scientific theories, and so on.

The absurdity of un-referenceable entities

Whereof one cannot speak, thereof one must be silent.

Ludwig Wittgenstein, Tractatus Logico-Philosophicus

Some criticism of my post on physicalism is that it discusses reference, not the world. To quote one comment: “I consider references to be about agents, not about the world.” To quote another: “Remember, you have only established that indexicality is needed for reference, ie. semantic, not that it applies to entities in themselves” and also “you need to show that standpoints are ontologically fundamental, not just epistemically or semantically.” A post containing answers says: “However, everyone already kind of knows the we can’t definitely show the existence of any objective reality behind our observations and that we can only posit it.” (Note, I don’t mean to pick on these commentators, they’re expressing a very common idea)

These criticisms could be rephrased in this way:

“You have shown limits on what can be referenced. However, that in no way shows limits on the world itself. After all, there may be parts of the world that cannot be referenced.”

This sounds compelling at first: wouldn’t it be strange to think that properties of the world can be deduced from properties of human reference?

But, a slight amount of further reflection betrays the absurdity involved in asserting the possible existence of un-referenceable entities. “Un-referenceable entities” is, after all, a reference.

A statement such as “there exist things that cannot be referenced” is comically absurd, in that it refers to things in the course of denying their referenceability.

We may say, then, that it is not the case that there exist things that cannot be referenced. The assumption that this is the case leads to contradiction.

I believe this sort of absurdity is quite related to Kantian philosophy. Kant distinguished phenomena (appearances) from noumena (things-in-themselves), and asserted that through observation and understanding we can only understand phenomena, not noumena. Quoting Kant:

Appearances, to the extent that as objects they are thought in ac­cordance with the unity of the categories, are called phaenomena. If, however, I suppose there to be things that are merely objects of the un­derstanding and that, nevertheless, can be given to an intuition, although not to sensible intuition, then such things would be called noumena.

Critique of Pure Reason, Chapter III

Kant at least grants that noumena are given to some “intuition”, though not a sensible intuition. This is rather less ridiculous than asserting un-referenceability.

It is ironic that noumena-like entity being hypothesized in the present case (the physical world) would, by Kant’s criterion, be considered a scientific entity, a phenomenon.

Part of the absurdity in saying that the physical world may be un-referenceable is that it is at odds with the claim that physics is known through observation and experimentation. After all, un-referenceable observations and experimental results are of no use in science; they couldn’t made their way into theories. So the shadow of the world that can be known (and known about) by science is limited to the referenceable. The un-referenceable may, at best, be inferred (although, of course, this statement is absurd in refererring to the un-referenceable).

It’s easy to make fun of this idea of un-referenceable entities (infinitely more ghostly than ghosts), but it’s worth examining what is compelling about this (absurd) position, to see what, if anything, can be salvaged.

From a modern perspective, we can see things that a pre-modern perspective cannot conceptualize. For example, we know about gravitational lensing, quantum entanglement, Cesium, and so on. It seems that, from our perspective, these things-in-themselves did not appear in the pre-modern phenomenal world. While they had influence, they did not appear in a way clear enough for a concept to be developed.

We may believe it is, then, normative for the pre-moderns to accept, in humility, that there are things-in-themselves they lack the capacity to conceptualize. And we may, likewise, admit this of the modern perspective, in light of the likelihood of future scientific advances.

However, conceptualizability is not the same as referenceability. Things can be pointed to that don’t yet have clear concepts associated with them, such as the elusive phenomena seen in dreams.

In this case, pre-moderns may point to modern phenomena as “those things that will be phenomena in 500 years”. We can talk about those things our best theories don’t conceptualize that will be conceptualized later. And this is a kind of reference; it travels through space-time to access phenomena not immediately present.

This reference is vague, in that it doesn’t clearly define what things are modern phenomena, and also doesn’t allow one to know ahead of time what these phenomena are. But it’s finitely vague, in contrast to the infinite vagueness of “un-referenceable entities”. It’s at least possible to imagine accessing them, by e.g. becoming immortal and living until modern times.

A case that our current condition (e.g. modernity) cannot know about something can be translated into a reference: a reference to that which we cannot know on account of our conditions but could know under other imaginable conditions. Which is, indeed, unsurprising, given that any account of something outside our understanding existing, must refer to that thing outside our understanding.

My critique of an un-refererenceable physical world is quite similar to Nietzsche’s of Kant’s unknowable noumena. Nietzsche wrote:

The “thing-in-itself” nonsensical. If I remove all the relationships, all the “properties,” all the “activities” of a thing, the thing does not remain over; because thingness has only been invented by us owing to the requirements of logic, thus with the aim of defining, communication (to bind together the multiplicity of relationships, properties, activities).

Will to Power, sec. 558

I continue to be struck by the irony of the transition from physical phenomena to physical noumena. Kant’s positing of a realm of noumena was, perhaps, motivated by a kind of humility, a kind of respect for morality, an appeasement of theological elements in society, while still making a place for thinking-for-one’s-self, science, and so on, in a separate magisterium that can’t collide with the noumenal realm.

Any idea, whether it’s God, Physics, or Objectivity, can disconnect from the human cognitive faculty that relates ideas to the world of experience, and remain as a mere signifier, which persists as a form of unfalsifiable control. When Physics and Objectivity take on theological significance (as they do in modern times), a move analogous to Kant’s will place them in an un-falsifiable noumenal realm, with the phenomenal realm being the subjective and/or intersubjective. This is extremely ironic.

Puzzles for physicalists

The following is a list of puzzles that are hard to answer within a broadly-physicalist, objective paradigm. I believe critical agentialism can answer these better than competing frameworks; indeed, I developed it through contemplation on these puzzles, among others. This post will focus on the questions, though, rather than the answers. (Some of the answers can be found in the linked post)

In a sense what I have done is located “anomalies” relative to standard accounts, and concentrated more attention on these anomalies, attempting to produce a theory that explains them, without ruling out its ability to explain those things the standard account already explains well.


(This section would be philosophical plagiarism if I didn’t cite On the Origin of Objects.)

Indexicals are phrases whose interpretation depends on the speaker’s standpoint, such as “my phone” or “the dog over there”. It is often normal to treat indexicals as a kind of shorthand: “my phone” is shorthand for “the phone belonging to Jessica Taylor”, and “the dog over there” is shorthand for “the dog existing at coordinates 37.856570, -122.284176”. This expansion allows indexicals to be accounted for within an objective, standpoint-independent frame.

However, even these expanded references aren’t universally unique. In a very large universe, there may be a twin Earth which also has a dog at coordinates 37.856570, -122.284176. As computer scientists will find obvious, specifying spacial coordinates requires a number of bits logarithmic in the amount of space addressed. These globally unique identifiers get more and more unwieldy the more space is addressed.

Since we don’t expand out references enough to be sure they’re globally unique, our use of them couldn’t depend on such global uniqueness. An accounting of how we refer to things, therefore, cannot posit any causally-effective standpoint-independent frame that assigns semantics.

Indeed, the trouble of globally unique references can also be seen by studying physics itself. Physical causality is spacially local; a particle affects nearby particles, and there’s a speed-of-light limitation. For spacial references to be effective (e.g. to connect to observation and action), they have to themselves “move through” local space-and-time.

This is a bit like the problem of having a computer refer to itself. A computer may address computers by IP address. The IP address “” always refers to this computer. These references can be resolved even without an Internet connection. It would be totally unnecessary and unwieldy for a computer to refer to itself (e.g. for the purpose of accessing files) through a globally-unique IP address, resolved through Internet routing.

Studying enough examples like these (real and hypothetical) leads to the conclusion that indexicality (and more specifically, deixis) are fundamental, and that even spacial references that appear to be globally unique are resolved deictically.

How does this relate to physics? It means references to “the objective world” or “the physical world” must also be resolved indexically, from some standpoint. Paying attention to how these references are resolved is critical.

The experimental results you see are the ones in front of you. You can’t see experimental results that don’t, through spacio-temporal information flows, make it to you. Thus, references to the physical which go through discussing “the thing causing experimental predictions” or “the things experiments failed to falsify” are resolved in a standpoint-dependent way.

It could be argued that physical law is standpoint-independent, because it is, symmetrically, true at each point in space-time. However, this excludes virtual standpoints (e.g. existing in a computer simulation), and additionally, this only means the laws are standpoint-independent, not the contents of the world, the things described by the laws.

Pre-reduction references

(For previous work, see “Reductive Refrerence”.)

Indexicality by itself undermines view-from-nowhere mythology, but perhaps not physicalism itself. What presents a greater challenge for physicalism is the problem of pre-reduced references (which are themselves deictic).

Let’s go back to the twin Earth thought experiment. Suppose we are in pre-chemistry times. We still know about water. We know water through our interactions with it. Later, chemistry will find that water has a particular chemical formula.

In pre-chemistry times, it cannot be known whether the formula is H2O, XYZ, etc, and these formulae are barely symbolically meaningful. If we discover that water is H2O, we will, after-the-fact, define “water” to mean H2O; if we discover that water is XYZ, we will, after-the-fact, define “water” to mean XYZ.

Looking back, it’s clear that “water” has to be H2O, but this couldn’t have been clear at the time. Pre-chemistry, “water” doesn’t yet have a physical definition; a physical definition is assigned later, which rationalizes previous use of the word “water” into a physicalist paradigm.

A philosophical account of reductionism needs to be able to discuss how this happens. To do this, it needs to be able to discuss the ontological status of entities such as “water” (pre-chemistry) that do not yet have a physical definition. In this intermediate state, the philosophy is talking about two entities, pre-reduced entities and physics, and considering various bridgings between them. So the intermediate state needs to contain entities that are not yet conceptualized physically.

A possible physicalist objection is that, while it may be a provisional truth that water is definitionally the common drinkable liquid found in rivers and so on, it is ultimately true that water is H20, and so physicalism is ultimately true. (This is very similar to the two truths doctrine in Buddhism).

Now, expanding out this account needs to provide an account of the relation between provisional and ultimate truth. Even if such an account could be provided, it would appear that, in our current state, we must accept it as provisionally true that some mental entities (e.g. imagination) do not have physical definitions, since a good-enough account has not yet been provided. And we must have a philosophy that can grapple with this provisional state of affairs, and judge possible bridgings as fitting/unfitting.

Moreover, there has never been a time without provisional definition. So this idea of ultimate truth functions as a sort of utopia, which is either never achieved, or is only achieved after very great advances in philosophy, science, and so on. The journey is, then, more important than the destination, and to even approach the destination, we need an ontology that can describe and usably function within the journeying process; this ontology will contain provisional definitions.

The broader point here is that, even if we have the idea of “ultimate truth”, that idea isn’t meaningful (in terms of observations, actions, imaginations, etc) to a provisional perspective, unless somehow the provisional perspective can conceptualize the relation between itself and the ultimate truth. And, if the ultimate truth contains all provisional truths (as is true if forgetting is not epistemically normative), the ultimate truth needs to conceptualize this as well.

Epistemic status of physics

Consider the question: “Why should I believe in physics?”. The conventional answer is: “Because it predicts experimental results.” Someone who can observe these experimental results can, thus, have epistemic justification for belief in physics.

This justificatory chain implies that there are cognitive actors (such as persons or social processes) that can do experiments and see observations. These actors are therefore, in a sense, agents.

A physicalist philosophical paradigm should be able to account for epistemic justifications of physics, else fails to self-ratify. So the paradigm needs to account for observers (and perhaps specifically active observers), who are the ones having epistemic justification for belief in physics.

Believing in observers leads to the typical mind-body problems. Disbelieving in observers fails to self-ratify. (Whenever a physicalist says “an observation is X physical entity”, it can be asked why X counts as an observation of the sort that is epistemically compelling; the answer to this question must bridge the mental and the physical, e.g. by saying the brain is where epistemic cognition happens. And saying “you know your observations are the things processed in this brain region because of physics” is circular.)

What mind-body problems? There are plenty.


The anthropic principle states, roughly, that epistemic agents must believe that the universe contains epistemic agents. Else, they would believe themselves not to exist.

The language of physics, on its own, doesn’t have the machinery to say what an observer is. Hence, anthropics is a philosophical problem.

The standard way of thinking about anthropics (e.g. SSA/SIA) is to consider the universe from a view-from-nowhere, and then assume that “my” body is in some way sampled “randomly” from this viewed-from-nowhere universe, such that I proceed to get observations (e.g. visual) from this body.

This is already pretty wonky. Indexicality makes the view-from-nowhere problematic. And the idea that “I” am “randomly” placed into a body is a rather strange metaphysics (when and where does this event happen?).

But perhaps the most critical issue is that the physicalist anthropic paradigm assumes it’s possible to take a physical description of the universe (e.g. as an equation) and locate observers in it.

There are multiple ways of considering doing so, and perhaps the best is functionalism, which will be discussed later. However, I’ll note that a subjectivist paradigm can easily find at least one observer: I’m right here right now.

This requires some explaining. Say you’re lost in an amusement park. There are about two ways of thinking about this:

  1. You don’t know where you are, but you know where the entrance is.
  2. You don’t know where the entrance is, but you know where you are.

Relatively speaking, 1 is an “objective” (relatively standpoint-independent) answer, and 2 is a “subjective” (relatively standpoint-dependent) answer.

2 has the intuitive advantage that you can point to yourself, but not to the entrance. This is because pointing is deictic.

Even while being lost, you can still find your way around locally. You might know where the Ferris wheel is, or the food stand, or your backpack. And so you can make a local map, which has not been placed relative to the entrance. This map is usable despite its disconnection from a global reference frame.

Anthropics seems to be saying something similar to (1). The idea is that I, initially, don’t know “where I am” in the universe. But, the deictic critique applies to anthropics as it applies to the amusement park case. I know where I am, I’m right here. I know where the Earth is, it’s under me. And so on.

This way of locating (at least one) observer works independent of ability to pick out observers given a physical description of the universe. Rather than finding myself relative to physics, I find physics relative to me.

Of course, the subjectivist framework has its own problems, such as difficulty finding other observers. So there is a puzzle here.

Tool use and functionalism

Functionalism is perhaps the current best answer as to how to locate observers in physics. Before discussing functionalism, though, I’ll discuss tools.

What’s a hammer? It’s a thing you can swing to apply lots of force to something at once. Hammers can be made of many physical materials, such as stone, iron, or wood. It’s about the function, not the substance.

The definition I gave refers to a “you” who can swing the hammer. Who is the “you”? Well, that’s standpoint-dependent. Someone without arms can’t use a conventional hammer to apply lots of force. The definition relativizes to the potential user. (Yes, a person without arms may say conventional hammers are hammers due to social convention, but this social convention is there because conventional hammers work for most people, so it still relativizes to a population.)

Let’s talk about functionalism now. Functionalism is based on the idea of multiple realizability: that a mind can be implemented on many different substrates. A mind is defined by its functions rather than its substrate. This idea is very familiar to computer programmers, who can hide implementation details behind an interface, and don’t need to care about hardware architecture for the most part.

This brings us back to tools. The definition I gave of “hammer” is an interface: it says how it can be used (and what effects it should create upon being used).

What sort of functions does a mind have? Observation, prediction, planning, modeling, acting, and so on. Now, the million-dollar question: Who is (actually or potentially) using it for these functions?

There are about three different answers to this:

  1. The mind itself. I use my mind for functions including planning and observation. It functions as a mind as long as I can use it this way.
  2. Someone or something else. A corporation, a boss, a customer, the government. Someone or something who wants to use another mind for some purpose.
  3. It’s objective. Things have functions or not independent of the standpoint.

I’ll note that 1 and 2 are both standpoint-dependent, thus subjectivist. They can’t be used to locate minds in physics; there would have to be some starting point, of having someone/something intending to use a mind for something.

3 is interesting. However, we now have a disanalogy from the hammer case, where we could identify some potential user. It’s also rather theological, in saying the world has an observer-independent telos. I find the theological implications of functionalism to be quite interesting and even inspiring, but that still doesn’t help physicalism, because physicalist ontology doesn’t contain standpoint-independent telos. We could, perhaps, say that physicalism plus theism yields objective functionalism. And this requires adding a component beyond the physical equation of the universe, if we wish to find observers in it.

Causality versus logic

Causality contains the idea that things “could” go one way or another. Else, causal claims reduce to claims about state; there wouldn’t be a difference between “if X, then Y” and “X causes Y”.

Pearlian causality makes this explicit; causal relations are defined in terms of interventions, which come from outside the causal network itself.

The ontology of physics itself is causal. It is asserted, not just that some state will definitely follow some previous state, but that there are dynamics that push previous states to new states, in a necessary way. (This is clear in the case of dynamical systems)

Indeed, since experiments may be thought of as interventions, it is entirely sensible that a physical theory that predicts the results of these interventions must be causal.

These “coulds” have a difficult status in relation to logic. Someone who already knows the initial state of a system can logically deduce its eventual state. To them, there is inevitability, and no logically possible alternative.

It appears that, while “could”s exist from the standpoint of an experimenter, they do not exist from the standpoint of someone capable of predicting the experimenter, such as Laplace’s demon.

This is not much of a problem if we’ve already accepted fundamental deixis and rejected the view-from-nowhere. But it is a problem for those who haven’t.

Trying to derive decision-theoretic causality from physical causality results in causal decision theory, which is known to have a number of bugs, due to its reliance on hypothetical extra-physical interventions.

An alternative is to try to develop a theory of “logical causality”, by which some logical facts (such as “the output of my decision process”, assuming you know your source code) can cause others. However, this is oxymoronic, because logic does not contain the affordance for intervention. Logic contains the affordance for constructing and checking proofs. It does not contain the affordance for causing 3+4 to equal 8. A sufficiently good reasoner can immediately see that “3+4=8” runs into contradiction; there is no way to construct a possible world in which 3+4=8.

Hence, it is hard to say that “coulds” exist in a standpoint-independent way. We may, then, accept standpoint-dependence of causation (as I do), or reject causation entirely.


My claim isn’t that physicalism is false, or that there don’t exist physicalist answers to these puzzles. My claim, rather, is that these puzzles are at least somewhat difficult, and that sufficient contemplation on them will destabilize many forms of physicalism. The current way I answer these puzzles is through a critical agential framework, but other ways of answering them are possible as well.

A conversation on theory of mind, subjectivity, and objectivity

I recently had a Twitter conversation with Roko Mijic. I believe it contains ideas that a wider philosophical/rationalist audience may find valuable, and so include here a transcript (quoted with permission).

Jessica: There are a number of “runs on top of” relations in physicalism:

  • mind runs on top of body
  • discrete runs on top of continuous
  • choice runs on top of causality

My present philosophy inverts the metaphysical order: mind/discrete/choice is more basic.

This is less of a problem than it first appears, because mind/discrete/choice can conceptualize, hypothesize, and learn about body/continuous/causality, and believe in a “effectively runs on” relation between the two.

In contrast, starting from body/continuous/causality has trouble with getting to mind/discrete/choice as even being conceptualizable, hence tending towards eliminativism.

Roko: Eliminitivism has a good track record though.

Jessica: Nah, it can’t account for what an “observation” is so can’t really explain observations.

Roko: I don’t really see a problem here. It makes perfect sense within a reductionist or eliminativist paradigm for a robot to have some sensors and to sense its environment. You don’t need a soul, or god, or strong free will, or objective person-independent values for that.

Jessica: Subjective Occam’s razor (incl. Solomonoff induction) says I should adopt the explanation that best explains my observations. Eliminativism can’t really say what “my” means here. If it believed in “my observations” it would believe in consciousness.

It has to do some ontological reshuffling around what “observations” are that, I think, undermines the case for believing in physics in the first place, which is that it explains my observations.

Roko: It means the observations that are caused by sensors plugged into the hardware that your algorithm instance is running on.

Jessica: That means “my algorithm instance” exists. Sounds like a mental entity. Can’t really have those under eliminativism (but can under functionalism etc).

Roko: I don’t want to eliminate my mental instance from my philosophy, that would be kind of ridiculous.

Jessica: Well, yes, so eliminativism is false. I understand eliminativism to mean there is only physical, no mental. Believing mental runs on physical could be functionalism, property dualism, or some other non-eliminativist position.

Roko: I think it makes more sense to think of mental things as existing subjectively (i.e. if they belong to you) and physical things as existing objectively. I definitely think that dualism is making a mistake in thinking of objectively-existing mental things

Jessica: I don’t think this objective/subjective dichotomy works out. I haven’t seen a good positive case, and my understanding of deixis leads me to believe that references to the objective must be resolved subjectively. See also On the Origin of Objects.

Basically I don’t see how we can, in a principled way, have judgments like “X exists but only subjectively, not objectively”. It would appear that by saying “X exists” I am asserting that X is an existent object (i.e. I’m saying something objective).

See also Thomas Nagel’s The View From Nowhere. Spoiler alert: there isn’t a view from nowhere, it’s an untenable concept.

Roko: My sensation of the flavor of chocolate exists but only subjectively.

Jessica: We’re now talking about the sensation of the flavor of the chocolate though. Is this really that different from talking about “that car over there”? I don’t see how some entities can, in a principled way, be classified as objective and some as subjective.

Like, in talking about “X” I’m porting something in my mental world-representation into the discursive space. I don’t at all see how to classify some of these portings as objective and some as subjective.

See also writing on the difficulty of the fact/opinion distinction.

Roko: It’s not actually the flavor “of” the chocolate though. It’s the sensation of flavor that your brain generates for you only, in response to certain nerve stimuli.

> I don’t see how some entities can, in a principled way, be classified as objective and some as subjective.

It’s very easy actually. Subjectives are the things that you cannot possibly be mistaken about, the “I think therefore I am’s”.

No deceiving demon can fool you into thinking that you’re experiencing the taste of chocolate, the color purple, or an orgasm. No deceiving demon can fool you into thinking that you’re visualizing the number 4.

Jessica: I don’t think this is right. The thought follows the experience. There can be mistranslations along the way. This might seem like a pedantic point but we’re talking about linguistic subjective statements so it’s relevant.

Translating the subjective into words can introduce errors. It’s at least as hard as, say, adding small numbers. So your definition means “1+1=2” is also subjective.

Roko: I think that it’s reasonable to see small number math instances as subjectives. I can see 3 pens. I can conceive of 3 dots, that’s a subjective thing. It’s in the same class as seeing red or smelling a rose.

[continuing from the deceiving demon thread] These are the things that are inherently part of your instance or mind. The objective, on the other hand, is always somewhat uncertain and inferred. Things are out there and they send signals to you. But you are inferring their existence.

Jessica: Okay, I agree with this sort of mental/outside-mental distinction, and you can define subjective/objective to mean that. This certainly doesn’t bring in other connotations of the objective, such as view-from-nowhere or observer-independence; I can be wrong about indexicals too.

Roko: Well it happens to be a property of our world that when different people infer the shape of the objective (i.e. draw maps), they always converge. This is what being in a shared reality means.

I mean they always converge if they follow the right principles, e.g. complexity priors, and those same principles are the ones that allow us to successfully manipulate reality via actions. That’s what the objective world out there is.

Jessica: Two reasons they could converge:

  1. Symmetry (this explains math)
  2. Existence of same entities (e.g. landmarks)

I’m fine with calling 1 observer-independent. Problem: your view of, and references to, 2, depend on your standpoint. Because of deixis.

Obvious deictic references are things like “the car over there” or “the room I’m in”. It is non-obvious but, I think, true, that all physical references are deictic. Which makes sense because physical causality is deictic (locally causal and symmetric).

Even “the Great Wall of China” refers to the Great Wall of China on our Earth. It couldn’t refer to the one on the twin Earth. And the people on twin Earth have “the Great Wall of China” refer to the one on the twin Earth, not ours.

At the same time, maps created starting from different places can be patched together, in a collage. However, pasting these together requires taking into account the standpoint-dependence of the individual maps being pasted together.

And at no point does this pasting-together result in a view from nowhere. It might seem that way because it keeps getting bigger and more zoomed-out. But at each individual time it’s finite.

Roko: Yes this is all nice but I think the point where we get to hard questions is when we think about mental phenomena that I would classify as subjectives as being part of the objective reality.

This is the petrl.org problem, or @reducesuffering worrying about whether plankton or insects “really do” have subjective experiences etc

Jessica: In my view “my observation” is an extremely deictic reference, to something maximally here-and-now, such that there isn’t any stabilization to do. Intermediate maps paste these extremely deictic maps together into less-deictic, but still deictic, maps. It never gets non-deictic.

It’s hard to pin down intersubjectively precisely because it’s so deictic. I can’t really port my here-and-now to your here-and-now without difficulty.