Dialogue on Appeals to Consequences

[note: the following is essentially an expanded version of this LessWrong comment on whether appeals to consequences are normative in discourse. I am exasperated that this is even up for debate, but I figure that making the argumentation here explicit is helpful]

Carter and Quinn are discussing charitable matters in the town square, with a few onlookers.

Carter: “So, this local charity, People Against Drowning Puppies (PADP), is nominally opposed to drowning puppies.”

Quinn: “Of course.”

Carter: “And they said they’d saved 2170 puppies last year, whereas their total spending was $1.2 million, so they estimate they save one puppy per $553.”

Quinn: “Sounds about right.”

Carter: “So, I actually checked with some of their former employees, and if what they say and my corresponding calculations are right, they actually only saved 138 puppies.”

Quinn: “Hold it right there. Regardless of whether that’s true, it’s bad to say that.”

Carter: “That’s an appeal to consequences, well-known to be a logical fallacy.”

Quinn: “Is that really a fallacy, though? If saying something has bad consequences, isn’t it normative not to say it?”

Carter: “Well, for my own personal decisionmaking, I’m broadly a consequentialist, so, yes.”

Quinn: “Well, it follows that appeals to consequences are valid.”

Carter: “It isn’t logically valid. If saying something has bad consequences, that doesn’t make it false.”

Quinn: “But it is decision-theoretically compelling, right?”

Carter: “In theory, if it could be proven, yes. But, you haven’t offered any proof, just a statement that it’s bad.”

Quinn: “Okay, let’s discuss that. My argument is: PADP is a good charity. Therefore, they should be getting more donations. Saying that they didn’t save as many puppies as they claimed they did, in public (as you just did), is going to result in them getting fewer donations. Therefore, your saying that they didn’t save as many puppies as they claimed to is bad, and is causing more puppies to drown.”

Carter: “While I could spend more effort to refute that argument, I’ll initially note that you only took into account a single effect (people donating less to PADP) and neglected other effects (such as people having more accurate beliefs about how charities work).”

Quinn: “Still, you have to admit that my case is plausible, and that some onlookers are convinced.”

Carter: “Yes, it’s plausible, in that I don’t have a full refutation, and my models have a lot of uncertainty. This gets into some complicated decision theory and sociological modeling. I’m afraid we’ve gotten sidetracked from the relatively clear conversation, about how many puppies PADP saved, to a relatively unclear one, about the decision theory of making actual charity effectiveness clear to the public.”

Quinn: “Well, sure, we’re into the weeds now, but this is important! If it’s actually bad to say what you said, it’s important that this is widely recognized, so that we can have fewer… mistakes like that.”

Carter: “That’s correct, but I feel like I might be getting trolled. Anyway, I think you’re shooting the messenger: when I started criticizing PADP, you turned around and made the criticism about me saying that, directing attention against PADP’s possible fraudulent activity.”

Quinn: “You still haven’t refuted my argument. If you don’t do so, I win by default.”

Carter: “I’d really rather that we just outlaw appeals to consequences, but, fine, as long as we’re here, I’m going to do this, and it’ll be a learning experience for everyone involved. First, you said that PADP is a good charity. Why do you think this?”

Quinn: “Well, I know the people there and they seem nice and hardworking.”

Carter: “But, they said they saved over 2000 puppies last year, when they actually only saved 138, indicating some important dishonesty and ineffectiveness going on.”

Quinn: “Allegedly, according to your calculations. Anyway, saying that is bad, as I’ve already argued.”

Carter: “Hold up! We’re in the middle of evaluating your argument that saying that is bad! You can’t use the conclusion of this argument in the course of proving it! That’s circular reasoning!”

Quinn: “Fine. Let’s try something else. You said they’re being dishonest. But, I know them, and they wouldn’t tell a lie, consciously, although it’s possible that they might have some motivated reasoning, which is totally different. It’s really uncivil to call them dishonest like that. If everyone did that with the willingness you had to do so, that would lead to an all-out rhetorical war…”

Carter: “God damn it. You’re making another appeal to consequences.”

Quinn: “Yes, because I think appeals to consequences are normative.”

Carter: “Look, at the start of this conversation, your argument was that saying PADP only saved 138 puppies is bad.”

Quinn: “Yes.”

Carter: “And now you’re in the course of arguing that it’s bad.”

Quinn: “Yes.”

Carter: “Whether it’s bad is a matter of fact.”

Quinn: “Yes.”

Carter: “So we have to be trying to get the right answer, when we’re determining whether it’s bad.”

Quinn: “Yes.”

Carter: “And, while appeals to consequences may be decision theoretically compelling, they don’t directly bear on the facts.”

Quinn: “Yes.”

Carter: “So we shouldn’t have appeals to consequences in conversations about whether the consequences of saying something is bad.”

Quinn: “Why not?”

Carter: “Because we’re trying to get to the truth.”

Quinn: “But aren’t we also trying to avoid all-out rhetorical wars, and puppies drowning?”

Carter: “If we want to do those things, we have to do them by getting to the truth.”

Quinn: “The truth, according to your opinion-

Carter: “God damn it, you just keep trolling me, so we never get to discuss the actual facts. God damn it. Fuck you.”

Quinn: “Now you’re just spouting insults. That’s really irresponsible, given that I just accused you of doing something bad, and causing more puppies to drown.”

Carter: “You just keep controlling the conversation by OODA looping faster than me, though. I can’t refute your argument, because you appeal to consequences again in the middle of the refutation. And then we go another step down the ladder, and never get to the truth.”

Quinn: “So what do you expect me to do? Let you insult well-reputed animal welfare workers by calling them dishonest?”

Carter: “Yes! I’m modeling the PADP situation using decision-theoretic models, which require me to represent the knowledge states and optimization pressures exerted by different agents (both conscious and unconscious), including when these optimization pressures are towards deception, and even when this deception is unconscious!”

Quinn: “Sounds like a bunch of nerd talk. Can you speak more plainly?”

Carter: “I’m modeling the actual facts of how PADP operates and how effective they are, not just how well-liked the people are.”

Quinn: “Wow, that’s a strawman.”

Carter: “Look, how do you think arguments are supposed to work, exactly? Whoever is best at claiming that their opponent’s argumentation is evil wins?”

Quinn: “Sure, isn’t that the same thing as who’s making better arguments?”

Carter: “If we argue by proving our statements are true, we reach the truth, and thereby reach the good. If we argue by proving each other are being evil, we don’t reach the truth, nor the good.”

Quinn: “In this case, though, we’re talking about drowning puppies. Surely, the good in this case is causing fewer puppies to drown, and directing more resources to the people saving them.”

Carter: “That’s under contention, though! If PADP is lying about how many puppies they’re saving, they’re making the epistemology of the puppy-saving field worse, leading to fewer puppies being saved. And, they’re taking money away from the next-best-looking charity, which is probably more effective if, unlike PADP, they’re not lying.”

Quinn: “How do you know that, though? How do you know the money wouldn’t go to things other than saving drowning puppies if it weren’t for PADP?”

Carter: “I don’t know that. My guess is that the money might go to other animal welfare charities that claim high cost-effectiveness.”

Quinn: “PADP is quite effective, though. Even if your calculations are right, they save about one puppy per $10,000. That’s pretty good.”

Carter: “That’s not even that impressive, but even if their direct work is relatively effective, they’re destroying the epistemology of the puppy-saving field by lying. So effectiveness basically caps out there instead of getting better due to better epistemology.”

Quinn: “What an exaggeration. There are lots of other charities that have misleading marketing (which is totally not the same thing as lying). PADP isn’t singlehandedly destroying anything, except instances of puppies drowning.”

Carter: “I’m beginning to think that the difference between us is that I’m anti-lying, whereas you’re pro-lying.”

Quinn: “Look, I’m only in favor of lying when it has good consequences. That makes me different from pro-lying scoundrels.”

Carter: “But you have really sloppy reasoning about whether lying, in fact, has good consequences. Your arguments for doing so, when you lie, are made of Swiss cheese.”

Quinn: “Well, I can’t deductively prove anything about the real world, so I’m using the most relevant considerations I can.”

Carter: “But you’re using reasoning processes that systematically protect certain cached facts from updates, and use these cached facts to justify not updating. This was very clear when you used outright circular reasoning, to use the cached fact that denigrating PADP is bad, to justify terminating my argument that it wasn’t bad to denigrate them. Also, you said the PADP people were nice and hardworking as a reason I shouldn’t accuse them of dishonesty… but, the fact that PADP saved far fewer puppies than they claimed actually casts doubt on those facts, and the relevance of them to PADP’s effectiveness. You didn’t update when I first told you that fact, you instead started committing rhetorical violence against me.”

Quinn: “Hmm. Let me see if I’m getting this right. So, you think I have false cached facts in my mind, such as PADP being a good charity.”

Carter: “Correct.”

Quinn: “And you think those cached facts tend to protect themselves from being updated.”

Carter: “Correct.”

Quinn: “And you think they protect themselves from updates by generating bad consequences of making the update, such as fewer people donating to PADP.”

Carter: “Correct.”

Quinn: “So you want to outlaw appeals to consequences, so facts have to get acknowledged, and these self-reinforcing loops go away.”

Carter: “Correct.”

Quinn: “That makes sense from your perspective. But, why should I think my beliefs are wrong, and that I have lots of bad self-protecting cached facts?”

Carter: “If everyone were as willing as you to lie, the history books would be full of convenient stories, the newspapers would be parts of the matrix, the schools would be teaching propaganda, and so on. You’d have no reason to trust your own arguments that speaking the truth is bad.”

Quinn: “Well, I guess that makes sense. Even though I lie in the name of good values, not everyone agrees on values or beliefs, so they’ll lie to promote their own values according to their own beliefs.”

Carter: “Exactly. So you should expect that, as a reflection to your lying to the world, the world lies back to you. So your head is full of lies, like the ‘PADP is effective and run by good people’ one.”

Quinn: “Even if that’s true, what could I possibly do about it?”

Carter: “You could start by not making appeals to consequences. When someone is arguing that a belief of yours is wrong, listen to the argument at the object level, instead of jumping to the question of whether saying the relevant arguments out loud is a good idea, which is a much harder question.”

Quinn: “But how do I prevent actually bad consequences from happening?”

Carter: “If your head is full of lies, you can’t really trust ad-hoc object-level arguments against speech, like ‘saying PADP didn’t save very many puppies is bad because PADP is a good charity’. You can instead think about what discourse norms lead to the truth being revealed, and which lead to it being obscured. We’ve seen, during this conversation, that appeals to consequences tend to obscure the truth. And so, if we share the goal of reaching the truth together, we can agree not to do those.”

Quinn: “That still doesn’t answer my question. What about things that are actually bad, like privacy violations?”

Carter: “It does seem plausible that there should be some discourse norms that protect privacy, so that some facts aren’t revealed, if such norms have good consequences overall. Perhaps some topics, such as individual people’s sex lives, are considered to be banned topics (in at least some spaces), unless the person consents.”

Quinn: “Isn’t that an appeal to consequences, though?”

Carter: “Not really. Deciding what privacy norms are best requires thinking about consequences. But, once those norms have been decided on, it is no longer necessary to prove that privacy violations are bad during discussions. There’s a simple norm to appeal to, which says some things are out of bounds for discussion. And, these exceptions can be made without allowing appeals to consequences in full generality.”

Quinn: “Okay, so we still have something like appeals to consequences at the level of norms, but not at the level of individual arguments.”

Carter: “Exactly.”

Quinn: “Does this mean I have to say a relevant true fact, even if I think it’s bad to say it?”

Carter: “No. Those situations happen frequently, and while some radical honesty practitioners try not to suppress any impulse to say something true, this practice is probably a bad idea for a lot of people. So, of course you can evaluate consequences in your head before deciding to say something.”

Quinn: “So, in summary: if we’re going to have suppression of some facts being said out loud, we should have that through either clear norms designed with consequences (including consequences for epistemology) in mind, or individuals deciding not to say things, but otherwise our norms should be protecting true speech, and outlawing appeals to consequences.”

Carter: “Yes, that’s exactly right! I’m glad we came to agreement on this.”

Why artificial optimism?

Optimism bias is well-known. Here are some examples.

  • It’s conventional to answer the question “How are you doing?” with “well”, regardless of how you’re actually doing. Why?
  • People often believe that it’s inherently good to be happy, rather than thinking that their happiness level should track the actual state of affairs (and thus be a useful tool for emotional processing and communication). Why?
  • People often think their project has an unrealistically high chance of succeeding. Why?
  • People often avoid looking at horrible things clearly. Why?
  • People often want to suppress criticism but less often want to suppress praise; in general, they hold criticism to a higher standard than praise. Why?

The parable of the gullible king

Imagine a kingdom ruled by a gullible king. The king gets reports from different regions of the kingdom (managed by different vassals). These reports detail how things are going in these different regions, including particular events, and an overall summary of how well things are going. He is quite gullible, so he usually believes these reports, although not if they’re too outlandish.

When he thinks things are going well in some region of the kingdom, he gives the vassal more resources, expands the region controlled by the vassal, encourages others to copy the practices of that region, and so on. When he thinks things are going poorly in some region of the kingdom (in a long-term way, not as a temporary crisis), he gives the vassal fewer resources, contracts the region controlled by the vassal, encourages others not to copy the practices of that region, possibly replaces the vassal, and so on. This behavior makes sense if he’s assuming he’s getting reliable information: it’s better for practices that result in better outcomes to get copied, and for places with higher economic growth rates to get more resources.

Initially, this works well, and good practices are adopted throughout the kingdom. But, some vassals get the idea of exaggerating how well things are going in their own region, while denigrating other regions. This results in their own region getting more territory and resources, and their practices being adopted elsewhere.

Soon, these distortions become ubiquitous, as the king (unwittingly) encourages everyone to adopt them, due to the apparent success of the regions distorting information this way. At this point, the vassals face a problem: while they want to exaggerate their own region and denigrate others, they don’t want others to denigrate their own region. So, they start forming alliances with each other. Vassals that ally with each other promise to say only good things about each other’s regions. That way, both vassals mutually benefit, as they both get more resources, expansion, etc compared to if they had been denigrating each other’s regions. These alliances also make sure to keep denigrating those not in the same coalition.

While these “praise coalitions” are locally positive-sum, they’re globally zero-sum: any gains that come from them (such as resources and territory) are taken from other regions. (However, having more praise overall helps the vassals currently in power, as it means they’re less likely to get replaced with other vassals).

Since praise coalitions lie, they also suppress the truth in general in a coordinated fashion. It’s considered impolite to reveal certain forms of information that could imply that things aren’t actually going as well as they’re saying it’s going. Prying too closely into a region’s actual state of affairs (and, especially, sharing this information) is considered a violation of privacy.

Meanwhile, the actual state of affairs has gotten worse in almost all regions, though the regions prop up their lies with Potemkin villages, so the gullible king isn’t shocked when he visits the region.

At some point, a single praise coalition wins. Vassals notice that it’s in their interest to join this coalition, since (as mentioned before) it’s in the interests of the vassals as a class to have more praise overall, since that means they’re less likely to get replaced. (Of course, it’s also in their class interests to have things actually be going well in their regions, so the praise doesn’t get too out of hand, and criticism is sometimes accepted) At this point, it’s conventional for vassals to always praise each other and punish vassals who denigrate other regions.

Optimism isn’t ubiquitous, however. There are a few strategies vassals can use to use pessimism to claim more resources. Among these are:

  • Blame: By claiming a vassal is doing something wrong, another vassal may be able to take power away from that vassal, sometimes getting a share of that power for themselves. (Blame is often not especially difficult, given that everyone’s inflating their impressions)
  • Pity: By showing that their region is undergoing a temporary but fixable crisis (perhaps with the help of other vassals), vassals can claim that they should be getting more resources. But, the problem has to be solvable; it has to be a temporary crises, not a permanent state of decay. (One form of pity is claiming to be victimized by another vassal; this mixes blame and pity)
  • Doomsaying: By claiming that there is some threat to the kingdom (such as wolves), vassals can claim that they should be getting resources in order to fight this threat. Again, the threat has to be solvable; the king has little reason to give someone more resources if there is, indeed, nothing to do about the threat.

Pity and doomsaying could be seen as two sides of the same coin: pity claims things are going poorly (but fixably) locally, while doomsaying claims things are going poorly (but fixably) globally. However, all of these strategies are limited to a significant degree by the overall praise coalition, so they don’t get out of hand.

Back to the real world

Let’s relate the parable of the gullible king back to the real world.

  • The king is sometimes an actual person (such as a CEO, as in Moral Mazes, or a philanthropist), but is more often a process distributed among many people that is evaluating which things are good/bad, in a pattern-matching way.
  • Everyone’s a vassal to some degree. People who have more power-through-appearing-good are vassals with more territory, who have more of an interest in maintaining positive impressions.
  • Most (almost all?) coalitions in the real world have aspects of praise coalitions. They’ll praise those in the coalition while denigrating those outside it.
  • Politeness and privacy are, in fact, largely about maintaining impressions (especially positive impressions) through coordinating against the revelation of truth.
  • Maintaining us-vs-them boundaries is characteristic of the political right, while dissolving them (and punishing those trying to set them up) is characteristic of the political left. So, non-totalizing praise coalitions are more characteristic of the right, and total ones that try to assimilate others (such as the one that won in the parable) are more characteristic of the left. (Note, totalizing praise coalitions still denigrate/attack ones that can’t be safely assimilated; see the paradox of tolerance)
  • Coalitions may be fractal, of course.
  • A lot of the distortionary dynamics are subconscious (see: The Elephant in the Brain).

This model raises an important question (with implications for the real world): if you’re a detective in the kingdom of the gullible king who is at least somewhat aware of the reality of the situation and the distortonary dynamics, and you want to fix the situation (or at least reduce harm), what are your options?

The AI Timelines Scam

[epistemic status: that’s just my opinion, man. I have highly suggestive evidence, not deductive proof, for a belief I sincerely hold]

“If you see fraud and do not say fraud, you are a fraud.”Nasim Taleb

I was talking with a colleague the other day about an AI organization that claims:

  1. AGI is probably coming in the next 20 years.
  2. Many of the reasons we have for believing this are secret.
  3. They’re secret because if we told people about those reasons, they’d learn things that would let them make an AGI even sooner than they would otherwise.

His response was (paraphrasing): “Wow, that’s a really good lie! A lie that can’t be disproven.”

I found this response refreshing, because he immediately jumped to the most likely conclusion.

Near predictions generate more funding

Generally, entrepreneurs who are optimistic about their project get more funding than ones who aren’t. AI is no exception. For a recent example, see the Human Brain Project. The founder, Henry Makram, predicted in 2009 that the project would succeed in simulating a human brain by 2019, and the project was already widely considered a failure by 2013. (See his TED talk, at 14:22)

The Human Brain project got 1.3 billion Euros of funding from the EU.

It’s not hard to see why this is. To justify receiving large amounts of money, the leader must make a claim that the project is actually worth that much. And, AI projects are more impactful if it is, in fact, possible to develop AI soon. So, there is an economic pressure towards inflating estimates of the chance AI will be developed soon.

Fear of an AI gap

The missile gap was a lie by the US Air Force to justify building more nukes, by falsely claiming that the Soviet Union had more nukes than the US.

Similarly, there’s historical precedent for an AI gap lie used to justify more AI development. Fifth Generation Computer Systems was an ambitious 1982 project by the Japanese government (funded for $400 million in 1992, or $730 million in 2019 dollars) to create artificial intelligence through massively parallel logic programming.

The project is widely considered to have failed.  From a 1992 New York Times article:

A bold 10-year effort by Japan to seize the lead in computer technology is fizzling to a close, having failed to meet many of its ambitious goals or to produce technology that Japan’s computer industry wanted.

That attitude is a sharp contrast to the project’s inception, when it spread fear in the United States that the Japanese were going to leapfrog the American computer industry. In response, a group of American companies formed the Microelectronics and Computer Technology Corporation, a consortium in Austin, Tex., to cooperate on research. And the Defense Department, in part to meet the Japanese challenge, began a huge long-term program to develop intelligent systems, including tanks that could navigate on their own.

The Fifth Generation effort did not yield the breakthroughs to make machines truly intelligent, something that probably could never have realistically been expected anyway. Yet the project did succeed in developing prototype computers that can perform some reasoning functions at high speeds, in part by employing up to 1,000 processors in parallel. The project also developed basic software to control and program such computers. Experts here said that some of these achievements were technically impressive.

In his opening speech at the conference here, Kazuhiro Fuchi, the director of the Fifth Generation project, made an impassioned defense of his program.

“Ten years ago we faced criticism of being too reckless,” in setting too many ambitious goals, he said, adding, “Now we see criticism from inside and outside the country because we have failed to achieve such grand goals.”

Outsiders, he said, initially exaggerated the aims of the project, with the result that the program now seems to have fallen short of its goals.

Some American computer scientists say privately that some of their colleagues did perhaps overstate the scope and threat of the Fifth Generation project. Why? In order to coax more support from the United States Government for computer science research.

(emphasis mine)

This bears similarity to some conversations on AI risk I’ve been party to in the past few years. The fear is that Others (DeepMind, China, whoever) will develop AGI soon, so We have to develop AGI first in order to make sure it’s safe, because Others won’t make sure it’s safe and We will. Also, We have to discuss AGI strategy in private (and avoid public discussion), so Others don’t get the wrong ideas. (Generally, these claims have little empirical/rational backing to them; they’re based on scary stories, not historically validated threat models)

The claim that others will develop weapons and kill us with them by default implies a moral claim to resources, and a moral claim to be justified in making weapons in response. Such claims, if exaggerated, justify claiming more resources and making more weapons. And they weaken a community’s actual ability to track and respond to real threats (as in The Boy Who Cried Wolf).

How does the AI field treat its critics?

Hubert Dreyfus, probably the most famous historical AI critic, published “Alchemy and Artificial Intelligence” in 1965, which argued that the techniques popular at the time were insufficient for AGI. Subsequently, he was shunned by other AI researchers:

The paper “caused an uproar”, according to Pamela McCorduck.  The AI community’s response was derisive and personal.  Seymour Papert dismissed one third of the paper as “gossip” and claimed that every quotation was deliberately taken out of context.  Herbert A. Simon accused Dreyfus of playing “politics” so that he could attach the prestigious RAND name to his ideas. Simon said, “what I resent about this was the RAND name attached to that garbage.”

Dreyfus, who taught at MIT, remembers that his colleagues working in AI “dared not be seen having lunch with me.”  Joseph Weizenbaum, the author of ELIZA, felt his colleagues’ treatment of Dreyfus was unprofessional and childish.  Although he was an outspoken critic of Dreyfus’ positions, he recalls “I became the only member of the AI community to be seen eating lunch with Dreyfus. And I deliberately made it plain that theirs was not the way to treat a human being.”

This makes sense as anti-whistleblower activity: ostracizing, discrediting, or punishing people who break the conspiracy to the public. Does this still happen in the AI field today?

Gary Marcus is a more recent AI researcher and critic. In 2012, he wrote:

Deep learning is important work, with immediate practical applications.

Realistically, deep learning is only part of the larger challenge of building intelligent machines. Such techniques lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like “sibling” or “identical to.” They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used. The most powerful A.I. systems … use techniques like deep learning as just one element in a very complicated ensemble of techniques, ranging from the statistical technique of Bayesian inference to deductive reasoning.

In 2018, he tweeted an article in which Yoshua Bengio (a deep learning pioneer) seemed to agree with these previous opinions. This tweet received a number of mostly-critical replies. Here’s one, by AI professor Zachary Lipton:

There’s a couple problems with this whole line of attack. 1) Saying it louder ≠ saying it first. You can’t claim credit for differentiating between reasoning and pattern recognition. 2) Saying X doesn’t solve Y is pretty easy. But where are your concrete solutions for Y?

The first criticism is essentially a claim that everybody knows that deep learning can’t do reasoning. But, this is essentially admitting that Marcus is correct, while still criticizing him for saying it [ED NOTE: the phrasing of this sentence is off (Lipton publicly agrees with Marcus on this point), and there is more context, see Lipton’s reply].

The second is a claim that Marcus shouldn’t criticize if he doesn’t have a solution in hand. This policy deterministically results in the short AI timelines narrative being maintained: to criticize the current narrative, you must present your own solution, which constitutes another narrative for why AI might come soon.

Deep learning pioneer Yann LeCun’s response is similar:

Yoshua (and I, and others) have been saying this for a long time.
The difference with you is that we are actually trying to do something about it, not criticize people who don’t.

Again, the criticism is not that Marcus is wrong in saying deep learning can’t do certain forms of reasoning, the criticism is that he isn’t presenting an alternative solution. (Of course, the claim could be correct even if Marcus doesn’t have an alternative!)

Apparently, it’s considered bad practice in AI to criticize a proposal for making AGI without presenting on alternative solution. Clearly, such a policy causes large distortions!

Here’s another response, by Steven Hansen (a research scientist at DeepMind):

Ideally, you’d be saying this through NeurIPS submissions rather than New Yorker articles. A lot of the push-back you’re getting right now is due to the perception that you haven’t been using the appropriate channels to influence the field.

That is: to criticize the field, you should go through the field, not through the press. This is standard guild behavior. In the words of Adam Smith: “People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices.”

(Also see Marcus’s medium article on the Twitter thread, and on the limitations of deep learning)

[ED NOTE: I’m not saying these critics on Twitter are publicly promoting short AI timelines narratives (in fact, some are promoting the opposite), I’m saying that the norms by which they criticize Marcus result in short AI timelines narratives being maintained.]

Why model sociopolitical dynamics?

This post has focused on sociopolotical phenomena involved in the short AI timelines phenomenon. For this, I anticipate criticism along the lines of “why not just model the technical arguments, rather than the credibility of the people involved?” To which I pre-emptively reply:

  • No one can model the technical arguments in isolation. Basic facts, such as the accuracy of technical papers on AI, or the filtering processes determining what you read and what you don’t, depend on sociopolitical phenomena. This is far more true for people who don’t themselves have AI expertise.
  • “When AGI will be developed” isn’t just a technical question. It depends on what people actually choose to do (and what groups of people actually succeed in accomplishing), not just what can be done in theory. And so basic questions like “how good is the epistemology of the AI field about AI timelines?” matter directly.
  • The sociopolitical phenomena are actively making technical discussion harder. I’ve had a well-reputed person in the AI risk space discourage me from writing publicly about the technical arguments, on the basis that getting people to think through them might accelerate AI timelines (yes, really).

Which is not to say that modeling such technical arguments is not important for forecasting AGI. I certainly could have written a post evaluating such arguments, and I decided to write this post instead, in part because I don’t have much to say on this issue that Gary Marcus hasn’t already said. (Of course, I’d have written a substantially different post, or none at all, if I believed the technical arguments that AGI is likely to come soon had merit to them)

What I’m not saying

I’m not saying:

  1. That deep learning isn’t a major AI advance.
  2. That deep learning won’t substantially change the world in the next 20 years (through narrow AI).
  3. That I’m certain that AGI isn’t coming in the next 20 years.
  4. That AGI isn’t existentially important on long timescales.
  5. That it isn’t possible that some AI researchers have asymmetric information indicating that AGI is coming in the next 20 years. (Unlikely, but possible)
  6. That people who have technical expertise shouldn’t be evaluating technical arguments on their merits.
  7. That most of what’s going on is people consciously lying. (Rather, covert deception hidden from conscious attention (e.g. motivated reasoning) is pervasive; see The Elephant in the Brain)
  8. That many people aren’t sincerely confused on the issue.

I’m saying that there are systematic sociopolitical phenomena that cause distortions in AI estimates, especially towards shorter timelines. I’m saying that people are being duped into believing a lie. And at the point where 73% of tech executives say they believe AGI will be developed in the next 10 years, it’s a major one.

This has happened before. And, in all likelihood, this will happen again.

Self-consciousness wants to make everything about itself

Here’s a pattern that shows up again and again in discourse:

A: This thing that’s happening is bad.

B: Are you saying I’m a bad person for participating in this? How mean of you! I’m not a bad person, I’ve done X, Y, and Z!

It isn’t always this explicit; I’ll discuss more concrete instances in order to clarify. The important thing to realize is that A is pointing at a concrete problem (and likely one that is concretely affecting them), and B is changing the subject to be about B’s own self-consciousness. Self-consciousness wants to make everything about itself; when some topic is being discussed that has implications related to people’s self-images, the conversation frequently gets redirected to be about these self-images, rather than the concrete issue. Thus, problems don’t get discussed or solved; everything is redirected to being about maintaining people’s self-images.

Tone arguments

A tone argument criticizes an argument not for being incorrect, but for having the wrong tone. Common phrases used in tone arguments are: “More people would listen to you if…”, “you should try being more polite”, etc.

It’s clear why tone arguments are epistemically invalid. If someone says X, then X’s truth value is independent of their tone, so talking about their tone is changing the subject. (Now, if someone is saying X in a way that breaks epistemic discourse norms, then defending such norms is epistemically sensible; however, tone arguments aren’t about epistemic norms, they’re about people’s feelings).

Tone arguments are about people protecting their self-images when they or a group they are part of (or a person/group they sympathize with) is criticized. When a tone argument is made, the conversation is no longer about the original topic, it’s about how talking about the topic in certain ways makes people feel ashamed/guilty. Tone arguments are a key way self-consciousness makes everything about itself.

Tone arguments are practically always in bad faith. They aren’t made by people trying to help an idea be transmitted to and internalized by more others. They’re made by people who want their self-images to be protected. Protecting one’s self-image from the truth, by re-directing attention away from the epistemic object level, is acting in bad faith.

Self-consciousness in social justice

A documented phenomenon in social justice is “white women’s tears”. Here’s a case study (emphasis mine):

A group of student affairs professionals were in a meeting to discuss retention and wellness issues pertaining to a specific racial community on our campus. As the dialogue progressed, Anita, a woman of color, raised a concern about the lack of support and commitment to this community from Office X (including lack of measurable diversity training, representation of the community in question within the staff of Office X, etc.), which caused Susan from Office X, a White woman, to feel uncomfortable. Although Anita reassured Susan that her comments were not directed at her personally, Susan began to cry while responding that she “felt attacked”. Susan further added that: she donated her time and efforts to this community, and even served on a local non-profit organization board that worked with this community; she understood discrimination because her family had people of different backgrounds and her closest friends were members of this community; she was committed to diversity as she did diversity training within her office; and the office did not have enough funding for this community’s needs at that time.

Upon seeing this reaction, Anita was confused because although her tone of voice had been firm, she was not angry. From Anita’s perspective, the group had come together to address how the student community’s needs could be met, which partially meant pointing out current gaps where increased services were necessary. Anita was very clear that she was critiquing Susan’s office and not Susan, as Susan could not possibly be solely responsible for the decisions of her office.

The conversation of the group shifted at the point when Susan started to cry. From that moment, the group did not discuss the actual issue of the student community. Rather, they spent the duration of the meeting consoling Susan, reassuring her that she was not at fault. Susan calmed down, and publicly thanked Anita for her willingness to be direct, and complimented her passion. Later that day, Anita was reprimanded for her ‘angry tone,’ as she discovered that Susan complained about her “behavior” to both her own supervisor as well as Anita’s supervisor. Anita was left confused by the mixed messages she received with Susan’s compliment, and Susan’s subsequent complaint regarding her.

The key relevance of this case study is that, while the conversation was originally about the issue of student community needs, it became about Susan’s self-image. Susan made everything about her own self-image, ensuring that the actual concrete issue (that her office was not supporting the racial community) was not discussed or solved.

Shooting the messenger

In addition to crying, Susan also shot the messenger, by complaining about Anita to both her and Anita’s supervisors. This makes sense as ego-protective behavior: if she wants to maintain a certain self-image, she wants to discourage being presented with information that challenges it, and also wants to “one-up” the person who challenged her self-image, by harming that person’s image (so Anita does not end up looking better than Susan does).

Shooting the messenger is an ancient tactic, deployed especially by powerful people to silence providers of information that challenges their self-image. Shooting the messenger is asking to be lied to, using force. Obviously, if the powerful person actually wants information, this tactic is counterproductive, hence the standard advice to not shoot the messenger.

Self-consciousness as privilege defense

It’s notable that, in the cases discussed so far, self-consciousness is more often a behavior of the privileged and powerful, rather than the disprivileged and powerless. This, of course, isn’t a hard-and-fast rule, but there certainly seems to be a relation. Why is that?

Part of this is that the less-privileged often can’t get away with redirecting conversations by making everything about their self-image. People’s sympathies are more often with the privileged.

Another aspect is that privilege is largely about being rewarded for one’s identity, rather than one’s works. If you have no privilege, you have to actually do something concretely effective to be rewarded, like cleaning. Whereas, privileged people, almost by definition, get rewarded “for no reason” other than their identity.

Maintenance of a self-image makes less sense as an individual behavior than as a collective behavior. The phenomenon of bullshit jobs implies that much of the “economy” is performative, rather than about value-creation. While almost everyone can pretend to work, some people are better at it than others. The best people at such pretending are those who look the part, and who maintain the act. That is: privileged people who maintain their self-images, and who tie their self-images to their collective, as Susan did. (And, to the extent that e.g. school “prepares people for real workplaces”, it trains such behavior.)

Redirection away from the object level isn’t merely about defending self-image; it has the effect of causing issues not to be discussed, and problems not to be solved. Such effects maintain the local power system. And so, power systems encourage people to tie their self-images with the power system, resulting in self-consciousness acting as a defense of the power system.

Note that, while less-privileged people do often respond negatively to criticism from more-privileged people, such responses are more likely to be based in fear/anger rather than guilt/shame.

Stop trying to be a good person

At the root of this issue is the desire to maintain a narrative of being a “good person”. Susan responded to the criticism of her office by listing out reasons why she was a “good person” who was against racial discrimination.

While Anita wasn’t actually accusing Susan of racist behavior, it is, empirically, likely that some of Susan’s behavior is racist, as implicit racism is pervasive (and, indeed, Susan silenced a woman of color speaking on race). Susan’s implicit belief is that there is such a thing as “not being racist”, and that one gets there by passing some threshold of being nice to marginalized racial groups. But, since racism is a structural issue, it’s quite hard to actually stop participating in racism, without going and living in the woods somewhere. In societies with structural racism, ethical behavior requires skillfully and consciously reducing harm given the fact that one is a participant in racism, rather than washing one’s hands of the problem.

What if it isn’t actually possible to be “not racist” or otherwise “a good person”, at least on short timescales? What if almost every person’s behavior is morally depraved a lot of the time (according to their standards of what behavior makes someone a “good person”)? What if there are bad things that are your fault? What would be the right thing to do, then?

Calvinism has a theological doctrine of total depravity, according to which every person is utterly unable to stop committing evil, to obey God, or to accept salvation when it is offered. While I am not a Calvinist, I appreciate this teaching, because quite a lot of human behavior is simultaneously unethical and hard to stop, and because accepting this can get people to stop chasing the ideal of being a “good person”.

If you accept that you are irredeemably evil (with respect to your current idea of a good person), then there is no use in feeling self-conscious or in blocking information coming to you that implies your behavior is harmful. The only thing left to do is to steer in the right direction: make things around you better instead of worse, based on your intrinsically motivating discernment of what is better/worse. Don’t try to be a good person, just try to make nicer things happen. And get more foresight, perspective, and cooperation as you go, so you can participate in steering bigger things on longer timescales using more information.

Paradoxically, in accepting that one is irredeemably evil, one can start accepting information and steering in the right direction, thus developing merit, and becoming a better person, though still not “good” in the original sense. (This, I know from personal experience)

(See also: What’s your type: Identity and its Discontents; Blame games; Bad intent is a disposition, not a feeling)

Writing children’s picture books

Here’s an exercise for explaining and refining your opinions about some domain, X:

Imagine writing a 10-20 page children’s picture book about topic X. Be fully honest and don’t hide things (assume the child can handle being told the truth, including being told non-standard or controversial facts).

Here’s a dialogue, meant to illustrate how this could work:

A: What do you think about global warming?

B: Uhh…. I don’t know, it seems real?

A: How would you write a 10-20 page children’s picture book about global warming?

B: Oh, I’d have a diagram showing carbon dioxide exiting factories and cars, floating up in the atmosphere, and staying there. Then I’d have a picture of sunlight coming through the atmosphere, bounding off the earth, then going back up, but getting blocked by the carbon dioxide, so it goes back to the earth and warms up the earth a second time. Oh, wait, if the carbon dioxide prevents the sunlight from bouncing from the earth to the sky, wouldn’t it also prevent the sunlight from entering the atmosphere in the first place? Oh, I should look that up later [NOTE: the answer is that CO2 blocks thermal radiation much more than it blocks sunlight].

Anyway, after that I’d have some diagrams showing global average temperature versus global CO2 level that show how the average temperature is tracking CO2 concentration, with some lag time. Then I’d have some quotes about scientists and information about the results of surveys. I’d show a graph showing how much the temperature would increase under different conditions… I think I’ve heard that, with substantial mitigation effort, the temperature difference might be 2 degrees Celsius from now until the end of the century [NOTE: it’s actually 2 degrees from pre-industrial times till the end of the century, which is about 1 degree from now]. And I’d want to show what 2 degrees Celsius means, in terms of, say, a fraction of the difference between winter and summer.

I’d also want to explain the issue of sea level rise, by showing a diagram of a glacier melting. Ice floats, so if the glacier is free-floating, then it melting doesn’t cause a sea level rise (there’s some scientific principle that says this, I don’t remember what it’s called), but if the glacier is on land, then when it melts, it causes the sea level to rise. I’d also want to show a map of the areas that would get flooded. I think some locations, like much of Florida, get flooded, so the map should show that, and there should also be a pie chart showing how much of the current population would end up underwater if they didn’t move (my current guess is that it’s between 1 percent and 10 percent, but I could be pretty wrong about this [NOTE: the answer is 30 to 80 million people, which is between about 0.4% and 1.1%]).

I’d also want to talk about possible mitigation efforts. Obviously, it’s possible to reduce energy consumption (and also meat consumption, because cows produce methane which is also a greenhouse gas). So I’d want to show a chart of which things produce the most greenhouse gases (I think airplane flights and beef are especially bad), and showing the relationship between possible reductions in that and the temperature change.

Also, trees take CO2 out of the atmosphere, so preserving forests is a way to prevent global warming. I’m confused about where the CO2 goes, exactly, since there’s some cycle it goes through in the forest; does it end up underground? I’d have to look this up.

I’d also want to talk about the political issues, especially the disinformation in the space. There’s a dynamic where companies that pollute want to deny that man-made global warming is a real, serious problem, so there won’t be regulations. So, they put out disinformation on television, and they lobby politicians. Sometimes, in the discourse, people go from saying that global warming isn’t real, to saying it’s real but not man-made, to saying it’s real and man-made but it’s too late to do anything about it. That’s a clear example of motivated cognition. I’d want to explain how this is trying to deny that any changes should be made, and speculate about why people might want to, such as because they don’t trust the process that causes changes (such as the government) to do the right thing.

And I’d also want to talk about geoengineering. There are a few proposals I know of. One is to put some kind of sulfer-related chemical in the atmosphere, to block out sunlight. This doesn’t solve ocean acidification, but it does reduce the temperature. But, it’s risky, because if you stop putting the chemical in the atmosphere, then that causes a huge temperature swing.

I also know it’s possible to put iron in the ocean, which causes a plankton bloom, which… does something to capture CO2 and store it in the bottom of the ocean? I’m really not sure how this works, I’d want to look it up before writing this section.

There’s also the proposal of growing and burning trees, and capturing and storing the carbon. When I looked this up before, I saw that this takes quite a lot of land, and anyway there’s a lot of labor involved, but maybe some if it can be automated.

There are also political issues with geoengineering. There are people who don’t trust the process of doing geoengineering to make things better instead of worse, because they expect that people’s attempts to reason about it will make lots of mistakes (or people will have motivated cognition and deceive themselves and each other), and then the resulting technical models will make things that don’t work. But, the geoengineering proposals don’t seem harder than things that humans have done in the past using technical knowledge, like rockets, so I don’t agree that this is such a big problem.

Furthermore, some people want to shut down discussion of geoengineering, because such discussion would make it harder to morally pressure people into reducing carbon emissions. I don’t know how to see this as anything other than an adversarial action against reasonable discourse, but I’m sure there is some motivation at play here. Perhaps it’s a motivation to have everyone come together as one, all helping together, in a hippie-ish way. I’m not sure if I’m right here, I’d want to read something written by one of these people before making any strong judgments.

Anyway, that’s how I’d write a picture book about global warming.


So, I just wrote that dialogue right now, without doing any additional research. It turns out that I do have quite a lot of opinions about global warming, and am also importantly uncertain in some places, some of which I just now became aware of. But I’m not likely to produce these opinions if asked “what do you think about global warming?”

Why does this technique work? I think it’s because, if asked for one’s opinions in front of an adult audience, it’s assumed that there is a background understanding of the issue, and you have to say something new, and what you decide to say says something about you. Whereas, if you’re explaining to a child, then you know they lack most of the background understanding, and so it’s obviously good to explain that.

With adults, it’s assumed there are things that people act like “everyone knows”, where it might be considered annoying to restate them, since it’s kind of like talking down to them. Whereas, the illusion or reality that “everyone knows” is broken when explaining to children.

The countervailing force is that people are tempted to lie to children. Of course, it’s necessary to not lie to children to do the exercise right, and also to raise or help raise children who don’t end up in an illusory world of confusion and dread. I would hope that someone who has tendencies to hide things from children would at least be able to notice and confront these tendencies in the process of imagining writing children’s picture books.

I think this technique can be turned into a generalized process for making world models. If someone wrote a new sketch of a children’s picture book (about a new topic) every day, and did the relevant research when they got stuck somewhere, wouldn’t they end up with a good understanding of both the world and of their own models of the world after a year? It’s also a great starting point from which to compare your opinions to others’ opinions, or to figure out how to explain things to either children or adults.

Anyway, I haven’t done this exercise for very many topics yet, but I plan on writing more of these.

Occamian conjecturalism: we posit structures of reality

Here’s my current explicit theory of ontology and meta-epistemology. I haven’t looked into the philosophical literature that much, but this view has similarities to both conjectural realism and to minimum description length.

I use “entity” to mean some piece of data in the mind, similar to an object in an object-oriented programming language. They’re the basic objects perception and models are made of.

Humans start with primitive entities, which include low-level physical percepts, and perhaps other things, though I’m not sure.

We posit entities to explain other entities, using Occam/probability rules; some entities are rules about how entities predict/explain other entities. Occam says to posit few entities to explain many.
Probability says explanations may be stochastic (e.g. dogs are white with 30% probability). See minimum description length for more on how Occam and probability interact.

High-level percepts get posited to explain low-level percepts, e.g. a color splotch gets posited to explain all the individual colored points that are close to each other. A line gets posited to explain a bunch of individual colored points that are in, well, a line.

Persistent objects are posited (object permanence) to explain regularities in high-level percepts over spacetime. Object-types get posited to explain similarities between different objects.

Generalities (e.g. “that swans are white”) get posited to explain regularities between different objects. Generalities may be stochastic (coins turn up heads half the time when flipped). It’s hard to disentangle generalities from types themselves (is being white a generality about swans, or a defining feature?). Logical universals (such as modus ponens) are generalities.

Some generalities are causal relations, e.g. that striking a match causes a flame. Causal relations explain “future” events from “past” events, in a directed acyclic graph structure.

So far, the picture is egocentric, in that percepts are taken to be basic. If I adopt a percept-based ontology, I will believe that the world moves around me as I walk, rather than believing that I move through the world. Things are addressed in coordinates relative to my position, not relative to the ground. (This is easy to see if you pay attention to your visual field while walking around)

Whence objectivity? As I walk, most of those things around me “don’t move” if I posit that the ground is stable, as they have the same velocity as the ground. So by positing the ground is still while I move, I posit fewer motions. While I could in theory continue using an egocentric reference frame and posit laws of motion to explain why the world moves around me, this ends up more complicated and epicyclical than simply positing that the ground is still while I move. Objectivity-in-general is a result of these shifts in reference frame, where things are addressed relative to some common ground rather than egocentrically.

Objectivity implies theory of mind, in that I take my mental phenomena to be “properties of me-the-person” rather than “the mental phenomena that are apparent”, as an egocentric reference frame would take them to be. I posit other minds like my own, which is a natural result of the generalization that human bodies are inhabited by minds. Empathy is the connection I effectively posit between my own mental phenomena and others’ through this generalization.

An ontology shift happens when we start positing different types of entities than we did previously. We may go from thinking in terms of color splotches to thinking in terms of objects, or from thinking in terms of chemical essences to thinking in terms of molecules. Each step is justified by the Occam/probability rules; the new ontology must make the overall structure simpler.

Language consists of words, which are themselves entities that explain lower-level percepts (phonemes, dots of ink on paper, etc). Children learning language find that these entities are correlated with the reality they have already posited. (This is clear in the naive case, where teachers simply use language to honestly describe reality, but correlation is still present when language use is dishonest). The combination of objectivity and language has the result of standardizing a subset of ontology between different speakers, though nonverbal ontology continues to exist.

Mathematical entities (e.g. numbers) are posited to explain regularities in entities, such as the regularity between “two things over here” and “two things over there”, and between linguistic entities such as the word “two” and the actual “two things over here”. Mathematical generalizations are posited to explain mathematical entities.

Fictional worlds are posited to explain fictional media. We, in some sense, assume that a fiction book is an actual description of some world. Unlike with nonfiction media, we don’t expect this world to be the same as the one we move through in everyday life; it isn’t the actual world. Reality is distinguished from fantasy by their differing correlational structures.

If everything but primitive entities is posited, in what sense are these things “ultimately real”? There is no notion of “ultimately real” outside the positing structure. We may distinguish reality from fantasy within the structure, as the previous paragraph indicates. We may also distinguish illusion from substance, as we expect substance but not illusion to generate concordant observations upon being viewed differently. We may distinguish persistent ontology (which stays the same as we get more data) from non-persistent ontology (which changes as we get more data). And we may distinguish primitive entities from posited ones. But, there doesn’t seem to be a notion of ultimate reality beyond these particular distinctions and ones like them. I think this is a feature, not a bug. However, it’s at least plausible that when I learn more, my ontology will stabilize to the point where I have a natural sense of ultimate reality.

What does it mean for propositions to be true or false? A proposition is some sentence (an entity) corresponding to a predicate on worlds; it is true if and only if the predicate is true of the world. For example, “snow is white” is true if and only if snow is white. This is basically a correspondence theory, where we may speak of correspondences between the (already-ontologized) territory and ontological representations of it.

But, what about ontological uncertainty? It’s hard to say whether an ontology, such as the ontology of objects, is “true” or “false”. We may speak of it “fitting the territory well” or “fitting the territory badly”, which is not the same thing as “true” or “false” in a propositional sense. If we expect our ontologies to shift in the future (and I expect mine to shift), then, from the perspective of our new ontology, our current ontology will be false, the way Newtonian mechanics is false. However, we don’t have access to this hypothetical future ontology yet, so we can’t use it to judge our current ontology as false; the judgment that the original ontology is false comes along with a new worldview, which we don’t have yet. What we can say is whether or not we expect our reasoning processes to produce ontology shifts when exposed to future data.

May non-falsifiable entities be posited? Yes, if they explain more than they posit. Absent ability to gain more historical data, many historical events are non-falsifiable. Still, positing such an event explains the data (e.g. artifacts supposedly left at the site of the event) better than alternatives (e.g. positing that the writing was produced by people who happened to have the same delusion). So, entities need not be falsifiable in general, although ones that are completely unrelated to any observational consequences will never be posited in the first place.

Is reality out there, or is it all in our heads? External objects are out there; they aren’t in your brain, or they would be damaging your brain tissue. Yet, our representations of such objects are in our heads. Objects differ from our representations of them; they’re in different places, are different sizes, and are shaped differently. When I speak of posited structures, I speak of representations, not the objects themselves, although our posited structures constitute our sense of all that is.

Reductionism and physicalism

But isn’t reality made of atoms (barring quantum mechanics), not objects? We posit atoms to explain objects and their features. Superficially, positing so many atoms violates Occamian principles, but this is not an issue in probabilistic epistemologies, where we may (implicitly) sum over many possible atomic configurations. The brain doesn’t actually do such a sum; in practice we rarely posit particular atoms, and instead posit generalities about atoms and their relation to other entities (such as chemical types). Objects still exist in our ontologies, and are explained by atoms. Atoms explain, but do not explain away, objects.

But couldn’t you get all the observations you’re using objects to explain using atoms? Perhaps an AI can do this, but a human can’t. Humans continue to posit objects upon learning about atoms. The ontology shift to believing in only-atoms would be computationally intractable.

But doesn’t that mean the ultimate reality is atoms, not objects? “Ultimate reality” is hard to define, as explained previously. Plausibly, I would believe in atoms and not believe in objects if I thought much faster than I actually do. This would make objects a non-persistent ontology, as opposed to the more-persistent atomic ontology. However, this conterfactual is strange, as it assumes my brain is larger than the rest of the universe. Even then, I would be unable to model my brain as atomic. So it seems that, as an epistemic fact, atoms aren’t all there are; I would never shift to an atom-only ontology, no matter how big my brain was.

But isn’t this confusing the territory and the best map of the territory? As explained previously, our representations are not the territory. Our sense of the territory itself (not just of our map of it) contains objects, or, to drop the quotation, the territory itself contains objects. (Why drop the quotation? I’m describing my sense of the territory to you; there is nothing else I could proximately describe, other than my sense of the territory; in reaching for the territory itself, I proximately find my sense of it)

This discussion is going towards the idea of supervenience, which is that high-level phenomena (such as objects) are entirely determined by low-level phenomena (such as atoms). Supervenience is a generality that relates high-level phenomena to low-level ones. Importantly, supervenience is non-referential (and thus vacuous) if there are no high-level phenomena.

If all supervenes on atoms, then there are high-level phenomena (such as objects), not just atoms. Positing supervenience yields all the effective predictions that physicalism could yield (in our actual brains, not in theoretical super-AIs). Supervenience may imply physicalism, depending on the definition of physicalism, but it doesn’t imply that atoms are the only entities.

Supervenience leaves open a degree of freedom, namely, the function mapping low-level phenomena to high-level phenomena. In the case of consciousness as the high-level phenomenon, this function will, among other things, resolve indexical/anthropic uncertainty (which person are the experiences I see happening to?) and uncertainty about the hard problem of consciousness (which physical structures are conscious, and of what?).

Doesn’t this imply that p-zombies are conceivable? We may distinguish “broad” notions of conceivability, under which just about any posited structure is conceivable (and under which p-zombies are conceivable), and “narrow” notions, where the structure must satisfy certain generalities, such as logic and symmetry. Adding p-zombies to the posited structure might break important general relations we expect will hold, such as logic, symmetry of function from physical structure to mental structure, or realization-independence. I’m not going to resolve the zombie argument in this particular post, but will conclude that it is at least not clear that zombies are conceivable in the narrow sense.

Conclusion

This is my current best simple, coherent view of ontology and meta-epistemology. If I were to give it a name, it would be “Occamian conjecturalism”, but it’s possible it has already been named. I’m interested in criticism of this view, or other thoughts on it.

Conditional revealed preference

There’s a pretty common analysis of human behavior that goes something like this:

“People claim that they want X. However, their actions are optimizing towards Y instead of X. If they really cared about X, they would do something else instead. Therefore, they actually want Y, and not X.”

This is revealed preference analysis. It’s quite useful, in that if people’s actions are effectively optimizing for Y and not X, then an agent-based model of the system will produce better predictions by predicting that people want Y and not X.

So, revealed preference analysis is great for analyzing a multi-agent system in equilibrium. However, it often has trouble predicting what would happen when a major change happens to the system.

As an example, consider a conclusion Robin Hanson gives on school:

School isn’t about learning “material,” school is about learning to accept workplace domination and ranking, and tolerating long hours of doing boring stuff exactly when and how you are told.

(note that I don’t think Hanson is claiming things about what people “really want” in this particular post, although he does make such claims in other writing)

Hanson correctly infers from the fact that most schools are highly authoritarian that school is effectively “about” learning to accept authoritarian work environments. We could make “about” more specific: the agents who determine what happens in schools (administrators, teachers, voters, parents, politicians, government employees) are currently taking actions that cause schools to be authoritarian, in a coordinated fashion, with few people visibly resisting this optimization.

This revealed preference analysis is highly useful. However, it leaves degrees of freedom open in what the agents terminally want. These degrees of freedom matter when predicting how those agents will act under different circumstances (their conditional revealed preferences). For example:

  • Perhaps many of the relevant agents actually do want schools to help children learn, but were lied to about what forms of school are effective for learning. This would predict that, upon receiving credible evidence that free schools are more effective for learning while being less authoritarian, they would support free schools instead.
  • Perhaps many of the relevant agents want school to be about learning, but find themselves in a grim trigger equilibrium where they expect to get punished for speaking out about the actual nature of school, and also to be punished for not punishing those who speak out. This would predict that, upon seeing enough examples of people speaking out and not being punished, they would join the new movement.
  • Perhaps many of the relevant agents have very poor world models of their own, and must therefore navigate according to imitation and to “official reality” narratives, which constrain them to acting as if school is for learning. This would predict that, upon gaining much more information about the world and gaining experience in navigating it according to their models (rather than the official narratives), they would favor free schools over authoritarian schools.

It’s hard to tell which if these hypotheses (or other hypotheses) are true given only information about how people act in the current equilibrium. These hypotheses make conditional and counterfactual predictions: they predict what people would do, given different circumstances than their current ones.

This is not to say that people’s stories about what they want are to be taken at face value; the gold standard for determining what people want is not what they say, but what they actually optimize for under various circumstances, including ones substantially different from present ones. (Obviously, their words can be evidence about their counterfactual actions, to the extent that they are imaginative and honest about the counterfactual scenarios)

To conclude, I suggest the following heuristics:

  • In analyzing an equilibrium, look mainly at what people actually optimize for with their actions, not what they say they’re optimizing for.
  • In guessing what they “really want”, additionally imagine their actions in alternative scenarios where they e.g. have more information and more ability to coordinate with those who have similar opinions.
  • Actually find data about these alternative scenarios, by e.g. actually informing people, or finding people who were informed and seeing how their actions changed.

Boundaries enable positive material-informational feedback loops

(Also posted on LessWrong)

[epistemic status: obvious once considered, I think]

If you want to get big things done, you almost certainly need positive feedback loops. Unless you can already do all the necessary things, you need to do/make things that allow you to do/make more things in the future. This dynamic can be found in RPG and economy-management games, and in some actual economic systems, such as industrializing economies.

Material, information, and economy

Some goods that can be used in a positive feedback loop, such as software and inventions, are informational. Once produced, they can be used indefinitely in the future. In economic terms, they are nonrivalous.

Other goods are material, such as manufactured goods and energy. They can’t be copied cheaply. In economic terms, they are rivalrous.

In practice, any long-lasting positive feedback loop contains both informational and material goods, as production of information requires a physical substrate. While ensuring that informational goods can be used in the future is an organization and communication problem (a subject beyond the scope of this post), the problem of ensuring that material goods can be used in the future is additionally a security problem.

An important question to ask is: why haven’t material-informational positive feedback loops already taken over the world? Why don’t we have so much stuff by now that providing for people’s material needs (such as food and housing) is trivial?

To some extent, material-informational positive feedback loops have taken over the world, but they seem much slower than one would naively expect. See cost disease. As an example of cost disease, the average cost of a new house in the USA has quadrupled over a 60-year period (adjusted for inflation!), whereas models of capitalism based on economy-management games such as Factorio (or, more academically, according to the labor/capital based economic models of classical economists such as David Ricardo) would suggest that houses would be plentiful by now. (And no, this isn’t just because of land prices; it costs about $300K to build a house in the US 2018)

Security and boundaries

I’ve already kind of answered this question by saying that ensuring that material goods can be used in the future is a security problem. If you use one of your material goods to produce another material good, and someone takes this new good, then you can’t put this good back into your production process. Thus, what would have been a positive feedback loop is instead a negative feedback loop, as it leaks goods faster than it produces them.

Solving security issues generally requires boundaries. You need to draw a boundary in material space somewhere, differentiating the inside from the outside, such that material goods (such as energy) on the inside don’t leak out, and can potentially have positive feedback loops. There are many ways to prevent leaks across a boundary while still allowing informational and material to pass through sometimes, such as semiporous physical barriers and active policing. Regardless of the method to enforce the boundary, the boundary has to exist in some geometrical sense for it to make sense to say that e.g. energy increases within this system.

Not all security issues are from other agents; some are from non-agentic processes. Consider a homeostatic animal. If the animal expends energy to warm its body, and this warmth escapes, the animal will fail to realize gains from the energy expenditure. Thus, the animal has a boundary (namely, skin) to solve this “security problem”. The cold air particles that take away heat from the animal are analogous to agents that directly take resources, though obviously less agentic. While perhaps my usage of the word “security” to include responses to nonagentic threats is nonstandard, I hope it is clear that these are on the same spectrum as agentic threats, and can be dealt with in some of the same ways.

It is also worth thinking about semi-agentic entities, such as microorganisms. One of the biggest threats to a food store is microorganisms (i.e. rotting), and slowing the negative feedback loops depleting food stores requires solving this security problem using a boundary (such as a sealed container or a subset of the air that is colder than the outside air, such as in a refrigerator).

Property rights are a simple example of boundaries. Certain goods are considered to be “owned” by different parties, such that there is common agreement about who owns what, and people are for one reason or another not motivated to take other people’s stuff. Such division of goods into sets owned by different parties is a set of boundaries enabling positive feedback loops, which are especially salient in capitalism.

What about trust between different entities? A complex ecosystem will contain entities satisfying a variety of niches, which include parasitism and predation (which are on the same spectrum). A trust network can be thought of as a way for different entities to draw various boundaries, often fuzzy ones, that mostly exclude parasites/predators, such that there are few leaks from inside this boundary to outside this boundary (which would include parasitism/predation by entities outside the boundary). There are “those who you trust” and “those who you don’t trust” (both fuzzy sets), and you assign more utility to giving resources to those you trust, as this allows for positive feedback loops within a system that contains you (namely, the trust network).

Externalities and sustainability

Since no subsystem of the world is causally closed, all positive feedback loops have externalities. By definition, the outside world is only directly affected by these externalities, and is only affected by what happens within the boundary to the extent that this eventually leads to externalities. A wise designer of a positive feedback loop will anticipate its externalities, and set it up such that the externalities are overall desirable to the designer. After all, there is no point to creating a positive feedback loop unless its externalities are mostly positive.

A positive feedback loop’s externalities modify its environment, affecting its own ability to continue; for example, a positive feedback loop of microorganisms eating food will exhaust itself by consuming the food. So, different positive feedback loops are environmentally sustainable to different extents. Both production and conquest generate positive feedback loops, as Ben Hoffman discusses in this post, but production is much more environmentally sustainable than conquest.

One way to increase environmental sustainability is to move more processes to the inside of the boundary. For example, a country that is consuming large amounts of iron (driving up iron prices) may consider setting up its own iron mines. Thus, the inside of the boundary becomes more like an economy of its own. This is sometimes known as import replacement.

Of course, the environmental sustainability of a positive feedback loop can also be a negative, as it is better for some processes (such as rotting) to limit or exhaust themselves, thus transitioning to negative feedback or a combination of positive and negative feedback. Processes that include intentionally-designed positive and negative feedback can be much more environmentally sustainable than processes that only have positive feedback loops designed in, since they can limit their growth when such growth would be unsustainable.

While in theory the philosophy of effective altruism (EA) would imply a strong (and likely overwhelming) emphasis on creating and maintaining environmentally sustainable positive feedback loops with positive externalities, typically-recommended EA practices (such as giving away 10% of one’s income) are negative feedback loops (the more you make, the more you give away). While in theory the place the resources are given to could have a faster positive feedback loop than just investing in yourself, your friends, and your projects, in practice I rarely believe claims of this form that come from the EA movement; for example, if a country has a high rate of poverty, that indicates that the negative feedback loops (such as corruption) are likely stronger than the positive ones, and that giving resources is ineffective. Thus, I cannot in good conscience allow anything like current EA ideology to substantially control resource allocation in most systems I create, even though EA philosophy taken to its logical conclusion would get the right answer on the importance of securing the boundaries of positive feedback loops.

Policy suggestions

How do these ideas translate to action? One suggestion is that, if you are trying to do something big, you use one or more positive feedback loops, and ask yourself the following questions about each one:

  1. What’s the generator of my positive feedback loop (i.e. what’s the process that turns stuff into more stuff)?
  2. What is the boundary within which the positive feedback increases resources?
  3. How am I reducing leakage across this boundary?
  4. What are the externalities of this positive feedback loop?
  5. How environmentally sustainable is this positive feedback loop?
  6. Are there built-in negative feedback loops that increase environmental sustainability?

(thanks to Bryce Hidysmith for a conversation that led to this post)

Act of Charity

(Also posted on LessWrong)

The stories and information posted here are artistic works of fiction and falsehood. Only a fool would take anything posted here as fact.

—Anonymous

Act I.

Carl walked through the downtown. He came across a charity stall. The charity worker at the stall called out, “Food for the Africans. Helps with local autonomy and environmental sustainability. Have a heart and help them out.” Carl glanced at the stall’s poster. Along with pictures of emaciated children, it displayed infographics about how global warming would cause problems for African communities’ food production, and numbers about how easy it is to help out with money. But something caught Carl’s eye. In the top left, in bold font, the poster read, “IT IS ALL AN ACT. ASK FOR DETAILS.”

Carl: “It’s all an act, huh? What do you mean?”

Worker: “All of it. This charity stall. The information on the poster. The charity itself. All the other charities like us. The whole Western idea of charity, really.”

Carl: “Care to clarify?”

Worker: “Sure. This poster contains some correct information. But a lot of it is presented in a misleading fashion, and a lot of it is just lies. We designed the poster this way because it fits with people’s idea is of a good charity they should give money to. It’s a prop in the act.”

Carl: “Wait, the stuff about global warming and food production is a lie?”

Worker: “No, that part is actually true. But in context we’re presenting it as some kind of imminent crisis that requires an immediate infusion of resources, when really it’s a very long-term problem that will require gradual adjustment of agricultural techniques, locations, and policies.”

Carl: “Okay, that doesn’t actually sound like more of a lie than most charities tell.”

Worker: “Exactly! It’s all an act.”

Carl: “So why don’t you tell the truth anyway?”

Worker: “Like I said before, we’re trying to fit with people’s idea of what a charity they should give money to looks like. More to the point, we want them to feel compelled to give us money. And they are compelled by some acts, but not by others. The idea of an immediate food crisis creates more moral and social pressure towards immediate action, than the idea that there will be long-term agricultural problems that require adjustments.

Carl: “That sounds…kind of scammy?”

Worker: “Yes, you’re starting to get it! The act is about violence! It’s all violence!”

Carl: “Now hold on, that seems like a false equivalence. Even if they were scammed by you, they still gave you money of their own free will.”

Worker: “Most people, at some level, know we’re lying to them. Their eyes glaze over ‘IT IS ALL AN ACT’ as if it were just a regulatory requirement to put this on charity posters. So why would they give money to a charity that lies to them? Why do you think?”

Carl: “I’m not nearly as sure as you that they know this! Anyway, even if they know at some level it’s a lie, that doesn’t mean they consciously know, so to their conscious mind it seems like being completely heartless.”

Worker: “Exactly, it’s emotional blackmail. I even say ‘Have a heart and help them out’. So if they don’t give us money, there’s a really convenient story that says they’re heartless, and a lot of them will even start thinking about themselves that way. Having that story told about them opens them up to violence.”

Carl: “How?”

Worker: “Remember Martin Shkreli?”

Carl: “Yeah, that asshole who jacked up the Daraprim prices.”

Worker: “Right. He ended up going to prison. Nominally, it was for securities fraud. But it’s not actually clear that whatever security fraud he did was worse than what others in his industry were doing. Rather, it seems likely that he was especially targeted because he was a heartless asshole.”

Carl: “But he still broke the law!”

Worker: “How long would you be in jail if you got punished for every time you had broken the law?”

Carl: “Well, I’ve done a few different types of illegal drugs, so… a lot of years.”

Worker: “Exactly. Almost everyone is breaking the law. So it’s really, really easy for the law to be enforced selectively, to punish just about anyone. And the people who get punished the most are those who are villains in the act.”

Carl: “Hold on. I don’t think someone would actually get sent to prison because they didn’t give you money.”

Worker: “Yeah, that’s pretty unlikely. But things like it will happen. People are more likely to give if they’re walking with other people. I infer that they believe they will be abandoned if they do not give.”

Carl: “That’s a far cry from violence.”

Worker: “Think about the context. When you were a baby, you relied on your parents to provide for you, and abandonment by them would have meant certain death. In the environment of evolutionary adaptation, being abandoned by your band would have been close to a death sentence. This isn’t true in the modern world, but people’s brains mostly don’t really distinguish abandonment from violence, and we exploit that.”

Carl: “That makes some sense. I still object to calling it violence, if only because we need a consistent definition of ‘violence’ to coordinate, well, violence against those that are violent. Anyway, I get that this poster is an act, and the things you say to people walking down the street are an act, but what about the charity itself? Do you actually do the things you say you do?”

Worker: “Well, kind of. We actually do give these people cows and stuff, like the poster says. But that isn’t our main focus, and the main reason we do it is, again, because of the act.”

Carl: “Because of the act? Don’t you care about these people?”

Worker: “Kind of. I mean, I do care about them, but I care about myself and my friends more; that’s just how humans work. And if it doesn’t cost me much, I will help them. But I won’t help them if it puts our charity in a significantly worse position.”

Carl: “So you’re the heartless one.”

Worker: “Yes, and so is everyone else. Because the standard you’re set for ‘not heartless’ is not one that any human actually achieves. They just deceive themselves about how much they care about random strangers; the part of their brain that inserts these self-deceptions into their conscious narratives is definitely not especially altruistic!”

Carl: “According to your own poster, there’s going to be famine, though! Is the famine all an act to you?”

Worker: “No! Famine isn’t an act, but most of our activities in relation to it are. We give people cows because that’s one of the standard things charities like ours are supposed to do, and it looks like we’re giving these people local autonomy and stuff.”

Carl: “Looks like? So this is all just optics?”

Worker: “Yes! Exactly!”

Carl: “I’m actually really angry right now. You are a terrible person, and your charity is terrible, and you should die in a fire.”

Worker: “Hey, let’s actually think through this ethical question together. There’s a charity pretty similar to ours that’s set up a stall a couple blocks from here. Have you seen it?”

Carl: “Yes. They do something with water filtering in Africa.”

Worker: “Well, do you think their poster is more or less accurate than ours?”

Carl: “Well, I know yours is a lie, so…”

Worker: “Hold on. This is Gell-Mann amnesia. You know ours is a lie because I told you. This should adjust your model of how charities work in general.”

Carl: “Well, it’s still plausible that they are effective, so I can’t condemn—”

Worker: “Stop. In talking of plausibility rather than probability, you are uncritically participating in the act. You are taking symbols at face value, unless there is clear disproof of them. So you will act like you believe any claim that’s ‘plausible’, in other words one that can’t be disproven from within the act. You have never, at any point, checked whether either charity is doing anything in the actual, material world.”

Carl: “…I suppose so. What’s your point, anyway?”

Worker: “You’re shooting the messenger. All or nearly all of these charities are scams. Believe me, we’ve spent time visiting these other organizations, and they’re universally fraudulent, they just have less self-awareness about it. You’re only morally outraged at the ones that don’t hide it. So your moral outrage optimizes against your own information. By being morally outraged at us, you are asking to be lied to.”

Carl: “Way to blame the victim. You’re the one lying.”

Worker: “We’re part of the same ecosystem. By rewarding a behavior, you cause more of it. By punishing it, you cause less of it. You reward lies that have plausible deniability and punish truth, when that truth is told by sinners. You’re actively encouraging more of the thing that is destroying your own information!”

Carl: “It still seems pretty strange to think that they’re all scams. Like, some of my classmates from college went into the charity sector. And giving cows to people who have food problems actually seems pretty reasonable.”

Worker: “It’s well known by development economists that aid generally creates dependence, that in giving cows to people we disrupt their local economy’s cow market, reducing the incentive to raise cattle. And in theory it could still be worth it, but our preliminary calculations indicate that it probably isn’t.”

Carl: “Hold on. You actually ran the calculation, found that your intervention was net harmful, and then kept doing it?”

Worker: “Yes. Again, it is all—”

Carl: “What the fuck, seriously? You’re a terrible person.”

Worker: “Do you think any charity other than us would have run the calculation we did, and then actually believe the result? Or would they have fudged the numbers here and there, and when even a calculation with fudged numbers indicated that the intervention was ineffective, come up with a reason to discredit this calculation and replace it with a different one that got the result they wanted?”

Carl: “Maybe a few… but I see your point. But there’s a big difference between acting immorally because you deceived yourself, and acting immorally with a clear picture of what you’re doing.”

Worker: “Yes, the second one is much less bad!”

Carl: “What?”

Worker: “All else being equal, it’s better to have clearer beliefs than muddier ones, right?”

Carl: “Yes. But in this case, it’s very clear that the person with the clear picture is acting immorally, while the self-deceiver, uhh..”

Worker: “…has plausible deniability. Their stories are plausible even though they are false, so they have more privilege within the act. They gain privilege by muddying the waters, or in other words, destroying information.”

Carl: “Wait, are you saying self-deception is a choice?”

Worker: “Yes! It’s called ‘motivated cognition’ for a reason. Your brain runs something like a utility-maximization algorithm to tell when and how you should deceive yourself. It’s epistemically correct to take the intentional stance towards this process.”

Carl: “But I don’t have any control over this process!”

Worker: “Not consciously, no. But you can notice the situation you’re in, think about what pressures there are on you to self-deceive, and think about modifying your situation to reduce these pressures. And you can do this to other people, too.”

Carl: “Are you saying everyone is morally obligated to do this?”

Worker: “No, but it might be in your interest, since it increases your capabilities.”

Carl: “Why don’t you just run a more effective charity, and advertise on that? Then you can outcompete the other charities.”

Worker: “That’s not fashionable anymore. The ‘effectiveness’ branding has been tried before; donors are tired of it by now. Perhaps this is partially because there aren’t functional systems that actually check which organizations are effective and which aren’t, so scam charities branding themselves as effective end up outcompeting the actually effective ones. And there are organizations claiming to evaluate charities’ effectiveness, but they’ve largely also become scams by now, for exactly the same reasons. The fashionable branding now is environmentalism.”

Carl: “This is completely disgusting. Fashion doesn’t help people. Your entire sector is morally depraved.”

Worker: “You are entirely correct to be disgusted. This moral depravity is a result of dysfunctional institutions. You can see it outside charity too; schools are authoritarian prisons that don’t even help students learn, courts put people in cages for not spending enough on a lawyer, the US military blows up civilians unnecessarily, and so on. But you already knew all that, and ranting about these things is itself a trope. It is difficult to talk about how broken the systems are without this talking itself being interpreted as merely a cynical act. That’s how deep this goes. Please actually update on this rather than having your eyes glaze over!”

Carl: “How do you even deal with this?”

Worker: “It’s already the reality you’ve lived in your whole life. The only adjustment is to realize it, and be able to talk about it, without this destroying your ability to participate in the act when it’s necessary to do so. Maybe functional information-processing institutions will be built someday, but we are stuck with this situation for now, and we’ll have no hope of building functional institutions if we don’t understand our current situation.”

Carl: “You are wasting so much potential! With your ability to see social reality, you could be doing all kinds of things! If everyone who were as insightful as you were as pathetically lazy as you, there would be no way out of this mess!”

Worker: “Yeah, you’re right about that, and I might do something more ambitious someday, but I don’t really want to right now. So here I am. Anyway… food for the Africans. Helps with local autonomy and environmental sustainability. Have a heart and help them out.”

Carl sighed, fished a 10 dollar bill from his wallet, and gave it to the charity worker.

Decision theory and zero-sum game theory, NP and PSPACE

(Also posted on LessWrong)

At a rough level:

  • Decision theory is about making decisions to maximize some objective function.
  • Zero-sum game theory is about making decisions to optimize some objective function while someone else is making decisions to minimize this objective function.

These are quite different.

Decision theory and NP

Decision theory roughly corresponds to the NP complexity class.  Consider the following problem:

Given a set of items, each of which has a integer-valued value and weight, does there exist a subset with total weight less than w and total value at least v?

(It turns out that finding a solution is not much harder than determining whether there is a solution; if you know how to tell whether there is a solution to arbitrary problems of this form, you can in particular tell if there is a solution that uses any particular item.)

This is the knapsack problem, and it is in NP.  Given a candidate solution, it is easy to check whether it actually is a solution: you just count the values and the weights.  Since this solution would constitute a proof that the answer to the question is “yes”, and a solution exists whenever the answer is “yes”, this problem is in NP.

The following is a general form for NP problems:

\exists x_1 \in \{0, 1\} \exists x_2 \in \{0, 1\} \ldots \exists x_k \in \{0, 1\} f(x_1, ..., x_k)

where f is a specification of a circuit (say, made of AND, OR, and NOT gates) that outputs a single Boolean value.  That is, the problem is to decide whether there is some assignment of values to x_1, \ldots, x_k that f outputs true on.  This is a variant of the Boolean satisfiability problem.

In decision theory (and in NP), all optimization is in the same direction.  The only quantifier is \exists.

Zero-sum game theory and PSPACE

Zero-sum game theory roughly corresponds to the PSPACE complexity class.  Consider the following problem:

Given a specification of a Reversi game state (on an arbitrarily-large square board), does there exists a policy for the light player that guarantees a win?

(It turns out that winning the game is not much harder than determining whether there is a winning policy; if you know how to tell whether there is a solution to arbitrary problems of this form, then in particular you can tell if dark can win given a starting move by light.)

This problem is in PSPACE: it can be solved by a Turing machine using a polynomial amount of space.  This Turing machine works through the minimax algorithm: it simulates all possible games in a backtracking fashion.

The following is a general form for PSPACE problems:

\exists x_1 \in \{0, 1\} \forall y_1 \in \{0, 1\} \ldots \exists x_k \in \{0, 1\} \forall y_k \in \{0, 1\} f(x_1, y_1, \ldots, x_k, y_k)

where f is a specification of a circuit (say, made of AND, OR, and NOT gates) that outputs a single Boolean value.  That is, the problem is to determine whether it is possible to set the x values interleaved with an opponent setting the y values such that, no matter how the opponent acts, f(x_1, y_1, \ldots, x_k, y_k) is true.  This is a variant of the quantified Boolean formula problem.  (Interpreting a logical formula containing \exists and \forall as a game is standard; see game semantics).

In zero-sum game theory, all optimization is in one of two completely opposite directions.  There is literally no difference between something that is good for one player and something that is bad for the other.  The opposing quantifiers \exists and \forall, representing decisions by the two opponents, are interleaved.

Different cognitive modes

The comparison to complexity classes suggests that there are two different cognitive modes for decision theory and zero-sum game theory, as there are two different types of algorithms for NP-like and PSPACE-like problems.

In decision theory, you plan with no regard to any opponents interfering with your plans, allowing you to plan on arbitrarily long time scales.  In zero-sum game theory, you plan on the assumption that your opponent will interfere with your plans (your \existss are interleaved with your opponent’s \foralls), so you can only plan as far as your opponent lacks the ability to interfere with these plans.  You must have a short OODA loop, or your opponent’s interference will make your plans useless.

In decision theory, you can mostly run on naïve expected utility analysis: just do things that seem like they will work.  In zero-sum game theory, you must screen your plans for defensibility: they must be resistant to possible attacks.  Compare farming with border defense, mechanical engineering with computer security.

High-reliability engineering is an intermediate case: designs must be selected to work with high probability across a variety of conditions, but there is normally no intelligent optimization power working against the design.  One could think of nature as an “adversary” selecting some condition to test the design against, and represent this selection by a universal quantifier; however, this is qualitatively different from a true adversary, who applies intentional optimization to break a design rather than haphazard selection of conditions.

Conclusion

These two types of problems do not cover all realistic situations an agent might face.  Decision problems involving agents with different but not completely opposed objective functions are different, as are zero-sum games with more than two players.  But realistic situations share some properties with each of these, and I suspect that there might actually be a discrete distinction between cognitive modes for NP-like decision theory problems and PSPACE-like zero-sum games.

What’s the upshot?  If you want to know what is going on, one of the most important questions (perhaps the most important question) is: what kind of game are you playing?  Is your situation more like a decision theory problem or a zero-sum game?  To what extent is optimization by different agents going in the same direction, opposing directions, or orthogonal directions?  What would have to change for the nature of the game to change?


Thanks to Michael Vassar for drawing my attention to the distinction between decision theory and zero-sum game theory as a distinction between two cognitive modes.

Related: The Face of the Ice