The AI Timelines Scam

[epistemic status: that’s just my opinion, man. I have highly suggestive evidence, not deductive proof, for a belief I sincerely hold]

“If you see fraud and do not say fraud, you are a fraud.”Nasim Taleb

I was talking with a colleague the other day about an AI organization that claims:

  1. AGI is probably coming in the next 20 years.
  2. Many of the reasons we have for believing this are secret.
  3. They’re secret because if we told people about those reasons, they’d learn things that would let them make an AGI even sooner than they would otherwise.

His response was (paraphrasing): “Wow, that’s a really good lie! A lie that can’t be disproven.”

I found this response refreshing, because he immediately jumped to the most likely conclusion.

Near predictions generate more funding

Generally, entrepreneurs who are optimistic about their project get more funding than ones who aren’t. AI is no exception. For a recent example, see the Human Brain Project. The founder, Henry Makram, predicted in 2009 that the project would succeed in simulating a human brain by 2019, and the project was already widely considered a failure by 2013. (See his TED talk, at 14:22)

The Human Brain project got 1.3 billion Euros of funding from the EU.

It’s not hard to see why this is. To justify receiving large amounts of money, the leader must make a claim that the project is actually worth that much. And, AI projects are more impactful if it is, in fact, possible to develop AI soon. So, there is an economic pressure towards inflating estimates of the chance AI will be developed soon.

Fear of an AI gap

The missile gap was a lie by the US Air Force to justify building more nukes, by falsely claiming that the Soviet Union had more nukes than the US.

Similarly, there’s historical precedent for an AI gap lie used to justify more AI development. Fifth Generation Computer Systems was an ambitious 1982 project by the Japanese government (funded for $400 million in 1992, or $730 million in 2019 dollars) to create artificial intelligence through massively parallel logic programming.

The project is widely considered to have failed.  From a 1992 New York Times article:

A bold 10-year effort by Japan to seize the lead in computer technology is fizzling to a close, having failed to meet many of its ambitious goals or to produce technology that Japan’s computer industry wanted.

That attitude is a sharp contrast to the project’s inception, when it spread fear in the United States that the Japanese were going to leapfrog the American computer industry. In response, a group of American companies formed the Microelectronics and Computer Technology Corporation, a consortium in Austin, Tex., to cooperate on research. And the Defense Department, in part to meet the Japanese challenge, began a huge long-term program to develop intelligent systems, including tanks that could navigate on their own.

The Fifth Generation effort did not yield the breakthroughs to make machines truly intelligent, something that probably could never have realistically been expected anyway. Yet the project did succeed in developing prototype computers that can perform some reasoning functions at high speeds, in part by employing up to 1,000 processors in parallel. The project also developed basic software to control and program such computers. Experts here said that some of these achievements were technically impressive.

In his opening speech at the conference here, Kazuhiro Fuchi, the director of the Fifth Generation project, made an impassioned defense of his program.

“Ten years ago we faced criticism of being too reckless,” in setting too many ambitious goals, he said, adding, “Now we see criticism from inside and outside the country because we have failed to achieve such grand goals.”

Outsiders, he said, initially exaggerated the aims of the project, with the result that the program now seems to have fallen short of its goals.

Some American computer scientists say privately that some of their colleagues did perhaps overstate the scope and threat of the Fifth Generation project. Why? In order to coax more support from the United States Government for computer science research.

(emphasis mine)

This bears similarity to some conversations on AI risk I’ve been party to in the past few years. The fear is that Others (DeepMind, China, whoever) will develop AGI soon, so We have to develop AGI first in order to make sure it’s safe, because Others won’t make sure it’s safe and We will. Also, We have to discuss AGI strategy in private (and avoid public discussion), so Others don’t get the wrong ideas. (Generally, these claims have little empirical/rational backing to them; they’re based on scary stories, not historically validated threat models)

The claim that others will develop weapons and kill us with them by default implies a moral claim to resources, and a moral claim to be justified in making weapons in response. Such claims, if exaggerated, justify claiming more resources and making more weapons. And they weaken a community’s actual ability to track and respond to real threats (as in The Boy Who Cried Wolf).

How does the AI field treat its critics?

Hubert Dreyfus, probably the most famous historical AI critic, published “Alchemy and Artificial Intelligence” in 1965, which argued that the techniques popular at the time were insufficient for AGI. Subsequently, he was shunned by other AI researchers:

The paper “caused an uproar”, according to Pamela McCorduck.  The AI community’s response was derisive and personal.  Seymour Papert dismissed one third of the paper as “gossip” and claimed that every quotation was deliberately taken out of context.  Herbert A. Simon accused Dreyfus of playing “politics” so that he could attach the prestigious RAND name to his ideas. Simon said, “what I resent about this was the RAND name attached to that garbage.”

Dreyfus, who taught at MIT, remembers that his colleagues working in AI “dared not be seen having lunch with me.”  Joseph Weizenbaum, the author of ELIZA, felt his colleagues’ treatment of Dreyfus was unprofessional and childish.  Although he was an outspoken critic of Dreyfus’ positions, he recalls “I became the only member of the AI community to be seen eating lunch with Dreyfus. And I deliberately made it plain that theirs was not the way to treat a human being.”

This makes sense as anti-whistleblower activity: ostracizing, discrediting, or punishing people who break the conspiracy to the public. Does this still happen in the AI field today?

Gary Marcus is a more recent AI researcher and critic. In 2012, he wrote:

Deep learning is important work, with immediate practical applications.

Realistically, deep learning is only part of the larger challenge of building intelligent machines. Such techniques lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like “sibling” or “identical to.” They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used. The most powerful A.I. systems … use techniques like deep learning as just one element in a very complicated ensemble of techniques, ranging from the statistical technique of Bayesian inference to deductive reasoning.

In 2018, he tweeted an article in which Yoshua Bengio (a deep learning pioneer) seemed to agree with these previous opinions. This tweet received a number of mostly-critical replies. Here’s one, by AI professor Zachary Lipton:

There’s a couple problems with this whole line of attack. 1) Saying it louder ≠ saying it first. You can’t claim credit for differentiating between reasoning and pattern recognition. 2) Saying X doesn’t solve Y is pretty easy. But where are your concrete solutions for Y?

The first criticism is essentially a claim that everybody knows that deep learning can’t do reasoning. But, this is essentially admitting that Marcus is correct, while still criticizing him for saying it [ED NOTE: the phrasing of this sentence is off (Lipton publicly agrees with Marcus on this point), and there is more context, see Lipton’s reply].

The second is a claim that Marcus shouldn’t criticize if he doesn’t have a solution in hand. This policy deterministically results in the short AI timelines narrative being maintained: to criticize the current narrative, you must present your own solution, which constitutes another narrative for why AI might come soon.

Deep learning pioneer Yann LeCun’s response is similar:

Yoshua (and I, and others) have been saying this for a long time.
The difference with you is that we are actually trying to do something about it, not criticize people who don’t.

Again, the criticism is not that Marcus is wrong in saying deep learning can’t do certain forms of reasoning, the criticism is that he isn’t presenting an alternative solution. (Of course, the claim could be correct even if Marcus doesn’t have an alternative!)

Apparently, it’s considered bad practice in AI to criticize a proposal for making AGI without presenting on alternative solution. Clearly, such a policy causes large distortions!

Here’s another response, by Steven Hansen (a research scientist at DeepMind):

Ideally, you’d be saying this through NeurIPS submissions rather than New Yorker articles. A lot of the push-back you’re getting right now is due to the perception that you haven’t been using the appropriate channels to influence the field.

That is: to criticize the field, you should go through the field, not through the press. This is standard guild behavior. In the words of Adam Smith: “People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices.”

(Also see Marcus’s medium article on the Twitter thread, and on the limitations of deep learning)

[ED NOTE: I’m not saying these critics on Twitter are publicly promoting short AI timelines narratives (in fact, some are promoting the opposite), I’m saying that the norms by which they criticize Marcus result in short AI timelines narratives being maintained.]

Why model sociopolitical dynamics?

This post has focused on sociopolotical phenomena involved in the short AI timelines phenomenon. For this, I anticipate criticism along the lines of “why not just model the technical arguments, rather than the credibility of the people involved?” To which I pre-emptively reply:

  • No one can model the technical arguments in isolation. Basic facts, such as the accuracy of technical papers on AI, or the filtering processes determining what you read and what you don’t, depend on sociopolitical phenomena. This is far more true for people who don’t themselves have AI expertise.
  • “When AGI will be developed” isn’t just a technical question. It depends on what people actually choose to do (and what groups of people actually succeed in accomplishing), not just what can be done in theory. And so basic questions like “how good is the epistemology of the AI field about AI timelines?” matter directly.
  • The sociopolitical phenomena are actively making technical discussion harder. I’ve had a well-reputed person in the AI risk space discourage me from writing publicly about the technical arguments, on the basis that getting people to think through them might accelerate AI timelines (yes, really).

Which is not to say that modeling such technical arguments is not important for forecasting AGI. I certainly could have written a post evaluating such arguments, and I decided to write this post instead, in part because I don’t have much to say on this issue that Gary Marcus hasn’t already said. (Of course, I’d have written a substantially different post, or none at all, if I believed the technical arguments that AGI is likely to come soon had merit to them)

What I’m not saying

I’m not saying:

  1. That deep learning isn’t a major AI advance.
  2. That deep learning won’t substantially change the world in the next 20 years (through narrow AI).
  3. That I’m certain that AGI isn’t coming in the next 20 years.
  4. That AGI isn’t existentially important on long timescales.
  5. That it isn’t possible that some AI researchers have asymmetric information indicating that AGI is coming in the next 20 years. (Unlikely, but possible)
  6. That people who have technical expertise shouldn’t be evaluating technical arguments on their merits.
  7. That most of what’s going on is people consciously lying. (Rather, covert deception hidden from conscious attention (e.g. motivated reasoning) is pervasive; see The Elephant in the Brain)
  8. That many people aren’t sincerely confused on the issue.

I’m saying that there are systematic sociopolitical phenomena that cause distortions in AI estimates, especially towards shorter timelines. I’m saying that people are being duped into believing a lie. And at the point where 73% of tech executives say they believe AGI will be developed in the next 10 years, it’s a major one.

This has happened before. And, in all likelihood, this will happen again.

Self-consciousness wants to make everything about itself

Here’s a pattern that shows up again and again in discourse:

A: This thing that’s happening is bad.

B: Are you saying I’m a bad person for participating in this? How mean of you! I’m not a bad person, I’ve done X, Y, and Z!

It isn’t always this explicit; I’ll discuss more concrete instances in order to clarify. The important thing to realize is that A is pointing at a concrete problem (and likely one that is concretely affecting them), and B is changing the subject to be about B’s own self-consciousness. Self-consciousness wants to make everything about itself; when some topic is being discussed that has implications related to people’s self-images, the conversation frequently gets redirected to be about these self-images, rather than the concrete issue. Thus, problems don’t get discussed or solved; everything is redirected to being about maintaining people’s self-images.

Tone arguments

A tone argument criticizes an argument not for being incorrect, but for having the wrong tone. Common phrases used in tone arguments are: “More people would listen to you if…”, “you should try being more polite”, etc.

It’s clear why tone arguments are epistemically invalid. If someone says X, then X’s truth value is independent of their tone, so talking about their tone is changing the subject. (Now, if someone is saying X in a way that breaks epistemic discourse norms, then defending such norms is epistemically sensible; however, tone arguments aren’t about epistemic norms, they’re about people’s feelings).

Tone arguments are about people protecting their self-images when they or a group they are part of (or a person/group they sympathize with) is criticized. When a tone argument is made, the conversation is no longer about the original topic, it’s about how talking about the topic in certain ways makes people feel ashamed/guilty. Tone arguments are a key way self-consciousness makes everything about itself.

Tone arguments are practically always in bad faith. They aren’t made by people trying to help an idea be transmitted to and internalized by more others. They’re made by people who want their self-images to be protected. Protecting one’s self-image from the truth, by re-directing attention away from the epistemic object level, is acting in bad faith.

Self-consciousness in social justice

A documented phenomenon in social justice is “white women’s tears”. Here’s a case study (emphasis mine):

A group of student affairs professionals were in a meeting to discuss retention and wellness issues pertaining to a specific racial community on our campus. As the dialogue progressed, Anita, a woman of color, raised a concern about the lack of support and commitment to this community from Office X (including lack of measurable diversity training, representation of the community in question within the staff of Office X, etc.), which caused Susan from Office X, a White woman, to feel uncomfortable. Although Anita reassured Susan that her comments were not directed at her personally, Susan began to cry while responding that she “felt attacked”. Susan further added that: she donated her time and efforts to this community, and even served on a local non-profit organization board that worked with this community; she understood discrimination because her family had people of different backgrounds and her closest friends were members of this community; she was committed to diversity as she did diversity training within her office; and the office did not have enough funding for this community’s needs at that time.

Upon seeing this reaction, Anita was confused because although her tone of voice had been firm, she was not angry. From Anita’s perspective, the group had come together to address how the student community’s needs could be met, which partially meant pointing out current gaps where increased services were necessary. Anita was very clear that she was critiquing Susan’s office and not Susan, as Susan could not possibly be solely responsible for the decisions of her office.

The conversation of the group shifted at the point when Susan started to cry. From that moment, the group did not discuss the actual issue of the student community. Rather, they spent the duration of the meeting consoling Susan, reassuring her that she was not at fault. Susan calmed down, and publicly thanked Anita for her willingness to be direct, and complimented her passion. Later that day, Anita was reprimanded for her ‘angry tone,’ as she discovered that Susan complained about her “behavior” to both her own supervisor as well as Anita’s supervisor. Anita was left confused by the mixed messages she received with Susan’s compliment, and Susan’s subsequent complaint regarding her.

The key relevance of this case study is that, while the conversation was originally about the issue of student community needs, it became about Susan’s self-image. Susan made everything about her own self-image, ensuring that the actual concrete issue (that her office was not supporting the racial community) was not discussed or solved.

Shooting the messenger

In addition to crying, Susan also shot the messenger, by complaining about Anita to both her and Anita’s supervisors. This makes sense as ego-protective behavior: if she wants to maintain a certain self-image, she wants to discourage being presented with information that challenges it, and also wants to “one-up” the person who challenged her self-image, by harming that person’s image (so Anita does not end up looking better than Susan does).

Shooting the messenger is an ancient tactic, deployed especially by powerful people to silence providers of information that challenges their self-image. Shooting the messenger is asking to be lied to, using force. Obviously, if the powerful person actually wants information, this tactic is counterproductive, hence the standard advice to not shoot the messenger.

Self-consciousness as privilege defense

It’s notable that, in the cases discussed so far, self-consciousness is more often a behavior of the privileged and powerful, rather than the disprivileged and powerless. This, of course, isn’t a hard-and-fast rule, but there certainly seems to be a relation. Why is that?

Part of this is that the less-privileged often can’t get away with redirecting conversations by making everything about their self-image. People’s sympathies are more often with the privileged.

Another aspect is that privilege is largely about being rewarded for one’s identity, rather than one’s works. If you have no privilege, you have to actually do something concretely effective to be rewarded, like cleaning. Whereas, privileged people, almost by definition, get rewarded “for no reason” other than their identity.

Maintenance of a self-image makes less sense as an individual behavior than as a collective behavior. The phenomenon of bullshit jobs implies that much of the “economy” is performative, rather than about value-creation. While almost everyone can pretend to work, some people are better at it than others. The best people at such pretending are those who look the part, and who maintain the act. That is: privileged people who maintain their self-images, and who tie their self-images to their collective, as Susan did. (And, to the extent that e.g. school “prepares people for real workplaces”, it trains such behavior.)

Redirection away from the object level isn’t merely about defending self-image; it has the effect of causing issues not to be discussed, and problems not to be solved. Such effects maintain the local power system. And so, power systems encourage people to tie their self-images with the power system, resulting in self-consciousness acting as a defense of the power system.

Note that, while less-privileged people do often respond negatively to criticism from more-privileged people, such responses are more likely to be based in fear/anger rather than guilt/shame.

Stop trying to be a good person

At the root of this issue is the desire to maintain a narrative of being a “good person”. Susan responded to the criticism of her office by listing out reasons why she was a “good person” who was against racial discrimination.

While Anita wasn’t actually accusing Susan of racist behavior, it is, empirically, likely that some of Susan’s behavior is racist, as implicit racism is pervasive (and, indeed, Susan silenced a woman of color speaking on race). Susan’s implicit belief is that there is such a thing as “not being racist”, and that one gets there by passing some threshold of being nice to marginalized racial groups. But, since racism is a structural issue, it’s quite hard to actually stop participating in racism, without going and living in the woods somewhere. In societies with structural racism, ethical behavior requires skillfully and consciously reducing harm given the fact that one is a participant in racism, rather than washing one’s hands of the problem.

What if it isn’t actually possible to be “not racist” or otherwise “a good person”, at least on short timescales? What if almost every person’s behavior is morally depraved a lot of the time (according to their standards of what behavior makes someone a “good person”)? What if there are bad things that are your fault? What would be the right thing to do, then?

Calvinism has a theological doctrine of total depravity, according to which every person is utterly unable to stop committing evil, to obey God, or to accept salvation when it is offered. While I am not a Calvinist, I appreciate this teaching, because quite a lot of human behavior is simultaneously unethical and hard to stop, and because accepting this can get people to stop chasing the ideal of being a “good person”.

If you accept that you are irredeemably evil (with respect to your current idea of a good person), then there is no use in feeling self-conscious or in blocking information coming to you that implies your behavior is harmful. The only thing left to do is to steer in the right direction: make things around you better instead of worse, based on your intrinsically motivating discernment of what is better/worse. Don’t try to be a good person, just try to make nicer things happen. And get more foresight, perspective, and cooperation as you go, so you can participate in steering bigger things on longer timescales using more information.

Paradoxically, in accepting that one is irredeemably evil, one can start accepting information and steering in the right direction, thus developing merit, and becoming a better person, though still not “good” in the original sense. (This, I know from personal experience)

(See also: What’s your type: Identity and its Discontents; Blame games; Bad intent is a disposition, not a feeling)

Writing children’s picture books

Here’s an exercise for explaining and refining your opinions about some domain, X:

Imagine writing a 10-20 page children’s picture book about topic X. Be fully honest and don’t hide things (assume the child can handle being told the truth, including being told non-standard or controversial facts).

Here’s a dialogue, meant to illustrate how this could work:

A: What do you think about global warming?

B: Uhh…. I don’t know, it seems real?

A: How would you write a 10-20 page children’s picture book about global warming?

B: Oh, I’d have a diagram showing carbon dioxide exiting factories and cars, floating up in the atmosphere, and staying there. Then I’d have a picture of sunlight coming through the atmosphere, bounding off the earth, then going back up, but getting blocked by the carbon dioxide, so it goes back to the earth and warms up the earth a second time. Oh, wait, if the carbon dioxide prevents the sunlight from bouncing from the earth to the sky, wouldn’t it also prevent the sunlight from entering the atmosphere in the first place? Oh, I should look that up later [NOTE: the answer is that CO2 blocks thermal radiation much more than it blocks sunlight].

Anyway, after that I’d have some diagrams showing global average temperature versus global CO2 level that show how the average temperature is tracking CO2 concentration, with some lag time. Then I’d have some quotes about scientists and information about the results of surveys. I’d show a graph showing how much the temperature would increase under different conditions… I think I’ve heard that, with substantial mitigation effort, the temperature difference might be 2 degrees Celsius from now until the end of the century [NOTE: it’s actually 2 degrees from pre-industrial times till the end of the century, which is about 1 degree from now]. And I’d want to show what 2 degrees Celsius means, in terms of, say, a fraction of the difference between winter and summer.

I’d also want to explain the issue of sea level rise, by showing a diagram of a glacier melting. Ice floats, so if the glacier is free-floating, then it melting doesn’t cause a sea level rise (there’s some scientific principle that says this, I don’t remember what it’s called), but if the glacier is on land, then when it melts, it causes the sea level to rise. I’d also want to show a map of the areas that would get flooded. I think some locations, like much of Florida, get flooded, so the map should show that, and there should also be a pie chart showing how much of the current population would end up underwater if they didn’t move (my current guess is that it’s between 1 percent and 10 percent, but I could be pretty wrong about this [NOTE: the answer is 30 to 80 million people, which is between about 0.4% and 1.1%]).

I’d also want to talk about possible mitigation efforts. Obviously, it’s possible to reduce energy consumption (and also meat consumption, because cows produce methane which is also a greenhouse gas). So I’d want to show a chart of which things produce the most greenhouse gases (I think airplane flights and beef are especially bad), and showing the relationship between possible reductions in that and the temperature change.

Also, trees take CO2 out of the atmosphere, so preserving forests is a way to prevent global warming. I’m confused about where the CO2 goes, exactly, since there’s some cycle it goes through in the forest; does it end up underground? I’d have to look this up.

I’d also want to talk about the political issues, especially the disinformation in the space. There’s a dynamic where companies that pollute want to deny that man-made global warming is a real, serious problem, so there won’t be regulations. So, they put out disinformation on television, and they lobby politicians. Sometimes, in the discourse, people go from saying that global warming isn’t real, to saying it’s real but not man-made, to saying it’s real and man-made but it’s too late to do anything about it. That’s a clear example of motivated cognition. I’d want to explain how this is trying to deny that any changes should be made, and speculate about why people might want to, such as because they don’t trust the process that causes changes (such as the government) to do the right thing.

And I’d also want to talk about geoengineering. There are a few proposals I know of. One is to put some kind of sulfer-related chemical in the atmosphere, to block out sunlight. This doesn’t solve ocean acidification, but it does reduce the temperature. But, it’s risky, because if you stop putting the chemical in the atmosphere, then that causes a huge temperature swing.

I also know it’s possible to put iron in the ocean, which causes a plankton bloom, which… does something to capture CO2 and store it in the bottom of the ocean? I’m really not sure how this works, I’d want to look it up before writing this section.

There’s also the proposal of growing and burning trees, and capturing and storing the carbon. When I looked this up before, I saw that this takes quite a lot of land, and anyway there’s a lot of labor involved, but maybe some if it can be automated.

There are also political issues with geoengineering. There are people who don’t trust the process of doing geoengineering to make things better instead of worse, because they expect that people’s attempts to reason about it will make lots of mistakes (or people will have motivated cognition and deceive themselves and each other), and then the resulting technical models will make things that don’t work. But, the geoengineering proposals don’t seem harder than things that humans have done in the past using technical knowledge, like rockets, so I don’t agree that this is such a big problem.

Furthermore, some people want to shut down discussion of geoengineering, because such discussion would make it harder to morally pressure people into reducing carbon emissions. I don’t know how to see this as anything other than an adversarial action against reasonable discourse, but I’m sure there is some motivation at play here. Perhaps it’s a motivation to have everyone come together as one, all helping together, in a hippie-ish way. I’m not sure if I’m right here, I’d want to read something written by one of these people before making any strong judgments.

Anyway, that’s how I’d write a picture book about global warming.

So, I just wrote that dialogue right now, without doing any additional research. It turns out that I do have quite a lot of opinions about global warming, and am also importantly uncertain in some places, some of which I just now became aware of. But I’m not likely to produce these opinions if asked “what do you think about global warming?”

Why does this technique work? I think it’s because, if asked for one’s opinions in front of an adult audience, it’s assumed that there is a background understanding of the issue, and you have to say something new, and what you decide to say says something about you. Whereas, if you’re explaining to a child, then you know they lack most of the background understanding, and so it’s obviously good to explain that.

With adults, it’s assumed there are things that people act like “everyone knows”, where it might be considered annoying to restate them, since it’s kind of like talking down to them. Whereas, the illusion or reality that “everyone knows” is broken when explaining to children.

The countervailing force is that people are tempted to lie to children. Of course, it’s necessary to not lie to children to do the exercise right, and also to raise or help raise children who don’t end up in an illusory world of confusion and dread. I would hope that someone who has tendencies to hide things from children would at least be able to notice and confront these tendencies in the process of imagining writing children’s picture books.

I think this technique can be turned into a generalized process for making world models. If someone wrote a new sketch of a children’s picture book (about a new topic) every day, and did the relevant research when they got stuck somewhere, wouldn’t they end up with a good understanding of both the world and of their own models of the world after a year? It’s also a great starting point from which to compare your opinions to others’ opinions, or to figure out how to explain things to either children or adults.

Anyway, I haven’t done this exercise for very many topics yet, but I plan on writing more of these.

Occamian conjecturalism: we posit structures of reality

Here’s my current explicit theory of ontology and meta-epistemology. I haven’t looked into the philosophical literature that much, but this view has similarities to both conjectural realism and to minimum description length.

I use “entity” to mean some piece of data in the mind, similar to an object in an object-oriented programming language. They’re the basic objects perception and models are made of.

Humans start with primitive entities, which include low-level physical percepts, and perhaps other things, though I’m not sure.

We posit entities to explain other entities, using Occam/probability rules; some entities are rules about how entities predict/explain other entities. Occam says to posit few entities to explain many.
Probability says explanations may be stochastic (e.g. dogs are white with 30% probability). See minimum description length for more on how Occam and probability interact.

High-level percepts get posited to explain low-level percepts, e.g. a color splotch gets posited to explain all the individual colored points that are close to each other. A line gets posited to explain a bunch of individual colored points that are in, well, a line.

Persistent objects are posited (object permanence) to explain regularities in high-level percepts over spacetime. Object-types get posited to explain similarities between different objects.

Generalities (e.g. “that swans are white”) get posited to explain regularities between different objects. Generalities may be stochastic (coins turn up heads half the time when flipped). It’s hard to disentangle generalities from types themselves (is being white a generality about swans, or a defining feature?). Logical universals (such as modus ponens) are generalities.

Some generalities are causal relations, e.g. that striking a match causes a flame. Causal relations explain “future” events from “past” events, in a directed acyclic graph structure.

So far, the picture is egocentric, in that percepts are taken to be basic. If I adopt a percept-based ontology, I will believe that the world moves around me as I walk, rather than believing that I move through the world. Things are addressed in coordinates relative to my position, not relative to the ground. (This is easy to see if you pay attention to your visual field while walking around)

Whence objectivity? As I walk, most of those things around me “don’t move” if I posit that the ground is stable, as they have the same velocity as the ground. So by positing the ground is still while I move, I posit fewer motions. While I could in theory continue using an egocentric reference frame and posit laws of motion to explain why the world moves around me, this ends up more complicated and epicyclical than simply positing that the ground is still while I move. Objectivity-in-general is a result of these shifts in reference frame, where things are addressed relative to some common ground rather than egocentrically.

Objectivity implies theory of mind, in that I take my mental phenomena to be “properties of me-the-person” rather than “the mental phenomena that are apparent”, as an egocentric reference frame would take them to be. I posit other minds like my own, which is a natural result of the generalization that human bodies are inhabited by minds. Empathy is the connection I effectively posit between my own mental phenomena and others’ through this generalization.

An ontology shift happens when we start positing different types of entities than we did previously. We may go from thinking in terms of color splotches to thinking in terms of objects, or from thinking in terms of chemical essences to thinking in terms of molecules. Each step is justified by the Occam/probability rules; the new ontology must make the overall structure simpler.

Language consists of words, which are themselves entities that explain lower-level percepts (phonemes, dots of ink on paper, etc). Children learning language find that these entities are correlated with the reality they have already posited. (This is clear in the naive case, where teachers simply use language to honestly describe reality, but correlation is still present when language use is dishonest). The combination of objectivity and language has the result of standardizing a subset of ontology between different speakers, though nonverbal ontology continues to exist.

Mathematical entities (e.g. numbers) are posited to explain regularities in entities, such as the regularity between “two things over here” and “two things over there”, and between linguistic entities such as the word “two” and the actual “two things over here”. Mathematical generalizations are posited to explain mathematical entities.

Fictional worlds are posited to explain fictional media. We, in some sense, assume that a fiction book is an actual description of some world. Unlike with nonfiction media, we don’t expect this world to be the same as the one we move through in everyday life; it isn’t the actual world. Reality is distinguished from fantasy by their differing correlational structures.

If everything but primitive entities is posited, in what sense are these things “ultimately real”? There is no notion of “ultimately real” outside the positing structure. We may distinguish reality from fantasy within the structure, as the previous paragraph indicates. We may also distinguish illusion from substance, as we expect substance but not illusion to generate concordant observations upon being viewed differently. We may distinguish persistent ontology (which stays the same as we get more data) from non-persistent ontology (which changes as we get more data). And we may distinguish primitive entities from posited ones. But, there doesn’t seem to be a notion of ultimate reality beyond these particular distinctions and ones like them. I think this is a feature, not a bug. However, it’s at least plausible that when I learn more, my ontology will stabilize to the point where I have a natural sense of ultimate reality.

What does it mean for propositions to be true or false? A proposition is some sentence (an entity) corresponding to a predicate on worlds; it is true if and only if the predicate is true of the world. For example, “snow is white” is true if and only if snow is white. This is basically a correspondence theory, where we may speak of correspondences between the (already-ontologized) territory and ontological representations of it.

But, what about ontological uncertainty? It’s hard to say whether an ontology, such as the ontology of objects, is “true” or “false”. We may speak of it “fitting the territory well” or “fitting the territory badly”, which is not the same thing as “true” or “false” in a propositional sense. If we expect our ontologies to shift in the future (and I expect mine to shift), then, from the perspective of our new ontology, our current ontology will be false, the way Newtonian mechanics is false. However, we don’t have access to this hypothetical future ontology yet, so we can’t use it to judge our current ontology as false; the judgment that the original ontology is false comes along with a new worldview, which we don’t have yet. What we can say is whether or not we expect our reasoning processes to produce ontology shifts when exposed to future data.

May non-falsifiable entities be posited? Yes, if they explain more than they posit. Absent ability to gain more historical data, many historical events are non-falsifiable. Still, positing such an event explains the data (e.g. artifacts supposedly left at the site of the event) better than alternatives (e.g. positing that the writing was produced by people who happened to have the same delusion). So, entities need not be falsifiable in general, although ones that are completely unrelated to any observational consequences will never be posited in the first place.

Is reality out there, or is it all in our heads? External objects are out there; they aren’t in your brain, or they would be damaging your brain tissue. Yet, our representations of such objects are in our heads. Objects differ from our representations of them; they’re in different places, are different sizes, and are shaped differently. When I speak of posited structures, I speak of representations, not the objects themselves, although our posited structures constitute our sense of all that is.

Reductionism and physicalism

But isn’t reality made of atoms (barring quantum mechanics), not objects? We posit atoms to explain objects and their features. Superficially, positing so many atoms violates Occamian principles, but this is not an issue in probabilistic epistemologies, where we may (implicitly) sum over many possible atomic configurations. The brain doesn’t actually do such a sum; in practice we rarely posit particular atoms, and instead posit generalities about atoms and their relation to other entities (such as chemical types). Objects still exist in our ontologies, and are explained by atoms. Atoms explain, but do not explain away, objects.

But couldn’t you get all the observations you’re using objects to explain using atoms? Perhaps an AI can do this, but a human can’t. Humans continue to posit objects upon learning about atoms. The ontology shift to believing in only-atoms would be computationally intractable.

But doesn’t that mean the ultimate reality is atoms, not objects? “Ultimate reality” is hard to define, as explained previously. Plausibly, I would believe in atoms and not believe in objects if I thought much faster than I actually do. This would make objects a non-persistent ontology, as opposed to the more-persistent atomic ontology. However, this conterfactual is strange, as it assumes my brain is larger than the rest of the universe. Even then, I would be unable to model my brain as atomic. So it seems that, as an epistemic fact, atoms aren’t all there are; I would never shift to an atom-only ontology, no matter how big my brain was.

But isn’t this confusing the territory and the best map of the territory? As explained previously, our representations are not the territory. Our sense of the territory itself (not just of our map of it) contains objects, or, to drop the quotation, the territory itself contains objects. (Why drop the quotation? I’m describing my sense of the territory to you; there is nothing else I could proximately describe, other than my sense of the territory; in reaching for the territory itself, I proximately find my sense of it)

This discussion is going towards the idea of supervenience, which is that high-level phenomena (such as objects) are entirely determined by low-level phenomena (such as atoms). Supervenience is a generality that relates high-level phenomena to low-level ones. Importantly, supervenience is non-referential (and thus vacuous) if there are no high-level phenomena.

If all supervenes on atoms, then there are high-level phenomena (such as objects), not just atoms. Positing supervenience yields all the effective predictions that physicalism could yield (in our actual brains, not in theoretical super-AIs). Supervenience may imply physicalism, depending on the definition of physicalism, but it doesn’t imply that atoms are the only entities.

Supervenience leaves open a degree of freedom, namely, the function mapping low-level phenomena to high-level phenomena. In the case of consciousness as the high-level phenomenon, this function will, among other things, resolve indexical/anthropic uncertainty (which person are the experiences I see happening to?) and uncertainty about the hard problem of consciousness (which physical structures are conscious, and of what?).

Doesn’t this imply that p-zombies are conceivable? We may distinguish “broad” notions of conceivability, under which just about any posited structure is conceivable (and under which p-zombies are conceivable), and “narrow” notions, where the structure must satisfy certain generalities, such as logic and symmetry. Adding p-zombies to the posited structure might break important general relations we expect will hold, such as logic, symmetry of function from physical structure to mental structure, or realization-independence. I’m not going to resolve the zombie argument in this particular post, but will conclude that it is at least not clear that zombies are conceivable in the narrow sense.


This is my current best simple, coherent view of ontology and meta-epistemology. If I were to give it a name, it would be “Occamian conjecturalism”, but it’s possible it has already been named. I’m interested in criticism of this view, or other thoughts on it.

Conditional revealed preference

There’s a pretty common analysis of human behavior that goes something like this:

“People claim that they want X. However, their actions are optimizing towards Y instead of X. If they really cared about X, they would do something else instead. Therefore, they actually want Y, and not X.”

This is revealed preference analysis. It’s quite useful, in that if people’s actions are effectively optimizing for Y and not X, then an agent-based model of the system will produce better predictions by predicting that people want Y and not X.

So, revealed preference analysis is great for analyzing a multi-agent system in equilibrium. However, it often has trouble predicting what would happen when a major change happens to the system.

As an example, consider a conclusion Robin Hanson gives on school:

School isn’t about learning “material,” school is about learning to accept workplace domination and ranking, and tolerating long hours of doing boring stuff exactly when and how you are told.

(note that I don’t think Hanson is claiming things about what people “really want” in this particular post, although he does make such claims in other writing)

Hanson correctly infers from the fact that most schools are highly authoritarian that school is effectively “about” learning to accept authoritarian work environments. We could make “about” more specific: the agents who determine what happens in schools (administrators, teachers, voters, parents, politicians, government employees) are currently taking actions that cause schools to be authoritarian, in a coordinated fashion, with few people visibly resisting this optimization.

This revealed preference analysis is highly useful. However, it leaves degrees of freedom open in what the agents terminally want. These degrees of freedom matter when predicting how those agents will act under different circumstances (their conditional revealed preferences). For example:

  • Perhaps many of the relevant agents actually do want schools to help children learn, but were lied to about what forms of school are effective for learning. This would predict that, upon receiving credible evidence that free schools are more effective for learning while being less authoritarian, they would support free schools instead.
  • Perhaps many of the relevant agents want school to be about learning, but find themselves in a grim trigger equilibrium where they expect to get punished for speaking out about the actual nature of school, and also to be punished for not punishing those who speak out. This would predict that, upon seeing enough examples of people speaking out and not being punished, they would join the new movement.
  • Perhaps many of the relevant agents have very poor world models of their own, and must therefore navigate according to imitation and to “official reality” narratives, which constrain them to acting as if school is for learning. This would predict that, upon gaining much more information about the world and gaining experience in navigating it according to their models (rather than the official narratives), they would favor free schools over authoritarian schools.

It’s hard to tell which if these hypotheses (or other hypotheses) are true given only information about how people act in the current equilibrium. These hypotheses make conditional and counterfactual predictions: they predict what people would do, given different circumstances than their current ones.

This is not to say that people’s stories about what they want are to be taken at face value; the gold standard for determining what people want is not what they say, but what they actually optimize for under various circumstances, including ones substantially different from present ones. (Obviously, their words can be evidence about their counterfactual actions, to the extent that they are imaginative and honest about the counterfactual scenarios)

To conclude, I suggest the following heuristics:

  • In analyzing an equilibrium, look mainly at what people actually optimize for with their actions, not what they say they’re optimizing for.
  • In guessing what they “really want”, additionally imagine their actions in alternative scenarios where they e.g. have more information and more ability to coordinate with those who have similar opinions.
  • Actually find data about these alternative scenarios, by e.g. actually informing people, or finding people who were informed and seeing how their actions changed.

Boundaries enable positive material-informational feedback loops

(Also posted on LessWrong)

[epistemic status: obvious once considered, I think]

If you want to get big things done, you almost certainly need positive feedback loops. Unless you can already do all the necessary things, you need to do/make things that allow you to do/make more things in the future. This dynamic can be found in RPG and economy-management games, and in some actual economic systems, such as industrializing economies.

Material, information, and economy

Some goods that can be used in a positive feedback loop, such as software and inventions, are informational. Once produced, they can be used indefinitely in the future. In economic terms, they are nonrivalous.

Other goods are material, such as manufactured goods and energy. They can’t be copied cheaply. In economic terms, they are rivalrous.

In practice, any long-lasting positive feedback loop contains both informational and material goods, as production of information requires a physical substrate. While ensuring that informational goods can be used in the future is an organization and communication problem (a subject beyond the scope of this post), the problem of ensuring that material goods can be used in the future is additionally a security problem.

An important question to ask is: why haven’t material-informational positive feedback loops already taken over the world? Why don’t we have so much stuff by now that providing for people’s material needs (such as food and housing) is trivial?

To some extent, material-informational positive feedback loops have taken over the world, but they seem much slower than one would naively expect. See cost disease. As an example of cost disease, the average cost of a new house in the USA has quadrupled over a 60-year period (adjusted for inflation!), whereas models of capitalism based on economy-management games such as Factorio (or, more academically, according to the labor/capital based economic models of classical economists such as David Ricardo) would suggest that houses would be plentiful by now. (And no, this isn’t just because of land prices; it costs about $300K to build a house in the US 2018)

Security and boundaries

I’ve already kind of answered this question by saying that ensuring that material goods can be used in the future is a security problem. If you use one of your material goods to produce another material good, and someone takes this new good, then you can’t put this good back into your production process. Thus, what would have been a positive feedback loop is instead a negative feedback loop, as it leaks goods faster than it produces them.

Solving security issues generally requires boundaries. You need to draw a boundary in material space somewhere, differentiating the inside from the outside, such that material goods (such as energy) on the inside don’t leak out, and can potentially have positive feedback loops. There are many ways to prevent leaks across a boundary while still allowing informational and material to pass through sometimes, such as semiporous physical barriers and active policing. Regardless of the method to enforce the boundary, the boundary has to exist in some geometrical sense for it to make sense to say that e.g. energy increases within this system.

Not all security issues are from other agents; some are from non-agentic processes. Consider a homeostatic animal. If the animal expends energy to warm its body, and this warmth escapes, the animal will fail to realize gains from the energy expenditure. Thus, the animal has a boundary (namely, skin) to solve this “security problem”. The cold air particles that take away heat from the animal are analogous to agents that directly take resources, though obviously less agentic. While perhaps my usage of the word “security” to include responses to nonagentic threats is nonstandard, I hope it is clear that these are on the same spectrum as agentic threats, and can be dealt with in some of the same ways.

It is also worth thinking about semi-agentic entities, such as microorganisms. One of the biggest threats to a food store is microorganisms (i.e. rotting), and slowing the negative feedback loops depleting food stores requires solving this security problem using a boundary (such as a sealed container or a subset of the air that is colder than the outside air, such as in a refrigerator).

Property rights are a simple example of boundaries. Certain goods are considered to be “owned” by different parties, such that there is common agreement about who owns what, and people are for one reason or another not motivated to take other people’s stuff. Such division of goods into sets owned by different parties is a set of boundaries enabling positive feedback loops, which are especially salient in capitalism.

What about trust between different entities? A complex ecosystem will contain entities satisfying a variety of niches, which include parasitism and predation (which are on the same spectrum). A trust network can be thought of as a way for different entities to draw various boundaries, often fuzzy ones, that mostly exclude parasites/predators, such that there are few leaks from inside this boundary to outside this boundary (which would include parasitism/predation by entities outside the boundary). There are “those who you trust” and “those who you don’t trust” (both fuzzy sets), and you assign more utility to giving resources to those you trust, as this allows for positive feedback loops within a system that contains you (namely, the trust network).

Externalities and sustainability

Since no subsystem of the world is causally closed, all positive feedback loops have externalities. By definition, the outside world is only directly affected by these externalities, and is only affected by what happens within the boundary to the extent that this eventually leads to externalities. A wise designer of a positive feedback loop will anticipate its externalities, and set it up such that the externalities are overall desirable to the designer. After all, there is no point to creating a positive feedback loop unless its externalities are mostly positive.

A positive feedback loop’s externalities modify its environment, affecting its own ability to continue; for example, a positive feedback loop of microorganisms eating food will exhaust itself by consuming the food. So, different positive feedback loops are environmentally sustainable to different extents. Both production and conquest generate positive feedback loops, as Ben Hoffman discusses in this post, but production is much more environmentally sustainable than conquest.

One way to increase environmental sustainability is to move more processes to the inside of the boundary. For example, a country that is consuming large amounts of iron (driving up iron prices) may consider setting up its own iron mines. Thus, the inside of the boundary becomes more like an economy of its own. This is sometimes known as import replacement.

Of course, the environmental sustainability of a positive feedback loop can also be a negative, as it is better for some processes (such as rotting) to limit or exhaust themselves, thus transitioning to negative feedback or a combination of positive and negative feedback. Processes that include intentionally-designed positive and negative feedback can be much more environmentally sustainable than processes that only have positive feedback loops designed in, since they can limit their growth when such growth would be unsustainable.

While in theory the philosophy of effective altruism (EA) would imply a strong (and likely overwhelming) emphasis on creating and maintaining environmentally sustainable positive feedback loops with positive externalities, typically-recommended EA practices (such as giving away 10% of one’s income) are negative feedback loops (the more you make, the more you give away). While in theory the place the resources are given to could have a faster positive feedback loop than just investing in yourself, your friends, and your projects, in practice I rarely believe claims of this form that come from the EA movement; for example, if a country has a high rate of poverty, that indicates that the negative feedback loops (such as corruption) are likely stronger than the positive ones, and that giving resources is ineffective. Thus, I cannot in good conscience allow anything like current EA ideology to substantially control resource allocation in most systems I create, even though EA philosophy taken to its logical conclusion would get the right answer on the importance of securing the boundaries of positive feedback loops.

Policy suggestions

How do these ideas translate to action? One suggestion is that, if you are trying to do something big, you use one or more positive feedback loops, and ask yourself the following questions about each one:

  1. What’s the generator of my positive feedback loop (i.e. what’s the process that turns stuff into more stuff)?
  2. What is the boundary within which the positive feedback increases resources?
  3. How am I reducing leakage across this boundary?
  4. What are the externalities of this positive feedback loop?
  5. How environmentally sustainable is this positive feedback loop?
  6. Are there built-in negative feedback loops that increase environmental sustainability?

(thanks to Bryce Hidysmith for a conversation that led to this post)

Act of Charity

(Also posted on LessWrong)

The stories and information posted here are artistic works of fiction and falsehood. Only a fool would take anything posted here as fact.


Act I.

Carl walked through the downtown. He came across a charity stall. The charity worker at the stall called out, “Food for the Africans. Helps with local autonomy and environmental sustainability. Have a heart and help them out.” Carl glanced at the stall’s poster. Along with pictures of emaciated children, it displayed infographics about how global warming would cause problems for African communities’ food production, and numbers about how easy it is to help out with money. But something caught Carl’s eye. In the top left, in bold font, the poster read, “IT IS ALL AN ACT. ASK FOR DETAILS.”

Carl: “It’s all an act, huh? What do you mean?”

Worker: “All of it. This charity stall. The information on the poster. The charity itself. All the other charities like us. The whole Western idea of charity, really.”

Carl: “Care to clarify?”

Worker: “Sure. This poster contains some correct information. But a lot of it is presented in a misleading fashion, and a lot of it is just lies. We designed the poster this way because it fits with people’s idea is of a good charity they should give money to. It’s a prop in the act.”

Carl: “Wait, the stuff about global warming and food production is a lie?”

Worker: “No, that part is actually true. But in context we’re presenting it as some kind of imminent crisis that requires an immediate infusion of resources, when really it’s a very long-term problem that will require gradual adjustment of agricultural techniques, locations, and policies.”

Carl: “Okay, that doesn’t actually sound like more of a lie than most charities tell.”

Worker: “Exactly! It’s all an act.”

Carl: “So why don’t you tell the truth anyway?”

Worker: “Like I said before, we’re trying to fit with people’s idea of what a charity they should give money to looks like. More to the point, we want them to feel compelled to give us money. And they are compelled by some acts, but not by others. The idea of an immediate food crisis creates more moral and social pressure towards immediate action, than the idea that there will be long-term agricultural problems that require adjustments.

Carl: “That sounds…kind of scammy?”

Worker: “Yes, you’re starting to get it! The act is about violence! It’s all violence!”

Carl: “Now hold on, that seems like a false equivalence. Even if they were scammed by you, they still gave you money of their own free will.”

Worker: “Most people, at some level, know we’re lying to them. Their eyes glaze over ‘IT IS ALL AN ACT’ as if it were just a regulatory requirement to put this on charity posters. So why would they give money to a charity that lies to them? Why do you think?”

Carl: “I’m not nearly as sure as you that they know this! Anyway, even if they know at some level it’s a lie, that doesn’t mean they consciously know, so to their conscious mind it seems like being completely heartless.”

Worker: “Exactly, it’s emotional blackmail. I even say ‘Have a heart and help them out’. So if they don’t give us money, there’s a really convenient story that says they’re heartless, and a lot of them will even start thinking about themselves that way. Having that story told about them opens them up to violence.”

Carl: “How?”

Worker: “Remember Martin Shkreli?”

Carl: “Yeah, that asshole who jacked up the Daraprim prices.”

Worker: “Right. He ended up going to prison. Nominally, it was for securities fraud. But it’s not actually clear that whatever security fraud he did was worse than what others in his industry were doing. Rather, it seems likely that he was especially targeted because he was a heartless asshole.”

Carl: “But he still broke the law!”

Worker: “How long would you be in jail if you got punished for every time you had broken the law?”

Carl: “Well, I’ve done a few different types of illegal drugs, so… a lot of years.”

Worker: “Exactly. Almost everyone is breaking the law. So it’s really, really easy for the law to be enforced selectively, to punish just about anyone. And the people who get punished the most are those who are villains in the act.”

Carl: “Hold on. I don’t think someone would actually get sent to prison because they didn’t give you money.”

Worker: “Yeah, that’s pretty unlikely. But things like it will happen. People are more likely to give if they’re walking with other people. I infer that they believe they will be abandoned if they do not give.”

Carl: “That’s a far cry from violence.”

Worker: “Think about the context. When you were a baby, you relied on your parents to provide for you, and abandonment by them would have meant certain death. In the environment of evolutionary adaptation, being abandoned by your band would have been close to a death sentence. This isn’t true in the modern world, but people’s brains mostly don’t really distinguish abandonment from violence, and we exploit that.”

Carl: “That makes some sense. I still object to calling it violence, if only because we need a consistent definition of ‘violence’ to coordinate, well, violence against those that are violent. Anyway, I get that this poster is an act, and the things you say to people walking down the street are an act, but what about the charity itself? Do you actually do the things you say you do?”

Worker: “Well, kind of. We actually do give these people cows and stuff, like the poster says. But that isn’t our main focus, and the main reason we do it is, again, because of the act.”

Carl: “Because of the act? Don’t you care about these people?”

Worker: “Kind of. I mean, I do care about them, but I care about myself and my friends more; that’s just how humans work. And if it doesn’t cost me much, I will help them. But I won’t help them if it puts our charity in a significantly worse position.”

Carl: “So you’re the heartless one.”

Worker: “Yes, and so is everyone else. Because the standard you’re set for ‘not heartless’ is not one that any human actually achieves. They just deceive themselves about how much they care about random strangers; the part of their brain that inserts these self-deceptions into their conscious narratives is definitely not especially altruistic!”

Carl: “According to your own poster, there’s going to be famine, though! Is the famine all an act to you?”

Worker: “No! Famine isn’t an act, but most of our activities in relation to it are. We give people cows because that’s one of the standard things charities like ours are supposed to do, and it looks like we’re giving these people local autonomy and stuff.”

Carl: “Looks like? So this is all just optics?”

Worker: “Yes! Exactly!”

Carl: “I’m actually really angry right now. You are a terrible person, and your charity is terrible, and you should die in a fire.”

Worker: “Hey, let’s actually think through this ethical question together. There’s a charity pretty similar to ours that’s set up a stall a couple blocks from here. Have you seen it?”

Carl: “Yes. They do something with water filtering in Africa.”

Worker: “Well, do you think their poster is more or less accurate than ours?”

Carl: “Well, I know yours is a lie, so…”

Worker: “Hold on. This is Gell-Mann amnesia. You know ours is a lie because I told you. This should adjust your model of how charities work in general.”

Carl: “Well, it’s still plausible that they are effective, so I can’t condemn—”

Worker: “Stop. In talking of plausibility rather than probability, you are uncritically participating in the act. You are taking symbols at face value, unless there is clear disproof of them. So you will act like you believe any claim that’s ‘plausible’, in other words one that can’t be disproven from within the act. You have never, at any point, checked whether either charity is doing anything in the actual, material world.”

Carl: “…I suppose so. What’s your point, anyway?”

Worker: “You’re shooting the messenger. All or nearly all of these charities are scams. Believe me, we’ve spent time visiting these other organizations, and they’re universally fraudulent, they just have less self-awareness about it. You’re only morally outraged at the ones that don’t hide it. So your moral outrage optimizes against your own information. By being morally outraged at us, you are asking to be lied to.”

Carl: “Way to blame the victim. You’re the one lying.”

Worker: “We’re part of the same ecosystem. By rewarding a behavior, you cause more of it. By punishing it, you cause less of it. You reward lies that have plausible deniability and punish truth, when that truth is told by sinners. You’re actively encouraging more of the thing that is destroying your own information!”

Carl: “It still seems pretty strange to think that they’re all scams. Like, some of my classmates from college went into the charity sector. And giving cows to people who have food problems actually seems pretty reasonable.”

Worker: “It’s well known by development economists that aid generally creates dependence, that in giving cows to people we disrupt their local economy’s cow market, reducing the incentive to raise cattle. And in theory it could still be worth it, but our preliminary calculations indicate that it probably isn’t.”

Carl: “Hold on. You actually ran the calculation, found that your intervention was net harmful, and then kept doing it?”

Worker: “Yes. Again, it is all—”

Carl: “What the fuck, seriously? You’re a terrible person.”

Worker: “Do you think any charity other than us would have run the calculation we did, and then actually believe the result? Or would they have fudged the numbers here and there, and when even a calculation with fudged numbers indicated that the intervention was ineffective, come up with a reason to discredit this calculation and replace it with a different one that got the result they wanted?”

Carl: “Maybe a few… but I see your point. But there’s a big difference between acting immorally because you deceived yourself, and acting immorally with a clear picture of what you’re doing.”

Worker: “Yes, the second one is much less bad!”

Carl: “What?”

Worker: “All else being equal, it’s better to have clearer beliefs than muddier ones, right?”

Carl: “Yes. But in this case, it’s very clear that the person with the clear picture is acting immorally, while the self-deceiver, uhh..”

Worker: “…has plausible deniability. Their stories are plausible even though they are false, so they have more privilege within the act. They gain privilege by muddying the waters, or in other words, destroying information.”

Carl: “Wait, are you saying self-deception is a choice?”

Worker: “Yes! It’s called ‘motivated cognition’ for a reason. Your brain runs something like a utility-maximization algorithm to tell when and how you should deceive yourself. It’s epistemically correct to take the intentional stance towards this process.”

Carl: “But I don’t have any control over this process!”

Worker: “Not consciously, no. But you can notice the situation you’re in, think about what pressures there are on you to self-deceive, and think about modifying your situation to reduce these pressures. And you can do this to other people, too.”

Carl: “Are you saying everyone is morally obligated to do this?”

Worker: “No, but it might be in your interest, since it increases your capabilities.”

Carl: “Why don’t you just run a more effective charity, and advertise on that? Then you can outcompete the other charities.”

Worker: “That’s not fashionable anymore. The ‘effectiveness’ branding has been tried before; donors are tired of it by now. Perhaps this is partially because there aren’t functional systems that actually check which organizations are effective and which aren’t, so scam charities branding themselves as effective end up outcompeting the actually effective ones. And there are organizations claiming to evaluate charities’ effectiveness, but they’ve largely also become scams by now, for exactly the same reasons. The fashionable branding now is environmentalism.”

Carl: “This is completely disgusting. Fashion doesn’t help people. Your entire sector is morally depraved.”

Worker: “You are entirely correct to be disgusted. This moral depravity is a result of dysfunctional institutions. You can see it outside charity too; schools are authoritarian prisons that don’t even help students learn, courts put people in cages for not spending enough on a lawyer, the US military blows up civilians unnecessarily, and so on. But you already knew all that, and ranting about these things is itself a trope. It is difficult to talk about how broken the systems are without this talking itself being interpreted as merely a cynical act. That’s how deep this goes. Please actually update on this rather than having your eyes glaze over!”

Carl: “How do you even deal with this?”

Worker: “It’s already the reality you’ve lived in your whole life. The only adjustment is to realize it, and be able to talk about it, without this destroying your ability to participate in the act when it’s necessary to do so. Maybe functional information-processing institutions will be built someday, but we are stuck with this situation for now, and we’ll have no hope of building functional institutions if we don’t understand our current situation.”

Carl: “You are wasting so much potential! With your ability to see social reality, you could be doing all kinds of things! If everyone who were as insightful as you were as pathetically lazy as you, there would be no way out of this mess!”

Worker: “Yeah, you’re right about that, and I might do something more ambitious someday, but I don’t really want to right now. So here I am. Anyway… food for the Africans. Helps with local autonomy and environmental sustainability. Have a heart and help them out.”

Carl sighed, fished a 10 dollar bill from his wallet, and gave it to the charity worker.

Decision theory and zero-sum game theory, NP and PSPACE

(Also posted on LessWrong)

At a rough level:

  • Decision theory is about making decisions to maximize some objective function.
  • Zero-sum game theory is about making decisions to optimize some objective function while someone else is making decisions to minimize this objective function.

These are quite different.

Decision theory and NP

Decision theory roughly corresponds to the NP complexity class.  Consider the following problem:

Given a set of items, each of which has a integer-valued value and weight, does there exist a subset with total weight less than w and total value at least v?

(It turns out that finding a solution is not much harder than determining whether there is a solution; if you know how to tell whether there is a solution to arbitrary problems of this form, you can in particular tell if there is a solution that uses any particular item.)

This is the knapsack problem, and it is in NP.  Given a candidate solution, it is easy to check whether it actually is a solution: you just count the values and the weights.  Since this solution would constitute a proof that the answer to the question is “yes”, and a solution exists whenever the answer is “yes”, this problem is in NP.

The following is a general form for NP problems:

\exists x_1 \in \{0, 1\} \exists x_2 \in \{0, 1\} \ldots \exists x_k \in \{0, 1\} f(x_1, ..., x_k)

where f is a specification of a circuit (say, made of AND, OR, and NOT gates) that outputs a single Boolean value.  That is, the problem is to decide whether there is some assignment of values to x_1, \ldots, x_k that f outputs true on.  This is a variant of the Boolean satisfiability problem.

In decision theory (and in NP), all optimization is in the same direction.  The only quantifier is \exists.

Zero-sum game theory and PSPACE

Zero-sum game theory roughly corresponds to the PSPACE complexity class.  Consider the following problem:

Given a specification of a Reversi game state (on an arbitrarily-large square board), does there exists a policy for the light player that guarantees a win?

(It turns out that winning the game is not much harder than determining whether there is a winning policy; if you know how to tell whether there is a solution to arbitrary problems of this form, then in particular you can tell if dark can win given a starting move by light.)

This problem is in PSPACE: it can be solved by a Turing machine using a polynomial amount of space.  This Turing machine works through the minimax algorithm: it simulates all possible games in a backtracking fashion.

The following is a general form for PSPACE problems:

\exists x_1 \in \{0, 1\} \forall y_1 \in \{0, 1\} \ldots \exists x_k \in \{0, 1\} \forall y_k \in \{0, 1\} f(x_1, y_1, \ldots, x_k, y_k)

where f is a specification of a circuit (say, made of AND, OR, and NOT gates) that outputs a single Boolean value.  That is, the problem is to determine whether it is possible to set the x values interleaved with an opponent setting the y values such that, no matter how the opponent acts, f(x_1, y_1, \ldots, x_k, y_k) is true.  This is a variant of the quantified Boolean formula problem.  (Interpreting a logical formula containing \exists and \forall as a game is standard; see game semantics).

In zero-sum game theory, all optimization is in one of two completely opposite directions.  There is literally no difference between something that is good for one player and something that is bad for the other.  The opposing quantifiers \exists and \forall, representing decisions by the two opponents, are interleaved.

Different cognitive modes

The comparison to complexity classes suggests that there are two different cognitive modes for decision theory and zero-sum game theory, as there are two different types of algorithms for NP-like and PSPACE-like problems.

In decision theory, you plan with no regard to any opponents interfering with your plans, allowing you to plan on arbitrarily long time scales.  In zero-sum game theory, you plan on the assumption that your opponent will interfere with your plans (your \existss are interleaved with your opponent’s \foralls), so you can only plan as far as your opponent lacks the ability to interfere with these plans.  You must have a short OODA loop, or your opponent’s interference will make your plans useless.

In decision theory, you can mostly run on naïve expected utility analysis: just do things that seem like they will work.  In zero-sum game theory, you must screen your plans for defensibility: they must be resistant to possible attacks.  Compare farming with border defense, mechanical engineering with computer security.

High-reliability engineering is an intermediate case: designs must be selected to work with high probability across a variety of conditions, but there is normally no intelligent optimization power working against the design.  One could think of nature as an “adversary” selecting some condition to test the design against, and represent this selection by a universal quantifier; however, this is qualitatively different from a true adversary, who applies intentional optimization to break a design rather than haphazard selection of conditions.


These two types of problems do not cover all realistic situations an agent might face.  Decision problems involving agents with different but not completely opposed objective functions are different, as are zero-sum games with more than two players.  But realistic situations share some properties with each of these, and I suspect that there might actually be a discrete distinction between cognitive modes for NP-like decision theory problems and PSPACE-like zero-sum games.

What’s the upshot?  If you want to know what is going on, one of the most important questions (perhaps the most important question) is: what kind of game are you playing?  Is your situation more like a decision theory problem or a zero-sum game?  To what extent is optimization by different agents going in the same direction, opposing directions, or orthogonal directions?  What would have to change for the nature of the game to change?

Thanks to Michael Vassar for drawing my attention to the distinction between decision theory and zero-sum game theory as a distinction between two cognitive modes.

Related: The Face of the Ice

In the presence of disinformation, collective epistemology requires local modeling

In Inadequacy and Modesty, Eliezer describes modest epistemology:

How likely is it that an entire country—one of the world’s most advanced countries—would forego trillions of dollars of real economic growth because their monetary controllers—not politicians, but appointees from the professional elite—were doing something so wrong that even a non-professional could tell? How likely is it that a non-professional could not just suspect that the Bank of Japan was doing something badly wrong, but be confident in that assessment?

Surely it would be more realistic to search for possible reasons why the Bank of Japan might not be as stupid as it seemed, as stupid as some econbloggers were claiming. Possibly Japan’s aging population made growth impossible. Possibly Japan’s massive outstanding government debt made even the slightest inflation too dangerous. Possibly we just aren’t thinking of the complicated reasoning going into the Bank of Japan’s decision.

Surely some humility is appropriate when criticizing the elite decision-makers governing the Bank of Japan. What if it’s you, and not the professional economists making these decisions, who have failed to grasp the relevant economic considerations?

I’ll refer to this genre of arguments as “modest epistemology.”

I see modest epistemology as attempting to defer to a canonical perspective: a way of making judgments that is a Schelling point for coordination. In this case, the Bank of Japan has more claim to canonicity than Eliezer does regarding claims about Japan’s economy. I think deferring to a canonical perspective is key to how modest epistemology functions and why people find it appealing.

In social groups such as effective altruism, canonicity is useful when it allows for better coordination. If everyone can agree that charity X is the best charity, then it is possible to punish those who do not donate to charity X. This is similar to law: if a legal court makes a judgment that is not overturned, that judgment must be obeyed by anyone who does not want to be punished. Similarly, in discourse, it is often useful to punish crackpots by requiring deference to a canonical scientific judgment.

It is natural that deferring to a canonical perspective would be psychologically appealing, since it offers a low likelihood of being punished for deviating while allowing deviants to be punished, creating a sense of unity and certainty.

An obstacle to canonical perspectives is that epistemology requires using local information. Suppose I saw Bob steal my wallet. I have information about whether he actually stole my wallet (namely, my observation of the theft) that no one else has. If I tell others that Bob stole my wallet, they might or might not believe me depending on how much they trust me, as there is some chance I am lying to them. Constructing a more canonical perspective (e.g. a in a court of law) requires integrating this local information: for example, I might tell the judge that Bob stole my wallet, and my friends might vouch for my character.

If humanity formed a collective superintelligence that integrated local information into a canonical perspective at the speed of light using sensible rules (e.g. something similar to Bayesianism), then there would be little need to exploit local information except to transmit it to this collective superintelligence. Obviously, this hasn’t happened yet. Collective superintelligences made of humans must transmit information at the speed of human communication rather than the speed of light.

In addition to limits on communication speed, collective superintelligences made of humans have another difficulty: they must prevent and detect disinformation. People on the internet sometimes lie, as do people off the internet. Self-deception is effectively another form of deception, and is extremely common as explained in The Elephant in the Brain.

Mostly because of this, current collective superintelligences leave much to be desired. As Jordan Greenhall writes in this post:

Take a look at Syria. What exactly is happening? With just a little bit of looking, I’ve found at least six radically different and plausible narratives:

• Assad used poison gas on his people and the United States bombed his airbase in a measured response.

• Assad attacked a rebel base that was unexpectedly storing poison gas and Trump bombed his airbase for political reasons.

• The Deep State in the United States is responsible for a “false flag” use of poison gas in order to undermine the Trump Insurgency.

• The Russians are responsible for a “false flag” use of poison gas in order to undermine the Deep State.

• Putin and Trump collaborated on a “false flag” in order to distract from “Russiagate.”

• Someone else (China? Israel? Iran?) is responsible for a “false flag” for purposes unknown.

And, just to make sure we really grasp the level of non-sense:

• There was no poison gas attack, the “white helmets” are fake news for purposes unknown and everyone who is in a position to know is spinning their own version of events for their own purposes.

Think this last one is implausible? Are you sure? Are you sure you know the current limits of the war on sensemaking? Of sock puppets and cognitive hacking and weaponized memetics?

All I am certain of about Syria is that I really have no fucking idea what is going on. And that this state of affairs — this increasingly generalized condition of complete disorientation — is untenable.

We are in a collective condition of fog of war. Acting effectively under fog of war requires exploiting local information before it has been integrated into a canonical perspective. In military contexts, units must make decisions before contacting a central base using information and models only available to them. Syrians must decide whether to flee based on their own observations, observations of those they trust, and trustworthy local media. Americans making voting decisions based on Syria must decide which media sources they trust most, or actually visit Syria to gain additional info.

While I have mostly discussed differences in information between people, there are also differences in reasoning ability and willingness to use reason. Most people most of the time aren’t even modeling things for themselves, but are instead parroting socially acceptable opinions. The products of reasoning could perhaps be considered as a form of logical information and treated similar to other information.

In the past, I have found modest epistemology aesthetically appealing on the basis that sufficient coordination would lead to a single canonical perspective that you can increase your average accuracy by deferring to (as explained in this post). Since then, aesthetic intuitions have led me to instead think of the problem of collective epistemology as one of decentralized coordination: how can good-faith actors reason and act well as a collective superintelligence in conditions of fog of war, where deception is prevalent and creation of common knowledge is difficult? I find this framing of collective epistemology more beautiful than the idea of a immediately deferring to a canonical perspective, and it is a better fit for the real world.

I haven’t completely thought through the implications of this framing (that would be impossible), but so far my thinking has suggested a number of heuristics for group epistemology:

  • Think for yourself. When your information sources are not already doing a good job of informing you, gathering your own information and forming your own models can improve your accuracy and tell you which information sources are most trustworthy. Outperforming experts often doesn’t require complex models or extraordinary insight; see this review of Superforecasting for a description of some of what good amateur forecasters do.
  • Share the products of your thinking. Where possible, share not only opinions but also the information or model that caused you to form the opinion. This allows others to verify and build on your information and models rather than just memorizing “X person believes Y”, resulting in more information transfer. For example, fact posts will generally be better for collective epistemology than a similar post with fewer facts; they will let readers form their own models based on the info and have higher confidence in these models.
  • Fact-check information people share by cross-checking it against other sources of information and models. The more this shared information is fact-checked, the more reliably true it will be. (When someone is wrong on the internet, this is actually a problem worth fixing).
  • Try to make information and models common knowledge among a group when possible, so they can be integrated into a canonical perspective. This allows the group to build on this, rather than having to re-derive or re-state it repeatedly. Contributing to a written canon that some group of people is expected to have read is a great way to do this.
  • When contributing to a canon, seek strong and clear evidence where possible. This can result in a question being definitively settled, which is great for the group’s ability to reliably get the right answer to the question, rather than having a range of “acceptable” answers that will be chosen from based on factors other than accuracy.
  • When taking actions (e.g. making bets), use local information available only to you or a small number of others, not only canonical information. For example, when picking organizations to support, use information you have about these organizations (e.g. information about the competence of people working at this charity) even if not everyone else has this info. (For a more obvious example to illustrate the principle: if I saw Bob steal my wallet, then it’s in my interest to guard my possessions more closely around Bob than I otherwise would, even if I can’t convince everyone that Bob stole my wallet).

Against unreasonably high standards

Consider the following procedure:

  1. Create unreasonably high standards that people are supposed to follow.
  2. Watch as people fail to meet them and thereby accumulate “debt”.
  3. Provide a way for people to discharge their debt by sacrificing their agency to some entity (concrete or abstract).

This is a common way to subjugate people and extract resources from them.  Some examples:

  • Christianity: Christianity defines many natural human emotions and actions as “sins” (i.e. things that accumulate debt), such that almost all Christians sin frequently.  Even those who follow all the rules have “original sin”.  Christianity allows people to discharge their debt by asking Jesus to bear their sins (thus becoming subservient to Jesus/God).
  • The Western education system: Western schools (and many non-Western schools) create unnatural standards of behavior that are hard for students to follow.  When students fail to meet these standards, they are told they deserve punishments including public humiliation and being poor as an adult.  School doesn’t give a way to fully discharge debts, leading to anxiety and depression in many students and former students, but people can partially discharge debt by admitting that they are in an important sense subservient to the education system (e.g. accepting domination from the more-educated boss in the workplace).
  • Effective altruism: The drowning child argument (promoted by effective altruists such as Peter Singer) argues that middle-class Americans have an obligation to sacrifice luxuries to save the lives of children in developing countries, or do something at least this effective (in practice, many effective altruists instead support animal welfare or existential risk organizations).  This is an unreasonably high standard; nearly no one actually sacrifices all their luxuries (living in poverty) to give away more money.  Effective altruism gives a way to discharge this debt: you can just donate 10% of your income to an effective charity (sacrificing some of your agency to it), or change your career to a more good-doing one.  (This doesn’t work for everyone, and many “hardcore EAs” continue to struggle with scrupulosity despite donating much more than 10% of their income or changing their career plans significantly, since they always could be doing more).
  • The rationalist community: I hesitate to write this section for a few reasons (specifically, it’s pretty close to home and is somewhat less clear given that some rationalists have usefully criticized some of the dynamics I’m complaining about).  But a subtext I see in the rationalist community says something like: “You’re biased so you’re likely to be wrong and make bad decisions that harm other people if you take actions in the world, and it’ll be your fault.  Also, the world is on fire and you’re one of the few people who knows about this, so it’s your responsibility to do something about it.  Luckily, you can discharge some of your debts by improving your own rationality, following the advice of high-level rationalists, and perhaps giving them money.”  That’s clearly an instance of this pattern; no one is unbiased, “high-level rationalists” included.  (It’s hard to say where exactly this subtext comes from, and I don’t think it’s anyone’s “fault”, but it definitely seems to exist; I’ve been affected by it myself, and I think it’s part of what causes akrasia in many rationalists.)

There are many more examples; I’m sure you can think of some.  Setting up a system like this has some effects:

  • Hypocrisy: Almost no one actually follows the standards, but they sometimes pretend they do.  Since standards are unreasonably high, they are enforced inconsistently, often against the most-vulnerable members of a group, while the less-vulnerable maintain the illusion that they are actually following the standards.
  • Self-violence: Buying into unreasonably high standards will make someone turn their mind against itself.  Their mind will split between the “righteous” part that is trying to follow and enforce the unreasonably high standards, and the “sinful” part that is covertly disobeying these standards in order to get what the mind actually wants (which is often in conflict with the standards).  Through neglect and self-violence, the “sinful” part of the mind develops into a shadow.  Self-hatred is a natural results of this process.
  • Distorted perception and cognition: The righteous part of the mind sometimes has trouble looking at ways in which the person is failing to meet standards (e.g. it will avoid looking at things that the person might be responsible for fixing).  Consciousness will dim when there’s risk of seeing that one is not meeting the standards (and sometimes also when there’s risk of seeing that others are not meeting the standards).  Concretely, one can imagine someone who gets lost surfing the internet to avoid facing some difficult work they’re supposed to do, or someone who avoids thinking about the ways in which their project is likely to fail.  Given the extent of the high standards and the debt that most people feel they are in, this will often lead to extremely distorted perception and cognition, such that coming out of it feels like waking from a dream.
  • Motivational problems: Working is one way to discharge debt, but working is less motivating if all products of your work go to debt-collectors rather than yourself.  The “sinful” part of the mind will resist work, as it expects to derive little benefit from it.
  • Fear: Accumulating lots of debt gives one the feeling that, at any time, debt-collectors could come and demand anything of you.  This causes the scrupulous to live in fear.  Sometimes, there isn’t even a concretely-identifiable entity they’re afraid of, but it’s clear that they’re afraid of something.

Systems involving unreasonably high standards could theoretically be justified if they were good coordination mechanisms.  But it seems implausible that they are.  Why not just make the de jure norms ones that people are actually likely to follow?  Surely a sufficient set of norms exists, since people are already following the de facto ones.  You can coordinate a lot without optimizing your coordination mechanism for putting everyone in debt!

I take the radical position that TAKING UNREASONABLY HIGH STANDARDS SERIOUSLY IS A REALLY BAD IDEA and ALL OF MY FRIENDS AND PERHAPS ALL HUMANS SHOULD STOP DOING IT.  Unreasonably high standards are responsible for a great deal of violence against life, epistemic problems, and horribleness in general.

(It’s important to distinguish having unreasonably high standards from having a preference ordering whose most-preferred state is impractical to attain; the second does not lead to the same problems unless there’s some way of obligating people to reach an unreasonably good state in the preference ordering.  Attaining a decent but non-maximally-preferred state should perhaps feel annoying or aesthetically displeasing, but not anxiety-inducing.)

My advice to the scrupulous: you are being scammed and you are giving your life away to scammers.  The debts that are part of this scam are fake, and you can safely ignore almost all of them since they won’t actually be enforced.  The best way to make the world better involves first refusing to be scammed, so that you can benefit from the products of your own labor (thereby developing intrinsic motivation to do useful things) instead of using them to pay imaginary debts, and so you can perceive the world accurately without fear.  You almost certainly have significant intrinsic motivation for helping others; you are more likely to successfully help them if your help comes from intrinsic motivation and abundance rather than fear and obligation.