“Self-Blackmail” and Alternatives

February 9, 2025February 10, 2025 ~ Jessica Taylor

Ziz has been in the news lately. Instead of discussing that, I’ll discuss an early blog post, “Self-Blackmail”. This is a topic I also talked with Ziz about in person, although not a lot.

Let’s start with a very normal thing people do: make New Year’s resolutions. They might resolve that, for example, they will do strenuous exercise at least 2 times a week for the next year. Conventional wisdom is that these are not very effective.

Part of the problem is that breaking commitments even once cheapens the commitment: once you have “cheated” once, there’s less of a barrier to cheating in the future. So being sparing about these explicit commitments can make them more effective:

I once had a file I could write commitments in. If I ever failed to carry one out, I knew I’d forever lose the power of the file. It was a self-fulfilling prophecy. Since any successful use of the file after failing would be proof that a single failure didn’t have the intended effect, so there’d be no extra incentive.

If you always fulfill the commitments, there is an extra incentive to fulfill additional commitments, namely, it can preserve the self-fulfilling prophecy that you always fulfill commitments. Here’s an example in my life: sometimes, when I have used addictive substances (e.g. nicotine), I have made a habit of tracking usage. I’m not trying to commit not to use them, rather, I’m trying to commit to track usage. This doesn’t feel hard to maintain, and it has benefits, such as noticing changes in the amount of substance consumed. And it’s in an area, addictive substances, where conventional wisdom is that human intuition is faulty and willpower is especially useful.

Ziz describes using this technique more extensively, in order to do more work:

I used it to make myself do more work. It split me into a commander who made the hard decisions beforehand, and commanded who did the suffering but had the comfort of knowing that if I just did the assigned work, the benevolent plans of a higher authority would unfold. As the commanded, responsibility to choose wisely was lifted from my shoulders. I could be a relatively shortsighted animal and things’d work out fine. It lasted about half a year until I put too much on it with too tight a deadline. Then I was cursed to be making hard decisions all the time. This seems to have improved my decisions, ultimately.

Compared to my “satisficer” usage of self-blackmail to track substance usage, this is more of a “maximizer” style where Ziz tries to get a lot of work out of it. This leads to more problems, because the technique relies on consistency, which is more achievable with light “satisficer” commitments.

There’s a deeper problem, though. Binding one’s future self is confused at a psychological and decision-theoretic level:

Good leadership is not something you can do only from afar. Hyperbolic discounting isn’t the only reason you can’t see/feel all the relevant concerns at all times. Binding all your ability to act to the concerns of the one subset of your goals manifested by one kind of timeslice of you is wasting potential, even if that’s an above-average kind of timeslice.

If you’re not feeling motivated to do what your thesis advisor told you to do, it may be because you only understand that your advisor (and maybe grad school) is bad for you and not worth it when it is directly and immediately your problem. This is what happened to me. But I classified it as procrastination out of “akrasia”.

Think back to the person who made a New Year’s resolution to strenuously exercise twice a week. This person may, in week 4, have the thought, “I made this commitment, and I really need to exercise today to make it, but I’m so busy, and tired. I don’t want to do this. But I said I would. It’s important. I want to keep the commitment that is in my long-term interest, not just do whatever seems right in the moment.” This is a self-conflicted psychological mode. Such self-conflict corresponds to decision-theoretic irrationality.

One type of irrationality is the mentioned hyperbolic discounting; self-blackmail could, theoretically, be a way of correcting dynamic inconsistencies in time preference. However, as Ziz notes, there are also epistemic and computational problems: the self who committed to a New Year’s resolution has thought about the implications little, and lacks relevant information to the future decisions, such as how busy they will be over the year.

A sometimes very severe problem is that the self-conflicted psychological state can have a lot of difficulty balancing different considerations and recruiting the brain’s resources towards problem-solving. This is often experienced as “akrasia”. A commitment to, for example, a grad school program, can generate akrasia, due to the self-conflict between the student’s feeling that they should finish the program, and other considerations that could lead to not doing so, but which are suppressed from consideration, as they seem un-virtuous. In psychology, this can be known as “topdog vs. underdog”.

Personally, I have the repeated experience of being excited about the project and working on it with others, but becoming demotivated over time, eventually quitting. This is expensive, in both time and money. At the time, I often have difficulty generating reasons why continuing to work on the project is a bad idea. But, usually, a year later, it’s very easy to come up with reasons why quitting was a good idea.

Ziz is glad that the self-blackmail technique ultimately failed. There are variations that have more potential sustainability, such as Beeminder:

These days there’s Beeminder. It’s a far better designed commitment mechanism. At the core of typical use is the same threat by self fulfilling prophecy. If you lie to Beeminder about having accomplished the thing you committed to, you either prove Beeminder has no power over you, or prove that lying to Beeminder will not break its power over you, which means it has no consequences, which means Beeminder has no power over you.

But Beeminder lets you buy back into its service.

It’s worse than a crutch, because it doesn’t just weaken you through lack of forced practice. You are practicing squashing down your capacity to act on “What do I want?, What do I have?, and How can I best use the latter to get the former?” in the moment. When you set your future self up to lose money if they don’t do what you say, you are practicing being blackmailed.

Beeminder is a method for staking money on completing certain goals. Since lying to Beeminder is psychologically harder than simply breaking a commitment you wrote to yourself, use of Beeminder can last longer than use of the original self-blackmail technique. Also, being able to buy back into the service makes a “reset” possible, which was not possible with the original technique.

Broadly, I agree with Ziz that self-blackmail techniques, and variations like Beeminder, are imprudent to use ambitiously. I think there are beneficial “satisfier” usages of these techniques, such as for tracking addictive substance usage; one is not in these cases tempted to stack big, hard-to-follow commitments.

What interests me more, though, are better ways to handle commitments in general, both commitments to the self and to others. I see a stronger case for explicit commitments with enforcement when dealing with other agents. For example, a contract to rent a car has terms signed by both parties, with potential legal enforcements for violating the terms.

This has obvious benefits. Even if you could theoretically get the benefits of car rental contracts with the ideal form of TDT spiritual love between moral agents, that’s computationally expensive at best. Contract law is a common part of successful mercantile cultures for a reason.

And, as with the original self-blackmail technique, there are potential self-fulfilling ways of keeping your word to another; you can be trusted more to fulfill commitments in the future if you always fulfils commitments made in the past. (Of course, to always fulfil commitments requires being sparing about making them.)

Let’s now consider, rather than inter-personal commitments, self-commitments. Consider alternatives to making a new year’s resolution to exercise twice a week. Suppose you actually believe that you will do resistance training about twice a week for the next year. Then, perhaps it is prudent to invest in a home gym. Investing in the gym is, in a way, a “bet” about your future actions: it will turn out to have been not worth it, if you rarely use it. Though, it’s an unusual type of bet, in that the outcome of the bet is determined by your future actions (thus potentially being influenced by self-fulfilling prophecies).

A more general formula: Instead of making a commitment from sheer force of will, think about the range of possible worlds where you actually fulfill the commitment. Think about what would be good decisions right now, conditional on fulfilling the commitment in the future. These are “bets” on fulfilling the commitment, and are often well thought of as “investments”. Now, ask two questions:

If I take these initial steps, do I expect that I’ll fulfill the commitment?
If I take these initial steps, and then fulfill the commitment, do I overall like the result, compared to the default alternative?

If the answers to both are “yes”, that suggests that the commitment-by-bet is overall prudent, compared with the default. (Of course, there are more possible actions if the answer to either question is “no”, including re-thinking the commitment or the initial steps, or going ahead with the initial steps anyway on expected value grounds.)

The overall idea here is to look for natural decision-theoretic commitment opportunities. Investing in a home gym, for example, is a good idea for people who make some sorts of decisions in the future (like regular resistance training), and a bad idea for people who make different sorts of decisions in the future. It’s not an artificial mechanism like giving your stuff to a friend who only gives it back if you exercise enough. It’s a feature of the decision-theoretic landscape, where making certain decisions ahead of time is only prudent conditional on certain future actions.

Something hard to model here is the effect of such investments/bets on a person’s future action through “self-fulfilling-prophecy” or “hyperstitional” means. For example, perhaps if you actually invest in a home gym, people including you will think of you as the sort of person who benefits from a home gym, who is a sort of person who exercises regularly. Such a change to one’s self-image, and external image, can influence what it feels natural to do in the future.

To be clear, I’m not recommending making performative investments in things corresponding to what you would like to be doing in the future. Instead, I’m advising thinking through what would actually be a good investment conditional on the imagined future actions. For example, even if you are going to exercise regularly, it’s not clear that a home gym is a good investment: a gym membership may be a better idea. And it’s prudent to take into account the chance of not exercising in the future, making the investment useless: my advised decision process counts this as a negative, not a useful self-motivating punishment. The details will, of course, depend on the specific situation.

This sort of commitment-by-bet can be extended to inter-personal situations, to some degree. For example, suppose two people like the idea of living together long-term. They could, as an alternative to making promises to each other about this, think of bets/investments that would be a good idea conditional on living together long-term, such as getting a shared mortgage on a house. That’s more likely to be prudent conditional on them living together long-term. And the cost of not living together is denominated more materially and financially, rather than in broken promises.

To summarize: I suggest, as an alternative to making explicit commitments that they feels bound by in the future, people could consider locating commitment opportunities that are already out there, in the form of decisions that are only prudent conditional on some future actions; taking such an opportunity constitutes a “bet” or “investment” on taking those actions in the future. This overall seems to be more compatible with low levels of psychological self-conflict, which has broad benefits to the committer’s ability to un-confusedly model the world and act agentically.

Published by Jessica Taylor

View all posts by Jessica Taylor

9 thoughts on ““Self-Blackmail” and Alternatives”

benquo says:

February 10, 2025 at 12:45 am

The paradigm here seems backwards. Options are valuable, so commitments are costs, so commitmentmaxxing is a costmaxxing. External incentives are already enough pressure for the sorts of cooperation we want more of. But if there’s no Pareto-compatible incentive to do it, in what sense can we want it? For instance, if two people like the idea of living together long-term, you suggest they could “invest” in ways that make backing out costly, like getting a shared mortgage. But this is only a good investment if they’ll want to live together in the future. It’s a cost, not a benefit, that buying a house together requires a prediction about future decisions. If it wouldn’t have otherwise been helpful to buy a house together, and in the future they still want to live together, then what will stop them from doing so? And if not, what good will being stuck with a shared house do them?

The only case I can see for preferring commitments is when this makes it easier to resist social pressure from a perverse culture that favors commitments over preferences. For instance, I face less (perceived) social pressure to stay out late or do similar things that I don’t like and never did, now that I “have to” be home with my kids at night. People with a mortgage payment can resist pressure to spend all their income because they “can’t” skip paying down the principal.

Looking for ways to “invest” in one’s exercise plan in order to identify otherwise-neglected investment opportunities is reasonable, but doing so in order to make it costly to renege is perverse. Reading your post caused me to notice how much of our culture’s ambient advice about fitness resolutions and self-help in general is advice to hurt yourself on purpose.

If exercise seems good prospectively, then you should expect it to seem good to you in the future when you’re deciding whether to exercise now, and on the occasions when it doesn’t you should expect there to be some reason for this. It’s a cost, not a benefit, that fitness equipment or a gym membership requires a prediction about your future decisions.

If you expect (maybe based on past experience) that when it comes time to exercise your future self will avoid it for bad rather than good reasons, that’s information about a way in which you’re confused, which is more valuable to investigate and solve in the general case than hack for this one thing at the cost of going to war against yourself. Likewise if you notice anxious-avoidant attachment patterns and predict you’ll flee an otherwise good relationship; you fix this by resolving the confusions causing the anxious-avoidant attachment patterns, not by buying a house to trap yourself in a relationship you’re terrified of.

This is probably the biggest immediately-usable information I got from reading through Spinoza’s Ethics: I stopped thinking of self-help in terms of a progression from “My behavior does not match my idea of what good behavior is” to “I process this as prediction error and cybernetically regulate my behavior to match my idea of good behavior”, and started thinking of it as a progression from “My behavior does not match my idea of what good behavior is” to “I have motives unaccounted for in my idea of good behavior” to investigating those motives, and figuring things out that (in whole or part) resolve the conflict one way or another.

Sometimes you’re too stupid to figure yourself out fast enough, and forcing is an available option. But then you should be able to force yourself next time if it’s still a good idea – and if you keep having to do that, it suggests it should have been worth the overhead to solve the problem in the first place.

I think that if I could send a message to my past self that would only have the effect of getting him not to exercise when he didn’t feel like it, his health at my age would be better, not worse. I wish I’d invested more, and known how to invest more, earlier, in learning what different kinds and levels of exertion felt like and how they made me feel the next hour, the next day, and the next week.

When we’re cooperating with sufficiently identical autonomous agents (e.g. our future and past selves) it makes more sense to think in terms of decisions than commitments. If agents who aren’t identical enough (or – which might be the same thing – don’t have enough common knowledge of identity) to be making decisions together share a legal system and see an opportunity in shared investment, the obvious way to structure the contract is to insure each other; if I don’t fulfill my part of the deal, I compensate you for the loss you incurred. The main exception is where it’s not possible (or undesirably expensive) to compensate the other party fully, but incomplete compensation can function as a deterrent. For instance, punishments for murder are generally insufficient to reanimate the murdered party, which is why murder is usually frowned on even if the perpetrator freely confesses and willingly accepts imprisonment, while taking a sandwich from a deli is considered acceptable as long as you pay for it. If and only if we can’t afford to insure each other (or want to free up the capital we’d have reserved for that, for other purposes), it makes sense to try to size penalties to make the incentive they create for the penalized party adequate to ensure fulfillment of the contract.

For relationships with large shared but potentially asymmetric commitments like raising children, it can make sense to specify compensation for one party exiting the arrangement. If you don’t think you can avail yourself of an enforceable well-specified contract, commitment devices like buying a house together can sometimes be a very imperfect substitute.

LikeLike

Reply
1. Jessica Taylor says:
  
  February 10, 2025 at 1:34 am
  
  Not sure if you realized this, but I think I addressed the negative/positive on making it hard to back out?
  
  And it’s prudent to take into account the chance of not exercising in the future, making the investment useless: my advised decision process counts this as a negative, not a useful self-motivating punishment. The details will, of course, depend on the specific situation.
  
  That it, it seems I mostly agree with you, but it doesn’t seem like you think you mostly agree with me, so there is something here to disagree about.
  
  According to the “just optimize” mindset, options are in general good. Sometimes people have the feeling that they are not acting optimally, when they are doing a sort of behavior that looks like “maximizing optionality” on the surface. This is somewhat correct, and “failure to make good investments” is a way the “maximizing optionality” (surface-level) behavior is often sub-optimal.
  
  I think you’re talking about using contract law for a lot of things, and I would agree that contract law works for some purposes, as I talked about.
  
  Perhaps one thing that seems strange to you about my framing is that I am starting with a confused intention (that is, intention to make a promise to do something in the future) and then thinking of something similar that is more decision-theoretic, like looking for investment opportunities. Perhaps directly introspecting on the confusion, rather than looking for an alternative path of action to the confused one, is going to work better.
  
  I think the reason why I am talking about this is that there are both feelings in favor of and against making “new year’s resolution” type commitments, and it doesn’t seem like simply saying “the feelings against making commitments are right, the feelings in favor are wrong” makes very much progress. So I am trying to think of a way of acting that is not confused, which is looking for investment opportunities, which is a natural sort of commitment. I think someone acting pretty much rationally would do this, and that this sort of behavior could be approved of by both the “in favor of making commitments” and “against making commitments” type feelings. (Which doesn’t mean trying to act more like this on the margin is going to help, that’s mostly a guess about what is advisable, what people might make progress by trying)
  
  LikeLike
  
  Reply
  1. benquo says:
    
    February 10, 2025 at 2:14 am
    
    Commitment-minimization in the name of optionality can be irrational – all decisions, even the best ones, have an opportunity cost. My objection isn’t to making commitments when they’re helpful (like buying exercise equipment that will make exercise more convenient or effective), but to seeking ways to make backing out costly as an end in itself.
    
    If you find yourself trying to solve the problem “commit to exercise in the future” rather than “exercise in the future” or just “be stronger / feel better in the future,” something funny is going on. Investigating that paradox – understanding why you feel you need to bind yourself – seems way more important than successfully doing so, even in a way that looks less like intentional self-harm and is easier to defend as though it were a rational solution to a constrained-optimization problem.
    
    LikeLike
  2. Jessica Taylor says:
    
    February 10, 2025 at 2:17 am
    
    (can’t reply to benquo’s reply to me so I reply here)
    
    I agree. I think I was confused because it seemed like you were disagreeing with the original post. That is, you said something about the framing being backwards, so I interpreted it as you thinking that the framing of the original post is backwards. So I was defending the original post against such a criticism. But it seems like we agree.
    
    LikeLike
  3. benquo says:
    
    February 10, 2025 at 2:17 am
    
    Experimenting with more commitments does seem like a valid exploration strategy if you’re stuck irrationally commitment-averse and can’t think of anything better-targeted to do. I’m proposing a more precise response.
    
    LikeLike
  4. benquo says:
    
    February 10, 2025 at 2:19 am
    
    I didn’t notice myself disagreeing with any of the object-level judgments you were making, just the paradigm within which you were making them. You’re not getting wrong answers to the questions you’re asking, but asking these questions rather than other questions implies a mistake that I’m trying to address directly.
    
    LikeLike
  5. Jessica Taylor says:
    
    February 10, 2025 at 2:39 am
    Ah. I think there are personal and interpersonal cases where I am probably making bad decisions with respect to commitment:
    
    Sometimes it seems like I am lazy and not working on something. Might look from the outside like depressive behavior. Maybe my behavior is well-explained by having irrationally high time preference. It’s not clear that the behavior in question is irrational, but the combination of that behavior and my feeling bad about it (which is caused by things like, the causes of the behavior not being transparent) is irrational.
    
    Sometimes I have expressed an intention to work on a project. Then later I don’t feel like working on it. But I don’t have transparent access to why. My current guess is, this is much less of a problem if I am an *owner* of the project (investor, director) than an employee. And I think that has to do with my modeling the commitment as a “investment-type commitment” not a “promise-type commitment”.
    
    Related, people around me are sometimes asking whether I am “committed” to the project or worried about it. It seems like I want to be able to explain to them a model of commitment that could be plausibly trying to be rational. Then there is a more productive discussion than me having to engage with the confused frame where I’m “committed” to a project and we aren’t talking specific contracts. It might be that they want clarity about what I expect to do in the future, and that looking at what bets I am making about my future actions is one way to get that clarity. (or maybe they are doing something more confused/malicious than that)
    
    It seems like there are arenas, like sex and reproduction, where people are normalizing “commitment”, and the normal frame is really clearly confused, but also clearly has at least some functions. There are DIY alternatives to normal methods, like thinking through specific contracts, which might be worth doing, though are non-standard. I think figuring out which commitment-like things are more compatible with general rationality and not getting in agency-sapping situations is pretty useful.
    
    So, there are a number of areas where the normal thing is “aha, here we use a commitment!” and my response is like, “by default this is basically black magic and is going to blow up, right? but, I want to figure out what you’re trying to get here, and think about something that has some of the functions you’re trying to get, but isn’t black magic that is going to blow up.”
    
    LikeLike
  6. benquo says:
    
    February 10, 2025 at 3:17 am
    
    These cases make more sense to me as a motive for asking these sorts of questions, thanks for explaining. Often – most clearly I think in the cases iii and iv you describe – we’re presented with social constructs such as various forms of (frequently reified) commitment as a proxy for something it is supposed to produce. Often proxy X closely resembles a good standardized solution to a constrained-optimization problem for desideratum Y.
    
    For instance, many of the expectations around marriage correspond to behaviors that make sense for collaborating to make children, but most people talk about marriage as though it were literally how babies are made, even when they’re clearly aware of exceptions.
    
    I think you might profitably model this sort of presentation as caused by some combination of the following factors:
    
    (A) An expression of an orientation towards (and thus a functional preference for) coercion, e.g. social conservatives censorious towards those who have sex outside of marriage.
    
    (B) Acceptance of ritualized or stereotyped expectations from people who are accepting proxy X as a stage 3 simulacrum or master signifier, e.g. people with feelings about “marriage” and “wedding” rather than about the person they’re marrying, or about their expectations of reproductive success and of their mate’s parental investment.
    
    (C) People who consciously want desideratum Y, and in more or less bad faith accept proxy X as magically necessary to produce desideratum Y, e.g. supposed liberals who can’t seem to remember that I’m not married or in some cases outright contradict me and insist I am married, because I’m living with my reproductive partner and our children and we clearly intend to cooperate over a long time on the sort of high shared investment parenting marriage is supposed to enable.
    
    (D) Naïve people who assume that the language used by A, B, and C is simply an imprecise way of describing a solution to a constrained optimization problem, and use the same language because they think that makes them easier to understand.
    
    This is important information about your counterparty! If you’re trying to figure out how to align incentives and make mutual intelligibility easier for a trade relationship, and they’re trying to figure out how to arrange to be co-enslaved, or trying to use ritual magic to achieve the shared goal, or just trying to earn magical value tokens, then if you’re not explicitly modeling the deep translation problems involved, at least one party is going to be badly disappointed at the result.
    
    When dealing with type A, B, or C counterparties, I’ve found it’s often worth being strategic – with a hospital bureaucrat who calls my partner my wife, I’ll just let it slide if I don’t expect any practical problems. But with someone I’m hoping for high-integrity mutually intelligible interactions with, I’ll correct the misunderstanding and carefully assess their response, as this tells me a lot about what sort of behavior I can count on from them. Type As are often worth actively avoiding even for interactions where they’re ostensibly bound by rules to be helpful.
    
    I suspect that i and ii are at least in part due to internal distortions to limit damage caused by failing to model the adversarial component in iii and iv. When faced with social systems that try to extract commitments through confused framing, it’s often better to halt your participation than to try to navigate them while unseeing the coercion. But best to see the coercion, think about it, and decide what to do.
    
    LikeLike
  7. Jessica Taylor says:
    
    February 10, 2025 at 3:27 am
    
    Oh, I think I see what you mean. I’m doing something like a first-order mistake theoretic analysis, which is probably better than getting confounded by not having done than analysis, but you’re saying that by distinguishing different types of counterparties who in some ways imitate each other, there is a better response that explicitly models the coercion. That makes a lot of sense, thanks!
    
    LikeLike

Share this:

Related

Published by Jessica Taylor

9 thoughts on ““Self-Blackmail” and Alternatives”

Leave a comment Cancel reply