War Doves, War Norms, War Moms

Over two years ago, I started writing a lot about the emerging pandemic. That crisis unfolded with a quaint stateliness and simplicity compared to the situation in Ukraine. (I also had a personal perspective formed by earlier writing about pandemics and work and travel around China so I wrote sooner and more often about that topic.) While the pandemic hit different populations in somewhat similar ways across the globe depending on infrastructure, medical care, policy, social beliefs, and more, the situation in Ukraine is different.

There are different camps of support, countries will be impacted differently by changing commodities costs and social preferences, and the military situation is still a question. But speed is one notable characteristic.

War Doves

People are again making a big deal out of Pope Francis’ 2014 dove-release-for-Ukraine-peace gone wrong.

“At the Vatican, Pope Francis called for an end to violence in the Ukraine before releasing two white doves as a symbol of peace. Moments later, a black crow and a seagull attacked the doves in front of the horrified crowd.”

Continue reading “War Doves, War Norms, War Moms”

Narrative Capture

Before we talk about narrative capture, let’s look at capture of another type.

Regulatory capture

Regulatory capture involves situations where a regulator ends up serving the interests of an industry, specific company, or other group. The people who are supposed to be making the rules end up following the lead of the very groups that they are supposed to be regulating.

Sometimes this is intentionally planned and financially supported and sometimes it just happens because of system design.

For a glimpse of thinking about regulatory capture during the late 1800s attempt to regulate railroads in the US, we have this attorney’s letter to a railroad president:

“My impressions would be that, looking at the matter from a railroad point of view exclusively, [repeal of the Interstate Commerce Act] would not be a wise thing to undertake…. The attempt would not be likely to succeed; if it did not succeed, and were made on the ground of the inefficiency and uselessness of the Commission, the result would very probably be giving it the power it now lacks. The Commission, as its functions have now been limited by the courts, is, or can be made, of great use to the railroads. It satisfies the popular clamor for a government supervision of railroads, at the same time that that supervision is almost entirely nominal. Further, the older such a Commission gets to be, the more inclined it will be found to take the business and railroad view of things…. The part of wisdom is not to destroy the Commission, but to utilize it.”Richard Olney, letter to Charles E. Perkins, 1892

Or, as I’ve heard someone say, “I like having a big board to report to because they never get anything done.” Continue reading “Narrative Capture”

A New Morality of Attainment (Goodhart’s Law)

Peter Drucker said “if you can’t measure it you can’t improve it,” but he didn’t mention the second-order effects of that statement. What changes after people get used to the measurements? What if we measure things that are only partly relevant to what we’re trying to improve?

Tracking metrics can tell us something new, but can also create problems. Let’s look at how Goodhart’s Law leads to unintended consequences.

A New Morality of Attainment

Goodhart’s original quote, about monetary policy in the UK (of all things) was:

“Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.”

But Goodhart’s original is often reinterpreted so that we can talk about more than economics. Anthropologist Marilyn Strathern, in “Improving Ratings: Audit in the British University System,” summarized Goodhart’s Law as:

“When a measure becomes a target, it ceases to be a good measure.”

This is the version of the law that most people use today.

Other commonly used variations include the Lucas Critique (from Robert Lucas’s work on macroeconomic policy):

“Given that the structure of an econometric model consists of optimal decision rules of economic agents, and that optimal decision rules vary systematically with changes in the structure of series relevant to the decision maker, it follows that any change in policy will systematically alter the structure of econometric models.”

And also Campbell’s Law:

“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”

I’m going to stick with Strathern’s description because its simpler and more people seem to know it.

Strathern wrote about the emergence, in Cambridge in the mid 1700s, of written and oral exams as a way to rate university students.

Those students’ results were supposed to show how well the students learned their material, but also showed how well the faculty and university were doing. That is, how well the students had learned the subject matter, how well the professors taught, and the quality of the universities.

But to determine how well students, professors, and universities had performed, the exams couldn’t be graded in traditional, qualitative ways. They needed a way to rank the students.

“This culminated in 1792 in a proposal that all answers be marked numerically, so that… the best candidate will be declared Number One… The idea of an examination as the formal testing of human activity joined with quantification (that is, a numerical summary of attainment) and with writing, which meant that results were permanently available for inspection. With measurement came a new morality of attainment. If human performance could be measured, then targets could be set and aimed for.”

Strathern also described the difficulties of the rankings.

“When a measure becomes a target, it ceases to be a good measure. The more examination performance becomes an expectation, the poorer it becomes as a discriminator of individual performances. [T]argets that seem measurable become enticing tools for improvement…. This was articulated in Britain for the first time around 1800 as ‘the awful idea of accountability’….”

Again from Strathern’s paper:

“Education finds itself drawn into the rather bloated phenomenon I am calling the audit culture… The enhanced auditing of performance returns not to the process of examining students, then, but to other parts of the system altogether. What now are to be subject to ‘examination’ are the institutions themselves—to put it briefly, not the candidates’ performance but the provision that is made for getting the candidates to that point. Institutions are rendered accountable for the quality of their provision.

“This applies with particular directness in Teaching Quality Assessment (TQA), which scrutinizes the effectiveness of teaching—that is, the procedures the institution has in place for teaching and examining, assessed on a department by department basis within the university’s overall provision…. TQA focuses on the means by which students are taught and thus on the outcome of teaching in terms of its organization and practice, rather than the outcome in terms of students’ knowledge. The Research Assessment Exercise (RAE), on the other hand… specifically rates research outcome as a scholarly product. Yet here, too, means are also acknowledged. Good research is supposed to come out of a good ‘research culture’. If that sounds a bit like candidates getting marks for bringing their pencils into the exam, or being penalized for the examination room being stuffy, it is a reminder that, at the end of the day, it is the institution as such that is under scrutiny. Quality of research is conflated with quality of research department (or centre). 1792 all over again!”

Measuring the quality of teaching (or quality of research) rather than what students learn (or research results) is an odd outcome. But it makes sense. Adherence to a process seems related to the goals, so the process becomes what’s measured.

John Gall, a popular systems writer, also outlined something like this in his book Systemantics. Here he describes how a university researcher gets pulled into metrics-driven work.

“[The department head] fires off a memo to the staff… requiring them to submit to him, in triplicate, by Monday next, statements of their Goals and Objectives….

“Trillium [the scientist who has to respond] goes into a depression just thinking about it.

“Furthermore, he cannot afford to state his true goals [he just likes studying plants]. He must at all costs avoid giving the impression of an ineffective putterer or a dilettante…. His goals must be well-defined, crisply stated, and must appear to lead somewhere important. They must imply activity in areas that tend to throw reflected glory on the Department.”

Universities may have started our the focus with metrics, but today we see metrics used all the time. Here are some examples of metrics used to achieve goals and the problems they created.

Some Inappropriate Metrics

Robert McNamara and the Vietnam War. Vietnam War-era Secretary of Defense Robert McNamara was one of the “Whiz Kids” in Statistical Control, a management science operation in the military. McNamara went on to work for Ford Motor Company and became its president, only to resign shortly afterward when Kennedy asked him to become Secretary of Defense.

In his new role, McNamara brought a statistician’s mind to the Vietnam War, with disastrous results. People on his team presented skewed data, put them in models that told a desired story, and couldn’t assess more qualitative issues like willingness of the Viet Cong and the US to fight. The focus on numbers even when they don’t tell the full story is called the McNamara Fallacy.

The documentary The Fog of War presents 11 lessons about what McNamara learned as a senior government official. The numbered list of lessons:

1) Empathize with your enemy; 2) Rationality alone will not save us; 3) There’s something beyond one’s self; 4) Maximize efficiency; 5) Proportionality should be a guideline in war; 6) Get the data; 7) Belief and seeing are both often wrong; 8) Be prepared to reexamine your reasoning; 9) In order to do good, you may have to engage in evil; 10) Never say never; 11) You can’t change human nature.

Many of those 11 lessons deal with issues of McNamara’s Fallacy, including at least numbers 1, 2, 4, 6, 7, and 8.

Easy, rather than meaningful measurements. If we need to measure something as a step toward our goals, we might choose what is easier to measure instead of what might be more helpful. Examples: a startup ecosystem tracking startup funding rounds (often publicly shared) rather than startup success (takes years and results are often private). The unemployment rate, which tracks how many people looking for work who have not found it, rather than how many have given up without working and how many have taken poorly paid jobs.

Vanity metrics. In startups we often talk about “vanity metrics” as being ones that look good but aren’t helpful.  A vanity metric would be growth in users (rather than accompanying metrics around revenue or retention) or website visits when those visits may come from expensive ad buys.

For example, when Groupon was getting ready to IPO, they quickly hired thousands of sales people in China. The purpose was to increase their valuation at IPO, not to bring in more revenue in a new market. When potential investors saw that Groupon had a large China team, they thought the company would succeed.

Vanity metrics can occur anywhere. When some police departments started to track crime statistics, they also started to underreport certain crimes since the police were judged based on how many crimes occurred.

Surrogate metrics in health care. There’s a trade-off when measuring efficacy of a medicine. How long do we wait to prove results? If there are proxies for knowing whether a patient seems to be on the path to recovery, when do we choose the proxy rather than the actual outcome? As described in Time to Review the Role of Surrogate End Points in Health Policy:

“The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA), have a long tradition of licensing technologies solely on the basis of evidence of their effects on biomarkers or intermediate end points that act as so-called surrogate end points. The role of surrogates is becoming increasingly important in the context of programs initiated by the FDA and the EMA to offer accelerated approval to promising new medicines. The key rationale for the use of a surrogate end point is to predict the benefits of treatment in the absence of data on patient-relevant final outcomes. Evidence from surrogate end points may not only expedite the regulatory approval of new health technologies but also inform coverage and reimbursement decisions.”

Nudging. Nudges are little encouragements used by governments and businesses to change what individuals choose to do. Sometimes the nudge comes in the form of information about what other people do. For example, in the UK, tax compliance increased when people received a letter stating that “9 out of 10 people in your area are up to date with tax payments.”

But what if the goals of the nudges don’t consider other outcomes?

In “The Power of Suggestion: Inertia in 401(k) Participation and Savings Behavior” the authors show how changing default choices in employee retirement decisions resulted in more people choosing to save, but also resulted in more people continuing with default conservative money market investment choices.

The nudge increased the goal of higher employee savings compliance but also created a situation where more employees gave up higher returns they could have had from long-term equity investing.

Artistic and sports performances. Judged artistic competitions are scored in different ways and judging criteria sometimes change. Tim Ferriss realized that tango dance competitions ranked turns highly and so he did lots of turns to win, as a relative novice to other competitors. Something similar happened when Olympic skating changed to value the technical difficulty in each component of the skaters’ performance. As a result, checking off technical moves and less of the subjective artistic moves can leave performances less beautiful to watch.

Alpha Chimp. Metrics are a modern invention, but there are versions of them in other societies. In Jane Goodall’s book In the Shadow of Man, we learn of a low-ranking chimpanzee “Mike” who suddenly became alpha male. The top-ranked male was often the toughest chimpanzee in the group, a position backed up by size, intimidation, and what the rest of the group accepted. Mike was at the bottom of the adult male hierarchy (attacked by almost all the other males, last access to bananas). But Mike realized that he could use some empty oil cans that Goodall had left at her campsite in a new way. He ran through the group of chimpanzees banging the cans together. The other chimpanzees had never heard noise like that before and scattered. Maybe Mike proved that the top-ranked position, which should be based on who best leads the group, was actually based on who was scariest.

Four and More Types of Goodhart

One of the best papers digging into variations of the metric and goal problem is Categorizing Variants of Goodhart’s Law, by David Manheim and Scott Garrabrant. Their paper outlines four types of Goodhart’s Law and why they happen.

Regressional. When selecting a metric also selects a lot of noise. Example: choosing to do whatever the winners of “person of the year” or “best company” awards did. You might not see that the person was chosen to send a political message or that the company was manipulating numbers and will fall next year.

Extremal. This comes from out-of-sample projections. When our initial information is within a specific boundary we may still want to project what could happen out of the boundary. In those extreme cases, the relationship between the metric and the goal may break down.

Casual. Where the regulator (the intermediary between the proxy metric and the goal) causes the problem. For example, when pain became a 5th vital sign doctors were measured by their ability to make their patients more comfortable. If doctors start to prescribe pain medication too often or too easily they may increase addiction.

Adversarial. Where agents have different goals from the regulator and find a loophole that harms the goal. For example, colonial powers wanting to decrease the number of cobras in India or rats in Vietnam and paying a bounty for dead cobras or rat tails. People discovered that they could raise their own cobras to kill or cut off rat tails and release the rats. This is known as the Cobra Effect.

Beyond Manheim and Garrabrant’s four examples, others find that the law itself has other different forms.

Right vs Wrong. Noah Smith splits Goodhart’s Law into wrong and right versions:

“The ‘law’ actually comes in several forms, one of which seems clearly wrong, one of which seems clearly right. Here’s the wrong one:

“Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.

“That’s obviously false. An easy counterexample is the negative correlation between hand-washing and communicable disease. Before the government made laws to encourage hand-washing by food preparation workers (and to teach hand-washing in schools), there was a clear negative correlation between frequency of hand-washing in a population and the incidence of communicable disease in that population. Now the government has placed pressure on that regularity for control purposes, and the correlation still holds…”

And here’s Smith’s version of Goodhart’s Law that seems true to him:

“As soon as the government attempts to regulate any particular set of financial assets, these become unreliable as indicators of economic trends.

“This seems obviously true if you define ‘economic trends’ to mean economic factors other than the government’s actions. In fact, you don’t even need any kind of forward-looking expectations for this to be true; all you need is for the policy to be effective.”

My Summary. I just summarize Goodhart’s Law as coming from two places. One is the behavior change that occurs when people start trying to achieve a metric rather than a goal. The other origin are problems with the metrics being proxies for goals.

Post Goodhart

There are many cases of metrics being poor proxies for a goal. There are also many cases of people changing their behavior to meet metric targets rather than goals.

More awareness of Goodhart’s Law should hopefully lead to less cases of it, though maybe I’m too optimistic.

We also have many examples of the way we work, live, and play that are without measurement. We often measure haphazardly but survive or measure nothing and also survive.

Or are we actually making some subconscious measurements that we don’t recognize? If so, could those subconscious measurements be beyond Goodhart style effects?

Tracking metrics can fool us into seeing relationships that don’t exist.

Here are some ways we can try to avoid Goodhart’s Law:

  • Be careful with situations where our morals skew the metrics we track. In Morals of the Moment I wrote about bad metric choices led to bad outcomes in forest fires, college enrollment, and biased hiring.
  • Check for signs of vanity metrics (metrics that can only improve or which have weak ties to outcomes).
  • Check whether we’re too process-focused (audit culture) rather than outcomes-focused.
  • Regularly update metrics (and goals) as we see behavior change or find better ways to track progress toward a goal.
  • Keep some metrics secret to avoid “self-defeating prophecies.”
  • Use “counter metrics,” a concept from Julie Zhou. “For each success metric, come up with a good counter metric that would convince you that you’re not simply plugging one hole with another. (For example, a common counter metric for measuring an increase in production is also measuring the quality of each thing produced.)”
  • Allow qualitative judgements that fall beyond numbers. This creates unintended consequences of its own, but can be a good check on our efforts. This is the difference between what Daniel Kahneman calls system one (quick intuition) and system two (slow, analytical, rational thinking). Is there room to act when a situation doesn’t feel right?
  • Be more mindful where our choice of metric can impact many people (metrics that change outcomes for large groups should be studied carefully).

That we are even have Goodhart’s Law is a symptom of a more complex and connected society. In a more localized world there wouldn’t be as much of a problem with tracking metrics to achieve goals. Either because our impact would be local, or we just wouldn’t track things at all.

We also might not have an incentive to learn from Goodhart’s Law. Why try to change the way we set metrics if we are not the ones penalized? Why care about skewed outcomes if our timescale of measurement is short and Goodhart outcomes take a longer time?

Three Wagers

It’s easy to write or talk about an issue without staking anything on its outcome. After all, that’s what casual forecasters do all the time. But having something in play — reputation or money or something else — can make sure that people remember their claims. This can keep us honest.

There are obviously a whole set of games that typically include betting. And there are many famous bets, from Ashley Revell gambling his life savings on one spin of the roulette wheel, John Gutfreund and John Meriwether’s proposed but then aborted $10 million bet on a single hand of liar’s poker, and even the dice bet over the sailors’ lives in Rime of the Ancient Mariner.

The following three examples are a bit different. The people making these wagers are trying to drive research, prove their model of the world is correct, or use logic to guide decision-making. It’s something we might consider when we make our own wagers.

‘Oumuamua’s Wager

Purpose: drive research in a specific area. “It’s good for us.”

In 2017 a strangely shaped object moved through the solar system. It was named, ‘Oumuamua, a Hawaiian word for “scout.”

‘Oumuamua was the first observed object passing through the solar system from elsewhere. Its strange shape (possibly like a giant cigar or pancake) led to speculation that it was alien in origin, including that ‘Oumuamua may be an alien-made solar sail or space junk.

These claims came from Avi Loeb, a professor of astronomy at Harvard who calls his speculation “‘Oumuamua’s Wager.”

But most astronomers are dismissive of the alien technology theory. Related to that, Loeb outlines the difficulty of setting a new direction for research. “And in terms of risk, in science, we are supposed to put everything on the table. We cannot just avoid certain ideas because we worry about the consequences of discussing them, because there is great risk in that, too. That would be similar to telling Galileo not to speak about Earth moving around the sun and to avoid looking in his telescope because it was dangerous to the philosophy of the day…. In the context of ‘Oumuamua, I say the available evidence suggests this particular object is artificial, and the way to test this is to find more [examples] of the same and examine them. It’s as simple as that.”

Loeb claims that believing ‘Oumuamua is alien in origin would be a net good because the invigorated search for alien life or technology would drive many other parts of scientific inquiry. In doing so, we would learn much more about the universe than otherwise.

Simon-Ehrlich Wager

Purpose: back up one’s theories. “Let’s prove who’s right.”

The Simon-Ehrlich Wager grew out of the doomsday writing of Paul Ehrlich, author of The Population Bomb, a 1968 book that forecast overpopulation would lead to global famines and resource shortages in the 1970s and 1980s.

Taking the other side of Ehrlich’s claims was Julian Simon, a professor of business. Simon believed that the world would not face the extreme shortages that Ehrlich forecast. The question was, how to gauge whether there was a change in resource availability?

As a solution, Simon proposed that Ehrlich chose any raw materials he wanted and a future date. Simon would win the wager if the prices of those items had decreased by that time. Ehrlich chose copper, chromium, nickel, tin, and tungsten and a date 10 years in the future (September 29, 1990).

The wager’s payoff was to be the difference between the $1000 of materials and the future prices. Ehrlich lost and mailed a check to Simon for $576.07.

Personally, I like that it was actually a business professor that won this one.

But notably, Ehrlich did not include anything other than the check and seemed to be bitter about the loss.

Since Simon was willing to wager again, Ehrlich (and climatologist Stephen Schneider) proposed a new wager — a set of 15 trends, including average temperature, emissions, oceanic harvests, availability of firewood in developing nations, and more.

But Simon passed on the new proposed wagers. As he explained:

“Let me characterize their offer as follows. I predict, and this is for real, that the average performances in the next Olympics will be better than those in the last Olympics. On average, the performances have gotten better, Olympics to Olympics, for a variety of reasons. What Ehrlich and others says is that they don’t want to bet on athletic performances, they want to bet on the conditions of the track, or the weather, or the officials, or any other such indirect measure.”

Pascal’s Wager

Purpose: attribute risk to outcomes and choose the best path. “It benefits me.”

Saying this is about personal benefit may be a bit odd since the wager weighs outcomes given the existence or absence of God.

There are only four outcomes of this wager:

  • God exists, people believe in God and receive infinite reward,
  • God exists, people do not believe in God and miss the infinite reward (or receive infinite punishment),
  • God does not exist, people believe in God and mildly inconvenience themselves,
  • God does not exist, people do not believe in God and have a finite amount of personal benefit.

Given that the outcomes include infinite reward or punishment and only minor costs, one would therefore be logical believing in God.

Pascal’s Wager has been compared to the Precautionary Principle:

“When an activity raises threats of harm to human health or the environment, precautionary measures should be taken even if some cause and effect relationships are not fully established scientifically.”

This principle states that we should work to avoid very bad outcomes even when their chance of occurring is tiny and cause and effect is not known.

Useful Wagers

What makes for a useful wager? There are a few elements.

  • The wager’s outcomes can’t be easily gamed. That is, those taking sides in the Simon-Ehrlich commodities wager can’t drive prices up or down.
  • Who wins is not debatable. This makes wagering on some issues problematic. A wager that is based on things becoming “better,” without a clear definition of how better is measured aren’t useful.
  • Those making the wagers must ride them to the end. One can’t make a wager and then remove oneself from it. Otherwise, people could pick and choose which wagers they commit to.
  • The wager helps us learn something new. We come away with a different understanding of the world after noting the wager’s results. Or, we create new knowledge needed to figure out who won the wager.

So go make wagers when it helps you train your view of the world.

Consider

  • Even Loeb admits that he might not have promoted the alien idea if he didn’t have tenure plus other academic positions. But as he admits, “what’s the worst thing that can happen to me? I’ll be relieved of my administrative duties? This will bring the benefit that I’ll have more time for science.”
  • Ehrlich, in spite of being wrong in his book and wager, is better known than Simon.
  • Pascal’s wager does not seek to prove God’s existence, but rather to bring rationality to belief.

Proposition 22 Paradox

Apart from the US presidential election, another well-funded campaign from 2020 was California ballot measure Proposition 22. If you live outside of California, you may have never heard of Prop 22, even though it could come to impact you.

Prop 22 was a California referendum that dealt with a question specifically addressed to rideshare and app delivery companies. Namely, should rideshare drivers legally be considered contractors or employees?

Now that attention to the presidential election and inauguration has past, let’s go back to look at Prop 22, its Yes vote, and how its implementation led to a system change.

The Fallout

Less than a month after Prop 22 came into effect, related companies took the following actions:

This is just the beginning, but each of the above outcomes have been met with some amount of shock and outrage. Why now?

The Ballot

What Proposition 22 actually said on the voter guide:

“Prop 22: Exempts App-based Transportation And Delivery Companies From Providing Employee Benefits To Certain Drivers. Initiative Statute.

“Summary: Classifies app-based drivers as “independent contractors,” instead of “employees,” and provides independent-contractor drivers other compensation, unless certain criteria are met. Fiscal Impact: Minor increase in state income taxes paid by rideshare and delivery company drivers and investors.”

As with many ballot propositions, it took lengthy explanations to show what voters were actually choosing. Rather than list all the listed arguments and rebuttals here, see the lengthy voter guide for the full details voters received.

Set aside the reality that few voters may actually read propositions carefully. Also set aside the issue that common knowledge of Prop 22 probably came more from ad campaigns than language listed on the voter information guide. To me, the most noticeable information was the information missing from the information provided. How would rideshare and related systems change if Prop 22 received a Yes or No vote?

The Campaign

Prop 22 put the business models of rideshare and delivery companies at risk as well as threatening to change gig worker treatment. The risk for either side was in their relative change. What followed were ad campaigns in support of either side.

But a ballot initiative doesn’t require that both arguments would be heard equally or presented just as well. Rideshare and delivery companies (Uber, Lyft, Postmates, DoorDash, and Instacart) spent approximately $200 million promoting a Yes vote. The opposition spent only $20 million.

How should we judge spending on Prop 22? To compare, in the last couple decades, we’ve seen dramatic increases in fundraising for presidential campaigns. In 2000, Bush and Gore together raised around $450 million. Another contentious presidential campaign in 2004 saw the campaigns raise $1 billion for the first time. For the record-breaking 2020 campaign, Biden and Trump collectively raised $1.7 billion (counting all the other candidates would double that amount).

Of all the 2020 presidential campaign money, $288 million came from donors in California. That gives us some perspective.

So perhaps the $220 million spent on Prop 22 in California makes this single issue similarly important to a presidential campaign. The financial support is different though. Companies backing the proposition can model their own benefit to the point that they know what a win is worth. That’s harder to do with a presidential campaign.

App companies also had distribution on their side. That is, they already had a direct line to customers and gig workers and could push messages like this.

Uber later updated the popup as follows.

Is it possible that 72% of drivers supported Prop 22? With that majority, shouldn’t voters follow the drivers?

This is where you might use the phrase “lies, damned lies, and statistics.” Uber drivers have different needs. Few regularly work over 15 hours a week just for Uber (the minimum needed to qualify for the Prop 22 benefits) while many work fewer hours and also split their time between other gig companies. Those in the second category would likely lose their ability to drive for Uber in the event of a No vote that classifies app workers as employees. That could explain why so many Uber drivers and delivery people supported the proposition.

The vote passed 58% in support of Proposition 22.

Comparisons

We need to back up a moment to an earlier bill. Prop 22 was itself a follow-up to Assembly Bill 5 (AB5), a 2019 bill which provided a three-part test to whether someone is an employee or a contractor.

The AB 5 bill’s three requirements (the ABC test) are as follows:

  1. “The person is free from the control and direction of the hiring entity in connection with the performance of the work, both under the contract for the performance of the work and in fact.
  2. “The person performs work that is outside the usual course of the hiring entity’s business.
  3. “The person is customarily engaged in an independently established trade, occupation, or business of the same nature as that involved in the work performed.”

AB 5 also granted numerous business-type exemptions, but not to app companies.

Exempted businesses were notably not individually well-funded or from Silicon Valley, though they may have held political sway with California legislators. These AB 5 business exceptions include doctors, dentists, lawyers, architects, engineers, accountants, commercial fishermen, designers, artists, barbers, and more.

They political sway was most allegorically seen in business awards a legislator bragged about for writing exemptions into AB 5.

AB 5 threatened the gig economy business model where drivers, delivery people, and other workers did not receive employee benefits. That threat was bound to receive a response from the companies that had the most at risk.

But how was Prop 22 described? From the Los Angeles Times:

“The ballot measure would require the companies to provide an hourly wage for time spent driving equal to 120% of either a local or statewide minimum wage. It would not pay drivers for the time they spend waiting for an assignment. It also requires that drivers receive a stipend for purchasing health insurance coverage when driving time averages at least 15 hours a week, a stipend that grows larger if average driving time rises to 25 hours a week.”

This is an example of where it’s difficult to assess outcomes from a quick read.

15 hours might sound like a low bar, but this is active driving time. Potentially double that amount of time to include driver wait time between fares. Also, the active driving time is tracked per company. That is, 10 hours driving for Uber and 5 hours driving for Lyft do not qualify as 15 hours. Just meeting that 15 hour minimum will prove difficult for many drivers.

Further, an earlier analysis of Prop 22 by the UC Berkeley Labor Center reassessed the 120% of minimum wage benchmark ($15.60) and claims that drivers with more than 15 active hours per week would actually earn only $5.64 per hour.

Why would drivers choose to work if their pay declines? There could be a number of reasons, including that they prefer driving to other work and can’t find other work. A distinction of gig economy work is that you typically don’t need to commit to a fixed schedule. In the case of driving for rideshare delivery businesses, gig workers can work just about any hours they want. That flexibility may make up for their ability to earn more doing something else.

Rideshare companies are famous for having basing their distribution model on moving into new municipalities in advance of any legal right to do so. Uber entered the New York City market without a license to operate as a transportation company. As a result, taxi drivers felt both their fares and medallion values fall and riders benefitted from lower fares and more supply.

A No vote on Prop 22 was supposed to challenge not the distribution part of rideshare businesses, but their cost structure.

But after the Yes vote, some fees have also been passed on to customers. For Uber and Lyft, the fees depend on location and range from $0.30 to $1.50 per ride. Postmates now charges a driver benefits fee of between $0.50 and $2.50, depending on location.

Prop 22 Paradox

Why did we see public shock and outrage in January when affected companies fired workers, changed policy, and added fees? Surely Proposition 22 voters would have expected some type of negative outcome, however they happened to vote.

It’s probably more likely that many Prop 22 voters didn’t really think about negative outcomes. Either because their chief concern was on what positive outcomes they would see, that they didn’t believe themselves negatively affected by potential changes. or they just didn’t have the habit of thinking about trade-offs.

A look at Prop 22 outcomes.

Yes (drivers classified as contractors). Rideshare companies start to provide some benefits. Drivers continue to set their own hours. Price increases passed on to customers.

Other potential outcomes: Slower driving? Drivers have an incentive to reach 20 hours of active driving time. That’s the point at which benefits kick in. If the value of the benefits is greater than that from driving fewer hours, then drivers may slow down or at least not speed.

No (drivers classified as employees). Drivers receive pay increases. Drivers have to work fixed  hours. Driver supply falls. Fares rise. Ride demand falls. Rideshare companies have difficulty operating in California. Price increases passed on to customers.

In the case of a No vote, affected companies estimated needed price increases of 25% to over 100%. Outside researchers estimated 5% to 10%.

Alison Stein, Uber’s economist makes this series of public projections for price increases, trips lost, and work opportunities lost.

By Alison Stein, Economist at Uber (link to larger version above)

Given Stein’s role we perhaps need to take the projections as one perspective. Notably, price increases are most extreme in the less populated parts of California (more inactive driver time between rides). But these price increases also stand out because Uber had subsidized rides to become a less expensive option than legacy taxis.

Trying to present the way a system change could have other outcomes is not unattainable. Especially in something like Proposition 22, which is relatively straightforward. I say relatively because unlike say electing an individual to political office, who might say one thing and do entirely something different, Prop 22 was more limited in scope.

It’s also not to say that putting out that systems map might produce one single correct answer. People can still weigh different outcomes differently. I just don’t like shock and outrage after a legal change goes into effect and companies act. The shock and outrage should have happened with the outcome of the vote itself.

Scaling a Scam (The Twitter Hack)

Today I was reminded of the first post I ever wrote on this blog (Voice AI, Telecom, Scams, and Co-evolution), back in 2018. My first article was focused on second-order effects of emerging voice AI capabilities and projected a number of scams that the technology would enable.

While this tech also has many positives, I always try to get a fuller picture by looking at the system.

The world is full of trade offs. In the case of voice AI in the article, we have the ability to scale up scams that historically worked, but only in small does. The “hey grandma” scam was one example. As I wrote back then:

“An older scam that this tech will scale is what’s known as the “Hey, Grandma” scam, where a grandparent gets a call from a “grandchild” in distress. There are different flavors of this. For US grandparents the story is often that the grandchild got into legal trouble and needs money wired. In China and Taiwan, it’s often that the grandchild has been kidnapped and is being beaten up. Again, wire the money.”

Continue reading “Scaling a Scam (The Twitter Hack)”

Inevitable Surveillance?

What is the purpose of surveilling a domestic population? Is it inevitable?

Surveillance and spying are a little different but the benefits of each have long been understood. The purposes of spying are to know when an enemy is going to attack, their capabilities, the potential to attack them first, or what one might gain in making an attack, state to state or tribe to tribe. Learn plans, intentionally mislead, survive.

Domestic surveillance is different or at least thought of as being different. For some types of domestic surveillance the purpose seems to be that the population harbors enemies (overlapping with spying above), whether this means enemies of the state itself or those harmful to the rest of the population.

A version of that is that if there are people who have “wrong thinking,” then their “wrong thinking” can infect their neighbors, and eventually lead to violence or chaos. Continue reading “Inevitable Surveillance?”

Changes in Value (Part 2)

While I discussed silver, tulips, and drugs in Changes in Value Part I, here I look at education, art, spices, chicken feet, and conformity. What systems influence the value of things? Why does value change?

At the end I provide suggestions to assess your own situations.

Education

I’ve been critical of higher education on this blog before, but for other reasons. When it comes to the the price of a college degree — and here I’m mostly talking of the price of American college tuition — we’ve seen a doubling in price, adjusting for inflation, over the last 30 years. A number of factors combine to drive up the price.

Continue reading “Changes in Value (Part 2)”

Changes in Value (Part 1)

When something changes in financial value quickly, unintended consequences abound. When this change happens at scale, affecting many people, the consequences are even more extreme. These changes impact supply and demand and social change around the world.

Let’s look at some examples of value change causing havoc. This week I’m intentionally (well, almost entirely) not writing about the topic you can’t escape.
Continue reading “Changes in Value (Part 1)”

Do We Create Shoplifters?

Those of you who work in a large organization occasionally might find yourself shaking your head thinking about a colleague: “What do they do all day?” Some of you might even think that about yourselves. Or you might think that about people in another department, especially those with whom you have an adversarial relationship.

At the same time, you also might be uncomfortable with the automation of certain tasks and possibly seeing those jobs disappear. Even those jobs of the unproductive humans you shook your head at. Fear of job automation and its unintended consequences has people thinking, but what are the roots of this thought?

Isn’t the history of technology about removing humans from a task and replacing them with machines, even simple ones?

Here’s an example from Vaclav Smil’s book Energy in World History.

Do you really want to be a glass polisher? And do the unintended consequences of job automation include creating shoplifters?

Continue reading “Do We Create Shoplifters?”