Mo Reads: Issue 6
AI, business intelligence, forecasting, history, policy, psychology, sociology, systems
In the previous issue I mentioned trying to balance topical diversity and fun in writing commentary. This issue leans a little more towards the latter. (For past issues see the archive.)
Links:
Shiny balls of mud by William Gibson (1,000 words, 4 min)
I can tolerate anything except the outgroup by Scott Alexander (8,500 words, 34 min)
Remove the legend to become one by Eugene Wei (8,000 words, 32 min)
Why has nuclear power been a flop? by Jason Crawford (4,600 words, 18 min)
This is how Amazon measures itself by Cedric Chin (4,400 words, 18 min)
Concrete problems in AI safety by Chris Olah et al (13,500 words, 54 min)
Instead of pledging to change the world, pledge to change prediction markets by Scott Alexander (800 words, 3 min)
George Mueller, NASA’s Apollo program, systems mgmt & engineering by Dominic Cummings (16,000 words, 64 min)
When money is abundant, knowledge is the real wealth by John Wentworth (1,500 words, 6 min)
History is written by the losers by Tanner Greer (2,200 words, 9 min)
Shiny balls of mud by William Gibson (1,000 words, 4 min): my first introduction to the dorodango, Japanese for “mud dumpling”, shiny balls of mud compressed by hand and painstakingly shaped and polished into perfect spheres, sometimes by children. The dorodango, claims Gibson, illustrate a core aspect of Japanese culture: an obsessive-compulsive pursuit on perfection of a single activity. Gibson interprets the hikikomori aesthetic and cockpit living similarly.
I can tolerate anything except the outgroup by Scott Alexander (8,500 words, 34 min): Scott examines the odd modern phenomenon where lots of people conspicuously praise every outgroup they can think of and conspicuously condemn their own ingroup, and hones in on a distinction between the trivial meaning of ‘outgroup’ (“a group I’m not part of”) and the insightful one (proximity + small differences). This is how outgroups can be people who look exactly like us (e.g. Northern Irish Protestants and Catholics), whereas ‘scary foreigner types’ can become the ingroup in a jiffy when convenient (e.g. the Brits and Sikh during the world wars); it’s why Scott’s readership is only ‘against’ Osama bin Laden but loathes Margaret Thatcher. Scott ends by proposing a test to figure out if the group you’re criticizing really is your ingroup or the outgroup: does it feel fun to do it, or does it make your blood boil? If it feels fun…
Why has nuclear power been a flop? by Jason Crawford (4,600 words, 18 min): in the 1950s nuclear was the future: scalable, on-demand, virtually emissions-free, takes up very little land, consumes very little fuel, produces very little waste. Over half a century later only ~10% of world electricity is nuclear-based, and reactor design hasn’t really changed in decades. What happened? (1) too expensive to compete with fossil fuels (8 vs 5c/kWh) due to high plant design/construction costs (only US) (2) safety: the linear no threshold dose-response model guiding US govt policy contradicts both theory and evidence; excessive concern about low radiation levels led to the ever-tightening ALARA regulatory standard (As Low As Reasonably Achievable), a perverse incentive that by definition eliminates nuclear’s chance to be cheaper than fossil fuels (if it does get cheaper, regulators aren’t doing their job) (3) perverse regulatory incentive: the Nuclear Regulatory Commission (NRC)’s job is all downside no upside, as they own all problems but don’t get credit for approving new plants or have growth goals (4) no competition, so it’s all bloated incumbents and govt labs spending billions on stuff like monitoring radiation contamination so sensitively they’re detecting rainwater. What to do? (1) Replace LNT with e.g. “Sigmoid No Threshold” (2) Replace ALARA with firm limits balancing risk/benefit (3) Don’t regulate test reactors like production ones (4) Align regulator incentives with industry (5) Allow arbitration of regulation
Remove the legend to become one by Eugene Wei (8,000 words, 32 min): a tale in 2 parts: the first being his painful but formative experience as Amazon’s first strategic planning analyst (fantastically entertaining and informative reading), the second being his manager sending him to a seminar on Edward Tufte’s now-classic book The Visual Display of Quantitative Information (“one of the most important books I’ve read in my life”) so he could apply its principles to the hundreds of charts in the Analytics Package covering every aspect of Amazon’s business, and read by every manager — revenue, editorial, marketing, operations, customer service, headcount, customer sentiment, market penetration, lifetime value, inventory turns etc. Tufte’s book opens with a concise summary of its key principles, like “graphics should induce the viewer to think about the substance not the methodology/graphic design, avoid distorting what the data has to say, encourage the eye to compare different pieces of data” etc. In this essay, Eugene walks through a simple line chart example in Excel, detailing what he had to do in making the monthly Analytics Package. Insightful reading.
This is How Amazon Measures Itself by Cedric Chin (4,400 words, 18 min): Key takeaway: good operators must instrument the organizations they are running — if you don’t, you won’t know what’s going on, so won’t know what to focus on to get where you need. (This is remarkably similar to Carlos Bueno’s The mature optimization handbook, which appeared in a previous issue in this newsletter.) Amazon tracks 2 kinds of metrics: controllable inputs and outputs — outputs show results but aren’t actionable, inputs provide guidance. The metrics are created via a process improvement method called DMAIC (Define, Measure, Analyze, Improve, Control), in precisely that order; nearly all fall within Amazon’s flywheel. (Interestingly, figuring out the right controllable input metrics is deceptively tricky and requires a lot of test-debate-iterate.) Tidbits: (1) since business units are incentivized to pick and tweak metrics that make them look good, finance is empowered to “unbias” them (2) instrumenting the business more accurately can be expensive, but is a worthwhile investment (3) develop a deep understanding of how metrics work, root causes, natural variances etc (4) if processes improve to the point where a once-useful metric becomes irrelevant, prune it from dashboards (5) at the highest level weekly business review (WBR) involving Bezos and S-team, metrics are compiled into a deck (of charts, graphs, data tables) showing end-to-end view of business following customer experience, focusing on emerging patterns, exception reporting, and anecdotes. No variances? Move along (6) business leads own metrics and are expected to explain variances (7) data + customer anecdotes = whole story, included in WBR
Concrete problems in AI safety by Chris Olah et al (13,500 words, 54 min): AI has recently made dramatic progress on hard problems, promising transformative potential but also societal challenges/risks (broadly studied under AI safety). This paper looks at a subset of AI risks: accidents in ML systems deployed autonomously at scale in open-domain situations. Most discussion on this gravitates towards extreme scenarios e.g. misspecified objective functions in powerful AI, which makes most people not take AI risk seriously since it’s outside their “Overton window”; this paper tries to shift discussion “into the window” by highlighting 5 practical issues with modern ML systems, illustrating each via a cute office cleaner robot that can use common cleaning tools: how to ensure (1) it won’t mess up the environment while cleaning (e.g. knocking over a vase to clean faster), without manually specifying everything it shouldn’t disturb? (2) it won’t game its own reward function (e.g. sweeping messes under the rug and proclaiming “done!”)? (3) scalable oversight: it does the right thing without needing frequent oversight/lots of info (e.g. throw away candy wrappers, leave aside people’s smartphones)? (4) it explores the office/cleaning strategies safely (e.g. try different mopping techniques, don’t put a wet mop into electrical outlets)? (5) it recognizes when it’s in a different environment (e.g. factory workfloor not office) and changes behavior appropriately? All open problems ready for experiment today
Instead of pledging to change the world, pledge to change prediction markets by Scott Alexander (800 words, 3 min): an idea: hold govt policy promises to account using prediction markets — exchange-traded markets to trade event outcomes, where market prices indicate what the crowd thinks the outcome odds are, inspired by wisdom of crowds — by pledging to shift predicted outcome, via iterating on a legislation package making it credible enough they’ll think the outcome will more likely than not happen. Scott addresses some objections (successor can repeal it, Goodhart’s law, maybe it’ll disrupt a key ‘natural’ part of the political process?)
George Mueller, NASA’s Apollo program, systems mgmt & engineering by Dominic Cummings (16,000 words, 64 min): where I first learned that high-output management was a thing (later made legendary in Silicon Valley by Intel CEO Andy Grove). Dominic claims that the difference between NASA getting to the moon in 1969 and ELDO’s (European Launcher Development Organisation) F-8 rocket failing on the launchpad was not one of sci/eng but management, in particular George Mueller’s instituting of systems mgmt and engineering, to coordinate the efforts of 300,000 people working for 20,000 contractors and 200 universities in 80 countries at its peak. Mueller was brought in from Bell Labs to solve NASA’s “internal conflicts, divisions between different groups and different physical locations, a lack of operational and managements skills, and political problems”. He began by defining systems mgmt as “systems engineering applied to mgmt — defining goals and ends, describing needed men and machines, outlining network of info/materials flow, entailing tradeoffs, integrating info about interactions across the system”, and noted that “many failures were of integration, or technical/schedule compatibility of interfaces”. He solved them by “building org-wide orientation” and “de-silo-ing via extensive comms”, introduced a “matrix mgmt” system whereby teams in different NASA centers reported to both his HQ and center bosses, gave ownership of key areas to individuals, building relationships with CEOs of major contractors but also incentivized comms by asking Bezos-like questions (“what happened to that valve?”) and rewrote contracts to incentivize hitting schedules, pursuing concurrent development of systems to save money by saving time, replacing conservative lengthy stage-sequential testing with ‘all-up testing’, and speeding cadence of updates from monthly to daily and sharing widely. Check it out!
When money is abundant, knowledge is the real wealth by John Wentworth (1,500 words, 6 min): great example of a recurring theme in systems — when you relieve a bottleneck, it moves. John notes that if nonexperts can’t tell apart experts and crackpots/noise, money can’t buy expertise, i.e. we can’t outsource knowledge but must understand it ourselves (“there is no royal road”). The money-to-knowledge bottleneck shift, in his experience attempting to solve big problems, happens quicker than you think — in the low six figures for annual income — beyond which more money just means more spray-and-pray ability, an inefficient strategy in our high dimensional world (the reasoning step here is a bit simplistic to me). This suggests a strategy: think of money as ‘cheap’, as is everything money can buy, so ‘expensive’ things are whatever money alone can’t buy e.g. expertise, so acquire expertise and use them for barter or investment, in particular to get more expertise, in particular gears-level models. These are the “capital assets” of knowledge that “pay dividends” in more knowledge, because they’re general-purpose; applied math/sciences are full of such tools.
History is written by the losers by Tanner Greer (2,200 words, 9 min): anything by Tanner should be read in full because it’s the details that make it fun, but in short: “history is written by the victors” (often misattributed to Churchill) is only sometimes true; many of the great histories of the premodern world were actually written by the losers. Examples by Lynn Rees: “Herodotus wrote his history only after his exile from Halicarnassus; Xenophon wrote his memoirs only after his faction was forced out of Athens. Polybius was once a general for the Archean League, but wrote his history as a hostage at Rome. The destruction of Judea was chronicled by a Josephus, a Jew” etc. Tanner adds: “Sima Guang, the second most significant historian of Chinese history, only finished his massive Zizhi Tongjian after court rivalries had forced him to retire. The history of the Mongols was written almost entirely by their vanquished enemies. Ibn Khaldun was associated with so many failed regimes that it is a wonder he found time to write his history at all… (And then there’s) Thucydides.” Tanner’s explanation is beautifully simple: “Defeat gives brilliant minds like Thucydides the two things they need to become great historians: time and motive.” (Rulers are too busy to write, Mao and Caesar being rare exceptions.)