Mo Reads: Issue 2
The first issue was perhaps a bit too long. This time round I’m aiming for succinct commentary. (A newsletter, after all, is an infinite game.)
Links:
Partial Derivatives and Partial Narratives by John Nerst (2,700 words, 11 mins)
Compress to impress by Eugene Wei (3,600 words, 15 mins)
The lava layer antipattern by Mike Hadlow (1,800 words, 7 mins)
The physics of space war by Rebecca Reesman et al (10,000 words, 40 mins)
Three wild speculations from amateur quantitative macrohistory by Luke Muehlhauser (1,200 words, 5 mins)
Message length by Zack Davis (3,000 words, 12 mins)
Could a Quantum Computer Have Subjective Experience? by Scott Aaronson (6,800 words, 27 mins)
There’s no such thing as a tree (phylogenetically) by Georgia Ray (2,100 words, 8 mins)
What is the upper limit of value? by David Manheim et al (15,000 words, 60 mins)
Shopping for happiness by Jacob Falkovich (3,000 words, 12 mins)
Partial Derivatives and Partial Narratives by John Nerst (2,700 words, 11 mins): reality’s causal dynamics is complicated, far too multivariate for our puny brains, having no obligation to make sense. Narratives are stories of events/entities and their causal relationships, chains of “X caused Y” simple enough to fit into our brains. So representing reality via narratives is like taking partial derivatives of multivariate functions. Implications: (1) when integrating over narratives to reconstruct (get ideas about) reality, you have no idea how complicated or significant the “integration constant” is, since derivatives/narratives destroy information; (2) sometimes there’s no such thing as “the right narrative” (even if they contradict each other), just like how there’s no such thing as “the right partial derivative” (they’re all correct), so savage political fights can happen without any factual or value disagreement; (3) disagreeing on signal vs noise is like differentiating w.r.t different variables. So (John argues) e.g. Ayn Rand isn’t right/wrong so much as ridiculously partial, like all good propaganda pieces are.
Compress to impress by Eugene Wei (3,600 words, 15 mins): org leaders face Chinese whisper problems in strategic alignment, which get horrible at scale. A time-tested solution to ensuring message integrity in lossy media (oral traditions, hierarchies) is to encode it in a distinctive format; this is the thrust of rhetorical tricks like rhythms and rhymes. One of Jeff Bezos’ great strengths as a communicator was in encoding Amazon’s most important strategies in concise memorable form, like “Day 1”, “resist proxies”, and “high-velocity decision making”, all arising from pressing needs in the moment. Jeff understood, argues Eugene (an early employee at Amazon), that time spent deriving the right words to package a key concept memorably, so people would say what he’d say when he wasn’t in the room, was time well spent. (Interestingly, his information inflows was the opposite: minimally compressed raw data.) Other great leaders do this too: “lean in”, “yes we can”, “move fast and break things”, “I have a dream”, “software is eating the world”, “information wants to be free” etc.
The lava layer antipattern by Mike Hadlow (1,800 words, 7 mins): successive well-intentioned changes to architecture and technology throughout an application’s lifetime can result in a codebase that’s fragmented and hard to maintain; sometimes it’s better to favor consistent legacy tech. “Lava layer” refers to how a hypothetical codebase visualization would look like half-breadth lava layers of different patterns/tech solving the same problem in different places. Mike notes that it’s “especially prevalent in situations where the software is large, mission critical, long-lived and where there is high staff turn-over”. One way to mitigate against this antipattern is to acknowledge that we may never finish refactoring. Also called spaghetti towers; pervasive in evolutions and bureaucracies not just software. It’s one form of technical debt.
The physics of space war by Rebecca Reesman et al (10,000 words, 40 mins): Star Wars dogfights make for great cinema, but won’t be how real space engagements are fought due to orbital dynamics constraints. 5 key concepts: (1) satellites move quickly, (2) satellites move predictably, (3) satellites maneuver slowly, (4) space is big, and (5) timing is everything. Different paradigm: unlike on earth where competitors fight to dominate a physical location, in space nobody occupies a single location over time, so instead focus on reducing/eliminating enemy satellite capabilities (comms, nav, intel). Goals: deceive, disrupt/degrade/deny capability, deter/defend against counterattacks. Weapons: kinetic vs standoff, ground- vs space-based, reversible or not. Because space is big, engagements are either intense or long not both. Too much to summarize, check it out!
Three wild speculations from amateur quantitative macrohistory by Luke Muehlhauser (1,200 words, 5 mins): Luke looked at five proxies of global average human well-being & empowerment going back thousands of years — physical health (life expectancy at birth), economic well-being (GDP per capita), energy capture (kilocals/person/day), tech empowerment (war-making capacity), and political freedom to live desired lives (% population living in democracies) — and plotted them all on a graph. From this he makes 3 speculations: (1) everything was awful throughout recorded human history, and then the industrial revolution happened — the impact of the wheel or money or writing or cavalry, the rise of major religions, the conquering of nations, the scientific revolution, the Black Death were all negligible fluctuations vs the hockey-stick trajectory of the post-industrial era; (2) most of the variance in historical human well-being is explained by a primary factor for productivity and a secondary one for political freedom; (3) it seems it would take a lot of deaths (say 15+% of world pop.) to knock civilization off its current trajectory — the deadliest events ever (Black Death and Genghis Khan, at 10%) came close but didn’t do it
Message length by Zack Davis (3,000 words, 12 mins): Occam’s razor (“explanations should be kept as simple as possible, but no simpler, given available evidence”) is a heuristic preference guiding scientists towards conservatism when developing theories. But that’s a vague qualitative statement; how to make it quantitative, hence applicable? In the general case this is impractical, but in a toy scenario it’s enlightening. Zack’s toy scenario is a broadcast of a stream of (0,1)-bits: given this evidence, can we predict what comes next? 1st idea: since there’s slightly more 1s than 0s, maybe the sequence represents flips of a biased coin. 2nd idea: since it contains longer runs of 1s and 0s than you’d expect in a 500-bit sequence, maybe they aren’t independent of each other but were generated by a Markov chain. 3rd idea: since there’s also a long run of alternations, maybe the generator is a higher-order MC. (This makes biased coins 0-th order MCs.) The best guess for the freq-of-0 parameter is the observed frequency (maximum likelihood estimation). Now we can write a program that takes the sequence and a degree n and computes the MLE for the n-th order MC that might have generated it (i.e. “best theory of degree n”), which lets us compare theories on predictive odds. (Since the odds are ultra-small, it’s sensible to take the negative log base 2; this is log loss.) Problem: higher n monotonically improves predictive odds via overfitting. Solution: penalize higher n by 2^n (the parameter count for n-th order MC), since more complex explanations need more evidence. This lets you Occam-justify a 3rd-order MC as “the true best theory” to predict next bits in sequence!
Could a Quantum Computer Have Subjective Experience? by Scott Aaronson (6,800 words, 27 mins): in philosophy of mind, computationalism is the view that the mind is an info-processing system, and cognition/consciousness a kind of computation. Once a fringe view, it’s now the dominant paradigm, so obvious it’s not worth stating. In Scott’s talk (this is the prepared version), he raises a number of issues it leaves unanswered, collectively called “pancomputationalism”: what exactly does a compute process have to do to count as “conscious”? The warmup problem drove me uneasy ever since, and it’s been years: if your brain were simulated by a gigantic lookup table hardwiring inputs (sensory stimuli) to outputs (brain responses), would that bring about your consciousness? Would it make a difference if nobody actually consulted this table, if it just “sat there doing nothing”? Or consider the idea that every physical system performs every computation: if true, the claim that a system is performing a certain computation becomes trivially true hence vacuous, as it fails to distinguish it from any other, so either every physical system is conscious or nothing is! (Reductio much?)
There’s no such thing as a tree (phylogenetically) by Georgia Ray (2,100 words, 8 mins): instead, think of trees — big long-lived self-supporting plants with leaves and wood — as convergent evolutionary strategies (as are fish and fruit and wood). There are no unique “tree genes”; no coherent phylogenetic category that only includes trees — this means e.g. mulberries and blackberries aren’t related despite looking identical sans color, and the last common ancestor between an apple and a peach was probably not a tree. By analogy, says Georgia, imagine if “there were amphibian birds and mammal birds and insect birds flying all around, and they all looked pretty much the same… and you had to be a real bird expert to be able to tell an insect bird from a mammal bird”. (All that said, this isn’t grounds for arguing that “trees” aren’t a useful or comprehensible category.)
What is the upper limit of value? by David Manheim et al (15,000 words, 60 mins): unless current understanding of physics is fundamentally wrong, there’s an upper limit to how much value our decisions can create, arising from the physics of information and physical beings’ ability to place value on outcomes, and the (far more restrictive) definition and conception of economic growth plus speed of light. This, argue Manheim et al, means that infinite ethics e.g. Pascal’s Wager-type issues are irrelevant for long-term ethical decision making, ruling out a whole class of challenges to consequentialism. (A corollary of light speed constraining economic growth is that long-term growth is at most polynomial i.e. current exponential trajectory is short-term.) “Value” here is predicated on choice: a comparison between them, or a decision made about them; outside of this choice framework Manheim et al consider “value” meaningless. Value comparisons induce at least an ordinal preference, implying finite value (count the world-states!). But reasoning about preferences under uncertainty (e.g. “prefer 10% odds A to 50% odds B”) requires cardinal utility, which reopens several doors to infinitarian paralysis; Manheim et al close them one by one.
Shopping for happiness by Jacob Falkovich (3,000 words, 12 mins): awhile back I read Elizabeth Dunn’s paper If Money Doesn't Make You Happy Then You Probably Aren't Spending It Right, whose title stuck with me ever since. Jacob’s post starts from Dunn’s book version of that paper, distills their five principles into a single equation for kicks, and adds great examples. I’ll just copy his summary: “Buy experiences, which include things that are cheap and you use for a long time in short increments. Make it a treat because the joy declines as you get used to anything, inject novelty where you can to refresh the source of delight. Calculating the value of your time, even very roughly, will let you make good trade-offs whether you’re buying time or paying with time. Pay now (cash) and consume later (memories) just make sure not to overestimate the time you’ll spend reminiscing and the amount of suffering you’ll have to pay. Invest in others, but don’t confuse the warm feeling of treating a friend with the ethical imperative to make the world a better place.”