The mathematics community lost a titan with the passing last month of Isadore “Is” Singer. Born in Detroit in 1924, Is was a visionary, transcending divisions between fields of mathematics as well as those between mathematics and quantum physics. He pursued deep questions and inspired others in his original research, wide-ranging lectures, mentoring of young researchers and advocacy in the public sphere.

Mathematical discussions with Is were freewheeling. He wanted to get to the essence of the matter at hand, to understand and present new ideas in his own way. He constantly asked questions and provoked others to look deeper and more broadly. He knew no boundaries, which led him to forge deep connections between fields. Above all, he valued his freedom — the freedom to explore, the freedom to err, the freedom to create. A social mathematician by nature, he sought out and nurtured friendships with like-spirited collaborators.

Is came to mathematics relatively late, having majored in physics as an undergraduate at the University of Michigan before going off to war in 1944. Upon his return, he went to graduate school in mathematics at the University of Chicago to better understand relativity and quantum mechanics, only to discover that mathematics was his true intellectual home.

The “A-roof genus” is a topological invariant of manifolds that is part of 1950s topology, guaranteed a priori only to be a rational number — a ratio of whole numbers. But topologists had proved that it is actually an integer for manifolds with a particular geometric feature: a spin structure. (Spinors, which are a kind of square root of vectors, had been introduced in algebra and also in physics as part of Paul Dirac’s theory of the electron. A spin structure on a manifold allows such square roots to exist.) Even though Michael and his collaborator Fritz Hirzebruch had proved the integrality as a consequence of their development of *K*-theory, an important innovation in topology, Michael still sought insight that the proofs do not provide. The desire for a more profound understanding resonated deeply with Is, who was immediately hooked by the question. A positive whole number should count something, and a whole number which can be negative may be the difference between two positive whole numbers, each of which counts something. In the Atiyah-Singer worldview, geometry is paramount, so whatever is being counted should be geometric.

Here, Is’ mastery of differential geometry came into play. Presumably inspired by the geometry of other integer invariants of the period, and based on his knowledge of Dirac’s theory, Is devised a version of the Dirac equation in differential geometry — it requires the spin structure which is at the heart of Michael’s question — and he conjectured that the A-roof genus measures the existence and uniqueness of solutions to that equation. This was Is’ response to the question. Michael immediately saw how to incorporate it into the *K*-theory that he and Fritz had developed, and he quickly had the statement of the general index theorem in hand. The first proof, which involves a large dose of analysis, was done within the year.

The statement, the proof and the immediate applications which flowed from them brought together mathematical subdisciplines which in the prevailing ethos of the day often existed in noninteracting orbits. The circle of mathematical ideas around the Atiyah-Singer theorem grew in the ensuing years. When the scope expanded even further in the mid-1970s, Is again played a central role.

The impetus this time was the self-duality equation, a nonlinear differential equation that arises in quantum field theory. Is was very familiar with the so-called Wu-Yang dictionary, which related gauge theory in physics to the structures in differential geometry he had learned from Shiing-Shen Chern, one of his teachers at the University of Chicago a quarter century earlier. The dictionary had grown out of dialogues at Stony Brook University with Jim Simons, whom Is had mentored years before at MIT. (Simons went on to found the Simons Foundation, which also funds this editorially independent publication.) Is saw a role for geometry in illuminating what the physicists had discovered, and on a trip to Oxford in 1977 he posed the question, laying out the problem in a series of lectures. Those lectures and similar ones elsewhere inspired a burst of activity. Significantly, the nonlinearity of the equation led to a fascinating new web of mathematics in which algebraic geometry, topology, differential geometry and analysis are beautifully entwined, now with physics in the mix as well.

Is went further. He grasped early on, at a time when this was in no way apparent, that the “quantum” in quantum field theory is something that we geometers need to engage with directly and incorporate into our mathematics. It is difficult to convey how visionary Is was at that time. He led the way by grappling with the physics and presenting it on his own terms in courses on quantum field theory, supersymmetry and string theory at both Berkeley and MIT. His Tuesday seminar at Berkeley often featured physicists explaining the latest results, followed by a Chinese dinner at which our communal education in physics continued. Is’ position in the mathematics community and the force of his ideas brought more and more mathematicians on board. As a result of his leadership, by the mid-1980s there was a vigorous interaction between quantum field theorists and geometers. As Is foresaw, the relationship grew and deepened. It continues to bear fruit for both fields.

Is’ leadership in mathematics and science extended to policy and community, where he engaged at a high level. To mention just one achievement, in the early 1980s Is teamed up with Chern and Cal Moore to found a new home for mathematics research, the Mathematical Sciences Research Institute. The fact that research institutes throughout the world now emulate this model is a tribute to the founders’ vision.

Of course, Is also enjoyed many aspects of life beyond mathematics and physics. His devotion to his family, and theirs to him, was paramount. Ambrose taught him to love jazz, another lifelong passion. And there was always tennis — Is played vigorously and enthusiastically into his 90s, constantly working on his game.

In fact, his longtime coach and close friend Jeff Bearup conjured an image of Is that perfectly captures the effect he had on people. He would arrive at his tennis club — or it could easily have been a math or physics department or a conference — and everyone would turn, smile, and shout out with admiration and respect: “Is!”

Mathematicians were disturbed, centuries ago, to find that calculating the properties of certain curves demanded the seemingly impossible: numbers that, when multiplied by themselves, turn negative.

All the numbers on the number line, when squared, yield a positive number; 2^{2} = 4, and (-2)^{2} = 4. Mathematicians started calling those familiar numbers “real” and the apparently impossible breed of numbers “imaginary.”

Imaginary numbers, labeled with units of *i *(where, for instance, (2*i*)^{2} = -4), gradually became fixtures in the abstract realm of mathematics. For physicists, however, real numbers sufficed to quantify reality. Sometimes, so-called complex numbers, with both real and imaginary parts, such as 2 + 3*i*, have streamlined calculations, but in apparently optional ways. No instrument has ever returned a reading with an *i.*

Yet physicists may have just shown for the first time that imaginary numbers are, in a sense, real.

A group of quantum theorists designed an experiment whose outcome depends on whether nature has an imaginary side. Provided that quantum mechanics is correct — an assumption few would quibble with — the team’s argument essentially guarantees that complex numbers are an unavoidable part of our description of the physical universe.

“These complex numbers, usually they’re just a convenient tool, but here it turns out that they really have some physical meaning,” said Tamás Vértesi, a physicist at the Institute for Nuclear Research at the Hungarian Academy of Sciences who, years ago, argued the opposite. “The world is such that it really requires these complex” numbers, he said.

In quantum mechanics, the behavior of a particle or group of particles is encapsulated by a wavelike entity known as the wave function, or ψ. The wave function forecasts possible outcomes of measurements, such as an electron’s possible position or momentum. The so-called Schrödinger equation describes how the wave function changes in time — and this equation features an *i*.

Physicists have never been entirely sure what to make of this. When Erwin Schrödinger derived the equation that now bears his name, he hoped to scrub the *i* out. “What is unpleasant here, and indeed directly to be objected to, is the use of complex numbers,” he wrote to Hendrik Lorentz in 1926. “ψ is surely a fundamentally real function.”

Schrödinger’s desire was certainly plausible from a mathematical perspective: Any property of complex numbers can be captured by combinations of real numbers plus new rules to keep them in line, opening up the mathematical possibility of an all-real version of quantum mechanics.

Indeed, the translation proved simple enough that Schrödinger almost immediately discovered what he believed to be the “true wave equation,” one that eschewed *i*. “Another heavy stone has been rolled away from my heart,” he wrote to Max Planck less than a week after his letter to Lorentz. “It all came out exactly as one would have it.”

But using real numbers to simulate complex quantum mechanics is a clunky and abstract exercise, and Schrödinger recognized that his all-real equation was too cumbersome for daily use. Within a year he was describing wave functions as complex, just as physicists think of them today.

“Anybody wanting to get work done uses the complex description,” said Matthew McKague, a quantum computer scientist at the Queensland University of Technology in Australia.

Yet the real formulation of quantum mechanics has lingered as evidence that the complex version is merely optional. Teams including Vértesi and McKague, for instance, showed in 2008 and 2009 that — without an *i* in sight — they could perfectly predict the outcome of a famous quantum physics experiment known as the Bell test.

The new research, which was posted on the scientific preprint server arxiv.org in January, finds that those earlier Bell test proposals just didn’t go far enough to break the real-number version of quantum physics. It proposes a more intricate Bell experiment that seems to demand complex numbers.

The earlier research led people to conclude that “in quantum theory complex numbers are only convenient, but not necessary,” wrote the authors, who include Marc-Olivier Renou of the Institute of Photonic Sciences in Spain and Nicolas Gisin of the University of Geneva. “Here we prove this conclusion wrong.”

The group declined to discuss their paper publicly because it is still under peer review.

The Bell test demonstrates that pairs of far-apart particles can share information in a single “entangled” state. If a quarter in Maine could become entangled with one in Oregon, for instance, repeated tosses would show that whenever one coin landed on heads, its distant partner would, bizarrely, show tails. Similarly, in the standard Bell test experiment, entangled particles are sent to two physicists, nicknamed Alice and Bob. They measure the particles and, upon comparing measurements, find that the results are correlated in a way that can’t be explained unless information is shared between the particles.

The upgraded experiment adds a second source of particle pairs. One pair goes to Alice and Bob. The second pair, originating from a different place, goes to Bob and a third party, Charlie. In quantum mechanics with complex numbers, the particles Alice and Charlie receive don’t need to be entangled with each other.

No real-number description, however, can replicate the pattern of correlations that the three physicists will measure. The new paper shows that treating the system as real requires introducing extra information that usually resides in the imaginary part of the wave function. Alice’s, Bob’s, and Charlie’s particles must all share this information in order to reproduce the same correlations as standard quantum mechanics. And the only way to accommodate this sharing is for all of their particles to be entangled with one another.

In the previous incarnations of the Bell test, Alice and Bob’s electrons came from a single source, so the extra information they had to carry in the real-number description wasn’t a problem. But in the two-source Bell test where Alice’s and Charlie’s particles come from independent sources, the fictitious three-party entanglement doesn’t make physical sense.

Even without recruiting an Alice, a Bob and a Charlie to actually perform the experiment that the new paper imagines, most researchers feel extremely confident that standard quantum mechanics is correct and that the experiment would therefore find the expected correlations. If so, then real numbers alone cannot fully describe nature.

“The paper in fact establishes that there are genuine, complex quantum systems,” said Valter Moretti, a mathematical physicist at the University of Trento in Italy. “This result is quite unexpected to me.”

Nevertheless, odds are that the experiment will happen someday. It wouldn’t be simple, but no technical obstacles exist. And a deeper understanding of the behavior of more complicated quantum networks will grow only more relevant as researchers continue to link numerous Alices, Bobs and Charlies over emerging quantum internets.

“We therefore trust that a disproof of real quantum physics will arrive in a near future,” the authors wrote.

I have a confession. As a producer working on the new season of the *Joy of x *podcast, sometimes I find myself editing episodes and thinking: Who let me in the room? It’s almost as if I’ve wandered into some hotel bar, and the last open seat is next to two folks deep in conversation. I’m listening in as they talk with intensity, passion. But they aren’t a romantic couple — they are sharing intimate details of their professional lives. Tales full of exploration and discovery. Of course, this scene is playing out in my imagination; the conversation I’m eavesdropping on is only in my headphones.

Our host, the mathematician and author Steven Strogatz, has a voracious intellectual curiosity, but it’s his warm and empathetic nature that makes listening to these interviews such a rewarding, even moving experience.

Every episode, I find myself learning something profound. Like when one of this season’s guests, the chemical nanoengineer Sharon Glotzer, talked about how, under the right circumstances, an “amorphous blob” of material can spontaneously change into “something with exquisite order to it.” Her professional life has been a quest to understand the interplay of forces in matter that create and destroy order.

Or when our guest Frank Wilczek, the Nobel Prize-winning physicist, marvels at the puzzles of subatomic forces: “What holds the nucleus together when electromagnetism wants to blow it apart?” he asked. These are explorers who have charted some of the universe’s great unknowns, and I’m riding shotgun with them as they take Steve on a tour through key moments in their journeys.

But right now, my second grader is rolling on the floor, screaming. Her teacher is shouting her name, trying to get her to return to the class Zoom, to unmute herself, to turn on her camera. This is a daily occurrence in our house.

Yours too, perhaps. We are not alone. Life during the pandemic has not only stressed and isolated us, it has heaped new distractions on us, making it harder to carve out time and mental space to reflect and learn. Kids aren’t the only ones in danger of falling behind intellectually.

That’s part of why I take such comfort in Steve’s intimate and lively conversations with his guests. His genuine curiosity about them and their work — and their sincerity in sharing about it — creates a calm but invigorating space for effortless learning. For those of us who love science, there is something deeply comforting about pressing play on *The Joy of x.*

But there’s more to it than that. We’ve been witnessing a global debate about the value and process of science playing out in our culture. The collective discussion of the SARS-CoV-2 virus and how it is transmitted has provided evidence for another crisis: The public’s scientific literacy and ability to engage with this crisis is fragile. For the last year, our collective hopes for ending the pandemic have been pinned on vaccine development, even as 31 million people follow anti-vaccine groups.

It strikes me that there’s never been a better time to get to know scientists — who they are, what makes them tick. Beyond teaching me about science, *The Joy of x*, like nothing else I’ve ever heard, makes the argument that a life in science is a life well spent. Steve Strogatz and his 12 guests this season are inspirational. The future needs scientists — of all kinds. Perhaps more than ever, we need people willing to dedicate their lives to this ambitious, optimistic, desperately important pursuit of truth.

*You can subscribe to the podcast and listen to episodes from the first season on **the *Quanta Magazine* website**, **Apple Podcasts**, **Google Podcasts**, **Spotify** or wherever you get your podcasts. The first full episode is being posted today, and new episodes will premiere every Tuesday.*

In the mid-1980s, the mathematician Jean Bourgain thought up a simple question about high-dimensional shapes. And then he remained stuck on it for the rest of his life.

Bourgain, who died in 2018, was one of the preeminent mathematicians of the modern era. A winner of the Fields Medal, mathematics’ highest honor, he was known as a problem-solver extraordinaire — the kind of person you might talk to about a problem you’d been working on for months, only to have him solve it on the spot. Yet Bourgain could not answer his own question about high-dimensional shapes.

“I was told once by Jean that he had spent more time on this problem and had dedicated more efforts to it than to any other problem he had ever worked on,” wrote Vitali Milman of Tel Aviv University earlier this year.

In the years since Bourgain formulated his problem, it has become what Milman and Bo’az Klartag of the Weizmann Institute of Science in Israel called the “opening gate” to understanding a wide range of questions about high-dimensional convex shapes — shapes that always contain the entire line segment connecting any two of their points. High-dimensional convex shapes are a central object of study not just for pure mathematicians but also for statisticians, machine learning researchers and other computer scientists working with high-dimensional data sets.

Bourgain’s problem boils down to the following simple question: Suppose a convex shape has volume 1 in your favorite choice of units. If you consider all the ways to slice through the shape using a flat plane one dimension lower, could these slices all have extremely low area, or must at least one be fairly substantial?

Bourgain guessed that some of these lower-dimensional slices must have substantial area. In particular, he conjectured that there is some universal constant, independent of the dimension, such that every shape contains at least one slice with area greater than this constant.

At first glance, Bourgain’s conjecture might seem obviously true. After all, if the shape were extremely skinny in every direction, how could it have enough substance to form one unit of volume?

“Come on — how hard can it be?” Ronen Eldan, a high-dimensional geometer at the Weizmann Institute remembers thinking when he first heard of the problem. “And then, the more you think about it, the more you understand how delicate it really is.”

The difficulty is that high-dimensional shapes often behave in ways that defy our human, low-dimensional intuition. For example, in dimensions 10 and up, it is possible to build a cube and a ball such that the cube has larger volume than the ball, but every slice through the center of the cube has smaller area than the corresponding slice through the center of the ball.

“The beauty of high-dimensional geometry is exactly that it doesn’t look anything like dimension two,” said Sébastien Bubeck of Microsoft Research in Redmond, Washington.

Bourgain’s slicing conjecture is a vote for high-dimensional tameness — a guess that high-dimensional shapes conform to our intuition in at least some ways.

Now, Bourgain’s guess has been vindicated: A paper posted online in November has proved, not quite Bourgain’s full conjecture, but a version so close that it puts a strict limit on high-dimensional weirdness, for all practical purposes.

Bourgain, said Klartag, “would have dreamt” of achieving a result this strong.

The new paper, by Yuansi Chen — a postdoctoral researcher at the Swiss Federal Institute of Technology Zurich who is about to join the math faculty at Duke University — gets at the Bourgain slicing problem via an even more far-reaching question about convex geometry called the KLS conjecture. This 25-year-old conjecture, which asks about the best way to slice a shape into two equal portions, implies Bourgain’s conjecture. What’s more, the KLS conjecture lies at the heart of many questions in statistics and computer science, such as how long it will take for heat to diffuse through a convex shape, or how many steps a random walker must take from a starting point before reaching a truly random location.

Random walks are pretty much the only effective methods available for sampling random points, Eldan said. And for a wide range of different computer science problems, he said, “the most important subroutine in the algorithm is, you want to sample a random point.”

Chen’s new result gives instant improvements to the known running times of algorithms for tasks such as computing the volume of a convex shape or sampling from an assortment of machine learning models. Chen’s work doesn’t quite prove the full KLS conjecture. But when it comes to computer science applications, Bubeck said, “you don’t need the full conjecture to get the full impact.”

Chen is not a convex geometer by training — instead, he is a statistician who became interested in the KLS conjecture because he wanted to get a handle on random sampling. “No one knows Yuansi Chen in our community,” Eldan said. “It’s pretty cool that you have this guy coming out of nowhere, solving one of [our] most important problems.”

Like the Bourgain slicing problem, the KLS conjecture (named after its creators, Ravi Kannan, László Lovász and Miklós Simonovits) asks a simple question: Suppose you want to cut a convex shape — maybe an apple without dimples — into two equal-size portions, and you’re planning to put one aside for later. The exposed surface is going to turn brown and unappetizing, so you want to make it as small as possible. Among all the possible cuts, which will minimize the exposed surface?

It’s not too hard to answer this question, at least approximately, if you are limited to straight cuts. But if you’re allowed to make curved cuts, all bets are off. In dimension two, mathematicians know that the best cut will always be a straight line or an arc of a circle. But in dimension three, the best cut is understood only for a few simple shapes, and for higher-dimensional shapes, mathematicians usually don’t even have a hope of finding the optimal cut.

Since the optimal curved cut is so hard to pin down, Kannan, Lovász and Simonovits wondered how much worse things would be if you only allowed straight cuts. In 1995, they conjectured that this restriction will never make things too much worse: There is some universal constant such that the surface area of the best flat cut is at most that constant times the surface area of the best overall cut.

“It was a brilliant insight,” said Santosh Vempala of the Georgia Institute of Technology, even though Kannan, Lovász and Simonovits couldn’t prove their conjecture. Instead of establishing a universal constant, the best they could do was establish a factor that works out to roughly the square root of the dimension the shape lives in. So for a 100-dimensional convex shape, for example, they knew that the best straight cut will expose at most about 10 times as much surface area as the very best cut.

Exposing 10 times as much surface area might not sound so great. But since many attributes of high-dimensional shapes grow exponentially as the dimension grows, a square root’s worth of growth is modest by comparison. “It’s already an indication that there is a nice phenomenon in high dimensions,” Bubeck said. “Things are not as crazy as they could be.”

But researchers were eager to improve on this result, and not just from academic interest: They knew that the KLS factor encapsulates a world of information about how random processes behave within a convex shape. That’s because the smaller the best cut is, the harder it is for a random process to spread around the shape quickly.

Think of a dumbbell, with two massive balls connected by a narrow bridge. The fact that you can divide it into two equal pieces with just a small cut precisely captures the notion that the bridge is a bottleneck. A heat source or a random walker in one of the two balls will usually take a long time to reach the other ball, since it has to find its way through the bottleneck.

Of course, a dumbbell is not convex. A convex shape cannot have a disproportionately small flat cut like the one in the dumbbell, but perhaps it could have a disproportionately small curved cut. The KLS conjecture essentially asks if a high-dimensional convex shape can contain a hidden, twisty sort of dumbbell that slows down random mixing.

Kannan, Lovász and Simonovits’ square root bound put a limit on how extreme these hidden dumbbells could be. And in 2012, Eldan lowered their bound to the cube root of the dimension by introducing a technique called stochastic localization that, roughly speaking, envisions tilting the convex shape and sliding its points around in one direction after another until they have piled up in a particular region. It’s easy to prove the KLS conjecture for a highly concentrated mass, which is about as different from a dumbbell as it gets. By showing that the tilting process hadn’t changed things too much, Eldan was able to calculate a KLS bound for the original shape. “It’s a very, very beautiful process,” Bubeck said.

A few years later, Vempala and Yin-Tat Lee of the University of Washington refined Eldan’s stochastic localization to lower the KLS factor even further, to the fourth root of the dimension. And for a brief, glorious moment, they thought they had done something much stronger. If the dimension is called *d*, then the square root is *d*^{1/2}, the cube root is *d*^{1/3}, and the fourth root is *d*^{1/4}. By introducing a new technique called bootstrapping, Lee and Vempala thought that they could lower the KLS bound all the way down to *d* raised to a power of 0 plus a little fudge factor. Since *d*^{0} always equals 1, Lee and Vempala’s bound was more or less a constant.

“This is amazing,” Bubeck remembers saying when Lee told him of the result. “In my feeling it was a big deal, worthy of the highest praise,” he said.

Lee and Vempala posted their paper online. But within a few days, Klartag found a gap that undermined their proof of the *d*^{0} bound. Lee and Vempala quickly posted a revised draft that only claimed a *d*^{1/4} bound. And for several years, researchers thought that perhaps this was the end of the KLS story.

That’s because Eldan and Klartag had previously shown that any KLS bound instantly translates into a Bourgain slicing bound — for instance, Lee and Vempala’s *d*^{1/4} bound means that in the Bourgain slicing problem, there is always a slice whose surface area is at least about 1/*d*^{1/4}. But mathematicians already knew several ways to prove a 1/*d*^{1/4} bound for Bourgain slicing. So maybe, they thought, Lee and Vempala had reached the natural endpoint of the KLS question.

“I was starting to feel, ‘Yeah, maybe this is the truth,’” Eldan said.

“There was the feeling that strong people worked on that method, and whatever could be exploited was exploited,” Klartag said. But that was before Yuansi Chen came along.

When Lee and Vempala posted their revised paper, they preserved in it their ideas about how a proof of the roughly *d*^{0} bound might work. Only one piece of their proof, they explained, had fallen through.

Their paper caught the eye of Chen, then a statistics graduate student at the University of California, Berkeley, who was studying the mixing rates of random sampling methods. Random sampling is a key ingredient in many types of statistical inference, such as Bayesian statistics, a framework for updating beliefs based on new evidence. “You deal with this [random] sampling every day if you want to do Bayesian statistics,” Chen said.

Lee and Vempala’s paper introduced Chen to the idea of stochastic localization. “I thought it was one of the most beautiful proof techniques I had seen for a while,” Chen said.

Chen dived into the literature and spent several weeks trying to fill the gap in Lee and Vempala’s proof, but to no avail. Periodically over the next few years, some idea for how to modify stochastic localization would pop into his head and he would ponder it for a few hours before giving up. Then finally, one of his ideas bore fruit: There was a way, he realized, not to prove the missing statement in Lee and Vempala’s proof, but to get around the need for such a strong statement at all.

Through what Chen called “some little tricks” but Vempala called an “elegant and important new insight,” Chen figured out how to make Lee and Vempala’s bootstrapping method work. This method takes a recursive approach to lowering the KLS bound, by showing that if you can make the bound fairly small, then there’s a way to make it even smaller. Applied repeatedly, this bootstrapping approach achieves the approximately constant bound for the KLS conjecture, and also for the Bourgain slicing problem.

When Chen posted his work online, “I immediately basically stopped everything I was doing and checked this paper,” Klartag said. Researchers were wary, given the previous incorrect proof and the fact that most of them had never heard of Chen. But his contribution turned out to be easy to verify. “This paper is 100% correct,” Klartag said. “There’s no question about it.”

Chen’s result means that the best 50-50 cut of a convex shape isn’t that much smaller than the best flat cut — in other words, high-dimensional convex shapes don’t contain hidden dumbbells with very narrow bridges. From a pure math perspective, “it is a big deal, because it was such a gaping hole in our understanding,” Bubeck said.

And from a practical standpoint, it means that a random walk is guaranteed to mix through a convex shape much faster than researchers could previously prove. Among other things, this understanding will help computer scientists to prioritize among different random sampling techniques — to figure out when the most basic random walk is best, and when a more sophisticated but computationally expensive algorithm will perform better.

In the end, considering how many people tried and failed to prove the *d*^{0} bound, the proof was surprisingly simple, Vempala said. Bourgain, he speculated, would probably have thought, “How did I miss that?”

Bourgain would have been thrilled by this development, mathematicians agree. Just a few months before his death in 2018, he contacted Milman, inquiring if there had been any progress. “He wanted to know the answer before he would leave,” Milman wrote.

Viruses evolve. It’s what they do. That’s especially true for a pandemic virus like SARS-CoV-2, the one behind COVID-19. When a population lacks immunity and transmission is extensive, we expect viral mutations to appear frequently simply due to the number of viruses replicating in a short period of time. And the growing presence of immune individuals means that the viruses that can still transmit in these partially immune populations will be favored over the original version. Sure enough, that’s what we’ve been seeing, as news reports warn of the appearance of novel variants (viruses with several mutations, making them distinct from their ancestors) and strains (variants that are confirmed to behave differently from the original).

To be clear, mutations are random errors that occur when a virus reproduces. In the case of SARS-CoV-2, which has an RNA genome based on adenine, cytosine, guanine and uracil, sometimes mistakes happen. Maybe an adenine gets swapped with a uracil (a substitution mutation that could also occur with any of the base pairs), or perhaps one or more bases get inserted or deleted. If a mutation actually changes the protein encoded by that part of the RNA sequence, it’s referred to as a non-synonymous mutation. Mutations that do not result in a protein change are referred to as synonymous, or silent, mutations.

Luckily, the mutation rate of coronaviruses generally is relatively slow, due to a proofreading ability in the virus that allows for some correction of replication mistakes. Typically SARS-CoV-2 will accumulate only two mutations per month among its genome’s 30,000 base pairs; that’s half the rate of an influenza virus, and a quarter of the rate of HIV. But with more than 100 million people infected to date, non-synonymous mutations are inevitable. The bigger issue is determining which mutations actually provide the virus enough of an advantage to increase its spread through the population.

Fortunately, at this point we have the knowledge to answer some of the most pressing questions.

The first mutation we learned about was the D614G mutation, first reported in March 2020. When a mutation causes a change in the protein sequence, its name refers to the ancestral amino acid, its location and then the new amino acid. This mutation changed the amino acid aspartate (abbreviated as D) at the 614th position in the virus spike protein into glycine (G). Because the spike protein enables the virus to bind to host cells, the change is significant; mutations here could help it to bind more efficiently to the host receptor (called ACE2).

However, it’s not clear yet if that’s the case with D614G. The authors of a paper describing the mutation suggested that the rapid spread of variants carrying this mutation, combined with in vitro analyses of viral behavior and clinical data involving people infected with it, meant that D614G provided a selective advantage to these variants, and the mutation was therefore spreading. Others were not convinced, suggesting an alternative rationale for the dominance of the D614G mutation: the shift in the geographic focus of the epidemic, from China to Europe (especially Italy) to the U.S. In China, the original version of the virus, with aspartate (D) in the 614th position, was most prevalent; in Europe, and subsequently in the U.S., it was the new one, with glycine. With additional exported cases including the D614G mutation, this variant may have become the major lineage due merely to luck or the “founder effect”— meaning that the lineage dominated simply because it was the first one to populate that area — rather than a selective advantage. We’re still not sure.

Since September 2020, a number of other SARS-CoV-2 mutations have been identified around the globe. Some of the variants currently circulating in the population do seem to be more evolutionarily fit than their older counterparts, with improved transmission, lethality or both. Now that the virus has spread almost everywhere, when we see new variants overtake a population, it is much more likely to be due to selection — improved fitness — than the founder effect. This is supported by the fact that many of the variants show signs of convergent evolution: The viruses have independently landed on the same mutations that make them more transmissible, giving them an evolutionary advantage over preexisting strains.

The most well-known is probably variant B.1.1.7, first detected in the U.K. in September of 2020. Here the name derives from a system called Pango lineages, where A and B represent early lineages, and the numbers after the letter represent branches from those lineages. B.1.1.7 contains 23 mutations that differentiate it from its wild-type ancestor. A study suggested that the variant is 35%-45% more transmissible and that it was likely introduced into the U.S. via international travel at least eight times. While increased transmission — but not lethality — seems to be a hallmark of this variant, one group has reported that B.1.1.7 may also be associated with an increased risk of death.

Meanwhile, in December 2020, another variant dubbed B.1.351 was first identified in South Africa, and soon after a variant called P.1 was found in Manaus, Brazil, during a second surge of infections in that city. (Manaus had already been hard hit in April, and officials thought herd immunity had been reached.) Both of these variants also seem to make the virus easier to catch.

As they all appear to have a transmission advantage over the established lineages, we will likely see these variants continue to spread. Recent work predicted that the B.1.1.7 variant could become the dominant lineage and account for more than half of identified cases in the U.S. in mid-March.

As with D614G, many mutations involve changes to the spike protein. A key mutation in B.1.1.7 is called N501Y, which changes the residue of an amino acid named asparagine (N) to one named tyrosine (Y) at the 501st position along the spike protein. Exactly why this may make the virus more transmissible isn’t yet understood; perhaps it allows for better binding to host cells, higher amounts of the virus in the respiratory system, improved viral replication, a combination of these, or something else entirely. Experiments to figure this out are underway in labs around the globe.

B.1.351 and P.1 have the N501Y mutation and another one called E484K, which switches glutamic acid (E) for lysine (K) at spike protein position 484. This mutation is especially concerning as it seems to be better at escaping antibody-mediated immunity: It makes it more difficult for the body’s antibodies to bind to the spike protein and thus prevent the virus from entering cells.

In addition to these specific changes, the B.1.351 and P.1 lineages also have approximately 20 additional unique mutations each. If both variants are indeed better than older viruses at escaping immunity, this could explain some of the second surge in Manaus, and it may leave previously infected individuals at risk of reinfection by these variants. Indeed, several case reports in Brazil have already documented such reinfections with variants containing the E484K mutation.

Yes, but perhaps not quite as effective.

In a pair of recent manuscripts, the developers of the Moderna and Pfizer-BioNTech vaccines examined whether antibodies from vaccinated individuals would neutralize (prevent from replicating) viruses containing mutated forms of the SARS-CoV-2 spike protein in cell culture. The antibodies functioned well against a virus carrying the B.1.1.7 mutations, but neutralization was reduced when the B.1.351 mutations were introduced. However, both companies expect the vaccines to work well even against this variant; the lower level of protective antibodies is still considered enough to prevent infection. The Novavax and Johnson & Johnson vaccines, not yet available in the U.S., also appeared to be less effective against the B.1.351 and P.1 variants in trials.

Boosters tailored to new variants may be required in the future, and many are already in development.

We’re not sure. For the B.1.1.7 strain in the U.K., there don’t appear to be any clear “intermediate” viral variants to demonstrate that this strain evolved from the prior dominant strains, accumulating mutations slowly over time in a stepwise pattern.

Instead, scientists are beginning to think there may have been a massive evolutionary leap, which could have occurred in a known individual suffering a lingering infection. A case report from December 2020 describes a SARS-CoV-2 infection in a man who was severely immunocompromised. Over time, scientists found that the population of viruses he harbored underwent “accelerated viral evolution,” likely due to the inability of his immune system to keep the virus in check. When examining the specific mutations, doctors spotted both N501Y and E484K — also part of the B.1.351 and P.1 variants that showed up around the same time, even though the man didn’t have either variant himself.

Now imagine this process happening again and again, around the globe. It only takes one variant replicating in the right person and in the right setting to take off and spread in the population.

Not as much as we need to do, but more than we were doing. As of February 7, 2021, the U.S. ranked 36th in the world in terms of sequencing our viral isolates, carrying out genomic analyses of only 0.36% of our confirmed cases. For comparison, the U.K. sequences approximately 10% of its cases, and Denmark 50%. The Biden administration has increased sequencing goals dramatically and earmarked additional funds for viral sequencing.

As far as stopping them is concerned, we must continue to do what we’ve been doing all along: wear masks, maintain social distancing, stay home, wash hands. We can now also add getting vaccinated as soon as a vaccine is available to you. This is important even if the variants reduce vaccine effectiveness somewhat, as at least the B.1.351 and P.1 variants seem to do — decreased effectiveness is still better than no effectiveness, and even a vaccine that is less effective at preventing infection can still protect against serious illness.

The key is to provide less tinder for the fire: Reduce susceptible hosts for the variants and stop their replication by following basic public health interventions and getting vaccinated. When the virus chances upon beneficial mutations, it’s as if it won the lottery; as virologist Angela Rasmussen suggests, we need to “stop selling it tickets.”

It often goes unmentioned that protons, the positively charged matter particles at the center of atoms, are part antimatter.

We learn in school that a proton is a bundle of three elementary particles called quarks — two “up” quarks and a “down” quark, whose electric charges (+2/3 and −1/3, respectively) combine to give the proton its charge of +1. But that simplistic picture glosses over a far stranger, as-yet-unresolved story.

In reality, the proton’s interior swirls with a fluctuating number of six kinds of quarks, their oppositely charged antimatter counterparts (antiquarks), and “gluon” particles that bind the others together, morph into them and readily multiply. Somehow, the roiling maelstrom winds up perfectly stable and superficially simple — mimicking, in certain respects, a trio of quarks. “How it all works out, that’s quite frankly something of a miracle,” said Donald Geesaman, a nuclear physicist at Argonne National Laboratory in Illinois.

Thirty years ago, researchers discovered a striking feature of this “proton sea.” Theorists had expected it to contain an even spread of different types of antimatter; instead, down antiquarks seemed to significantly outnumber up antiquarks. Then, a decade later, another group saw hints of puzzling variations in the down-to-up antiquark ratio. But the results were right on the edge of the experiment’s sensitivity.

So, 20 years ago, Geesaman and a colleague, Paul Reimer, embarked on a new experiment to investigate. That experiment, called SeaQuest, has finally finished, and the researchers report their findings today in the journal *Nature.* They measured the proton’s inner antimatter in more detail than ever before, finding that there are, on average, 1.4 down antiquarks for every up antiquark.

Samuel Velasco/Quanta Magazine

The data immediately favors two theoretical models of the proton sea. “This is the first real evidence backing up those models that has come out,” said Reimer.

One is the “pion cloud” model, a popular, decades-old approach that emphasizes the proton’s tendency to emit and reabsorb particles called pions, which belong to a group of particles known as mesons. The other model, the so-called statistical model, treats the proton like a container full of gas.

Planned future experiments will help researchers choose between the two pictures. But whichever model is right, SeaQuest’s hard data about the proton’s inner antimatter will be immediately useful, especially for physicists who smash protons together at nearly light speed in Europe’s Large Hadron Collider. When they know exactly what’s in the colliding objects, they can better piece through the collision debris looking for evidence of new particles or effects. Juan Rojo of VU University Amsterdam, who helps analyze LHC data, said the SeaQuest measurement “could have a big impact” on the search for new physics, which is currently “limited by our knowledge of the proton structure, in particular of its antimatter content.”

For a brief period around half a century ago, physicists thought they had the proton sorted.

In 1964, Murray Gell-Mann and George Zweig independently proposed what became known as the quark model — the idea that protons, neutrons and related rarer particles are bundles of three quarks (as Gell-Mann dubbed them), while pions and other mesons are made of one quark and one antiquark. The scheme made sense of the cacophony of particles spraying from high-energy particle accelerators, since their spectrum of charges could all be constructed out of two- and three-part combos. Then, around 1970, researchers at Stanford’s SLAC accelerator seemed to triumphantly confirm the quark model when they shot high-speed electrons at protons and saw the electrons ricochet off objects inside.

But the picture soon grew murkier. “As we started trying to measure the properties of those three quarks more and more, we discovered that there were some additional things going on,” said Chuck Brown, an 80-year-old member of the SeaQuest team at the Fermi National Accelerator Laboratory who has worked on quark experiments since the 1970s.

Scrutiny of the three quarks’ momentum indicated that their masses accounted for a minor fraction of the proton’s total mass. Furthermore, when SLAC shot faster electrons at protons, researchers saw the electrons ping off of more things inside. The faster the electrons, the shorter their wavelengths, which made them sensitive to more fine-grained features of the proton, as if they’d cranked up the resolution of a microscope. More and more internal particles were revealed, seemingly without limit. There’s no highest resolution “that we know of,” Geesaman said.

The results began to make more sense as physicists worked out the true theory that the quark model only approximates: quantum chromodynamics, or QCD. Formulated in 1973, QCD describes the “strong force,” the strongest force of nature, in which particles called gluons connect bundles of quarks.

QCD predicts the very maelstrom that scattering experiments observed. The complications arise because gluons feel the very force that they carry. (They differ in this way from photons, which carry the simpler electromagnetic force.) This self-dealing creates a quagmire inside the proton, giving gluons free rein to arise, proliferate and split into short-lived quark-antiquark pairs. From afar, these closely spaced, oppositely charged quarks and antiquarks cancel out and go unnoticed. (Only three unbalanced “valence” quarks — two ups and a down — contribute to the proton’s overall charge.) But physicists realized that when they shot in faster electrons, they were hitting the small targets.

Yet the oddities continued.

Mary Alberg, a nuclear physicist at Seattle University, and her co-authors have long argued for the significance of the pion in shaping the identity of the proton.

Self-dealing gluons render the QCD equations generally unsolvable, so physicists couldn’t — and still can’t — calculate the theory’s precise predictions. But they had no reason to think gluons should split more often into one type of quark-antiquark pair — the down type — than the other. “We would expect equal amounts of both to be produced,” said Mary Alberg, a nuclear theorist at Seattle University, explaining the reasoning at the time.

Hence the shock when, in 1991, the New Muon Collaboration in Geneva scattered muons, the heavier siblings of electrons, off of protons and deuterons (consisting of one proton and one neutron), compared the results, and inferred that more down antiquarks than up antiquarks seemed to be splashing around in the proton sea.

Theorists soon came out with a number of possible ways to explain the proton’s asymmetry.

One involves the pion. Since the 1940s, physicists have seen protons and neutrons passing pions back and forth inside atomic nuclei like teammates tossing basketballs to each other, an activity that helps link them together. In mulling over the proton, researchers realized that it can also toss a basketball to itself — that is, it can briefly emit and reabsorb a positively charged pion, turning into a neutron in the meantime. “If you’re doing an experiment and you think you’re looking at a proton, you’re fooling yourself, because some of the time that proton is going to fluctuate into this neutron-pion pair,” said Alberg.

Specifically, the proton morphs into a neutron and a pion made of one up quark and one down antiquark. Because this phantasmal pion has a down antiquark (a pion containing an up antiquark can’t materialize as easily), theorists such as Alberg, Gerald Miller and Tony Thomas argued that the pion cloud idea explains the proton’s measured down antiquark surplus.

Samuel Velasco/Quanta Magazine

Several other arguments emerged as well. Claude Bourrely and collaborators in France developed the statistical model, which treats the proton’s internal particles as if they’re gas molecules in a room, whipping about at a distribution of speeds that depend on whether they possess integer or half-integer amounts of angular momentum. When tuned to fit data from numerous scattering experiments, the model divined a down-antiquark excess.

The models did not make identical predictions. Much of the proton’s total mass comes from the energy of individual particles that burst in and out of the proton sea, and these particles carry a range of energies. Models made different predictions for how the ratio of down and up antiquarks should change as you count antiquarks that carry more energy. Physicists measure a related quantity called the antiquark’s momentum fraction.

When the “NuSea” experiment at Fermilab measured the down-to-up ratio as a function of antiquark momentum in 1999, their answer “just lit everybody up,” Alberg recalled. The data suggested that among antiquarks with ample momentum — so much, in fact, that they were right on the end of the apparatus’s range of detection — up antiquarks suddenly became more prevalent than downs. “Every theorist was saying, ‘Wait a minute,’” said Alberg. “Why, when those antiquarks get a bigger share of the momentum, should this curve start to turn over?”

As theorists scratched their heads, Geesaman and Reimer, who worked on NuSea and knew that the data on the edge sometimes isn’t trustworthy, set out to build an experiment that could comfortably explore a larger antiquark momentum range. They called it SeaQuest.

Long on questions about the proton but short on cash, they started assembling the experiment out of used parts. “Our motto was: Reduce, reuse, recycle,” Reimer said.

They acquired some old scintillators from a lab in Hamburg, leftover particle detectors from Los Alamos National Laboratory, and radiation-blocking iron slabs first used in a cyclotron at Columbia University in the 1950s. They could repurpose NuSea’s room-size magnet, and they could run their new experiment off of Fermilab’s existing proton accelerator. The Frankenstein assemblage was not without its charms. The beeper indicating when protons were flowing into their apparatus dated back five decades, said Brown, who helped find all the pieces. “When it beeps, it gives you a warm feeling in your tummy.”

Gradually they got it working. In the experiment, protons strike two targets: a vial of hydrogen, which is essentially protons, and a vial of deuterium — atoms with one proton and one neutron in the nucleus.

When a proton hits either target, one of its valence quarks sometimes annihilates with one of the antiquarks in the target proton or neutron. “When annihilation occurs, it has a unique signature,” Reimer said, yielding a muon and an antimuon. These particles, along with other “junk” produced in the collision, then encounter those old iron slabs. “The muons can go through; everything else stops,” he said. By detecting the muons on the other side and reconstructing their original paths and speeds, “you can work backwards to work out what momentum fraction the antiquarks carry.”

Because protons and neutrons mirror each other — each has up-type particles in place of the other’s down-type particles, and vice versa — comparing the data from the two vials directly indicates the ratio of down antiquarks to up antiquarks in the proton — directly, that is, after 20 years of work.

In 2019, Alberg and Miller calculated what SeaQuest should observe based on the pion cloud idea. Their prediction matches the new SeaQuest data well.

The new data — which shows a gradually rising, then plateauing, down-to-up ratio, not a sudden reversal — also agrees with Bourrely and company’s more flexible statistical model. Yet Miller calls this rival model “descriptive, rather than predictive,” since it’s tuned to fit data rather than to identify a physical mechanism behind the down antiquark excess. By contrast, “the thing I’m really proud of in our calculation is that it was a true prediction,” Alberg said. “We didn’t dial any parameters.”

In an email, Bourrely argued that “the statistical model is more powerful than that of Alberg and Miller,” since it accounts for scattering experiments in which particles both are and aren’t polarized. Miller vehemently disagreed, noting that pion clouds explain not only the proton’s antimatter content but various particles’ magnetic moments, charge distributions and decay times, as well as the “binding, and therefore existence, of all nuclei.” He added that the pion mechanism is “important in the broad sense of why do nuclei exist, why do we exist.”

In the ultimate quest to understand the proton, the deciding factor might be its spin, or intrinsic angular momentum. A muon scattering experiment in the late 1980s showed that the spins of the proton’s three valence quarks account for no more than 30% of the proton’s total spin. The “proton spin crisis” is: What contributes the other 70%? Once again, said Brown, the Fermilab old-timer, “something else must be going on.”

At Fermilab, and eventually at Brookhaven National Laboratory’s planned Electron-Ion Collider, experimenters will probe the spin of the proton sea. Already Alberg and Miller are working on calculations of the full “meson cloud” surrounding protons, which includes, along with pions, rarer “rho mesons.” Pions don’t possess spin, but rho mesons do, so they must contribute to the overall spin of the proton in a way Alberg and Miller hope to determine.

Fermilab’s SpinQuest experiment, involving many of the same people and parts as SeaQuest, is “almost ready to go,” Brown said. “With luck we’ll take data this spring; it will depend” — at least, partly — “on the progress of the vaccine against the virus. It’s sort of amusing that a question this deep and obscure inside the nucleus is depending on the response of this country to the COVID virus. We’re all interconnected, aren’t we?”

To climate scientists, clouds are powerful, pillowy paradoxes: They can simultaneously reflect away the sun’s heat but also trap it in the atmosphere; they can be products of warming temperatures but can also amplify their effects. Now, while studying the atmospheric chemistry that produces clouds, researchers have uncovered an unexpectedly potent natural process that seeds their growth. They further suggest that, as the Earth continues to warm from rising levels of greenhouse gases, this process could be a major new mechanism for accelerating the loss of sea ice at the poles — one that no global climate model currently incorporates.

This discovery emerged from studies of aerosols, the tiny particles suspended in air onto which water vapor condenses to form clouds. As described this month in a paper in *Science*, researchers have identified a powerful overlooked source of cloud-making aerosols in pristine, remote environments: iodine.

The full climate impact of this mechanism still needs to be assessed carefully, but tiny modifications in the behavior of aerosols, which are treated as an input in climate models, can have huge consequences, according to Andrew Gettelman, a senior scientist at the National Center for Atmospheric Research (NCAR) who helps run the organization’s climate models and who was not involved in the study. And one consequence “will definitely be to accelerate melting in the Arctic region,” said Jasper Kirkby, an experimental physicist at CERN who leads the Cosmics Leaving Outdoor Droplets (CLOUD) experiment and a coauthor of the new study.

Jasper Kirkby, the project leader, sits inside the CLOUD chamber at CERN in 2009, when the experiment was launched to understand the influence of galactic cosmic rays on aerosols, clouds and the climate.

The results could also help scientists understand how much the planet will warm on average when carbon dioxide levels double compared with pre-industrial levels. For decades, estimates have put this number, called the equilibrium climate sensitivity, between 1.5 and 4.5 degrees Celsius (2.6 to 8.1 degrees Fahrenheit) of warming, a range of uncertainty that has remained stubbornly wide for decades. If Earth were no more complicated than a billiard ball flying through space, calculating this number would be easy: just under 1 degree C, Kirkby said. But that calculation doesn’t account for amplifying feedback loops from natural systems that introduce tremendous uncertainty into climate models.

Aerosols’ overall role on climate sensitivity remains unclear; estimates in the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report suggest a moderate cooling effect, but the error bars range from a net warming effect to a more significant cooling effect. Clouds generally cool the planet, as the white tops of the clouds reflect sunlight into space. But in polar regions, snowpack has a similar albedo, or reflectivity, as cloud tops, so an increase in clouds would reflect little additional sunlight. Instead, it would trap longwave radiation from the ground, creating a net warming effect.

Now atmospheric scientists can try to confirm whether what they observed in the CLOUD chamber occurs in nature. “What they’ve accomplished gives us a target to shoot for in the atmosphere, so now we know what instruments to take on our aircraft and what molecules to look for to see that these processes are actually occurring in the atmosphere,” Brock said.

To be sure, while these findings are a step in the right direction, Gettelman said, there are many other factors that remain large sources of uncertainty in global climate models, such as the structure and role of ice in cloud formation. In 2019, NCAR’s model projected a climate sensitivity well above IPCC’s average upper bound and 32% higher than its previous estimate — a warming of 5.3 degrees C (10.1 degrees F) if the global carbon dioxide is doubled — mostly as a result of the way that clouds and their interactions with aerosols are represented in their new model. “But we fix one problem and reveal another one,” Gettelman said.

Brock remains hopeful that future research into new particle formation will help to chip away at the uncertainty in climate sensitivity. “I think we’re gaining an appreciation for the complexity of these new particle sources,” he said.

Joseph Silverman remembers when he began connecting the dots that would ultimately lead to a new branch of mathematics: April 25, 1992, at a conference at Union College in Schenectady, New York.

It happened by accident while he was at a talk by the decorated mathematician John Milnor. Milnor’s subject was a field called complex dynamics, which Silverman knew little about. But as Milnor introduced some basic ideas, Silverman started to see a striking resemblance to the field of number theory where he was an expert.

“If you just change a couple of the words, there’s an analogous sort of problem,” he remembers thinking to himself.

Silverman, a mathematician at Brown University, left the room inspired. He asked Milnor some follow-up questions over breakfast the next day and then set to work pursuing the analogy. His goal was to create a dictionary that would translate between dynamical systems and number theory.

At first glance, the two look like unrelated branches of mathematics. But Silverman recognized that they complement each other in a particular way. While number theory looks for patterns in sequences of numbers, dynamical systems actually produce sequences of numbers — like the sequence that defines a planet’s position in space at regular intervals of time. The two merge when mathematicians look for number-theoretic patterns hidden in those sequences.

In the decades since Silverman attended Milnor’s talk, mathematicians have dramatically expanded the connections between the two branches of math and built the foundations of an entirely new field: arithmetic dynamics.

The field’s reach continues to grow. In a paper published in *Annals of Mathematics* last year, a trio of mathematicians extended the analogy to one of the most ambitious and unexpected places yet. In doing so, they resolved part of a decades-old problem in number theory that didn’t previously seem to have any clear connection to dynamical systems at all.

The new proof quantifies the number of times that a type of curve can intersect special points in a surrounding space. Number theorists previously wondered if there is a cap on just how many intersections there can be. The authors of the proof used arithmetic dynamics to prove there is an upper limit for a particular collection of curves.

“We wanted to understand the number theory. We didn’t care if there was a dynamical system, but since there was one, we were able to use it as a tool,” said Laura DeMarco, a mathematician at Harvard University and co-author of the paper along with Holly Krieger of the University of Cambridge and Hexi Ye of Zhejiang University.

In May 2010, a group of mathematicians gathered at a small research institute in Barbados where they spent sunny days discussing math just a few dozen feet from the beach. Even the lecture facilities — with no walls and simple wooden benches — left them as close to nature as possible.

“One evening when it was raining you couldn’t even hear people, because of the rain on the metal roof,” said Silverman.

The conference was a pivotal moment in the development of arithmetic dynamics. It brought together experts from number theory, like Silverman, and dynamical systems, like DeMarco and Krieger. Their goal was to expand the types of problems that could be addressed by combining the two perspectives.

Their starting point was one of the central objects in number theory: elliptic curves. Just like circles and line*s*, elliptic curves are both numbers and shapes. They are pairs of numbers, *x* and *y*, that serve as solutions to an algebraic equation like *y*^{2} = *x*^{3} − 2*x*. The graph of those solutions creates a geometric shape that looks vaguely like a vertical line extruding a bubble.

Samuel Velasco/Quanta Magazine

Mathematicians have long been interested in quantifying and classifying various properties of these curves. The most prominent result to date is Andrew Wiles’ famed 1994 proof of Fermat’s Last Theorem, a question about which equations have solutions that are whole numbers. The proof relied heavily on the study of elliptic curves. In general, mathematicians focus on elliptic curves because they occupy the sweet spot of inquiry: They’re not easy enough to be trivial and not so hard that they’re impossible to study.

“Elliptic curves are still mysterious enough that they’re generating new math all the time,” said Matt Baker, a mathematician at the Georgia Institute of Technology.

Mathematicians are particularly interested in points on elliptic curves that act like a home base for a special way of moving around on the curves. On an elliptic curve, you can add points to each other using standard addition, but this approach is not very useful: the sum is unlikely to be another point on the curve.

But elliptic curves come packaged with a special internal structure that creates a different type of arithmetic. This structure is called a group, and the result of adding points together using its self-contained arithmetic rules is quite different.

If you add two points on an elliptic curve according to the group structure, the sum is always a third point on the curve. And if you continue this process by, for example, adding a point to itself over and over, the result is an infinite sequence of points that all lie along the elliptic curve.

Different starting points will result in different sequences. The “home base” points are starting points with a very unique property. If you repeatedly add one of these points to itself, it does not generate an infinite sequence of new points. Instead, it creates a loop that comes back to the point you started with.

Samuel Velasco/Quanta Magazine

These special starting values that create loops are called torsion points. They are of immediate interest to number theorists. They also have a striking correspondence to a specific type of point on dynamical systems — and it was this correspondence that really set arithmetic dynamics in motion.

“That’s truly the basis of why this field has become a field,” said Krieger.

Dynamical systems are often used to describe real-world phenomena that move forward in time according to a repeated rule, like the ricocheting of a billiard ball in accordance with Newton’s laws. You begin with a value, plug it into a function, and get an output that becomes your new input.

Some of the most interesting dynamical systems are driven by functions like *f*(*x*) = *x*^{2} − 1, which are associated with intricate fractal pictures known as Julia sets. If you use complex numbers (numbers with a real part and an imaginary part) and apply the function over and over — feeding each output back into the function as the next input — you generate a sequence of points in the complex plane.

This is just one example of what’s called a quadratic polynomial, in which the variable is raised to the second power. Quadratic polynomials are the foundation of research in dynamical systems, just as elliptic curves are the focus of a lot of basic inquiry in number theory.

“Quadratic polynomials [in dynamical systems] play a similar role as elliptic curves in number theory,” said Baker. “They’re the ground that we always seem to return to to try to actually prove something.”

Dynamical systems generate sequences of numbers as they evolve. Take for example that quadratic function *f*(*x*) = *x*^{2} − 1. If you start with the value *x* = 2, you generate the infinite sequence 2, 3, 8, 63, and so on.

But not all starting values trigger a series that grows larger forever. If you begin with *x* = 0, that same function generates a very different type of sequence: 0, −1, 0, −1, 0, and so on. Instead of an infinite string of distinct numbers, you end up in a small, closed loop.

In the world of dynamical systems, starting points whose sequences eventually repeat are called finite orbit points. They are a direct analog of torsion points on elliptic curves. In both cases, you start with a value, apply the rules of the system or curve, and end up in a cycle. This is the analogy that the three mathematicians exploit in their new proof.

“This simple observation — that torsion points on the elliptic curve are the same as finite orbit points for a certain dynamical system — is what we use in our paper over and over and over again,” said DeMarco.

Both Krieger and Ye received their doctorates from the University of Illinois, Chicago in 2013 under DeMarco’s supervision. The trio reconvened in August 2017 at the American Institute of Mathematics in San Jose, California, which hosts intensive, short-term research programs.

“We stayed in a room for five days. We needed to work through some questions,” said Ye.

Archives of the Mathematisches Forschungsinstitut Oberwolfach; Courtesy of Hexi Ye

During this period, they began to envision a way to extend the crucial analogy between torsion points of elliptic curves and finite orbit points of dynamical systems. They knew that they could transform a seemingly unrelated problem into one where the analogy was directly applicable. That problem arises out of something called the Manin-Mumford conjecture.

The Manin-Mumford conjecture is about curves that are more complicated than elliptic curves, such as *y*^{2} = *x*^{6} + *x*^{4} + *x*^{2} − 1. Each of these curves comes with an associated larger geometric object called a Jacobian, which mimics certain properties of the curve and is often easier for mathematicians to study than the curve itself. A curve sits inside its Jacobian the way a piece sits inside a jigsaw puzzle.

Unlike elliptic curves, these more complicated curves don’t have a group structure that enables adding points on a curve to get other points on the curve. But the associated Jacobians do. The Jacobians also have torsion points, just like elliptic curves, which circle back on themselves under repeated internal addition.

The Manin-Mumford conjecture has to do with how many times one of these complicated curves, nestled inside its Jacobian, intersects the torsion points of the Jacobian. It predicts that these intersections only occur finitely many times. The conjecture reflects the interrelationship between the algebraic nature of a curve (in the way that torsion points are special solutions to the equations defining the curve) and its life as a geometric object (reflecting how the curve is embedded inside its Jacobian, like one shape inside another). Torsion points are crowded in every region of the Jacobian. If you zoom in on any tiny part of it, you will find them. But the Manin-Mumford conjecture predicts that, surprisingly, the nestled curve still manages to miss all but a finite number of them.

In 1983 Michel Raynaud proved the conjecture true. Since then, mathematicians have been trying to upgrade his result. Instead of just knowing that the number of intersections is finite, they’d like to know it’s below some specific value.

“Now that you know that they have only finitely many points in common, then every mathematician you would meet would say, well, how many?” said Krieger.

But the effort to count the intersection points was impeded by the lack of a clear framework in which to think about the complex numbers that define those points. Arithmetic dynamics ended up providing one.

In their 2020 paper, DeMarco, Krieger and Ye established that there is an upper bound on the intersection number for a family of curves. A newer paper by another mathematician, Lars Kühne of the University of Copenhagen, presents a proof establishing an upper bound for all curves. That paper was posted in late January and has not been fully vetted.

Raynaud’s previous result proved simply that the number of intersections is finite — but it left room for that finite number to be as large as you could possibly want (in the sense that you can always make a larger finite number). The trio’s new proof establishes what’s called a uniform bound, a cap on how big that finite number of intersections can be. DeMarco, Krieger and Ye didn’t identify that cap exactly, but they proved it exists, and they also identified a long series of steps that future work could take to calculate the number.

Laura DeMarco joined with two of her former students to demonstrate how dynamical systems can solve problems about elliptic curves.

Their proof relies on a unique property of the Jacobians associated to this special family of curves: They can be split apart into two elliptic curves.

The elliptic curves that make up the Jacobians take their solutions from the complex numbers, which gives their graphs a bulkier appearance than the graphs of elliptic curves whose solutions come from the real numbers. Instead of a wiggly line, they look like the surface of a doughnut. The specific family of curves that DeMarco, Krieger and Ye studied has Jacobians that look like two-holed doughnuts. They break apart nicely into two regular doughnuts, each of which is the graph of one of the two constituent elliptic curves.

The new work focuses on the torsion points of those elliptic curves. The three mathematicians knew that the number they were interested in — the number of intersection points between complicated curves and the torsion points of their Jacobians — could be reframed in terms of the number of times that torsion points from one of those elliptic curves overlap torsion points from the other. So, to put a bound on the Manin-Mumford conjecture, all the authors had to do was count the number of intersections between those torsion points.

They knew this could not be accomplished directly. The two elliptic curves and their torsion points could not be immediately compared because they do not necessarily overlap. The torsion points are sprinkled on the surfaces of the elliptic curves, but the two curves might have very different shapes. It’s like comparing points on the surface of a sphere to points on the surface of a cube — the points can have similar relative positions without actually overlapping.

“You can’t really compare the points on those elliptic curves, because they’re in different places; they’re living on different geometric objects,” said Krieger.

Samuel Velasco/Quanta Magazine

But while the torsion points don’t actually necessarily overlap, it’s possible to think of pairs of them as being in the same relative position on each doughnut. And pairs of torsion points that occupy the same relative position on their respective doughnuts can be thought of as intersecting.

In order to determine precisely where these intersections take place, the authors had to lift the torsion points off their respective curves and transpose them over each other — almost the way you’d fit a star chart to the night sky.

Mathematicians knew about these star charts, but they didn’t have a good perspective that allowed them to count the overlapping points. DeMarco, Krieger and Ye managed it using arithmetic dynamics. They translated the two elliptic curves into two different dynamical systems. The two dynamical systems generated points on the same actual space, the complex plane.

“It’s easier to think of one space with two separate dynamical systems, versus two separate spaces with one dynamical system,” said DeMarco.

The finite orbit points of the two dynamical systems corresponded to the torsion points of the underlying elliptic curves. Now, to put a bound on the Manin-Mumford conjecture, the mathematicians just needed to count the number of times these finite orbit points overlapped. They used techniques from dynamical systems to solve the problem.

In order to count the number of overlaps, DeMarco, Krieger and Ye turned to a tool which measures how much the value of an initial point grows as it’s repeatedly added to itself.

The torsion points on elliptic curves have no growth or long-term change, since they circle back to themselves. Mathematicians measure this growth, or lack of it, using a “height function.” It equals zero when applied to the torsion points of elliptic curves. Similarly, it equals zero when applied to the finite orbit points of dynamical systems. Height functions are an essential tool in arithmetic dynamics because they can be used on either side of the divide between the two branches.

The authors studied how often points of zero height coincide for the dynamical systems representing the elliptic curves. They showed that these points are sufficiently scattered around the complex plane so that they are unlikely to coincide — so unlikely, in fact, that they can’t do it more than a specific number of times.

That number is difficult to compute, and it’s probably much larger than the actual number of coinciding points, but the authors proved that this hard ceiling does exist. They then translated the problem back into the language of number theory to determine a maximum number of shared torsion points on two elliptic curves — the key to their original question and a provocative demonstration of the power of arithmetic dynamics.

“They’re able to answer a specific question that already existed just within number theory and that nobody thought had anything to do with dynamical systems,” said Patrick Ingram of York University in Toronto. “That got a lot of attention.”

Shortly after DeMarco, Krieger and Ye first posted their proof of a uniform bound for the Manin-Mumford conjecture, they released a second, related paper. The follow-up work is about a question in dynamical systems, instead of number theory, but it uses similar methods. In that sense, the pair of papers is a quintessential product of the analogy Silverman noticed almost 30 years earlier.

“In some sense, it’s the same argument applied to two different families of examples,” said DeMarco.

The two papers synthesized many of the ideas that mathematicians working in arithmetic dynamics have developed over the last three decades while also adding wholly new techniques. But Silverman sees the papers as suggestive more than conclusive, hinting at an even wider influence for the new discipline.

“The specific theorems are special cases of what the big conjectures should be,” said Silverman. “But even those individual theorems are really, really beautiful.”

**Correction: **February 23, 2021**
**This article has been revised to avoid implying that Lars Kühne’s new work uses arithmetic dynamics.

In 2007, some of the leading thinkers behind deep neural networks organized an unofficial “satellite” meeting at the margins of a prestigious annual conference on artificial intelligence. The conference had rejected their request for an official workshop; deep neural nets were still a few years away from taking over AI. The bootleg meeting’s final speaker was Geoffrey Hinton of the University of Toronto, the cognitive psychologist and computer scientist responsible for some of the biggest breakthroughs in deep nets. He started with a quip: “So, about a year ago, I came home to dinner, and I said, ‘I think I finally figured out how the brain works,’ and my 15-year-old daughter said, ‘Oh, Daddy, not again.’”

The audience laughed. Hinton continued, “So, here’s how it works.” More laughter ensued.

Hinton’s jokes belied a serious pursuit: using AI to understand the brain. Today, deep nets rule AI in part because of an algorithm called backpropagation, or backprop. The algorithm enables deep nets to learn from data, endowing them with the ability to classify images, recognize speech, translate languages, make sense of road conditions for self-driving cars, and accomplish a host of other tasks.

But real brains are highly unlikely to be relying on the same algorithm. It’s not just that “brains are able to generalize and learn better and faster than the state-of-the-art AI systems,” said Yoshua Bengio, a computer scientist at the University of Montreal, the scientific director of the Quebec Artificial Intelligence Institute and one of the organizers of the 2007 workshop. For a variety of reasons, backpropagation isn’t compatible with the brain’s anatomy and physiology, particularly in the cortex.

Geoffrey Hinton, a cognitive psychologist and computer scientist at the University of Toronto, is responsible for some of the biggest breakthroughs in deep neural network technology, including the development of backpropagation.

Bengio and many others inspired by Hinton have been thinking about more biologically plausible learning mechanisms that might at least match the success of backpropagation. Three of them — feedback alignment, equilibrium propagation and predictive coding — have shown particular promise. Some researchers are also incorporating the properties of certain types of cortical neurons and processes such as attention into their models. All these efforts are bringing us closer to understanding the algorithms that may be at work in the brain.

“The brain is a huge mystery. There’s a general impression that if we can unlock some of its principles, it might be helpful for AI,” said Bengio. “But it also has value in its own right.”

For decades, neuroscientists’ theories about how brains learn were guided primarily by a rule introduced in 1949 by the Canadian psychologist Donald Hebb, which is often paraphrased as “Neurons that fire together, wire together.” That is, the more correlated the activity of adjacent neurons, the stronger the synaptic connections between them. This principle, with some modifications, was successful at explaining certain limited types of learning and visual classification tasks.

But it worked far less well for large networks of neurons that had to learn from mistakes; there was no directly targeted way for neurons deep within the network to learn about discovered errors, update themselves and make fewer mistakes. “The Hebbian rule is a very narrow, particular and not very sensitive way of using error information,” said Daniel Yamins, a computational neuroscientist and computer scientist at Stanford University.

Nevertheless, it was the best learning rule that neuroscientists had, and even before it dominated neuroscience, it inspired the development of the first artificial neural networks in the late 1950s. Each artificial neuron in these networks receives multiple inputs and produces an output, like its biological counterpart. The neuron multiplies each input with a so-called “synaptic” weight — a number signifying the importance assigned to that input — and then sums up the weighted inputs. This sum is the neuron’s output. By the 1960s, it was clear that such neurons could be organized into a network with an input layer and an output layer, and the artificial neural network could be trained to solve a certain class of simple problems. During training, a neural network settled on the best weights for its neurons to eliminate or minimize errors.

Samuel Velasco/Quanta Magazine

However, it was obvious even in the 1960s that solving more complicated problems required one or more “hidden” layers of neurons sandwiched between the input and output layers. No one knew how to effectively train artificial neural networks with hidden layers — until 1986, when Hinton, the late David Rumelhart and Ronald Williams (now of Northeastern University) published the backpropagation algorithm.

The algorithm works in two phases. In the “forward” phase, when the network is given an input, it infers an output, which may be erroneous. The second “backward” phase updates the synaptic weights, bringing the output more in line with a target value.

To understand this process, think of a “loss function” that describes the difference between the inferred and desired outputs as a landscape of hills and valleys. When a network makes an inference with a given set of synaptic weights, it ends up at some location on the loss landscape. To learn, it needs to move down the slope, or gradient, toward some valley, where the loss is minimized to the extent possible. Backpropagation is a method for updating the synaptic weights to descend that gradient.

In essence, the algorithm’s backward phase calculates how much each neuron’s synaptic weights contribute to the error and then updates those weights to improve the network’s performance. This calculation proceeds sequentially backward from the output layer to the input layer, hence the name backpropagation. Do this over and over for sets of inputs and desired outputs, and you’ll eventually arrive at an acceptable set of weights for the entire neural network.

The invention of backpropagation immediately elicited an outcry from some neuroscientists, who said it could never work in real brains. The most notable naysayer was Francis Crick, the Nobel Prize-winning co-discoverer of the structure of DNA who later became a neuroscientist. In 1989 Crick wrote, “As far as the learning process is concerned, it is unlikely that the brain actually uses back propagation.”

Backprop is considered biologically implausible for several major reasons. The first is that while computers can easily implement the algorithm in two phases, doing so for biological neural networks is not trivial. The second is what computational neuroscientists call the weight transport problem: The backprop algorithm copies or “transports” information about all the synaptic weights involved in an inference and updates those weights for more accuracy. But in a biological network, neurons see only the outputs of other neurons, not the synaptic weights or internal processes that shape that output. From a neuron’s point of view, “it’s OK to know your own synaptic weights,” said Yamins. “What’s not okay is for you to know some other neuron’s set of synaptic weights.”

Samuel Velasco/Quanta Magazine

Any biologically plausible learning rule also needs to abide by the limitation that neurons can access information only from neighboring neurons; backprop may require information from more remote neurons. So “if you take backprop to the letter, it seems impossible for brains to compute,” said Bengio.

Nonetheless, Hinton and a few others immediately took up the challenge of working on biologically plausible variations of backpropagation. “The first paper arguing that brains do [something like] backpropagation is about as old as backpropagation,” said Konrad Kording, a computational neuroscientist at the University of Pennsylvania. Over the past decade or so, as the successes of artificial neural networks have led them to dominate artificial intelligence research, the efforts to find a biological equivalent for backprop have intensified.

Take, for example, one of the strangest solutions to the weight transport problem, courtesy of Timothy Lillicrap of Google DeepMind in London and his colleagues in 2016. Their algorithm, instead of relying on a matrix of weights recorded from the forward pass, used a matrix initialized with random values for the backward pass. Once assigned, these values never change, so no weights need to be transported for each backward pass.

To almost everyone’s surprise, the network learned. Because the forward weights used for inference are updated with each backward pass, the network still descends the gradient of the loss function, but by a different path. The forward weights slowly align themselves with the randomly selected backward weights to eventually yield the correct answers, giving the algorithm its name: feedback alignment.

“It turns out that, actually, that doesn’t work as bad as you might think it does,” said Yamins — at least for simple problems. For large-scale problems and for deeper networks with more hidden layers, feedback alignment doesn’t do as well as backprop: Because the updates to the forward weights are less accurate on each pass than they would be from truly backpropagated information, it takes much more data to train the network.

Yoshua Bengio, an artificial intelligence researcher and computer scientist at the University of Montreal, is one of the scientists seeking learning algorithms that are as effective as backpropagation but more biologically plausible.

Researchers have also explored ways of matching the performance of backprop while maintaining the classic Hebbian learning requirement that neurons respond only to their local neighbors. Backprop can be thought of as one set of neurons doing the inference and another set of neurons doing the computations for updating the synaptic weights. Hinton’s idea was to work on algorithms in which each neuron was doing both sets of computations. “That was basically what Geoff’s talk was [about] in 2007,” said Bengio.

Building on Hinton’s work, Bengio’s team proposed a learning rule in 2017 that requires a neural network with recurrent connections (that is, if neuron A activates neuron B, then neuron B in turn activates neuron A). If such a network is given some input, it sets the network reverberating, as each neuron responds to the push and pull of its immediate neighbors.

Eventually, the network reaches a state in which the neurons are in equilibrium with the input and each other, and it produces an output, which can be erroneous. The algorithm then nudges the output neurons toward the desired result. This sets another signal propagating backward through the network, setting off similar dynamics. The network finds a new equilibrium.

“The beauty of the math is that if you compare these two configurations, before the nudging and after nudging, you’ve got all the information you need to find the gradient,” said Bengio. Training the network involves simply repeating this process of “equilibrium propagation” iteratively over lots of labeled data.

The constraint that neurons can learn only by reacting to their local environment also finds expression in new theories of how the brain perceives. Beren Millidge, a doctoral student at the University of Edinburgh and a visiting fellow at the University of Sussex, and his colleagues have been reconciling this new view of perception — called predictive coding — with the requirements of backpropagation. “Predictive coding, if it’s set up in a certain way, will give you a biologically plausible learning rule,” said Millidge.

Predictive coding posits that the brain is constantly making predictions about the causes of sensory inputs. The process involves hierarchical layers of neural processing. To produce a certain output, each layer has to predict the neural activity of the layer below. If the highest layer expects to see a face, it predicts the activity of the layer below that can justify this perception. The layer below makes similar predictions about what to expect from the one beneath it, and so on. The lowest layer makes predictions about actual sensory input — say, the photons falling on the retina. In this way, predictions flow from the higher layers down to the lower layers.

Daniel Yamins, a computational neuroscientist and computer scientist at Stanford University, is working on ways to identify which algorithms are active in biological brains.

Roelfsema thinks the brain’s solution to the problem is in the process of attention. In the late 1990s, he and his colleagues showed that when monkeys fix their gaze on an object, neurons that represent that object in the cortex become more active. The monkey’s act of focusing its attention produces a feedback signal for the responsible neurons. “It is a highly selective feedback signal,” said Roelfsema. “It’s not an error signal. It is just saying to all those neurons: You’re going to be held responsible [for an action].”

Roelfsema’s insight was that this feedback signal could enable backprop-like learning when combined with processes revealed in certain other neuroscientific findings. For example, Wolfram Schultz of the University of Cambridge and others have shown that when animals perform an action that yields better results than expected, the brain’s dopamine system is activated. “It floods the whole brain with neural modulators,” said Roelfsema. The dopamine levels act like a global reinforcement signal.

In theory, the attentional feedback signal could prime only those neurons responsible for an action to respond to the global reinforcement signal by updating their synaptic weights, said Roelfsema. He and his colleagues have used this idea to build a deep neural network and study its mathematical properties. “It turns out you get error backpropagation. You get basically the same equation,” he said. “But now it became biologically plausible.”

The team presented this work at the Neural Information Processing Systems online conference in December. “We can train deep networks,” said Roelfsema. “It’s only a factor of two to three slower than backpropagation.” As such, he said, “it beats all the other algorithms that have been proposed to be biologically plausible.”

Nevertheless, concrete empirical evidence that living brains use these plausible mechanisms remains elusive. “I think we’re still missing something,” said Bengio. “In my experience, it could be a little thing, maybe a few twists to one of the existing methods, that’s going to really make a difference.”

Meanwhile, Yamins and his colleagues at Stanford have suggestions for how to determine which, if any, of the proposed learning rules is the correct one. By analyzing 1,056 artificial neural networks implementing different models of learning, they found that the type of learning rule governing a network can be identified from the activity of a subset of neurons over time. It’s possible that such information could be recorded from monkey brains. “It turns out that if you have the right collection of observables, it might be possible to come up with a fairly simple scheme that would allow you to identify learning rules,” said Yamins.

Given such advances, computational neuroscientists are quietly optimistic. “There are a lot of different ways the brain could be doing backpropagation,” said Kording. “And evolution is pretty damn awesome. Backpropagation is useful. I presume that evolution kind of gets us there.”

When the first black hole collision was detected in 2015, it was a watershed moment in the history of astronomy. With gravitational waves, astronomers were observing the universe in an entirely new way. But this first event didn’t revolutionize our understanding of black holes — nor could it. This collision would be the first of many, astronomers knew, and only with that bounty would answers come.

“The first discovery was the thrill of our lives,” said Vicky Kalogera, an astrophysicist at Northwestern University and part of the Laser Interferometer Gravitational-Wave Observatory (LIGO) collaboration that made the 2015 detection. “But you cannot do astrophysics with one source.”

Now, gravitational wave physicists like Kalogera say they are entering a new era of black hole astronomy, driven by a rapid increase in the number of black holes they are observing.

The latest catalog of these so-called black hole binary mergers — the result of two black holes spiraling inward toward each other and colliding — has quadrupled the black hole merger data available to study. There are now almost 50 mergers for astrophysicists to scrutinize, with dozens more expected in the next few months and hundreds more in the coming years.

“Black hole astrophysics is being revolutionized by gravitational waves because the numbers are so big. And the numbers are allowing us to ask qualitatively different questions,” said Kalogera. “We’ve opened a treasure trove.”

On the strength of this data, new statistically driven studies are beginning to reveal the secrets of these enigmatic objects: how black holes form, and why they merge. This growing black hole inventory could also offer a novel way to probe cosmological evolution — from the Big Bang through the birth of the first stars and the growth of galaxies.

“I definitely didn’t expect that we’d be looking at these questions so soon after the first detection,” said Maya Fishbach, an astrophysicist at Northwestern. “The field has exploded.”

Before black holes can be used to study the cosmos as a whole, astrophysicists must first figure out how they are made. Two theories have dominated the debate so far.

Some astronomers suggest that most black holes originate inside crowded clusters of stars — regions that are sometimes a million times denser than our own galactic backyard. Each time a very massive star explodes, it leaves behind a black hole that sinks to the middle of the star cluster. The center of the cluster becomes thick with black holes, which become entwined by gravity into a fateful cosmic dance. Astronomers call this “dynamical” black hole formation.

Others suggest that black hole binaries start out as pairs of stars in comparatively desolate areas of galaxies. After a long and chaotic life together, they too explode, creating a pair of “isolated” black holes that continue to orbit each other.

“There’s been this perception that it’s a fight between the dynamical and the isolated models,” said Daniel Holz, an astrophysicist at the University of Chicago.

The tendency of many theorists to advocate for just one black hole binary formation channel partly stems from the practicalities of working with very little data. “Each event was lovingly analyzed, obsessed over and fussed over,” said Holz. “We would make a detection and people would try to abstract very broad statements from a sample size of one or two black holes.”

Indeed, astrophysicists used that first detection to argue for opposing conclusions. LIGO found its first black hole merger extremely quickly — before the official start of observation, in fact — which suggested that black hole binary systems are very common in the universe. Since isolated black holes can form in a broad range of astrophysical environments, theories that favor isolated black holes predict that we’ll see a lot of mergers.

Others pointed out that the first merger featured unusually large black holes, and that the existence of these giants supported the dynamical theory. Such large black holes, they reasoned, could only be made in the early universe, when star clusters are also thought to have formed.

Yet with a sample size of one, such assertions could only be an “educated guess,” said Carl Rodriguez, an astrophysicist at Carnegie Mellon University.

Now data from LIGO’s latest catalog shows that black hole binaries are far less common than expected. In fact, the rate of merging black holes now observed could be “entirely explained” by star clusters, according to a paper posted by Rodriguez and his collaborators on the scientific preprint site arxiv.org late last month. (The paper’s conclusion is more measured and suggests that both the dynamical and isolated processes are important.)

In addition, the new mergers have enabled a fresh approach to the puzzle of where black holes come from. Despite their elusive nature, black holes are very simple. Aside from mass and charge, the only trait a black hole can have is spin — a measure of how quickly it rotates. If a pair of black holes, and the stars from which they form, live their whole lives together, the constant push and pull will align their spins. But if two black holes happen to encounter each other later in life, their spins will be random.

After measuring the spin of the black holes in the LIGO data set, astronomers now suggest that the dynamic and isolated scenarios are almost equally likely. There is no “one channel to rule them all,” wrote the astrophysicist Michael Zevin and collaborators in a recent preprint outlining an array of different pathways that together can explain this new and growing population of black hole binaries.

“The simplest answer is not always the correct one,” said Zevin. “It’s a more complicated landscape, and it’s certainly a bigger challenge. But I think it’s a more fun problem to address as well.”

LIGO and its sister observatory Virgo have also grown more sensitive over time, which means they can now see colliding black holes that are much farther away from Earth and much further back in time. “We’re listening to a really big chunk of the universe, out to when the universe was much younger than it is today,” said Fishbach.

In a recent preprint, Fishbach and her collaborators found indications of differences in the types of black holes observed at different points in cosmic history. In particular, heavier black holes seem to be more common earlier in the universe’s history.

This came as no surprise to many astrophysicists; they expect that the first stars in the universe formed from huge clouds of hydrogen and helium, which would make them much bigger than later stars. Black holes created from these stars should then also be huge.

But it’s one thing to predict what happened in the early universe, and another to observe it. “You can really start to use [black holes] as a tracer of how the universe formed stars over cosmic time and how the galaxies that form those stars and star clusters are assembled. And that starts to get really cool,” said Rodriguez.

The study is a first step toward using large data sets of black holes as a radical tool to explore the cosmos. Astronomers have created an astonishingly accurate model of how the universe evolved, known as Lambda-CDM. But no model is perfect. Gravitational waves offer a way to measure the universe that is completely independent of every other method in the history of cosmology, said Salvatore Vitale, an astrophysicist at the Massachusetts Institute of Technology. “If you get the same results, you’ll sleep better at night. If you don’t, that points to a potential misunderstanding.”

Theorists are now building models that include multiple black hole formation scenarios and unscrambling how each one evolves across the universe’s history. Gravitational wave physicists are hopeful that in the coming months and years they’ll be able to answer these questions with confidence.

“We’re just scratching the surface,” said Kalogera. “The sample is still too small to give us a robust answer, but when we have 100 or 200 of these [mergers], then I think we’ll have clear answers.

“We’re not that far away.”

*This article was reprinted on TheAtlantic.com.*

Po-Shen Loh has resurrected the United States International Mathematical Olympiad team, leading it to four first-place rankings in the last six years as the team’s head coach.

But in 2002, when a friend suggested Loh apply for an open position as a grader with the team, he hesitated. “I had never thought to apply before,” Loh said. “Not because I didn’t want to. But because I thought there are better people out there.”

He eventually agreed, and by the end of the team’s June 2002 training program, he’d made an impression. “Somehow I got voted best lecturer,” he said. In 2013 the Mathematical Association of America, which coordinates the team, asked Loh to become the head coach. He accepted, and two years later the U.S. achieved a top ranking in the IMO for the first time in 21 years.

Math has always been a part of Loh’s life. He grew up in Madison, Wisconsin. His father was a statistics professor, and his mother had taught math in Singapore. A local news story from 2015 dubbed them the capital city’s “first family of math.” It’s in this context that Loh learned early in life that tackling hard math problems requires persistence and, often, unorthodox thinking.

Loh brings this perspective to his work as IMO coach. Every June, the mathematical association invites 60 high schoolers to a national training camp in Pittsburgh. Based on a battery of exams given the preceding year, six of them have already been tapped to represent the U.S. at the IMO in July. One of Loh’s main innovations at the camp has been to invite Olympiad teams from many different countries to practice alongside the 60-person U.S. contingent. “At first people were surprised, because we paid for all expenses, with no catch,” he said.

Loh also works hard to expose kids from all backgrounds to mathematical ideas. Before the pandemic he toured the country giving math talks like a “traveling salesman of mathematics,” he said. In 2014 he launched an app called Expii, which uses interactive puzzles to teach basic math concepts. He also made a YouTube video in which he helped coach a young cheerleading team in New Jersey by explaining the basic math behind their choreography.

When he’s not filming videos or hosting international training sessions, Loh works as a mathematician at Carnegie Mellon University in Pittsburgh. He studies extremal combinatorics, meaning objects like very large graphs or networks. In particular, he studies how overarching characteristics of those networks affect their small-scale features. One of his results quantifies how the total number of nodes and edges in a certain class of large networks constrains the number of nodes that can be disconnected from the others.

*Quanta Magazine *recently spoke with Loh about his approach to coaching, why he enrolled in improv comedy classes and the reason he often runs from place to place. The conversation has been condensed and edited for clarity.

When I was around fifth grade or so, my father found a book of interesting math puzzles. These were not like “Do 100 arithmetic problems as fast as possible.” They were problems that made you think outside the box, like, you have six toothpicks, can you line up the six toothpicks so that there are four triangles made? The solution is that you make a tetrahedron. Which is not what you expect, right?

That introduced me to the practice of thinking about something for a while and then finding out, “Oh, my gosh, you do it that way.”

I found out that you could meet interesting people, and I found out that the problems were insanely hard. That actually appealed to a certain element of my personality. I really love people. I love doing things to help people. And I like something that’s supposed to be ridiculously hard and feels good if you solve it. So that’s actually what drove me onwards.

I’m not competitive in the sense of trying to beat other people, but I’m very competitive with myself. Even crazy things like, if Google Maps tells me it’ll take me this amount of time to walk from here to there, I might challenge that just because I feel like it.

The young people on the IMO team are very interesting people that we have a chance to touch. They are people who could lead science, technology and innovation in this country in the next few decades. My goal as coach wasn’t going to be to win, it was going to be to maximize the number of them that I read about in The New York Times in 20 years.

Because if you put your mind on a finish line, and the finish line is going to happen within two months, you’re going to calibrate for that final sprint. For example, if I have two months left before the International Math Olympiad, what should we really be doing? We should spend all day and all night helping the students know competition tricks inside and out, so that when they get to the actual exam, they’ll just tear it apart like robots.

We use our time together to also expose them to what people do with all of these math skills. I organize evening seminars where people talk about the kinds of math or applications they’re doing now. You might have somebody talking about quantum algorithms for factoring numbers even though there’s not going to be anything about quantum mechanics on the International Math Olympiad.

The most obvious change is we invite people from other countries’ teams to train with the United States. We teach them all the secrets. We treat them as our own. And we just have fun for three weeks learning all kinds of crazy mathematics.

If you let 60 Americans see and work with and build relationships with their peers who are going to be leaders from other countries, this is extremely valuable. If I was a student, would I want this? Of course. Because if I was a student, one of those 60 people, I will be pretty sure I’ll never make the International Math Olympiad team — what’s my chance of being one of the top six in the U.S.? Instead of sending six people to the IMO, we bring the IMO to 60 people.

Loh at the 1999 IMO in Bucharest, Romania where he won a silver medal.

You need to make sure there are enough people who are trying to pick up these very unusual skills that are in the math Olympiads. When I think about the issues of diversity, I think about what is involved in getting people interested. So when I give a talk, I can tell who feels comfortable and who doesn’t feel comfortable. And actually one of my goals is to go and try to help the people who look like they don’t think they belong and to help them feel like they can.

I think I like retrospectively obvious observations. I just thought it was beautiful to be able to bring in some different angle, and suddenly, because of that different perspective, you’re cutting through it. It’s like if you try to cut wood and go against the grain, it’s a totally different experience from going with the grain, and pop, the whole thing falls apart.

Say you’re trying to make some kind of a network that has some complicated properties, and we just can’t imagine what it would look like. Sometimes it turns out that you can’t make it, but you can prove that it exists by using probability. I’ll show you how. Give me a decimal between zero and one.

So 0.2 is what’s called a rational number. It’s two over 10, or one over five, we can write it as a whole number over a whole number. But it turns out that if you pick a random number between zero and one, the probability it’s rational is zero. This is because there are way more irrational numbers than rational numbers. So what’s kind of funny is that you didn’t provide me a number — an irrational number — that is actually the most common kind of number there is.

In the context of extremal combinatorics, you can use probability to show that networks with certain properties actually exist, even if you might have a hard time coming up with an example of one yourself.

In improv the principle is if you start doing something, you’re not supposed to suddenly go and say, “Oh, no, sorry, that was wrong.” Solving a math problem, it’s also the same. You can’t just sit there and say: “I don’t know if this idea will work. I don’t know if that idea will work. I’m not going to try any idea.” No, you’ve got to dive in. You have to already have the attitude that “I don’t know where this idea is taking me, but I’m going to push it all the way through.”

If it’s a math problem, it’s hard to get something that would make me drop it, unless somehow something was proved that said that it is not possible. To me, “drop” is a really strong term. Because what if in 10 years a new technique developed? It’s a new weapon, you should try it.

Loh teaching a guest lecture to students underserved in STEM fields.

I would say that when it comes to teaching, there’s a big problem that we haven’t solved yet, which is how to solve the issue of mathematics not necessarily being something that everyone thinks is great and everyone wants to do. That’s a big problem.

Mathematics is the heart of what helps people think. If you want to live in as reasonable a society as possible, it helps if everyone’s comfortable reasoning. And I’m not talking about mathematics from the point of view of sines and cosines. I’m just referring to mathematics from the point of view of logic. But it’s hard to learn logic in a vacuum.** **

I would not be surprised if that affects how I’m perceived. But at the same time, that is not necessarily the thing that I would use to decide what I would do. I decide what to do based on what I think is going to have the most impact. And I don’t mean this in any disparaging way to people who don’t choose to do these things.

I’m actually lucky that in my field, we’re quite close-knit. I think that virtually everyone in my field knows what I’ve been up to. And we also have mutual respect, in the sense that I have tons of respect for people who spend their time primarily proving theorems. I think that’s great.

It’s because I’m always trying to milk every second to do something. And then, by the time you finish, you realize that there’s no time left to work. So you run. I always think: just one more task. And by the time you’re at that point, you say, “Oh, no, now I need to hurry.”

Identical twins have nothing on black holes. Twins may grow from the same genetic blueprints, but they can differ in a thousand ways — from temperament to hairstyle. Black holes, according to Albert Einstein’s theory of gravity, can have just three characteristics — mass, spin and charge. If those values are the same for any two black holes, it is impossible to discern one twin from the other. Black holes, they say, have no hair.

“In classical general relativity, they would be exactly identical,” said Paul Chesler, a theoretical physicist at Harvard University. “You can’t tell the difference.”

Yet scientists have begun to wonder if the “no-hair theorem” is strictly true. In 2012, a mathematician named Stefanos Aretakis — then at the University of Cambridge and now at the University of Toronto — suggested that some black holes might have instabilities on their event horizons. These instabilities would effectively give some regions of a black hole’s horizon a stronger gravitational pull than others. That would make otherwise identical black holes distinguishable.

However, his equations only showed that this was possible for so-called extremal black holes — ones that have a maximum value possible for either their mass, spin or charge. And as far as we know, “these black holes cannot exist, at least exactly, in nature,” said Chesler.

But what if you had a near-extremal black hole, one that approached these extreme values but didn’t quite reach them? Such a black hole should be able to exist, at least in theory. Could it have detectable violations of the no-hair theorem?

A paper published late last month shows that it could. Moreover, this hair could be detected by gravitational wave observatories.

“Aretakis basically suggested there was some information that was left on the horizon,” said Gaurav Khanna, a physicist at the University of Massachusetts and the University of Rhode Island and one of the co-authors. “Our paper opens up the possibility of measuring this hair.”

In particular, the scientists suggest that remnants either of the black hole’s formation or of later disturbances, such as matter falling into the black hole, could create gravitational instabilities on or near the event horizon of a near-extremal black hole. “We would expect that the gravitational signal we would see would be quite different from ordinary black holes that are not extremal,” said Khanna.

If black holes do have hair — thus retaining some information about their past — this could have implications for the famous black hole information paradox put forward by the late physicist Stephen Hawking, said Lia Medeiros, an astrophysicist at the Institute for Advanced Study in Princeton, New Jersey. That paradox distills the fundamental conflict between general relativity and quantum mechanics, the two great pillars of 20th-century physics. “If you violate one of the assumptions [of the information paradox], you might be able to solve the paradox itself,” said Medeiros. “One of the assumptions is the no-hair theorem.”

The ramifications of that could be broad. “If we can prove the actual space-time of the black hole outside of the black hole is different from what we expect, then I think that is going to have really huge implications for general relativity,” said Medeiros, who co-authored a paper in October that addressed whether the observed geometry of black holes is consistent with predictions.

Perhaps the most exciting aspect of this latest paper, however, is that it could provide a way to merge observations of black holes with fundamental physics. Detecting hair on black holes — perhaps the most extreme astrophysical laboratories in the universe — could allow us to probe ideas such as string theory and quantum gravity in a way that has never been possible before.

“One of the big issues [with] string theory and quantum gravity is that it’s really hard to test those predictions,” said Medeiros. “So if you have anything that’s even remotely testable, that’s amazing.”

There are major hurdles, however. It’s not certain that near-extremal black holes exist. (The best simulations at the moment typically produce black holes that are 30% away from being extremal, said Chesler.) And even if they do, it’s not clear if gravitational wave detectors would be sensitive enough to spot these instabilities from the hair.

What’s more, the hair is expected to be incredibly short-lived, lasting just fractions of a second.

But the paper itself, at least in principle, seems sound. “I don’t think that anybody in the community doubts it,” said Chesler. “It’s not speculative. It just turns out Einstein’s equations are so complicated that we’re discovering new properties of them on a yearly basis.”

The next step would be to see what sort of signals we should be looking for in our gravitational detectors — either LIGO and Virgo, operating today, or future instruments like the European Space Agency’s space-based LISA instrument.

“One should now build upon their work and really compute what would be the frequency of this gravitational radiation, and understand how we could measure and identify it,” said Helvi Witek, an astrophysicist at the University of Illinois, Urbana-Champaign. “The next step is to go from this very nice and important theoretical study to what would be the signature.”

There are plenty of reasons to want to do so. While the chances of a detection that would prove the paper correct are slim, such a discovery would not only challenge Einstein’s theory of general relativity but prove the existence of near-extremal black holes.

“We would love to know if nature would even allow for such a beast to exist,” said Khanna. “It would have pretty dramatic implications for our field.”

**Correction:** February 11, 2021

The original version of this article implied that theorists are unable to simulate black holes closer than 30% away from being extremal. In fact, they can simulate near-extremal black holes, but their typical simulations are 30% away from being extremal.

*This article was reprinted on Wired.com and in Spanish at Investigacionyciencia.es*.