Understanding and predicting extreme events as well as the related anomalous statistics is a grand challenge in complex natural systems. Deep convolutional neural networks provide a useful tool to learn the essential model dynamics directly from data. A deep learning strategy is proposed to predict the extreme events that appear in turbulent dynamical systems. A truncated KdV model displaying distinctive statistics from near-Gaussian to highly skewed distributions is used as the test model. The neural network is trained using data only from the near-Gaussian regime without the occurrence of large extreme values. The optimized network demonstrates uniformly high skill in successfully capturing the solution structures in a wide variety of statistical regimes, including the highly skewed extreme events.

Extreme events and the related anomalous statistics are ubiquitously observed in many natural systems, and the development of efficient methods to understand and accurately predict such representative features remains a grand challenge. Here, we investigate the skill of deep learning strategies in the prediction of extreme events in complex turbulent dynamical systems. Deep neural networks have been successfully applied to many imaging processing problems involving big data, and have recently shown potential for the study of dynamical systems. We propose to use a densely connected mixed-scale network model to capture the extreme events appearing in a truncated Korteweg–de Vries (tKdV) statistical framework, which creates anomalous skewed distributions consistent with recent laboratory experiments for shallow water waves across an abrupt depth change, where a remarkable statistical phase transition is generated by varying the inverse temperature parameter in the corresponding Gibbs invariant measures. The neural network is trained using data without knowing the explicit model dynamics, and the training data are only drawn from the near-Gaussian regime of the tKdV model solutions without the occurrence of large extreme values. A relative entropy loss function, together with empirical partition functions, is proposed for measuring the accuracy of the network output where the dominant structures in the turbulent field are emphasized. The optimized network is shown to gain uniformly high skill in accurately predicting the solutions in a wide variety of statistical regimes, including highly skewed extreme events. The technique is promising to be further applied to other complicated high-dimensional systems.

]]>A striking feature in bivalved seashells is that the 2 valves fit together perfectly when closed. This trait has evolved in 2 phyla from a common shell-less ancestor and has been described for hundreds of years. While its functional advantage is clear, there is no understanding of how this feature is generated. A mathematical model of the shell growth process explains how geometry and mechanics conspire to generate an interlocking pattern. This model provides a physical explanation for a prominent example of convergent evolution. By showing how variations in the mechanism create a wide variety of morphological trends the model provides insight into how biophysical processes, probably modulated by genetic factors, are manifest across scales to produce a predictable pattern.

Brachiopods and mollusks are 2 shell-bearing phyla that diverged from a common shell-less ancestor more than 540 million years ago. Brachiopods and bivalve mollusks have also convergently evolved a bivalved shell that displays an apparently mundane, yet striking feature from a developmental point of view: When the shell is closed, the 2 valve edges meet each other in a commissure that forms a continuum with no gaps or overlaps despite the fact that each valve, secreted by 2 mantle lobes, may present antisymmetric ornamental patterns of varying regularity and size. Interlocking is maintained throughout the entirety of development, even when the shell edge exhibits significant irregularity due to injury or other environmental influences, which suggests a dynamic physical process of pattern formation that cannot be genetically specified. Here, we derive a mathematical framework, based on the physics of shell growth, to explain how this interlocking pattern is created and regulated by mechanical instabilities. By close consideration of the geometry and mechanics of 2 lobes of the mantle, constrained both by the rigid shell that they secrete and by each other, we uncover the mechanistic basis for the interlocking pattern. Our modeling framework recovers and explains a large diversity of shell forms and highlights how parametric variations in the growth process result in morphological variation. Beyond the basic interlocking mechanism, we also consider the intricate and striking multiscale-patterned edge in certain brachiopods. We show that this pattern can be explained as a secondary instability that matches morphological trends and data.

]]>Many real-world systems can be described by a set of differential equations. Knowing these equations allows researchers to predict the system’s behavior under interventions, such as manipulations of initial or environmental conditions. For many complex systems, the differential equations are unknown. Deriving them by hand is infeasible for large systems, and data science is used to learn them from observational data. Existing techniques yield models that predict the observational data well, but fail to explain the effect of interventions. We propose an approach, CausalKinetiX, that explicitly takes into account stability across different experiments. This allows us to draw a more realistic picture of the system’s underlying causal structure and is a first step toward increasing reproducibility.

Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called CausalKinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model, a key criterion for reproducible research. Results on both simulated and real-world examples suggest that learning the structure of kinetic systems benefits from a causal perspective. The identified variables and models allow for a concise description of the dynamics across multiple experimental settings and can be used for prediction in unseen experiments. We observe significant improvements compared to well-established approaches focusing solely on predictive performance, especially for out-of-sample generalization.

]]>Natural or synthetic genetic modules can lose their function over long-term evolution if the function is costly. How populations can evolve to restore such broken function is poorly understood. To test the reversibility of evolutionary breakdown, we use yeast cell populations with a chromosomally integrated synthetic gene circuit. In previous evolution experiments the gene circuit lost its costly function through various mutations. By exposing such mutant populations to conditions where regaining gene circuit function would be beneficial, we find adaptation scenarios with or without repairing lost gene circuit function. These results are important for drug resistance or future synthetic biology applications where evolutionary loss and regain of function play a significant role.

Evolutionary reversibility—the ability to regain a lost function—is an important problem both in evolutionary and synthetic biology, where repairing natural or synthetic systems broken by evolutionary processes may be valuable. Here, we use a synthetic positive-feedback (PF) gene circuit integrated into haploid *Saccharomyces cerevisiae* cells to test if the population can restore lost PF function. In previous evolution experiments, mutations in a gene eliminated the fitness costs of PF activation. Since PF activation also provides drug resistance, exposing such compromised or broken mutants to both drug and inducer should create selection pressure to regain drug resistance and possibly PF function. Indeed, evolving 7 PF mutant strains in the presence of drug revealed 3 adaptation scenarios through genomic, PF-external mutations that elevate PF basal expression, possibly by affecting transcription, translation, degradation, and other fundamental cellular processes. Nonfunctional mutants gained drug resistance without ever developing high expression, while quasifunctional and dysfunctional PF mutants developed high expression nongenetically, which then diminished, although more slowly for dysfunctional mutants where revertant clones arose. These results highlight how intracellular context, such as the growth rate, can affect regulatory network dynamics and evolutionary dynamics, which has important consequences for understanding the evolution of drug resistance and developing future synthetic biology applications.

Cell migration requires energy, but the metabolic cost of migration has not been quantitatively explored in detail. Here, we use a 2-phase model of the cell cytoplasm to compute cell velocities and energy efficiencies during cell movement. This model predicts that actin polymerization-driven migration is very inefficient in high-hydraulic-resistance environments. Instead, cells can adopt the water-driven mechanism. Therefore, the energetics and mechanical efficiencies of cell movement are predicted to depend on the physical environment.

In this work, we explore fundamental energy requirements during mammalian cell movement. Starting with the conservation of mass and momentum for the cell cytosol and the actin-network phase, we develop useful identities that compute dissipated energies during extensions of the cell boundary. We analyze 2 complementary mechanisms of cell movement: actin-driven and water-driven. The former mechanism occurs on 2-dimensional cell-culture substrate without appreciable external hydraulic resistance, while the latter mechanism is prominent in confined channels where external hydraulic resistance is high. By considering various forms of energy input and dissipation, we find that the water-driven cell-migration mechanism is inefficient and requires more energy. However, in environments with sufficiently high hydraulic resistance, the efficiency of actin-polymerization-driven cell migration decreases considerably, and the water-based mechanism becomes more efficient. Hence, the most efficient way for cells to move depends on the physical environment. This work can be extended to higher dimensions and has implication for understanding energetics of morphogenesis in early embryonic development and cancer-cell metastasis and provides a physical basis for understanding changing metabolic requirements for cell movement in different conditions.

]]>Message passing, a celebrated family of methods for performing calculations on networks, has led to many important results in physics, statistics, computer science, and other areas. The technique allows one to divide large network calculations into manageable pieces and hence solve them either analytically or numerically. However, the method has a substantial and widely recognized shortcoming, namely that it works poorly on networks that contain short loops. Unfortunately, most real-world networks contain many such loops, which limits the applicability of the method. In this paper we give a solution for this problem, demonstrating how message passing can be extended to any network, regardless of structure, allowing it to become a general tool for the quantitative study of network phenomena.

Message passing is a fundamental technique for performing calculations on networks and graphs with applications in physics, computer science, statistics, and machine learning, including Bayesian inference, spin models, satisfiability, graph partitioning, network epidemiology, and the calculation of matrix eigenvalues. Despite its wide use, however, it has long been recognized that the method has a fundamental flaw: It works poorly on networks that contain short loops. Loops introduce correlations that can cause the method to give inaccurate answers or to fail completely in the worst cases. Unfortunately, most real-world networks contain many short loops, which limits the usefulness of the message-passing approach. In this paper we demonstrate how to rectify this shortcoming and create message-passing methods that work on any network. We give 2 example applications, one to the percolation properties of networks and the other to the calculation of the spectra of sparse matrices.

]]>Modeling intracellular processes has long relied on the markovian assumption. However, as soon as a reactant interacts with its environment, molecular memory definitely exists and its effects cannot be neglected. Since the Markov theory cannot translate directly to modeling and analysis of nonmarkovian processes, this leads to many significant challenges. We develop a formulation, namely the stationary generalized chemical-master equation, to model intracellular processes with molecular memory. This formulation converts a nonmarkovian question to a markovian one while keeping the stationary probabilistic behavior unchanged. Both a stationary generalized Fokker–Planck equation and a generalized linear noise approximation are further developed for the fast evaluation of fluctuations. These formulations can have broad applications and may help us discover new biological knowledge.

Many cellular processes are governed by stochastic reaction events. These events do not necessarily occur in single steps of individual molecules, and, conversely, each birth or death of a macromolecule (e.g., protein) could involve several small reaction steps, creating a memory between individual events and thus leading to nonmarkovian reaction kinetics. Characterizing this kinetics is challenging. Here, we develop a systematic approach for a general reaction network with arbitrary intrinsic waiting-time distributions, which includes the stationary generalized chemical-master equation (sgCME), the stationary generalized Fokker–Planck equation, and the generalized linear-noise approximation. The first formulation converts a nonmarkovian issue into a markovian one by introducing effective transition rates (that explicitly decode the effect of molecular memory) for the reactions in an equivalent reaction network with the same substrates but without molecular memory. Nonmarkovian features of the reaction kinetics can be revealed by solving the sgCME. The latter 2 formulations can be used in the fast evaluation of fluctuations. These formulations can have broad applications, and, in particular, they may help us discover new biological knowledge underlying memory effects. When they are applied to generalized stochastic models of gene-expression regulation, we find that molecular memory is in effect equivalent to a feedback and can induce bimodality, fine-tune the expression noise, and induce switch.

]]>Matrix completion finds numerous applications in data science, ranging from information retrieval to medical imaging. While substantial progress has been made in designing estimation algorithms, it remains unknown how to perform optimal statistical inference on the unknown matrix given the obtained estimates—a task at the core of modern decision making. We propose procedures to debias the popular convex and nonconvex estimators and derive distributional characterizations for the resulting debiased estimators. This distributional theory enables valid inference on the unknown matrix. Our procedures 1) yield optimal construction of confidence intervals for missing entries and 2) achieve optimal estimation accuracy in a sharp manner.

Noisy matrix completion aims at estimating a low-rank matrix given only partial and corrupted entries. Despite remarkable progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained estimates and how to perform efficient statistical inference on the unknown matrix (e.g., constructing a valid and short confidence interval for an unseen entry). This paper takes a substantial step toward addressing such tasks. We develop a simple procedure to compensate for the bias of the widely used convex and nonconvex estimators. The resulting debiased estimators admit nearly precise nonasymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals/regions for, say, the missing entries and the low-rank factors. Our inferential procedures do not require sample splitting, thus avoiding unnecessary loss of data efficiency. As a byproduct, we obtain a sharp characterization of the estimation accuracy of our debiased estimators in both rate and constant. Our debiased estimators are tractable algorithms that provably achieve full statistical efficiency.

]]>Sensitivity of optimization algorithms to problem and algorithmic parameters leads to tremendous waste in time and energy, especially in applications with millions of parameters, such as deep learning. We address this by developing stochastic optimization methods demonstrably—both by theory and by experimental evidence—more robust, enjoying optimal convergence guarantees for a variety of stochastic optimization problems. Additionally, we highlight the importance of method sensitivity to problem difficulty and algorithmic parameters.

Standard stochastic optimization methods are brittle, sensitive to stepsize choice and other algorithmic parameters, and they exhibit instability outside of well-behaved families of objectives. To address these challenges, we investigate models for stochastic optimization and learning problems that exhibit better robustness to problem families and algorithmic parameters. With appropriately accurate models—which we call the aprox family—stochastic methods can be made stable, provably convergent, and asymptotically optimal; even modeling that the objective is nonnegative is sufficient for this stability. We extend these results beyond convexity to weakly convex objectives, which include compositions of convex losses with smooth functions common in modern machine learning. We highlight the importance of robustness and accurate modeling with experimental evaluation of convergence time and algorithm sensitivity.

]]>Governing equations are essential to the study of physical systems, providing models that can generalize to predict previously unseen behaviors. There are many systems of interest across disciplines where large quantities of data have been collected, but the underlying governing equations remain unknown. This work introduces an approach to discover governing models from data. The proposed method addresses a key limitation of prior approaches by simultaneously discovering coordinates that admit a parsimonious dynamical model. Developing parsimonious and interpretable governing models has the potential to transform our understanding of complex systems, including in neuroscience, biology, and climate science.

The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam’s razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. In this work, we design a custom deep autoencoder network to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented. Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. This method places the discovery of coordinates and models on an equal footing.

]]>