In studying predator-prey interactions, it is widely agreed that dynamic behavioral responses by both predators and prey should be considered. In practice, however, such consideration rarely occurs. Instead, predators are often treated as abstract sources of risk while prey behavior is observed, or prey are treated as fixed resources and predator behavior is observed. I conducted experiments in which predators (sevenspotted lady beetles, Coccinella septempunctata) and prey (pea aphids, Acyrthosiphon pisum) could move simultaneously between patches of habitat that varied in the amount of the prey's resource (Vicia faba) they contained. I used these experiments to test the predictions of ideal free distribution (IFD) models that consider simultaneous patch choices by predators and prey. The distributions of prey supported many IFD models since in the presence or absence of predators they matched their resource: the proportion of aphids in a patch was approximately equal to the proportion of resources in that patch. Interestingly, predators also reached an IFD, but they apparently accomplished this not by matching the distribution of prey per se, but rather by matching the prey's resource. I discuss some interesting consequences of this behavior by predators and of anti-predator behavior by prey. I then use these results to introduce and discuss a new reaction-diffusion-advection model of predator-prey habitat selection and movement.
Emerging infectious diseases present critical issues of public health and economic welfare. As demonstrated by the coordinated international response to SARS, novel diseases are being addressed via rapid genomic sequencing. However, our ability to make sense of these data lags behind acquisition.
First, genomic analyses such as the reconstruction of phylogenetic trees are computationally difficult, requiring novel algorithmic approaches and high performance computers. Next, even when phylogenetic trees are produced, we have hardly begun to understand how disease-causing organisms evolve and travel over various hosts and geography to become epidemics. To these ends we have created an interactive genomic and geographic map using phylogenetic trees and GoogleTM Earth to reconstruct the evolution and spread of avian influenza lineages (H5N1) over the past decade. By examining a phylogenetic tree of H5N1 projected onto the globe we have studied visually and statistically whether and where key genotypes in viral proteins are correlated with host shifts and resistance to therapeutic drugs.
I will provide other examples of how our workflow system, available in prototype at supramap.osu.edu, can be used to inspire and test retrospective and predictive hypotheses of the evolution and geographic spread of microbial pathogens in animal and human populations.
When we seek to compare shapes parameterized as a set of unlabeled points, we face the twin problems of i) shape correspondence and ii) shape deformation. While the problems of determining optimal shape correspondences or shape deformations may not arise in indexing situations, they are important in deformable shape registration - the problem of taking one shape onto another while least deforming the ambient space. Over the past few years, we have shown the efficacy of i) simultaneously solving for the correspondences and the deformation: TPS-RPM, ii) simultaneously clustering and matching the two shapes: JCM, iii) using the Jensen-Shannon divergence to solve for the deformation without parameterizing the correspondences, and iv) finding a deformation which minimizes a closed-form distance between two Gaussian mixture models for the shapes. Furthermore, we have shown that groupwise registration and atlas construction of point-sets can be performed in an unbiased manner using the aforementioned distance measures. The clinical problem that we are interested in is the retrospective and prospective classification of subjects with either left or right anterior temporal epileptic focii (and scheduled for lobectomy) in the left or right hippocampus respectively. We empirically demonstrate the importance of shape-based features (as against volume-based features) in the automated classification of LATL and RATL subjects.
Many problems in biology and engineering can be formulated as a pattern recognition one. In such problems, linear methods are preferred for their simplicity and tractability. Unfortunately, linear methods have many limitations, of which several are still unknown. Understanding these limitations is a key to advancing the current state of the art. In this talk, we will address these issues within the context of feature extraction and classification. We will define where linear feature extraction methods do not work and how this knowledge can be used to propose algorithms that are guaranteed to work in a large number of applications. Possible applications in biology will be discussed. Time permitting, we will sketch the problem posed by classical normalization procedures. In particular, that of norm normalization, generally used to make shape descriptors invariant to scale and rotation, and for modeling mtDNA in genetics. Open problems will be outlined during the course of the talk.
Bio: Aleix M. Martinez is an assistant professor in the Department of Electrical and Computer Engineering at The Ohio State University (OSU), where he is the founder and director of the Computational Biology and Cognitive Science Lab. He is also affiliated with the Department of Biomedical Engineering and to the Center for Cognitive Science. Prior to joining OSU, he was affiliated with the Electrical and Computer Engineering Department at Purdue University and with the Sony Computer Science Lab. He currently serves as an associate editor of IEEE Transactions on Pattern Analysis and Machine Intelligence and Image and Vision Computing.
In organisms with RNA editing the messenger RNA is modified by base substitutions, deletions, or insertions with respect to the genomic template. Especially, in the case of deletions and insertions this makes it very difficult to identify genes in organisms with RNA editing. I will discuss several computational approaches based on the iteration of dircrete transfer matrices that address this challenge. In Physarum polycephalum, one of the model organisms for RNA editing, these methods have lead to the discovery of a significant number of new genes and even of a new type of editing.
DNA is commonly known as a long double-stranded molecule. Each strand is a polymer of simple units called nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T). The strands are anti-parallel (oppositely directed) and complementary: A in one DNA strand is chemically bound to T in the opposite strand, C bound to G, and vice versa. Nucleotide order defines basic properties of DNA as of a transmitter of hereditary information. From the bioinformatics standpoint, DNA is commonly considered as nucleotide sequence.
However, for many biological processes (particularly those related to gene regulation) DNA properties are defined by order of dinucleotides (successive nucleotides) in the sequence. This is shown on several examples from our research.
Overall, although DNA basic properties are defined by sequence of nucleotides, its biophysical properties essential for gene regulation are largely defined by sequence of dinucleotides.
Angiogenesis in the zebrafish embryo begins after the first day of development. During this time the intersegmental vessels in the trunk develop from the dorsal aorta in the first wave of embryonic angiogenesis. Previous work suggests a link between VEGF and Syndecan-2, which may function as a co-receptor for VEGF. We are currently developing equations that include terms expressing reaction, diffusion, and cell movement biased by "convection" like terms to model this interaction. These terms model the chemotactic influences on cells, and hence the interaction of the cells with the extracellular matrix that results in their directed movement towards the diffusible growth factor. Using this approach as a framework, we expect to develop mathematical models for angiogenesis for zebrafish that are both predictive and descriptive of growth factor signaling and extracellular matrix interactions during cell migration. Based on the high degree of conservation of signaling pathways involved in angiogenesis, we expect that modeling these processes in zebrafish will be directly applicable to tumor angiogenesis.
Genomics researchers produce vast quantities of data that require detailed analysis. The amount of information makes it impossible to manually analyze the data. Thus, many bioinformatics software tools have been developed for the purpose of analyzing large-scale data distributed across numerous public data repositories. The discipline of bioinformatics, like the field of genomics, is in its infancy. This talk will illustrate this point by highlighting recent findings of the international ENCyclopedia Of DNA Elements initiative (referred to as the ENCODE project), and by presenting exciting opportunities for computational genomics research.
CGH array experiments have become a powerful technique for analyzing changes in DNA by comparing a test DNA to a reference DNA. These experiments produce a huge amount of data and special statistical techniques are required for the detection of the alterations. Our intention was to design a method that is able to find gene copy number changes in highly noisy data. The quantile smoothing approach (Eilers and Menezes, 2005) is used for pre-processing of the data. Then, based on the assumption of rank order dependence of the probes and the jump character of gene copy number changes, the breakpoints are detected. Our method is sequential and is based on monitoring changes in variability of the distribution of the log ratios using a moving window of fixed width. The variability of the distribution of log ratios is estimated in each window applying a median absolute deviation concept. The idea behind is that the variability is increased in those windows which cover breakpoints. When the variability of a window exceeds some critical level, the breakpoint is detected. The critical level is derived as quantile of the empirical distribution of variability of the dataset. Finally merging (Willenbrock and Fridlyand, 2005) is used to control the false positives. Performance of our method is demonstrated using simulated and publicly available data sets in comparison with DNAcopy (Olshen et al., 2004).
Our research program focuses on understanding the biochemical connections between genetic and environmental factors linked to schizophrenia, with a long-term goal of preventing and/or reducing illness severity. Disparate results by us and others, on schizophrenia, and related severe neuropsychiatric illnesses, point to common metabolic perturbances, and/or hypersensitivities in the balance between folate, methionine, and sulfur metabolism. These intersecting pathways require the essential nutrients: folate (vitamin B9), cobolamin (vitamin B12), vitamin B6 and methoinine. Folate derivatives are required for the biosynthesis of nucleic acids (RNA and DNA), glutathione, S-adenosyl methionine (SAM), proteins, and lipids. The synthesis of glutathione, the major intracellular antioxidant and product of the transsulfuration pathway, competes with the synthesis of SAM, the major product of the methionine pathway. More than 100 methyl transferases use SAM in vivo including enzymes linked to DNA methylation (an epigenetic phenomenon), dopamine metabolism and schizophrenia. Schizophrenia is linked to multiple enzymes in these pathways through genetic studies. Environmental factors (e.g. paternal age, maternal starvation, folate deficiency, and infections) linked to disease impact these pathways and occur early in development. Early life is uniquely associated with exceptionally large amounts of DNA replication and post-replication DNA methylation (epigenetic) changes. Severe perturbations to the intersecting metabolic pathways and/or the balance between them will be lethal to a cell. However, less severe perturbations will have multiple complex cellular effects. In fact, models of common human diseases like schizophrenia require an understanding of the affected pathway(s) dynamics, including gene-environment interactions, exemplified by these studies.
The growth rate of platelet thrombi in vessels as affected by blood flow rate has been the subject of in vivo experiments, theory and computer modeling. In this talk, we will first review a relatively new mesoscopic method, Dissipative Particle Dynamics or DPD, which is particularly suited to modeling complex fluids. DPD can be seamlessly interfaced with MD and the Navier-Stokes equations for a multiscale modeling approach. I will then apply the method to platelet aggregation with and without the presence of red blood cells (RBCs). Finally, I will discuss recent results on modeling RBCs at the spectrin level and how to coarse-grain RBCs models using mean field theory to obtain effective properties.
I will present an overview of recent work focused on constructing computer models of developmental systems. In particular, we have devised an algorithm, called the Subcellular Element Model, which is able to simulate large numbers (thousands) of deformable cells in three dimensions. I will describe the inner workings of the algorithm, and indicate that the method is capable of modeling developmental systems over a wide range of scales - capturing cell visco-elasticity at small scales, and long-ranged coordinated cell movement at large scales. Modeling of primitive streak extension in the chick embryo will be discussed as a concrete application.
There is much interest in creating 3D models of cellular architecture of various tissue samples. Such tissue samples are often obtained to support phenotyping studies for cancer research. They are available for later analysis in the form of 3D microscopy images (confocal, histology, etc.). We describe a method that uses a series of segmentation and geometric visualization algorithms to reconstruct the cellular architecture as 3D models. By inspecting these 3D models one can fathom the changes wrought to cellular arrangements with the progression of disease. To demonstrate the efficacy of our methods, we describe their deployment on two specific phenotyping studies.
I will discuss two projects I have worked on this fall. The first I'll title "Degeneracy-driven dynamics of selective repertoires". A repertoire of recognizers (say antibodies, enzymes, predators, ...) rely on a diversity of resources (respectively, signals like antigens, substrates, prey,...). Biological interactions are often not specific. This limited specificity is called "degeneracy" (Edelman and Gally, PNAS, 2001). We will introduce degeneracy into population models (the Verhulst and Lotka-Volterra models) and show some preliminary analysis and numerical behavior of the generalized Verhulst model. The second topic is titled "Identifying dendritic tree structure from voltage measurements". Given various assumptions on the class of dendrites, what types of voltage and current data imposed at the soma and/or terminal points can one use to determine the topological structure of the dendrite. I will discuss some background on dendrites and introduce assumptions and an approach to address one such inverse problem.
I will present a probability density approach to modeling localized Ca influx via L-type Ca channels and Ca-induced Ca release mediated by clusters of ryanodine receptors during excitation-contraction coupling in cardiac myocytes. Coupled advection-reaction equations are derived relating the time-dependent probability density of subsarcolemmal subspace and junctional sarcoplasmic reticulum [Ca] conditioned on "Ca release unit" state. When these equations are solved numerically using a high-resolution finite difference scheme and the resulting probability densities are coupled to ordinary differential equations for the bulk myoplasmic and sarcoplasmic reticulum [Ca], a realistic but minimal model of cardiac excitation-contraction coupling is produced. Modeling Ca release unit activity using this probability density approach avoids the computationally demanding task of resolving spatial aspects of global Ca signaling, while accurately representing heterogeneous local Ca signals in a population of diadic subspaces and junctional sarcoplasmic reticulum depletion domains. The probability density approach is validated for a physiologically realistic number of Ca release units and benchmarked for computational efficiency by comparison to traditional Monte Carlo simulations. [This is joint work with George S. B. Williams, Marco A. Huertas, Eric A. Sobie, and M. Saleet Jafri.]
We present a workflow designed to quantitatively characterize the 3-D structural attributes of macroscopic tissue specimens acquired at a micron level resolution using light microscopy. This workflow includes four major components: (i) Serial-section image acquisition, (ii) image preprocessing, (iii) image analysis involving 2-D pair-wise registration, 2-D segmentation and 3-D reconstruction, and (iv) visualization and quantification of phenotyping parameters. Several new algorithms have been developed within each workflow component. The biological applications of our work include a study of the morphological change in a mouse placenta induced by knocking out the retinoblastoma gene and our ongoing study on breast tumor microenvironment modeling.
Hibernation in small mammals involves extreme physiological changes, as well as some puzzling dynamics. Every one to two weeks, hibernating mammals arouse to normal temperatures but do not eat or drink. This consumes most of their fat reserves. Why do they do it? After reviewing the phenomenon of mammalian hibernation, a simple model based on one possible answer to that question will be presented (joint work with Matthew T. Andrews). Some future directions in modeling, experiment, and bioinformatics will be described as well.
It is commonly thought that virus evolution in vivo can contribute to or correlate with the progression of HIV infection from the asymptomatic phase towards AIDS. The virus evolves towards immune escape, increased replication kinetics, and a higher degree of cell killing, leading to the depletion of the T helper cell population. Mathematical models of in vivo HIV evolution have been useful in shaping our understanding of the disease process. However, the models considered so far assume that one cell can only harbor one virus particle. Recent data, however, indicate that one cell can be infected by more than one virus particle, a process called co-infection. I will discuss a mathematical model that studies the effect of co-infection on HIV evolution in vivo and on the process of disease progression. This gives rise to some counter-intuitive insights that find some support in experimental data. It also gives rise to a theory for why natural SIV infection does not progress to AIDS despite the presence of high virus loads and high virus diversity in some cases.
Accurate assessment of tumor margins and the recognition of occult disease within adjacent peritumoral tissues and regional lymph node basins are important oncologic principles that help to minimize recurrence rates and improve long-term patient outcomes. However, most of currently utilized modalities for cancer detection and imaging are single modal, static, and focus primarily on preoperative image acquisition. Although each modality contributes a piece of a complex puzzle, co-registration between different modalities becomes a significant challenge, especially in a dynamic, intraoperative environment such as an operating room. We are developing portable multi-modal imaging systems for intraoperative, dynamic cancer detection and imaging. In this talk, I will address the current status of the technology development, the clinical potentials, and the need for mathematicians' contribution.
Many statistical models have algebraic structure in that they are defined either parametrically or implicitly by polynomial conditions on a natural parameter space. These algebraic statistical models are ubiquitous in statistics and the algebraic structure can often be exploited to answer statistical or probabilistic questions. I will try to illustrate these two points with examples of Gaussian conditional independence models, conditional inference for log-linear models, and algebraic invariants of phylogenetic models.
Many motor control systems contain central pattern generator (CPG) neural networks. CPG networks typically can produce one or more stable rhythmic behaviors, for example corresponding to different quadruped gaits. In the marine mollusk {\it Aplysia californica}, a CPG network controlling feeding movements can produce biting, swallowing and rejection behaviors by changing the phase relationships of activity in individual network units. We investigate simplified $D_4$-equivariant models of a CPG neural network in which two stable limit cycles coexist, with the goal of understanding how noise (stochastic perturbations) might facilitate switching from one activity pattern to another. In particular we investigate conditions under which random perturbations of the deterministic network dynamics are simultaneously weak enough to preserve the form of the deterministic limit cycle attractors, and strong enough to induce spontaneous switching between them.
Joint work with Hillel J. Chiel (Biology) Case Western Reserve University.
I will present our recent works on the existence of multiple stationary solutions and convergence of dynamics (every solution converges to one of the equilibria as time goes to infinity) for the Hopfield-type neural networks with delays. The theory is obtained through formulating parameter conditions motivated by a geometrical observation on the single neuron equation. We first construct 3^n equilibria for the n-neuron network. Positively invariant regions for the flows generated by the system and basins of attraction for the 2^n, among these 3^n, stationary solutions are established. The theory is also extended to the existence of 2^n limit cycles for the networks with time-periodic inputs. For the convergence of dynamics, some conditions can be imposed to show that the system is strongly order preserving so that quasiconvergence is generic for the networks, as the self-feedback time lags are small for the neurons with negative self-connection weights.
A modified formulation which bears a spirit of ignoring the delays can also be developed to derive certain componentwise dynamical property. An iteration argument is then constructed to conclude that every solution of the network converges to a single equilibrium as time tends to infinity.
This talk will describe some of our combined experimental and computational methods for determining the specificity of DNA-binding proteins and for discovering regulatory sites in genomic DNA sequences. It will cover aspects of the algorithms we have developed and the types of experiments we employ to test the predictions and refine the models. Examples from bacteria, yeast and worms will be described.
In this talk, I will introduce a quantile approach for analyzing GeneChip microarray data to detect differentially expressed genes through analyzing probe level measurements. The developed test makes no distributional assumptions, and it does not require estimating the unknown error density function. Our empirical studies with real experimental data show that detecting differences in the quartiles for the probe level data is a valuable complement to the usual mixed model analysis based on Gaussian likelihood. Aiming to improve the efficiency of the quantile rank score test at small samples, we propose an enhanced method to calibrate the intra-subject correlation estimation by sharing information across the "interesting" genes. The enhanced method shrinks the gene-specific correlation estimates towards a common value, with the degree of shrinkage depending on the variability of correlation coefficients across genes.
In current studies on cell movement in tissues, Friedl et al. have observed single metastatic cancer cells as they move through collagen network tissue. They show a characteristic form of movement, called "mesenchymal motion". Based on their observations, I will derive mathematical models for mesenchymal motion. On a mesoscopic level, I will formulate transport equations. To obtain macroscopic models in the form of advection-diffusion equations, I will use hyperboplic and parabolic scaling techniques. Numerical simulations of these models show interesting pattern formation in form of networks. I will discuss specific applications and present new results on steady states.
The reverse engineering of biological network is a major focus of research in the post-omics era. Gene networks are conceptual representations of interactions between genes and may provide important information about the regulatory aspects of the biological system under study. Applications in biomedical engineering include the design of specific drug targets that could maximize the effect of its action across the network. A multitude of methods are available to infer gene networks from data, some of which have specific data requirements in order to satisfy their theoretical framework. I propose to present a new method to reverse engineer gene networks from time series data based on the estimation of gene interactions by least squares fitting. By iteratively selecting genes to be perturbed (i.e., to be knocked out), constraints can be imposed in the network, thereby helping in the inference process.
The adaptive immune system has the convenient feature of being able to remember and defend the body against previously encountered pathogens, rendering long-term immunity to an individual who survives an initial acute infection. T-cell populations accomplish this task through their expansion and differentiation into subtypes of cells with effector (useful for eliminating pathogen) and memory (surviving) capabilities. Simple mathematical models using systems of ordinary differential equations can capture the dynamics of typical immune responses, and these models are useful for predicting proliferation and death rates of various subcategories of T cells. We discuss some findings based on parameter fitting in these basic models which assume a variety of differentiation pathways. We also present and discuss a T-cell population model that assumes that differentiation to memory cells is a continuous process dependent on the strength and duration of antigen exposure. This new model consists of a coupled pair of partial differential equations and results in a translating solution of the heat equation. Interestingly, this same mathematical model has been used to describe and analyze transport along nerve axons.
The biotechnological advances in the last decade have enabled the possibility of a reverse problem formulation for the modeling of systems structure and dynamics of genetic and metabolic networks. Some major challenges for the development of these reverse engineering methods are related to the construction of efficient algorithms to build robust models with respect to data noise and feasible ways to combine gene expression data with a priori knowledge to produce functional predictions of such networks.
In this talk, we will introduce an evolutionary computation based reverse engineering algorithm for constructing the underlying network structure and dynamics from gene expression data and combine it, when available, with a priori knowledge; in our proposed method, gene expression data include wildtype time courses as well as knockout perturbations. Our framework is that of polynomial dynamical systems (PDS) enabling the use of computational algebra tools to efficiently describe structural characteristics of the desired models. Experiments on artificial genetic networks such as the segment polarity gene network in D. Melanogaster, show the performance of the proposed algorithm in constructing a robust (with respect to data noise) mathematical model.
The fungus Magnaporthe grisea, commonly referred to as the rice blast fungus, is responsible for destroying from 10% to 30% of the world's rice crop each year. The fungus attaches to the rice leaf and forms a dome-shaped structure, the appressorium, in which enormous pressures are generated that are used to blast a penetration peg through the rice cell walls and infect the plant. We develop models for both the appressorial development and the penetration peg using exact, nonlinear, elasticity theory for shells and membranes. The model for appressorial design explains the shape of the appressorium, and its ability to maintain that shape under enormous increases in turgor pressure that can occur during the penetration phase. The model for the penetration peg provides the means of studying the effects of external surface stresses and the normal motion of material points on the cell surface.
Orexin-producing neurons are clearly essential for the regulation of wakefulness and sleep as loss of these cells produces narcolepsy. However, little is understood about how these neurons dynamically interact with other wake- and sleep-regulatory nuclei to control behavioral states. Using survival analysis of wake bouts in wild type and orexin knockout mice, we characterized the fragmentation of wakefulness observed in orexin knockout mice and identified a surprisingly delayed onset (> 1 min) of functional orexin effects. We incorporated these findings into a mathematical model of the mouse sleep/wake network, and the resulting simulated behavior accurately reflects the fragmented sleep/wake behavior of narcolepsy. Analysis of the model geometry provides insight into the mechanism associated with behavioral state instability in the simulated data and leads to several predictions.
In mammals, the respiratory rhythm is maintained under a wide range of conditions, depending on age, metabolic demand, and environmental factors. This rhythm is driven by a pacemaker system in the brainstem. Hence, a central question is, how does this pacemaker system generate such robust, adaptable rhythms? One component of the respiratory pacemaker system is the pre-Botzinger complex (pBC), a collection of neurons that can exhibit bursts of activity under appropriate conditions and that are coupled with synaptic excitation. I will discuss the mathematical analysis of the mechanisms by which synaptic coupling and heterogeneity can promote rhythmic activity in a model pBC network. This analysis incorporates fast-slow decomposition, bifurcation analysis, reduction of differential equations to maps, and a bit of graph theory.
Bacterial Biofilms are the most ubiquitous form of life on the planet: more than 90% of bacteria live in aggregations called biofilms. Biofilms are primary cause for deaths of people with Cycstic Fibrosis, cause Legionairre's disease, are a major source of nosocomial infections, damage ships, and clog fluid based industrial and food processing machinery causing billions of dollars of damage annually. Biofilms are also used to improve performance of fertilizers, to manufacture many household products, and to clean industrial runoff. Biofilms exhibit complex behavior such as varying surface morphology, cell-to-cell communication, and symbiotic relationships. Consequently, it is important for many reasons to understand the formation, growth, and characteristics of bacterial biofilms so that they can be inhibited where they are undesirable and controlled where they are used to our advantage. In this talk I will discuss our work on modeling and simulation of bacterial biofilms. In particular, I will discuss two biofilm systems: Pseudomonas aeruginosa biofilms which are the most common cause of death for people with CF, and autotroph/heterotroph systems that are used for nitrate and ammonia removal from waste water in activated sludge reactors.
A well-known problem in protein modeling is the determination of the structure of a protein with a given set of inter-atomic or inter-residue distances obtained from either physical experiments or theoretical estimates. A general form of the problem is known as the distance geometry problem in mathematics, the graph embedding problem in computer science, and the multidimensional scaling problem in statistics. The problem has applications in many other scientific and engineering fields as well such as sensor network localization, image recognition, and protein classification. We describe the formulations and complexities of the problem in its various forms, and introduce a geometric buildup approach to the problem. Central to this approach is the idea that the coordinates of the atoms in a protein can be determined one atom at a time, with the distances from the determined atoms to the undetermined ones. The determination of each atom requires the solution of a small system of distance equations, which can usually be obtained in constant time. Therefore, in ideal cases, the coordinates of n atoms can be determined by a geometric buildup algorithm with O(n) distances in O(n) computing time instead of O(n2) distances in O(n2) computing time as required by a conventional singular-value decomposition algorithm. We present the general algorithm and discuss the methods for controlling the propagation of the numerical errors in the buildup process, for determining rigid vs. unique structures, and for handling problems with inexact distances (distances with errors). We show the results from applying the algorithm to a set of model protein problems with varying degrees of availability and accuracy of the distances and justify the potential use of the algorithm in protein modeling practice.
There is a growing interest in establishing the transcriptional regulatory networks that govern the expression of all the genes in an organism, a necessary step towards building an "in silico" cell. The wealth of available sequence information has provided a fertile ground for data mining making bioinformatics a valuable tool in aiding the establishment of connections in the complex web of gene interactions. However, there are significant experimental hurdles that slow down the production of experimental data that is essential for confirming predictions or for identifying connections difficult to anticipate by other means. I will describe some of the main experimental approaches involved in establishing gene regulatory motifs and explain how they are utilized in my laboratory to understand two aspects of plant epidermal cell differentiation: the formation of leaf hairs (trichomes) and the formations of pores (stoma) that allow gas exchange.
In this presentation I will introduce the Affymetrix genotype and copy-number microarray platform and show how it can be used to estimate whole-genome copy numbers (CNs). In July 2003 Affymetrix released the so called "10K SNP" chip, which was designed for genotyping 10,000 single-nucleotide polymorphisms (SNPs) although early on various groups also proposed methods for estimating CNs. Since then, in less than four years, Affymetrix has released an additional four genotyping & CN assays where the density of markers has increased with an order of magnitude for each generation. With the release of the GenomeWideSNP_6 ("GWS6") chip in June 2007 we now have 900,000 SNPs and 900,000 non-polymorphic loci at hand, averaging one CN marker per 1600 base pairs. This continuous and rapid development of marker density, together with an increasing number of samples per project, provides us not only with new opportunities but also statistical and computational challenges. I will present a low-level single-locus CN method, together with a bounded-memory algorithm, that controls for PCR effects, non-balanced enzyme mixtures, cross-talk between alleles due to sequence homologies, and offset in obtained probe signals. The method is evaluated by comparing it with other available methods.
The ability to measure thousands of mRNA transcript expressions simultaneously using high-throughput genomic technology has revolutionized the field of Genetics. In our study, RNA from a collection of Lymphoblastoid Cell Lines were hybridized onto Affymetrix genechip arrays. The goal of this study is to discover the genes that are highly variable in the human population subgroups. An ANOVA model was performed and the effect of the Population subgroups were isolated and tested after adjusting for confounding effects such as ChipLot, Operator, and Gender.
Here the complexity of our analysis lies in the fact that the covariates Gender and Population subgroups have gene-specific effects while the covariates ChipLot and Operator have a global effect (that is, they are common to all genes). In addition, our microarray data is highly unbalanced, therefore no results in the microarray literature is of any help.
In this talk we discuss how multifactorial analysis of variance containing both global and gene-specific parameters can be carried out efficiently in spite of the large size of microarray data. We first derive an analytical form of the solutions of the normal equations and use these solutions to suggest a low-cost two-stage analysis. Our procedure can be viewed as an extension of the work by Kerr et al. (2000) for balanced (orthogonal) designs. We also review permutation tests for both balanced and unbalanced ANOVA designs. All these results are applied to our unbalanced microarray data. We also discuss how to get around of the computational complexities in computing ten of thousands of empirical p-values efficiently based on large permutation size (more than 100,000).
Systems biology addresses the biochemical and genetic networks responsible for cellular activity. This activity must be closely regulated in order for the cell to survive in a variable and unpredictable environment. The mathematical tools of systems and control theory were developed to aid in the design and analysis of man- made self-regulating systems. Those same techniques can be used to provide insight into the reverse-engineering of biochemical and genetic self-regulating systems. This talk will present such an analysis: a treatment of the role of negative feedback in the distribution of robustness.
A 1--D mathematical model for ameboid cell movements using linear viscoelastic fluid dynamics with free boundary formulation and model based inverse problem formulation will be discussed. Based on the model, the inverse problem can be posed: depending on the constitutive relations and governing equations, what kind of characteristic properties must the model parameters and unknowns have in order to reproduce a given movement of the cell, provided that the velocity field at any point is given? The inverse problem provides the model parameters that give some insight, principally into the mechanical aspect, but also, through qualitative reasoning, into chemical and biophysical aspects of the cell. Some numerical analysis and results of the inverse problem are also discussed.
This talk will consist of three segments which cover some of my past and present research on modeling, understanding, and controlling the acute inflammatory response. In the first part, a four dimensional differential equation model of the acute inflammatory response is presented in the context of repeated endotoxin administrations. Lipopolysaccharide (LPS) or endotoxin can induce an acute inflammatory response comparable to a bacterial infection. In experiments with repeated endotoxin administration the observation that a preconditioning dose can blunt the inflammatory response is known as endotoxin tolerance. Our findings support the hypothesis that endotoxin tolerance and other related phenomena can be considered as dynamic manifestations of a unified acute inflammatory response. The second part will touch on some mathematical results, regarding the behavior of transients, that were inspired by the endotoxin tolerance work. In the third and final part we use the previously mentioned model to investigate a prospective tool known as nonlinear model predictive control (NMPC), which may help determine suitable dose regimens in complex clinical settings. The advantage of this approach over other control algorithms is that it combines both a prediction of the future state of the system from a mathematical model and feedback from real time data measurements to successively update a sequence of control moves that will help to optimize the desired outcome for a specific scenario.
Biofilms are accumulations of bacteria on a surface in aqueous systems. The bacteria attach to the surface and produce a gel-like polymeric matrix in which the cells themselves are embedded and in which vivid microbial communities develop. It is well accepted that biofilm communities are more difficult to eradicate with biocides than planktonic communities, which causes big problems in a medical and industrial context.
Among experimental biofilm researchers there is little doubt that the hydrodynamic conditions in the environment affect biofilm processes. In contrast, most biofilm modellers set their studies in a hydrostatic context, in part due to the increased complexity that comes with the Navier-Stokes equations.
We study a mathematical model for the formation of spatially heterogenous biofilm morphologies and their response to biocides. This model will be coupled with a simplified (compared to the full Navier-Stokes equations) description of bulk hydrodynamics. We show some analytical results and numerical simulations.
It has long been known that metabolic rate, heart rate, and lifespan scale in a systematic and inter-related way with body size and temperature across species. These scaling relationships hold over an astronomical range in body size (~21 orders of magnitude) and across taxonomically diverse organisms that live in a myriad of environments. Moreover, these relationships for body mass are usually well approximated by power laws with exponents that are simple multiples of 1/4, and for body temperature by exponential Boltzmann-Arrhenius factors. I will describe a model to explain these relationships that focuses on the cardiovascular system and the kinetics of biochemical reactions. I will also discuss recent work of mine that shows how finite-size corrections and asymmetric branching can refine the original model's predictions. I will then present my work that builds on these scaling relationships to examine critical physiological and ecological processes. At the physiological level, I will discuss models to explore, for example, tumor growth dynamics, cell size, and why an elephant sleeps much less than a mouse. At the ecological level, I will outline a trait-based framework to investigate the effects of fluctuating environments on ecosystems and the effects of temperature on predator-prey interactions. Together, these have the potential to gauge the impact of climate change on ecosystem dynamics and stability.
A growing amount of evidence points to the fact that animals do not simply move by perfoming a Brownian walk. By playing with the time spent before taking the next steps as well as with the length of those steps, an animal can perform more exotic random walks and optimize its movement statistics according to the situation. I will show how mathematical formalisms such as the so-called Generalized Master Equation or the differential equation describing Fractional Brownian motion are good candidates for describing anomalous statistics in movement patterns. Application of these ideas to movement patterns of some Mediterranean seabird (Puffinus mauretanicus and Calonectris diomedea) as well as some rodent species, such as the deer mouse, has been instrumental in quantifying their food searching strategies.
In this talk, I will introduce two spatial problems in theoretical ecology together with their mathematical solutions.
The first part of the talk concerns competition between plants for sunlight. In it, I use a mechanistic Kolmogorov-type competition model to connect plant population vertical leaf profiles (or VLPs) to the asymptotic behavior of the resulting dynamical system. For different VLPs, conditions can be obtained for either competitive exclusion to occur or stable coexistence at one or more equilibrium points.
The second part of the talk concerns the spatial spread of infectious diseases. Here, I use a family of SI-type models to examine the ability of a disease, such as rabies, to invade or persist in a spatially heterogeneous habitat. I will discuss properties of the disease-free equilibrium and the behavior of the endemic equilibrium as the mobility of healthy individuals becomes very small relative to that of infecteds. The family of disease models consists variously of systems of difference equations (which I will emphasize), ODEs, and reaction-diffusion equations.
T cells of the immune system are activated by interactions with antigen-presenting cells (APC). T cell receptors (TCR) on the T cell surface transiently bind to defined signatures of infection (antigens) on the APC. Productive TCR-antigen binding leads to biochemical signals within the T cell and an immune response. T cell responses occur even when the antigen is present at very low concentrations. It has been suggested that during the T cell-APC interaction, over the course of minutes to hours, presented antigens can bind to a series of TCR and that such "serial engagement" is a key determinant of the T cell response. In this talk I will describe the biological questions in more detail and show how mathematical models can be used to interpret experimental data and propose experimentally testable hypotheses.
Bistable systems are very common modules in natural biological systems. In this work, well-characterized biological components are used to construct a genetic toggle switch in S. cerevisiae through mutual inhibition. Mathematical modeling is combined with molecular biology to design and construct the genetic toggle switch. We show that, guided by modeling predictions, we can achieve bistability by tuning the system. I will illustrate the artificial "cell differentiation", both experimentally and mathematically, by starting the switch from the third state, which represents the state that expression of both repressors are turned off. This work demonstrates the use of synthetic gene networks to uncover general regulatory mechanisms in natural biological systems.
Mathematical modeling and computer simulations play increasingly important roles in cancer research, and the most important one is providing a framework to capture various mechanisms underlying tumor growth and angiogenesis(new blood vessels formation). A tumor develops through avascular, angiogenic and vascularized stages to be malignant. Initially a tumor utilizes nutrient diffusing from parent vessels to grow but only up to a limited size (~2mm in diameter). The key to reach malignant stage is the angiogenesis process, where the tumor obtains extra source of nutrient and begins to grow out of control. In this talk, I will first present a mathematical model serving as framework for cancer research to simulate the growth of a tumor through all these stages, then describe a chemotherapy model to predict drug efficacy in the condition of vascular and morphological heterogeneity. Finally I will present a new angiogenesis model that addresses the proper relationship between endothelial cells (lining up the blood vessels) proliferation and migration, which is the key to understand vasculogenesis (formation of blood vessel plexus in embryo) and angiogenesis processes.
We will explain the need for stochastic reaction-diffusion models appropriate for studying the dynamics of gene and signaling networks within biological cells. In particular, we will describe our work developing a stochastic reaction-diffusion method that can incorporate the complex geometry of cellular architecture, and the application of this method to a model for eukaryotic gene expression and nuclear transport. This work raised the question of what the reaction-diffusion master equation (RDME), a lattice based stochastic reaction-diffusion model, approximates as the lattice spacing is decreased. We will discuss our recent work proving that in the continuum limit reaction effects are lost in the RDME model. While this may seem a negative feature, we will also show how the RDME for finite lattice spacings may be interpreted as an asymptotic approximation to a spatially-continuous stochastic reaction-diffusion model due to Smoluchowski. We will conclude with a brief introduction to a new, long term, modeling project we have begun, developing stochastic-reaction diffusion models of gene/signaling networks involved in several cardiac conditions.
Tiling planforms dominated by diamonds (such as the diamond-shaped seeds on a sunflower head), hexagons, or ridges (such as those on saguaro cacti) are observed on many plants. We analyze PDE models for the formation of these patterns that incorporate the effects of growth and biophysical and biochemical mechanisms. The aim is to understand both the underlying symmetries and the information specific to the mechanisms. The patterns are compared to Voronoi tessellations, and we will start to draw a bigger picture of growth and symmetry in biological systems.
Mutations in the breast cancer associated tumor suppressor-1 protein, BRCA1, are linked to hereditary breast cancer, and low protein expression has been associated with non-familial (sporadic) breast cancer. My laboratory has been working to determine what biochemical reactions are at the heart of how BRCA1 functions to block tumorigenesis. We have used biochemical and cell biology experiments in order to identify biochemical reactions regulated by the enzymatic activity of BRCA1, and lately we have begun to use systems biology approaches in order to identify protein networks in which BRCA1 is key. In this talk, a collaborative study will be described in which a single large gene expression dataset was used to identify genes/proteins in a BRCA-centered-network. The basic principle applied in this study was that genes that function together in a complex process, such as breast tumorigenesis, would have their mRNA expression co-regulated. Thus, to find genes that function together, we analyzed co-expression with specific reference genes in order to find members of the network. Other systems data were applied to these to rank identified genes, and several top-ranked gene/protein were analyzed experimentally. One gene, called HMMR, was found to interact with BRCA1 at the centrosome and to regulate centrosome number in concert with BRCA1. Further, the genotype of HMMR was analyzed in patient cases, and specific haplotypes associated with the HMMR gene had an increased risk of breast cancer. Another gene/protein identified using the co-expression analysis was the Aurora-A kinase, and this gene was shown to regulate the enzymatic activity of BRCA1. This bioinformatics approach proved a powerful means for identifying key BRCA1 interactions that are important for controlling the centrosome and for which there is an important role in the etiology of breast cancer.
The broad problem of data integration as I see it is three-fold: How to enable the study of integrated data within the context of a biological question; how to integrate differing data types such as categorical or continuous data; and how to compensate for variations between data collected by different groups and/or technologies as well as biological diversity between different samples or cell types. I will lead a cursory discussion on these three issues and spend the remainder of the time introducing novel methods that have been proposed for addressing the question of data integration.