Author response: Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data Article Swipe
YOU?
·
· 2018
· Open Access
·
· DOI: https://doi.org/10.7554/elife.28728.031
· OA: W2984360125
Article Figures and data Abstract Introduction Results Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract In vivo calcium imaging through microendoscopic lenses enables imaging of previously inaccessible neuronal populations deep within the brains of freely moving animals. However, it is computationally challenging to extract single-neuronal activity from microendoscopic data, because of the very large background fluctuations and high spatial overlaps intrinsic to this recording modality. Here, we describe a new constrained matrix factorization approach to accurately separate the background and then demix and denoise the neuronal signals of interest. We compared the proposed method against previous independent components analysis and constrained nonnegative matrix factorization approaches. On both simulated and experimental data recorded from mice, our method substantially improved the quality of extracted cellular signals and detected more well-isolated neural signals, especially in noisy data regimes. These advances can in turn significantly enhance the statistical power of downstream analyses, and ultimately improve scientific conclusions derived from microendoscopic data. https://doi.org/10.7554/eLife.28728.001 Introduction Monitoring the activity of large-scale neuronal ensembles during complex behavioral states is fundamental to neuroscience research. Continued advances in optical imaging technology are greatly expanding the size and depth of neuronal populations that can be visualized. Specifically, in vivo calcium imaging through microendoscopic lenses and the development of miniaturized microscopes have enabled deep brain imaging of previously inaccessible neuronal populations of freely moving mice (Flusberg et al., 2008; Ghosh et al., 2011; Ziv and Ghosh, 2015). This technique has been widely used to study the neural circuits in cortical, subcortical, and deep brain areas, such as hippocampus (Cai et al., 2016; Ziv et al., 2013; Jimenez et al., 2018; Rubin et al., 2015), entorhinal cortex (Kitamura et al., 2015; Sun et al., 2015), hypothalamus (Jennings et al., 2015), prefrontal cortex (PFC) (Pinto and Dan, 2015), premotor cortex (Markowitz et al., 2015), dorsal pons (Cox et al., 2016), basal forebrain (Harrison et al., 2016), striatum (Barbera et al., 2016; Carvalho Poyraz et al., 2016; Klaus et al., 2017), amygdala (Yu et al., 2017), and other brain regions. Although microendoscopy has potential applications across numerous neuroscience fields (Ziv and Ghosh, 2015), methods for extracting cellular signals from this data are currently limited and suboptimal. Most existing methods are specialized for two-photon or light-sheet microscopy. However, these methods are not suitable for analyzing single-photon microendoscopic data because of its distinct features: specifically, this data typically displays large, blurry background fluctuations due to fluorescence contributions from neurons outside the focal plane. In Figure 1, we use a typical microendoscopic dataset to illustrate these effects (see Video 1 for raw video). Figure 1A shows an example frame of the selected data, which contains large signals additional to the neurons visible in the focal plane. These extra fluorescence signals contribute as background that contaminates the single-neuronal signals of interest. In turn, standard methods based on local correlations for visualizing cell outlines (Smith and Häusser, 2010) are not effective here, because the correlations in the fluorescence of nearby pixels are dominated by background signals (Figure 1B). For some neurons with strong visible signals, we can manually draw regions-of-interest (ROI) (Figure 1C). Following (Barbera et al., 2016; Pinto and Dan, 2015), we used the mean fluorescence trace of the surrounding pixels (blue, Figure 1D) to roughly estimate this background fluctuation; subtracting it from the raw trace in the neuron ROI yields a relatively good estimation of neuron signal (red, Figure 1D). Figure 1D shows that the background (blue) has much larger variance than the relatively sparse neural signal (red); moreover, the background signal fluctuates on similar timescales as the single-neuronal signal, so we can not simply temporally filter the background away after extraction of the mean signal within the ROI. This large background signal is likely due to a combination of local fluctuations resulting from out-of-focus fluorescence or neuropil activity, hemodynamics of blood vessels, and global fluctuations shared more broadly across the field of view (photo-bleaching effects, drifts in z of the focal plane, etc.), as illustrated schematically in Figure 1E. Figure 1 Download asset Open asset Microendoscopic data contain large background signals with rapid fluctuations due to multiple sources. (A) An example frame of microendoscopic data recorded in dorsal striatum (see Materials and methods section for experimental details). (B) The local 'correlation image' (Smith and Häusser, 2010) computed from the raw video data. Note that it is difficult to discern neuronal shapes in this image due to the high background spatial correlation level. (C) The mean-subtracted data within the cropped area (green) in (A). Two ROIs were selected and coded with different colors. (D) The mean fluorescence traces of pixels within the two selected ROIs (magenta and blue) shown in (C) and the difference between the two traces. (E) Cartoon illustration of various sources of fluorescence signals in microendoscopic data. 'BG' abbreviates 'background'. https://doi.org/10.7554/eLife.28728.002 Video 1 Download asset This video cannot be played in place because your browser does support HTML5 video. You may still download the video for offline viewing. Download as MPEG-4 Download as WebM Download as Ogg An example of typical microendoscopic data. The video was recorded in dorsal striatum; experimental details can be found above. MP4 https://doi.org/10.7554/eLife.28728.003 The existing methods for extracting individual neural activity from microendoscopic data can be divided into two classes: semi-manual ROI analysis (Barbera et al., 2016; Klaus et al., 2017; Pinto and Dan, 2015) and PCA/ICA analysis (Mukamel et al., 2009). Unfortunately, both approaches have well-known flaws (Resendez et al., 2016). For example, ROI analysis does not effectively demix signals of spatially overlapping neurons, and drawing ROIs is laborious for large population recordings. More importantly, in many cases, the background contaminations are not adequately corrected, and thus the extracted signals are not sufficiently clean enough for downstream analyses. As for PCA/ICA analysis, it is a linear demixing method and therefore typically fails when the neural components exhibit strong spatial overlaps (Pnevmatikakis et al., 2016), as is the case in the microendoscopic setting. Recently, constrained nonnegative matrix factorization (CNMF) approaches were proposed to simultaneously denoise, deconvolve, and demix calcium imaging data (Pnevmatikakis et al., 2016). However, current implementations of the CNMF approach were optimized for 2-photon and light-sheet microscopy, where the background has a simpler spatiotemporal structure. When applied to microendoscopic data, CNMF often has poor performance because the background is not modeled sufficiently accurately (Barbera et al., 2016). In this paper, we significantly extend the CNMF framework to obtain a robust approach for extracting single-neuronal signals from microendoscopic data. Specifically, our extended CNMF for microendoscopic data (CNMF-E) approach utilizes a more accurate and flexible spatiotemporal background model that is able to handle the properties of the strong background signal illustrated in Figure 1, along with new specialized algorithms to initialize and fit the model components. After a brief description of the model and algorithms, we first use simulated data to illustrate the power of the new approach. Next, we compare CNMF-E with PCA/ICA analysis comprehensively on both simulated data and four experimental datasets recorded in different brain areas. The results show that CNMF-E outperforms PCA/ICA in terms of detecting more well-isolated neural signals, extracting higher signal-to-noise ratio (SNR) cellular signals, and obtaining more robust results in low SNR regimes. Finally, we show that downstream analyses of calcium imaging data can substantially benefit from these improvements. Model and model fitting CNMF for microendoscope data (CNMF-E) The recorded video data can be represented by a matrix Y∈ℝ+d×T, where d is the number of pixels in the field of view and T is the number of frames observed. In our model, each neuron i is characterized by its spatial 'footprint' vector 𝒂i∈ℝ+d characterizing the cell's shape and location, and 'calcium activity' timeseries 𝒄i∈ℝ+T, modeling (up to a multiplicative and additive constant) cell i's mean fluorescence signal at each frame. Here, both 𝒂i and 𝒄i are constrained to be nonnegative because of their physical interpretations. The background fluctuation is represented by a matrix B∈ℝ+d×T. If the field of view contains a total number of K neurons, then the observed movie data is modeled as a superposition of all neurons' spatiotemporal activity, plus time-varying background and additive noise: (1) Y=∑i=1K𝒂i⋅𝒄iT+B+E=AC+B+E, where A=[𝒂1,…,𝒂K] and C=[𝒄1,…,𝒄K]T. The noise term E∈ℝd×T is modeled as Gaussian, E(t)∼𝒩(0,Σ) is a diagonal matrix, indicating that the noise is spatially and temporally uncorrelated. Estimating the model parameters A,C in model (1) gives us all neurons' spatial footprints and their denoised temporal activity. This can be achieved by minimizing the residual sum of squares (RSS), aka the Frobenius norm of the matrix Y-(AC+B), (2) ∥Y-(AC+B)∥F2, while requiring the model variables A,C and B to follow the desired constraints, discussed below. Constraints on neuronal spatial footprints A and neural temporal traces C Each spatial footprint 𝒂i should be spatially localized and sparse, since a given neuron will cover only a small fraction of the field of view, and therefore most elements of 𝒂i will be zero. Thus, we need to incorporate spatial locality and sparsity constraints on A (Pnevmatikakis et al., 2016). We discuss details further below. Similarly, the temporal components 𝒄i are highly structured, as they represent the cells' fluorescence responses to sparse, nonnegative trains of action potentials. Following (Vogelstein et al., 2010; Pnevmatikakis et al., 2016), we model the calcium dynamics of each neuron 𝒄i with a stable autoregressive (AR) process of order p, (3) ci(t)=∑j=1pγj(i)ci(t-j)+si(t), where si(t)≥0 is the number of spikes that neuron fired at the t-th frame. (Note that there is no further noise input into ci(t) beyond the spike signal si(t).) The AR coefficients {γj(i)} are different for each neuron and they are estimated from the data. In practice, we usually pick p=2, thus incorporating both a nonzero rise and decay time of calcium transients in response to a spike; then Equation (3) can be expressed in matrix form as (4) Gi⋅ci=si, with Gi=[100⋯0−γ1(i)10⋯0−γ2(i)−γ1(i)1⋯0⋮⋱⋱⋱⋮0⋯−γ2(i)−γ1(i)1]. The neural activity 𝒔i is nonnegative and typically sparse; to enforce sparsity, we can penalize the ℓ0 (Jewell and Witten, 2017) or ℓ1 (Pnevmatikakis et al., 2016; Vogelstein et al., 2010) norm of 𝒔i, or limit the minimum size of nonzero spike counts (Friedrich et al., 2017b). When the rise time constant is small compared to the timebin width (low imaging frame rate), we typically use a simpler AR(1) model (with an instantaneous rise following a spike) (Pnevmatikakis et al., 2016). Constraints on background activity B In the above we have largely followed previously described CNMF approaches (Pnevmatikakis et al., 2016) for modeling calcium imaging signals. However, to accurately model the background effects in microendoscopic data, we need to depart significantly from these previous approaches. Constraints on the background term B in Equation (1) are essential to the success of CNMF-E, since clearly, if B is completely unconstrained we could just absorb the observed data Y entirely into B, which would lead to recovery of no neural activity. At the same time, we need to prevent the residual of the background term (i.e. B-B^, where B^ denotes the estimated spatiotemporal background) from corrupting the estimated neural signals AC in model (1), since subsequently, the extracted neuronal activity would be mixed with background fluctuations, leading to artificially high correlations between nearby cells. This problem is even worse in the microendoscopic context because the background fluctuation usually has significantly larger variance than the isolated cellular signals of interest (Figure 1D), and therefore any small errors in the estimation of B can severely corrupt the estimated neural signal AC. In (Pnevmatikakis et al., 2016), B is modeled as a rank-1 nonnegative matrix B=𝒃⋅𝒇T, where 𝒃∈ℝ+d and 𝒇∈ℝ+T. This model mainly captures the global fluctuations within the field of view (FOV). In applications to two-photon or light-sheet data, this rank-1 model has been shown to be sufficient for relatively small spatial regions; the simple low-rank model does not hold for larger fields of view, and so we can simply divide large FOVs into smaller patches for largely parallel processing (Pnevmatikakis et al., 2016; Giovannucci et al., 2017b). (See [Pachitariu et al., 2016] for an alternative approach.) However, as we will see below, the local rank-1 model fails in many microendoscopic datasets, where multiple large overlapping background sources exist even within modestly sized FOVs. Thus, we propose a new model to constrain the background term B. We first decompose the background into two terms: (5) B=Bf+Bc, where Bf represents fluctuating activity and Bc=𝒃0⋅𝟏T models constant baselines (𝟏∈ℝT denotes a vector of T ones). To model Bf, we exploit the fact that background sources (largely due to blurred out-of-focus fluorescence) are empirically much coarser spatially than the average neuron soma size l. Thus, we model Bf at one pixel as a linear combination of the background fluorescence in pixels which are chosen to be nearby but not nearest neighbors: (6) Bitf=∑j∈Ωiwij⋅Bjtf, ∀t=1…T, where Ωi={j | dist(xi,xj)∈[ln,ln+1)}, with dist(𝒙i,𝒙j) the Euclidean distance between pixel i and j. Thus, Ωi only selects the neighboring pixels with a distance of ln from the i-th pixel (the green dot and black pixels in Figure 2B illustrate i and Ωi, respectively); here ln is a parameter that we choose to be greater than l (the size of the typical soma in the FOV), e.g., ln=2l. This choice of ln ensures that pixels i and j in Equation (6) share similar background fluctuations, but do not belong to the same soma. Figure 2 Download asset Open asset CNMF-E can accurately separate and recover the background fluctuations in simulated data. (A) An example frame of simulated microendoscopic data formed by summing up the fluorescent signals from the multiple sources illustrated in Figure 1E. (B) A zoomed-in version of the circle in (A). The green dot indicates the pixel of interest. The surrounding black pixels are its neighbors with a distance of 15 pixels. The red area approximates the size of a typical neuron in the simulation. (C) Raw fluorescence traces of the selected pixel and some of its neighbors on the black ring. Note the high correlation. (D) Fluorescence traces (raw data; true and estimated background; true and initial estimate of neural signal) from the center pixel as selected in (B). Note that the background dominates the raw data in this pixel, but nonetheless we can accurately estimate the background and subtract it away here. Scalebars: 10 seconds. Panels (E–G) show the cellular signals in the same frame as (A). (E) Ground truth neural activity. (F) The residual of the raw frame after subtracting the background estimated with CNMF-E; note the close correspondence with E. (G) Same as (F), but the background is estimated with rank-1 NMF. A video showing (E–G) for all frames can be found at Video 2. (H) The mean correlation coefficient (over all pixels) between the true background fluctuations and the estimated background fluctuations. The rank of NMF varies and we run randomly-initialized NMF for 10 times for each rank. The red line is the performance of CNMF-E, which requires no selection of the NMF rank. (I) The performance of CNMF-E and rank-1 NMF in recovering the background fluctuations from the data superimposed with an increasing number of background sources. https://doi.org/10.7554/eLife.28728.004 Video 2 Download asset This video cannot be played in place because your browser does support HTML5 video. You may still download the video for offline viewing. Download as MPEG-4 Download as WebM Download as Ogg Comparison of CNMF-E with rank-1 NMF in estimating background fluctuation in simulated data. Top left: the simulated fluorescence data in Figure 2. Bottom left: the ground truth of neuron signals in the simulation. Top middle: the estimated background from the raw video data (top left) using CNMF-E. Bottom middle: the residual of the raw video after subtracting the background estimated with CNMF-E. Top right and top bottom: same as top middle and bottom middle, but the background is estimated with rank-1 NMF. MP4 https://doi.org/10.7554/eLife.28728.005 We can rewrite Equation (6) in matrix form: (7) Bf=WBf, where Wij=0 if dist(𝒙i,𝒙j)∉[ln,ln+1). In practice, this hard constraint is difficult to enforce computationally and is overly stringent given the noisy observed data. We relax the model by replacing the right-hand side Bf with the more convenient closed-form expression (8) Bf=W⋅(Y-AC-𝒃0⋅𝟏T). According to Equations (1) and (5), this change ignores the noise term E; since elements in E are spatially uncorrelated, W⋅E contributes as a very small disturbance to B^f in the left-hand side. We found this substitution for B^f led to significantly faster and more robust model fitting. Fitting the CNMF-E model Table 1 lists the variables in the proposed CNMF-E model. Now we can formulate the estimation of all model variables as a single optimization meta-problem: (P-All)A,C,S,Bf,W,b0minimize‖Y−AC−b0⋅1T−Bf‖F2subject toA≥0, Ais sparse and spatially localizedci≥0, si≥0, G(i)ci=si,si is sparse ∀i=1…KBf⋅1=0Bf=W⋅(Y−AC−b0⋅1T)Wij=0if dist(xi,xj)∉[ln,ln+1). Table 1 Variables used in the CNMF-E model and algorithm. ℝ: real numbers; ℝ+: positive real numbers; ℕ: natural numbers; ℕ+: positive integers. https://doi.org/10.7554/eLife.28728.006 NameDescriptionDomaindnumber of pixelsℕ+Tnumber of framesℕ+Knumber of neuronsℕYmotion corrected video dataℝ+d×TAspatial footprints of all neuronsℝ+d×KCtemporal activities of all neuronsℝ+K×TBbackground activityℝ+d×TEobservation noiseℝd×TWweight matrix to reconstruct B using neighboring pixelsℝd×d𝒃0constant baseline for all of the of the noise at pixel We this a because we have not the sparsity and spatial locality constraints on A and these can be by different (see details in Materials and note that 𝒔i is completely by 𝒄i and and Bf is not optimized but discussed can be estimated as so we with to The problem all variables and is but can be divided into simpler that we Estimating given is sparse and spatially localized Estimating given is sparse Estimating given For each of these we are able to use algorithms for and are discussed in et al., Pnevmatikakis et al., 2016) or these we obtain for all model variables in problem this gives us the of further potential or in the optimization for example, incorporating further information on neurons' or spatial components and detecting neurons from the These can significantly improve the quality of the model this is an compared with which no for of information or manually on the details on the algorithms for and then these are in the Materials and methods Results CNMF-E can estimate large background fluctuations We first use simulated data to illustrate the background model in CNMF-E and compare its performance against the low-rank NMF model used in the CNMF approach (Pnevmatikakis et al., 2016). We the observed fluorescence Y by summing up simulated fluorescent signals of multiple sources as shown in Figure plus additive noise (Figure An example pixel Figure was selected to illustrate the background model in CNMF-E which that each background activity can be using its neighboring The selected neighbors form a and their to the center pixel are larger than a typical neuron size (Figure Figure shows that the fluorescence traces of the center pixel and its neighbors are highly due to the shared large background fluctuations. Here, for we fit the background by problem while This should the background estimation more challenging to true neural components into the but nonetheless in Figure 2 we see that the background fluctuation was (Figure this estimated background from the observed fluorescence in the center yields a good of the cellular signal (Figure Thus, this example shows that we can reconstruct a background trace while the neural signal For the example frame in Figure the true cellular signals are sparse and (Figure When we subtract the estimated background using CNMF-E from the raw data, we obtain a good recovery of the true signal (Figure For we estimate the background activity by a rank-1 NMF model as used in the resulting image is still severely by the background (Figure This is to the spatiotemporal background signal in microendoscopic data typically has a rank higher than due to the various signal sources in Figure and therefore a rank-1 NMF background model is A approach would be to simply the rank of the NMF background model. Figure that this approach is higher rank NMF does but with high and low to in the initial of as the NMF rank many single-neuronal signals of interest are up in the estimated background signal not In CNMF-E the background signal more accurately than any of the NMF In real data analysis the rank of NMF is an and the selection of its is a We simulated data with different of local background sources and use a single parameter to run CNMF-E for the background multiple such Figure shows that the performance of CNMF-E does not as we have more background in to rank-1 NMF. CNMF-E can recover the background accurately across a of background as CNMF-E accurately single-neuronal spatial and temporal components Next, we used simulated data to our proposed (Figure In this example, we simulated neurons with strong spatial overlaps (Figure of the first in our is to a spatial filter to the to the background and the power of in the In Figure we see that the local correlation image (Smith and Häusser, 2010) computed on the spatially data a good initial of neuron compare to Figure where the correlation image computed on the raw data was highly by background signals. Figure Download asset Open asset CNMF-E accurately individual neurons' spatial and temporal components in simulated data. (A) An example frame of the simulated data. and red squares will to (D) and (E) below, (B) The temporal mean of the cellular activity in the simulation. (C) The correlation image computed using the spatially data. (D) An example of an isolated selected pixels to the the and the outside of a The raw traces and the traces are shown as The line is the true neural signal of the selected the spike times from the (E) Same as but two neurons are spatially overlapping in this Note that in both neural activity is visible in the and the initial of the spatial footprints are accurate are ground (F) The of all neurons on top of the correlation image as shown in represent the rank of neurons' SNR from red to The are of the true (G) The spatial and the temporal between each simulated neuron and its in the (H) The local correlation and the ratio for pixels in the area of each neuron (blue) and other The red are the for pixels in our A video showing the can be found at Video Video Download asset This video cannot be played in place because your browser does support HTML5 video. You may still download the video for offline viewing. Download as MPEG-4 Download as WebM Download as Ogg for the simulated data in Figure Top left: correlation image of the data. are of Top middle: pixels red for neurons on top of The large red dot indicates the current Top the correlation image surrounding the selected pixel or the spatial footprint of the the fluorescence trace at the pixel or the temporal activity raw and MP4 We choose two example ROIs to illustrate CNMF-E the background and nearby neural signals for accurate of neurons' shapes and activity. In the first example, we choose a well-isolated neuron Figure We pixels in the the and the outside of the neuron and show the fluorescence traces in both the raw data and the spatially data (Figure The raw traces are noisy and highly but the traces show relatively clean neural signals. This is because spatial the shared background activity and the neural signals the data. Similarly, Figure is an example showing CNMF-E two overlapping The traces in the of the two neurons still their temporal activity. After the neurons' traces using the spatially data, we initialize our estimate of their spatial Note that simply these spatial footprints with the spatially data does not not since the resulting shapes are by the spatial We found that it was more effective to initialize each spatial footprint by the initial neuron traces the raw movie data (see Materials and methods for details). The initial the simulated ground truth with high (Figure In this