Coupled Transformer Autoencoder for Disentangling Multi-Region Neural Latent Dynamics

Abstract

Simultaneous large‑scale multielectrode recordings (e.g., Neuropixels probes) from thousands of neurons across multiple brain areas reveal rich mixtures of activity that are shared between regions and dynamics that are unique to each region. When analyzing such datasets, existing alignment or multi-view methods neglect temporal structure, whereas dynamical latent-variable models capture temporal dependencies but are usually restricted to a single area or conflate shared and private signals. We introduce Coupled Transformer Autoencoder (CTAE)—an end‑to‑end sequence model designed to uncover latent population dynamics while simultaneously addressing (i) non-stationary, non-linear dynamics and (ii) separation of shared versus region-specific structure. CTAE employs Transformer encoders and decoders to capture long-range neural dynamics, and explicitly partitions each region’s latent space into orthogonal shared and private subspaces through reconstruction, alignment, and orthogonality losses, and scales to many brain regions without a parameter explosion. We demonstrate the effectiveness of CTAE on two high-density electrophysiology datasets of simultaneous recordings from multiple regions, one from motor cortical areas-dorsal premotor cortex (PMd) and primary motor cortex (M1)-and the other from sensory areas-primary (V1) and secondary (V2) visual cortices. In the PMd-M1 motor dataset (Perich et al.), CTAE identifies a communication subspace that captures almost all reach-related variance: linear decoders trained on CTAE’s shared latents predict hand kinematics more accurately than decoders trained on shared latents extracted by prior methods or on the raw spike data. In V1-V2 recordings of drifting gratings (Semedo et al.), a linear classifier achieves high orientation-decoding accuracy using CTAE latents. The model captures direction-relevant information in both shared and private subspaces, reproduces the expected periodicity of grating responses—most strongly in V1-private latents—and avoids the tendency of existing techniques to concentrate all behaviourally relevant variance in a single shared space. By providing a simple and scalable computational tool, CTAE offers an interpretable representation for dissecting how multiple brain areas interact through shared and private latent factors to modulate behavior. Each factor may serve as a potential target for causal manipulation in closed-loop perturbation experiments.

Date
Nov 17, 2025 12:00 AM
Event
SfN, 2025
Location
San Diego, USA