Applying the matched-filter technique to the search for dark matter transients with networks of quantum sensors

There are several networks of precision quantum sensors in existence, including networks of atomic clocks, magnetometers, and gravitational wave detectors. These networks can be re-purposed for searches of exotic physics, such as direct dark matter searches. Here we explore a detection strategy for macroscopic dark matter objects with such networks using the matched-filter technique. Such “clumpy” dark matter objects would register as transients sweeping through the network at galactic velocities. As a specific example, we consider a network of atomic clocks aboard the Global Positioning System (GPS) satellites. We apply the matched-filter technique to simulated GPS atomic clock data and study its utility and performance. The analysis and the developed methodology have a discovery reach up to three orders of magnitude above the previous GPS results and have a wide applicability to other networks of quantum sensors.

optical cavities. The particular hypothesized form of coupling of DM fields to baryonic matter determines the type of the sensor to be used in the DM search [3].
Here we focus exclusively on ultralight fields. In general, one may consider free noninteracting and self-interacting fields. We are interested in models with self-interaction that can form macroscopic dark-matter objects, such as topological defects or Q-balls [4][5][6][7]. Interactions of such DM constituents with standard model (SM) particles and fields can induce variations in fundamental constants of nature. Such variations may be detected by observing atomic frequencies in atomic clocks [8][9][10]. As the DM constituents sweep through a device, the variation would register as a transient perturbation. Geographicallydistributed networks can resolve the velocities of the transients and provide a powerful vetoing of the potential events as the sweep velocity must be consistent with standard halo model priors. These ideas form the basis of dark matter searches with distributed networks of atomic clocks [11][12][13][14][15], magnetometers [16,17], and gravitational wave detectors [18][19][20] alike.
Our GPS.DM group focuses on searching for DM-induced transients in GPS atomic clock data [12,13]. Here the advantage is the public availability of nearly two decades of archival data enabling relatively inexpensive data mining. A dark matter signature would consist of a correlated propagation of atomic clock perturbations through the GPS constellation at galactic (∼300 km/s) velocities. Previously, our GPS.DM collaboration performed an analysis of the archival GPS data in search for domain walls (a particular type of topological defect) [13]. Although no DM signatures were found, new limits were placed on certain DM couplings to atoms that were several orders of magnitude more stringent than prior astrophysical limits. The original search [13] focused on finding large DM signals well above the instrument noise. In Ref. [14], we have shown that the application of more sophisticated Bayesian search techniques can extend the discovery reach by several orders of magnitude both in terms of sensitivity and size/geometry of the DM objects.
Here we study the performance of the matched-filter technique (MFT) as an alternative frequentist search method. In addition, we develop analytic results for idealized network of white-noise sensors with cross-node correlation.
The MFT is a relatively ubiquitous technique, utilized, for example, by the Laser Interferometer Gravitational Wave Observatory (LIGO) in gravitational wave detection (see, e.g. [21]). It is also used in a variety of applications, such as astrophysics [22], geophysics [23], and searches for exotic physics [24]. Of a special interest to us is the performance of the MFT for large networks, as the GPS.DM sensor array may include over a hundred instruments if we take into account all GNSS constellations and terrestrial clocks. A technical complication in applying the MFT to these networks is that the noise is correlated between spatially separated nodes of the network. Understanding the consequences of this cross-node correlation is of importance to our analysis.
The structure of this paper is as follows. Section 1.1 reviews the MFT for data analysis and previous applications. Section 1.2 formalizes requirements for a network detector of clumpy DM. Section 1.3 reviews the relevant theory and how the GPS network can be repurposed for DM searches of this sort. A summary of processing the GPS data is provided in Sect. 1.4. In Sect. 2, we describe our methodology, including the formulation of our detection statistic and the search algorithm. The detection threshold, detection probability and parameter estimation capabilities are provided in Sects. 3.2, 3.3, and 3.4, respectively. Lastly, the projected discovery reach for this method is provided in Sect. 3

.5 and
Sect. 4 draws conclusions. The paper also contains appendices where we discuss network covariance matrix and its inversion, inverse transform sampling, and present supporting derivations. Since the intended audience includes both atomic and particle physics communities, we restore and c in the formulas in favor of using natural or atomic units. We also use the rationalized Heaviside-Lorentz units for electromagnetism.

The matched-filter technique
The matched-filter technique is often used to search for hidden signals within data streams in cases where the signal's "shape" is known but the signal's strength is not. In this case, the general shape can be compared to the data stream to search for an underlying correlation that is not immediately evident to the un-aided eye. The matched-filter itself is the best estimate of the unknown signal strength by employing an optimal filter. In the most general sense, an optimal filter is a particular combination of data that optimizes a quantity deemed to be significant, usually relating to signal detection within data sets [21]. Usual quantities of interest include the detection probability for a given signal strength and the signal-to-noise ratio (SNR), though many other application-specific statistics can be devised.
Since the MFT requires a predefined signal shape, this approach cannot be used for unmodeled signals. When a hypothesized signal signature is able to be modeled, there often exist many candidate shapes along with the unknown signal strength. Thus, one forms a collection of signal shape templates that approximately spans the range of possible shapes. One could think of the MFT as a technique that maximizes an overlap between the templates and the data stream. This maximization is done with the help of a matched-filter statistic, such as a SNR. However, it is not usually the value of the SNR alone that determines the level of overlap, but rather the value of the SNR compared to a threshold. Additionally, one of the most promising aspects of the MFT is the ability to align the shape of the detected signal with the template that produced an SNR above the detection threshold, leading to immediate signal parameter estimation.
The efficacy of the MFT depends on how well one can distinguish a weak signal from intrinsic device noise. Thus, a network of devices can offer a better sensitivity and higher confidence in the event of a positive detection since all network sensors would experience a signal from the same event. a If a hypothesized signal shape can be modeled for a network of devices, the MFT becomes a powerful tool for weak signal detection and signal parameter estimation.

Examples of the MFT in practice
Perhaps the most well-known application of the MFT comes from the gravitational wave detection by LIGO (see e.g., [25]). A detailed outline of search techniques and matchedfiltering is provided in Ref. [21] and guided much of our development of the method discussed in this paper. However, LIGO's use of the MFT involved a small network of devices that exhibit uncorrelated noise. The black hole merger gravitational wave detection from 2015, for instance, used only two spatially separated interferometers in the waveform template matching analysis [25]. Another previous application of the MFT used 15-20 station from the International Deployment of Accelerometers (IDA), where geophysicists were been able to identify previously undetected global seismic events in archival IDA data [23]. The method has also found use in galaxy cluster identification [22] and in the search for neutrino-less beta decay [24].
All of these examples consider networks of devices far smaller than that available (∼100) to our DM search. Furthermore, an essential feature of our GPS.DM network search is a cross-node correlated noise (due to a reference clock common to all the nodes, see Sect. 1.4), which, to the best of our knowledge, has not been addressed in the literature. A related theoretical development for ultralight (non self-interacting) DM fields is presented in Ref. [26]. There a quantum sensor network was considered for detecting DM waves and a SNR statistic was developed in the frequency domain. By contrast, here we focus on network detection of dark matter transients and develop a SNR statistic in the time domain.

Network desiderata
Here we are devising a strategy for detecting macroscopic DM objects that sweep through a distributed network of N D sensors. We use the words "sensor" and "node" interchangeably. In particular, a single geographical location may host several instruments, yet each individual sensor is referred to as a distinct node for our purposes. We will assume that DM objects interact (non-gravitationally) with the instruments and the interaction only occurs when the bulk of the DM object overlaps with a sensor.
There are several criteria for such networks: (i) The network should be sufficiently dense so that the finite-size DM object can overlap with at least several geographically-distinct nodes. (ii) The network size should be sufficiently large to increase the likelihood of encountering compact DM objects spread throughout the galaxy. (iii) As per the standard halo model (SHM) [27] the DM objects sweep through the network at galactic velocities (v g ∼ 300 km/s), the sampling rate should be sufficiently high to enable tracking the propagation of the DM object through the network. The tracking enables reconstructing the geometry of the encounter. (iv) Although not necessary, it is desired that the encounters of DM objects with the network are sufficiently rare, so that only a single DM object interacts with the network at any given time. While most of these requirements are apparent, criterion (iii) deserves further discussion. For example, while a setup [28] of two co-located clocks with a shared optical cavity can be considered as a rudimentary two-node network, such network does not satisfy criterion (iii) since even if both clocks were to register a DM signal, there would be no galactic velocity/direction signature to support the signal's DM origin. Thus, such low-sampling rate networks can only be used to constrain couplings to the DM sector.
For concreteness, we focus on topological defects. Inside the defect, the amplitude of the DM field A and the average energy density of the defect is related by ρ inside = A 2 /( cd 2 ), where d is the width or spatial scope of the defect (we use the convention where the field has units of energy). The DM object width d is treated as a free observational parameter and, for TD models, may be linked to the mass of the DM field particles m φ through the healing length which is on the order of the Compton wavelength d = /(m φ c). Further, the local DM energy density ρ DM may be linked to d and A by assuming that these objects saturate the local DM energy density, where τ avg ∼ d/v g is the average duration of crossing through a point-like instrument and T is the average time between subsequent encounters of the DM objects with the device [13].
As for the non-gravitational interactions, to be specific and consistent with our earlier work [14], we assume the quadratic scalar portal, where m f are the fermion masses, φ is the scalar DM field (measured in units of energy), Γ X are coupling constants that quantify the DM interaction strength, and ψ f and F μη are the SM fermion fields and the electromagnetic Faraday tensor, respectively. The SM fermions f in the above equation are summed over implicitly. Such interactions appear naturally for DM fields possessing either Z 2 or U(1) intrinsic symmetries. The above Lagrangian leads to an effective redefinition of fundamental masses and coupling constants, where ω 0 is the nominal clock frequency, X runs over relevant fundamental constants, and κ X are dimensionless sensitivity coefficients. For convenience, we introduced the effective constant, Γ eff ≡ X κ X Γ X , which depends on the specific clock. The effective coupling constants for the GPS network microwave atomic clocks (Rb, Cs and H) read (using computations [40,41], see [14] for details, and Ref. [42] for illucidating the underlying logic) Although linear combinations of the coupling constants differ for each type of clock in the GPS network, individual coupling constants Γ X (or, equivalently, individual energy scales Λ X ) can be obtained by combining the results from different clock types. A laboratory optical Sr clock has provided the most stringent constrains on Λ α for specific regions of the (d, T ) parameter space (see, e.g. [28]). More recently, new constraints have been placed on Λ α , Λ m e , and Λ m q by our GPS.DM collaboration [13] and on Λ α by a global network of optical laboratory clocks [15]. These two papers reported null results for domain wall searches.

Thin domain walls
In this paper we will focus on a specific type of DM signal-"thin" domain walls. While retaining the main features of more complicated signals from other types of DM "clumps", this signal offers a sufficiently simple, analytically treatable signature. Domain wall-like signatures can appear naturally in the context of bubbles, i.e., domain walls closed on themselves [13]. Locally, one can neglect the bubble curvature as long as the bubble radius is much larger than the spatial extent of the sensor network. Another example are Q-balls that couple to SM fields through derivative couplings, φφ * → ∂ μ φ∂ μ φ * in Eq. (2). Q-balls are spherically symmetric objects with a nearly flat density profile in the bulk. Thus the dominant part of the interaction would occur at the Q-ball walls. Again one needs to require the radii of these objects to be much larger then the network size. Since bubbles and Q-balls are spherically symmetric, gravitationally interacting ensembles of these DM objects are a subject to the equation of state for pressureless cosmological fluid as required by the ΛCDM paradigm.
We distinguish between "thin" and "thick" walls based on the sampling rate, which is finite for any realistic device. If the interaction time with the device d/v g is shorter than the sampling interval τ 0 , the exact arrival time of the DM clump is not resolved, and neither its shape. Thus the DM object is "thin" for observational purposes if its size d v g τ 0 . For domain walls, strictly speaking, the relevant velocity is its component normal to the wall, v ⊥ .
For the GPS sampling interval of τ 0 = 30 s, the above arguments translate into domain walls of thicknesses below the Earth size, d 300 km s -1 × 30 s ≈ 10 4 km. Any domain wall with a thickness larger than this value is characterized as "thick" and is discussed in more detail in Ref. [14].
The thin wall network signature is formalized in Sect. 2.2. For thin walls, the value of the effective coupling relates to the maximum DM-induced accumulated clock phase (time) τ = d/v ⊥ being the interaction time between the wall moving at velocity v and an individual device. Again, for the wall to be "thin", we require τ to be less than the sampling time interval τ 0 . With the theoretical background established, we now review GPS data and characterize the utility of applying the matched-filter technique for DM search in network data streams. This includes establishing a signal-to-noise ratio test statistic and benchmarking the method via simulation.

Overview of GPS data
A detailed description of modern GPS data acquisition and processing techniques and their application in precision geodesy can be found in Ref. [43]. Details relevant to DM searches with GPS constellation are given in Ref. [13]. Here, we briefly review the main aspects of GPS atomic clock data and introduce relevant concepts and terminology.
In our search, we analyze the GPS data generated by the Jet Propulsion Laboratory (JPL) [44]. This data consists of clock biases, the difference in clock phases (i.e., the operational "time" as counted by the clocks) between a given satellite clock and a fixed reference clock, and are sampled at τ 0 = 30 s intervals. The same reference clock is used for the entire network of satellite clocks on any given day. The data set also provides the satellite orbits, so we know the locations of the networks nodes (satellites). The JPL performs the initial GPS data processing [44]. In their processing, they do not limit clock bias behavior, meaning that real transient signals are not removed as outliers.
Each clock's bias data, denoted d (0) j where the subscript enumerates data points (epochs), is non-stationary and is dominated by random walk process. Prior to our analysis, we "whitten" the data by performing the first-order differencing and define a new data stream This differencing procedure is sufficient for the Rb satellite clocks while a second-order differencing procedure (d (2) is often preferred for Cs clocks. We refer to the differenced data d (1) j as the pseudo-frequency due to its proportionality to the discrete clock bias derivative. The units of pseudo-frequency d (1) j are nanoseconds. As shown in Ref. [14] the pseudo-frequency noise is dominated by the Gaussian white noise. To streamline notation, for the rest of the paper, d j ≡ d (1) j . Such data standard deviation σ is related to the commonly used Allan deviation σ y (τ 0 ) as σ = τ 0 σ y (τ 0 ).
An important aspect of the GPS time series data is that it consists of the individual clock noise and the noise of the reference clock. So, for each clock a, the noise component can be represented as where e j is the individual clock noise and c j is the contribution from the reference clock noise common to all data streams. Here and below the superscript enumerates sensors. While both sources of noise in pseudo-frequencies are dominated by the Gaussian white noise, in our simulations we will include realistic auto-correlation functions for the GPS clocks computed in [14].

Simulating GPS data
Characterizing the efficacy of the MFT is contingent on our ability to simulate the GPS atomic clock data. A detailed description of GPS data simulation along with a direct comparison to the GPS archival data is provided in Ref. [14]. The essence of the simulation method comes from utilizing the known power spectral densities for each clock (from JPL) to "color" pseudo-random white noise [45]. Moreover, we are able to simulate crossclock correlation by adding an extra set of simulated white noise with standard deviation equal to that of a typical reference clock (σ × ≈ 0.006 ns) to all of the simulated satellite clock data streams. This effectively acts as the common reference clock noise component to the satellite data in Eq. (11).
With the necessary background provided, we pivot to a description of our methodology in next section.

Methodology
We wish to determine whether there is significant evidence that a thin wall DM signal is present in the GPS archival atomic clock data or not. This can be formalized by the following two-sided hypothesis test: where h represents the strength of the possible hidden DM signal. Note that the alternative hypothesis H 1 can be thought of a union of all possible alternatives H h where h may be any non-zero real number. Furthermore, note that we allow the signal strength h to be both positive and negative, which is different than a typical signal strength search that treats h as an amplitude (and therefore non-negative). This is because the sign of the DM interaction coupling Γ is not known a priori. That is, we do not know if the DM-induced perturbations will cause the GPS atomic clocks to tick faster or slower.
To perform the aforementioned hypothesis test, we must formulate a detection statistic ρ, which in this case will be a signal-to-noise ratio and is formulated in the following section. Then, in order to claim detection, we must first establish a detection threshold ρ * (provided in Sect. 3.2) to compare to our observed statistic ρ.
The cases in which our search produces an observed SNR that exceeds the established threshold or not are treated differently. If our search results in a detection statistic larger than the threshold, we then wish to estimate the parameters relating to the DM interaction event. This is discussed in Sect. 3.4. On the other hand, if our search does not result in a detection statistic indicating an event, we wish to place limits on the signal strength h which translates into into limits on the DM coupling Γ via Eq. (9), see Sect. 3.5.

Formulation of test statistic
Consider a candidate DM model M that would leave a coherent signal in network sensor data set D. The data set may or may not include a candidate model-prescribed signal s(θ ), where θ is the specific set of parameters that define the signal signature, such as the DM object's velocity, orientation, arrival time and strength. If there is no signal within the data, then s = 0. Taking a frequentist approach, all one needs is a likelihood function for the data set. This is given by the following Gaussian: where K is the normalization factor and Here d a j is data for the ath clock at epoch j and E is the covariance matrix for the network. To streamline notation, we dropped the explicit reference to template parameters θ . The indices a, b run over N D sensors (excluding the reference clock for GPS) and j, l range over the epochs (data points) in the observation time window of length J W . In an equivalent vector notation, where d is the data stream and s is the signal stream. The likelihood in Eq. (12) is a multivariate function of the signal parameters θ . An important aspect of our method is the signal linearity with respect to its strength h so that we can define a "unit" signal that is scaled by its strength This way, our likelihood becomes Since we do not know the shape or parameters of the unit signal s (or if there exists a signal within the data at all), we span the space of all possible signals by forming a large repository of unit signal templates (a discussion of how we form the repository of templates is provided in Sect. 2.2). Suppose we form a unit signal template s i with a set of randomly (but strategically) chosen parameters. We then compare this template to the data stream via the likelihood function in Eq. (16). The template specific likelihood is then a function of only the signal strength h since each of the other signal parameters have been fixed to form s i . In this case, the template-specific likelihood can be re-cast as a function of h alone Hereĥ is the signal strength that maximizes the template-specific likelihood and σ h is the template-specific likelihood standard deviation. We quantify how well the signal template s i matches the signal within the data via a signal-to-noise ratio statistic defined as We emphasize that our use of the SNR as a statistic rather than its square is to retain the dependence on the sign of the DM coupling. The SNR statistic depends on the inverse of the covariance matrix E. Properties of the covariance matrix and its inversion techniques are discussed in Appendix 1. Note that Eq. (20) is general in that it applies to any modelled signal (monopoles, strings, walls, etc.), though here we treat the case of thin walls only.
In general, due to the central limit theorem, we can assume that the intrinsic noise of the network sensors is Gaussian (the noise may be colored). This was the underlying assumption in writing the likelihood (12). Then the SNR statistic (20) is a linear combination of Gaussian random variables, and as such, the SNR is also a random Gaussian variable. Now, recall that we wish to span the space of possible model-prescribed signals with a repository of unit signal templates. Then, given a repository of M randomly generated templates, we define our detection statistic ρ as the template-specific SNR with the largest magnitude Choosing this as the detection statistic results in finding the signal template that maximizes the multivariate likelihood (12). With our detection statistic defined, in the next sections we outline our method of unit template generation and provide an overview of our procedure for searching the GPS datastreams for DM events.

Template generation
Each signal template is determined by the DM model used (in this case, the thin domain wall) and the necessary parameters associated with the event: velocity (and incident direction), time of the event and thickness of the DM object. Since it would take an infinite number of model-prescribed signal templates to cover such a continuous parameter space, we strategically generate our finite repositories (template banks) of signals with a Monte-Carlo approach using prior distributions for individual parameters. When generating signals for a template bank, we employ importance sampling for each parameter according to these prior distributions in an effort to approximately span the continuous parameter space with a finite sample. This approach is formalized in Appendix 3.
We use the SHM to generate necessary parameter prior distributions. The velocity distribution for DM objects in the halo is quasi-Maxwellian and isotropic with a dispersion around v ≈ 300 km s -1 [27]. In addition, there is a motion of the Solar system through the halo at galactic velocities of v g ≈ 220 km s -1 . The resulting most probable incident direction of a DM object is along the path of the Sun's orbit in the galaxy, toward the Cygnus constellation. This implies that over 90% of DM events would come from the forward facing hemisphere [13]. Further discussion of priors for event parameters such as domain wall width and event rate is provided in Ref. [14]. The event parameters (velocity, incident direction, etc.) determine in which order the nodes are swept, as well as how quick the sweep is. These characteristics distinguish the templates within the template bank.
After generating the template parameters, we form the signal templates. The thin domain wall is characterized by an interaction time with each device (clock) of less than the sampling time interval; thereby, the profile contains delta-functions of time. The DMinduced clock bias (phase) of a given clock a is proportional to an integral of the frequency shift (5). Further, the bias data stream is given with respect to a fixed reference clock R, which is also affected by the domain wall. We thus distinguish between maximum signals h a and h R as shown in Eq. (9). The signal in the differenced data stream s a j (1) then reads (for the case when the wall encounters the clock a prior to the reference clock R) where the time at epoch j is j × τ 0 , a discrete time on the sampling grid, and j a , j R are the epochs in which the satellite clock and reference clock interact with the DM object, respectively. This template is shown graphically in Fig. 1.
In a homogeneous network of clocks, the values of h a and h R are the same, thereby allowing us to split a single h from the differenced signal to form the templates with unit spikes at j a and j R . This is also true for any network under the assumption that the Γ m e coupling dominates over other couplings (i.e., |Γ m e | |Γ α,q |) since the Γ m e contribution in Eqs. (6)-(8) is the same for all clock types. Assuming that any other coupling dominates, we may still split a single h from the differenced signal but the unit templates will contain a unit spike at j a and a spike of magnitude η ≡ h R /h a = Γ R eff /Γ a eff and of opposite sign at j R .

Figure 1
Time series for a differenced thin domain wall signal. Here clock a is affected by the DM object prior to the reference clock R. When the thin wall object interacts with clock a between epochs j a -1 and j a , a spike of magnitude h a is seen at epoch j a . Then, as the thin wall object sweeps the reference clock between epochs j R -1 and j R , a spike of opposite sign and magnitude h R is seen at epoch j R We would also like to highlight the importance of a well-spaced network for the MFT approach and template generation. Well-spaced meaning that few (if any) satellite clocks are affected by the DM wall within the same 30 s time period as the reference device. If the network nodes were not sufficiently spatially separated, the signal templates from Eq. (22) would collapse into "null" templates, where all elements of the signal stream are zero. This is due to the node devices and reference device de-synchronizing and re-synchronizing all within the time period of one epoch, effectively eliminating detectable DM interaction effects on the data stream.
Beyond individual template generation, we must choose an appropriate number of templates in the repository to accurately span the parameter space. In Sect. 3.3 we gauge how the number of templates affects the DM signal detection capabilities.

Analytic results for idealized network
Now we turn to the general SNR (20) and determine the statistical properties of SNR for thin domain wall signals (22). In this section, we consider an analytically treatable case of an idealized network comprised of N D identical white noise sensors. We additionally incorporate a white noise reference sensor common to all the sensors. This common noise reference sensor is especially relevant to GPS clock network, where it arises due to all clock biases reported with respect to a single reference clock. We will denote the intrinsic noise variance of the network sensors and the reference sensor as σ 2 and σ 2 × , respectively. Both the sensors and the reference can be affected by dark matter transients.
We will determine the expected distribution of the template-specific SNR values ρ i given that there is no signal in the data stream, Prob (ρ i |H 0 ), as well as the distribution of the detection statistic given that there is a signal of strength h present, Prob (ρ|H h ), for this idealized sensor network.
As discussed in Sect. 1.1, the central quantity of interest, SNR (20), is a Gaussian random variable and as such its probability distribution is fully characterized by its mean value and variance. Because it is random, the SNR can fluctuate. Due to these fluctuation, even in the absence of the DM signal, the SNR may attain large values that can be falsely misinterpreted as the presence of the DM signal. The larger the SNR variance, the larger the fluctuations are, and the larger detection threshold must be.
For the idealized network, the inverse of the covariance matrix needed to compute the SNR statistic can be found analytically (see Appendix 1) where ξ ≡ N D σ 2 × /σ 2 . Now, if there is a signal present in the data stream, each individual sensor's data is given by d a j = e a jc j + hs a j (a sum of an individual sensor noise, reference noise and a signal term). When the signal is absent, one can simply set h → 0. Our explicit computation using Eq. (20) with s i = s (see Appendix 2 for derivation) results in a Gaussian distribution for ρ with a mean of and variance of Here η ≡ h R /h a = Γ R eff /Γ a eff is the ratio between the strength of the signal experienced for the reference clock to that of the satellite clocks. Note that we assumed that device degeneracy (multiple sensors experiencing a signal at the same epoch) can be ignored. Note that when the DM signal is absent (h = 0), μ ρ = 0 while σ ρ remains constant. Moreover, the standard deviation of template-specific SNR σ ρ i is also constant at 1. The probability density distribution for SNR (for a fixed, matching template) is given by Assuming that none of the couplings Γ X dominates the DM interaction with GPS devices, η ≈ 1 for any of the satellite-reference clock combination (see Eqs. (6)(7)(8)). This remains true if either Γ α or Γ m e dominates the interaction. The only major deviation of η from a value near 1 is when Γ m q is the dominate coupling, for which the use of a network of Rb clocks with an H-maser reference clock will result in η ≈ 2. For the following analysis, we will assume that η = 1.
In the limit ξ 1, i.e., σ × σ / √ N D , we arrive at μ ρ = h √ 2N D /σ , recovering the known result for a network of uncorrelated devices (see e.g., Ref [21]). For networks with large cross-correlation or large number of devices (ξ 1), we arrive at μ ρ = h √ N D /σ , a factor of √ 2 less than the uncorrelated network. Regardless of the level of cross-correlation, the network sensitivity grows with the sensor number as √ N D . In our search, we do not use the exact inversion of the covariance matrix as was used to derive the expressions in this section. Instead we incorporate a perturbative inversion (see Appendix 1) which assumes that the reference clock noise is small compared to the noise of the satellite clocks.

Multiple events
The Bayesian technique outlined in [14] assumed there to be at most one DM interaction event in any particular time window of the archival GPS data. So far, here we have also only treated the case of a single thin wall interaction event occurring in a time period of J W epochs. However, if we consider dark matter encounters to be Poisson distributed in time, with an average time between consecutive events T , over the 20 years of archival data we would expect there to be N E = (20 years)/T events. Then, extending our search window J W to contain the total number of epochs in the entire two-decade window of GPS data, and assuming that consecutive events are non-overlapping, we find that the mean of our detection statistic (24) increases by a factor of √ N E , while the variance of the statistic remains unchanged. This ultimately improves our sensitivity by √ N E . Qualitatively, this is due to the fact that we measure the signal strength N E times.
While this section analyzed an idealized network, simplifying assumptions will be lifted in full numerical simulations in later sections. We will use real colored noise autocorrelation functions for heterogeneous networks of GPS clocks.

Determination of a detection threshold
In order to find a detection threshold, we must determine how the detection statistic behaves in the case when there is no signal present in the data. This way we can determine whether the computed statistic provides a significant evidence for rejecting the null hypothesis and claiming DM detection. Rather than obtaining the distribution for our test statistic given the null hypothesis is true, Prob(ρ|H 0 ), we determine our detection threshold in a different yet equivalent fashion given the nature of our test statistic.
Recall that we define our detection statistic as the template-specific SNR with the maximum magnitude out of a repository of M templates. Instead of assessing the distribution of the maximum magnitude ρ i , we simply assess the distribution of ρ i . To this end, we performed Monte-Carlo simulations consisting of ≈10 6 SNR calculations on event free simulated data and confirmed that the distributions for ρ i given the null hypothesis is true are Gaussian with a mean of zero and standard deviation σ ρ i . The results of the simulations for various simulated clock networks (clock networks for the years 2000, 2005, 2010, and 2015; see Table 1) are provided in Fig. 2. Given that the template-specific SNR behaves in this fashion, the probability that any of the templates in the repository produces an SNR value larger in magnitude than some SNR threshold ρ * = n * σ ρ i , is  where M is the number of templates used in the template repository. This is a false positive rate per epoch. A reliable SNR threshold will ensure that we can expect less that 1 false positive in Z epochs. The value of n * that meets this criterion is The most reliable threshold would allow less than one false positive in the entire span of data. For 20 years of archival 30-second GPS data, Z = 2.1 × 10 7 epochs. With M = 1024 templates in the repository, the value of n * is 6.57, corresponding to a threshold SNR of ρ * = 6.57σ ρ i . However, less strict detection thresholds may be used to identify possible weak candidate events for further investigation. Note that the value for n * depends weakly (logarithmically) on the number of templates M, thereby it does not vary significantly for different sized template repositories. Ultimately, our detection threshold is given by ρ * = n * σ ρ i . Using the distributions in Fig. 2, we calculate the detection thresholds for each network allowing for 1 false positive per day, 10 false positives per year, and 1 false positive in 20 years. Table 2 summarizes the results.

Future networks and alternative data processing
As mentioned in Sect. 3.1, in our simulations we use a perturbative inverse of the covariance matrix; see Appendix 1. This approximation relies on the noise level of the satellite clocks being sufficiently larger than the noise of the reference clock. The GPS clock networks for the years 2000 through 2015 satisfy the quiet reference clock requirement. In recent years, however, more stable Rb-IIF clocks with noise comparable to that of the reference clocks have been added to the GPS constellation thereby weakening the justification for the perturbative approximation for E -1 . Moreover, future GNSS networks will contain a plethora of ground-and satellite-based H-maser clocks to be exploited in our searches (Galileo satellites already house stable H-masers [46]). Switching our method to use the exact inversion mitigates, but with the trade-off of computational overhead. We wish to avoid using an exact inversion of the covariance matrix due to the fact that it drastically increases computation time and would add a considerable amount of time to a search through the GPS data.
Considering the insufficiency of the perturbative approximation for more recent clock networks, here we offer a possible mitigation technique. As more accurate satellite clocks are being placed in orbit, more reference clocks are being placed around the globe. We propose pairing each of the satellite clocks with their own reference clock, thereby eliminating the cross-correlation caused by the use of a single reference clock that is inhibiting current search techniques. Suppose there are N D atomic clocks, satellite-and Earth-based, at our disposal with half of them being Earth-based. The large level of cross-correlation that restricts the perturbative inversion may be eliminated by using data from N D /2 satellite-Earth clock pairs. The application of the matched-filter technique can be reformulated for a network of device pairs and is left for future work when such networks become a reality.

Detecting events
In the event of a weak DM signal presence in the data stream, it may not be immediately noticeable in the atomic clock data due to the randomness of the clock noise. The advantage of the SNR (and a detection statistic in general) is to provide a clear gauge of the signal presence. We verify that the SNR statistic is capable of detecting sought DM signals via simulation.
To this end, we simulated 2 hours of data for a network of N D = 30 clocks that exhibit Gaussian white noise with a standard deviation of σ = 0.05 ns along with a white noise reference clock contribution with noise level σ × = 0.006 ns. We then injected a signal of strength h = 0.1 ns in the middle of the data stream with normal velocity of v ⊥ ≈ 300 km s -1 and incident direction angles θ ≈ 1.7π rad and φ ≈ 0.2π rad (this in the Earth-centered Inertial (ECI) J2000 frame), which are the most probable event parameters according to the SHM and our previous calculations (see [14]). The results of performing our search technique on this data set is shown in Fig. 3  While the injected signal is not recognizable by the eye in the simulated data streams, a spike in the detection statistic at the time of the injected event is apparent.
Note that the search method is not aware of the event's strength, speed, direction, time of occurrence, or the fact that there was an injected signal at all. The injected signals were generated independently of the search routine and the template bank.

Detection probability
The main figure of merit of the MFT algorithm is a detection probability curve for the various clock networks that have been collecting data for the past two decades. The detection probability is defined as the probability that the observed detection statistic exceeds the detection threshold given that the alternative hypothesis is true: Prob(ρ > ρ * |H h ). We wish to determine the detection probability for our various clock networks as a function of the signal strength h and obtain a 95% detection probability signal strength, denoted h 95%,D.P. .
Monte-Carlo simulations produced the detection probability curves shown in Fig. 4. The simulation scheme consisted of 128 trials where a randomly-generated thin-wall signal of strength h was injected into a data stream for a particular clock network and the detection statistic was calculated for every epoch within the simulated data stream. An event was considered found if the computed SNR exceeded the network's threshold within one epoch of the injected event time. We compared the calculated SNR values with two different detection thresholds: one that allows for 10 false positive events per year and another one that allows for less than 1 false positive event in the 20 year span of the GPS data. The number of found events divided by the number of iterations gave us the detection probability. This detection probability was computed for a range of injected signal strengths.  Along with the detection probability curves, we plot the average of the 128 SNR calculations at the epoch where the signal was injected as a function of the signal strength. This is also shown in Fig. 4. We can see that the SNR is a linear function of h, as expected. This fact helps form a confidence interval for the signal strength h in the event that a DM signal is found.
To verify that our detection probability curve provides us with the correct value for h 95%,D.P. , we performed an auxiliary simulation. Here we injected signals of strength h 95%,D.P. = 0.045 ns for the 2010 clock network into a simulated data stream. For each signal injection with random parameters, we calculate the detection statistic ρ. The histogram of the detection statistic for 10 5 of these simulations is provided in Fig. 5. The resulting histogram confirms that the distribution for Prob(ρ|H h , h = h 95%,D.P. ) is indeed Gaussian. Moreover, using a Gaussian distribution with the same mean and standard deviation as calculated, we find that Prob(ρ > ρ * |H h , h = h 95%,D.P. ) = 0.946, almost exactly as expected.
A major factor associated with detection probability is the number of devices in the network N D . A more complete discussion of this using the analytic results from Sect. 3.1 is provided in Sect. 3.5.1. Our analysis of detection probability using simulated GPS data was continued by the varying the number of devices in the network N D . We injected signals of varying strength into simulated homogeneous networks of 20, 30, and 50 white noise devices. The percentage of events found as a function of the injected signal strength for these networks is shown in Fig. 6. The average value of the detection statistics for each signal strength and clock network is also provided in the same figure. It is clear that our sensitivity to weaker signals improves as the number of devices in the network increases. We have found that h 95%,D.P. ∝ 1/ √ N D , as expected. To complete our analysis of factors affecting detection probability, we tested the effect of the template repository size M. To this end, we simulated a network of 30 homogeneous devices with standard deviation σ = 0.05 ns and injected events of varying strengths into the data streams. The simulated reference clock had a standard deviation of σ × = 0.006 ns. We then calculated detection statistics for the event using repositories of 256, 1024, and 4096 templates. The effect of template size on sensitivity in provided in Fig. 7 along with the corresponding detection statistic. Notice that an increased template repository size results in better sensitivity and larger SNR values. However, increasing the number of templates results in an increase in the false positive rate [Eq. (27)] along with an increase in computation time. To balance the false positive rate, detection probability and computation time, we typically use 1024 templates in our template repositories.

Parameter estimation
In the event that we find a DM signal in the data stream, our main goal is to estimate the parameters associated with the DM object that caused the signal. Among the parameters of interest are the incident speed, incident direction, event time, and signal strength. The estimates we provide on these parameters correspond to the parameters associated with the model-prescribed signal template that results in an SNR above the detection threshold.
In order to test the efficacy of our parameter estimation, we performed ≈ 20,000 iterations of injecting a DM signal of considerable strength (twice the level of the noise standard deviation, σ = 0.05 ns) with random parameters into a stream of simulated white-noise data for N D = 30 clocks. For each iteration, we calculate the SNR for every epoch in the simulated time window and store the event parameters that resulted in an SNR above the detection threshold. These extracted parameters are then compared to the injected parameters to check the precision of our parameter extraction. Histograms depicting our precision are shown in Fig. 8. Our resulting resolution was the following: ±27 km s -1 for velocity, ±0.05π radians for incident angle θ , and (though not shown in Fig. 8) ±5 s for the event time. The average detection statistic value ρ of the 128 trials at the time the event was injected

Placing limits
Suppose we do not observe a DM interaction signature in the GPS atomic clock data stream. This means that there were no SNR values with a magnitude above the detection threshold ρ * . We may then establish the lower and upper limits on the DM signal strength h. For the upper limit, suppose the largest SNR value we observed was ρ obs . We define the 95% confidence upper limit h 95%,U.L. as the minimum value of h for which Prob(ρ > ρ obs |H h ) = 0.95.
That is, we find the minimum value of h for which we would observe an SNR value larger than ρ obs 95% of the time if there was in fact a signal of strength h in the data stream.

Maximum and minimum sensitivity given clock network characteristics
Before analyzing the data, we can project a minimum upper limit and maximum lower limit on h by replacing ρ obs → ρ obs . Then, the minimum 95% confidence upper limit for h is the minimum value of h for which We will denote the value of h that satisfies this requirement by h * . Assuming that events are weak, i.e., well below the noise floor, it is clear by the nature of the SNR ρ obs → 0. Once again, the maximum lower limit is defined in a similar fashion, resulting in -h * serving as the maximum lower limit for the signal strength. The maximum possible exclusion limits can be placed on the magnitude of h with 95% confidence by bounding it by h * : |h| < |h * |.
For the idealized sensor network of Sect. 3.1, we are able to find the exact relation between h * and the network characteristics (N D , σ , ξ , and η) using probability distribution (26). Ultimately, we find that where K is determined by the level of confidence (for 95% confidence K = 1.64). Notice that when ξ = N D σ 2 × /σ 2 1 (i.e., when cross-correlation is negligible) our sensitivity is Kσ / N D N E (1 + η 2 ). However, when cross-correlation is considerable, or the network is large (ξ 1), the sensitivity becomes Kσ / √ N D N E . Thus, when the reference sensor is noisy, its sensitivity encoded by the constant η = Γ R eff /Γ a eff is effectively suppressed. We can also estimate our minimum sensitivity by projecting a maximum upper limit and minimum lower limit on h by using our detection threshold value as the maximum possible observed SNR value for which we do not claim a detection. Then, the minimum 95% confidence lower limit is given by the minimum value of h for which Prob(ρ > ρ * |H h ) = 0.95. It should be clear that the maximum upper limit above is the same as the 95% detection probability signal strength h 95%,D.P. . In this case, h 95%,D.P. scales in the same fashion as h * above, but the constant of proportionality K increases as a function of the false positive rate and template repository size (for less than 1 false positive in 20 years and 1024 templates in the repository, K = 8.2). Ultimately, this makes the minimum sensitivity reach nearly an order of magnitude below the maximum predicted reach.

Projected sensitivity and discovery reach
Given a DM model type (domain wall, monopole, etc.), the signal strength h links to the specific field parameters for those models. For a thin domain wall, the average strength h is related to the effective coupling and DM object parameters by in agreement with Ref. [13].
If we assume that a specific coupling Γ X dominates the effective coupling then Γ eff → κ X Γ X in Eq. (32). The limit on h, |h avg | ≤ h * translates into a limit on the coupling constant for a particular coupling to a fundamental constant Γ X where for idealized network h * is given by Eq. (31). This provides projected exclusion limits on the effective energy scale Λ X = 1/ √ |Γ X |, Our projected discovery reach using h * for the 2010 GPS network (σ ≈ 0.02 ns, σ × ≈ 0.006 ns, N D = 33 clocks, and η ≈ 1), is plotted in Fig. 9 along with existing constraints. This discovery reach includes the possibility of multiple DM interaction events occurring within the time window of the search, which results in a sensitivity that is comparable to that of optical clocks [15,28]. Notice that the projected sensitivity reach in Fig. 9 exhibits a sharp cutoff for domain-walls of thickness larger than 10 4 km and for average times between events larger than T = 20 years. This is due to the fact that DM objects of size larger than 10 4 km will affect satellite clocks and the reference clock simultaneously, resulting in a no detectable signal is the data stream. Moreover, for thin domain walls, we require that the signal be present for just a single epoch. The regime for d ≥ 10 4 km belongs to "thick" domain walls (see Ref. [14]). The sharp cutoff for average time between events larger than 20 years comes from the fact that only two decades of archival data exists.
Also notice that for a fixed DM wall thickness, the increase in sensitivity from the previous GPS results is larger for shorter average times between interaction events. This is due to the fact that the sensitivity of this approach is proportional to √ N E (see Sect. 3.1.1) while previous GPS work did not consider the case of multiple events. Thus, as T decreases, the expected number of events increases making the gap between previous GPS constraints ( [13,14]) and the predicted reach of this work larger-an effect that is exaggerated for small T .

Conclusion
In this paper we focused on detecting dark matter transients with networks of atomic sensors. We formalized the desired characteristics of such networks and developed applications of matched-filter technique in the network settings. We extended the previous literature to the practically important case of a network with cross-node correlations. This setting is especially relevant to GPS atomic clock network. Our simulations have proved the method's signal detection and event parameter estimation capabilities.
While our paper deals with classical networks of quantum sensors, it is worth noting recent proposals [48,49] for massively entangled networks of atomic clocks. In these networks, entanglement is spread not only over an atomic ensemble at a single node, but also over nodes. We leave generalization of our paper to entangled networks for the future work when such entangled networks become a reality.

Figure 9
Matched-filter technique discovery reach. Projected discovery reach for thin wall dark matter objects using the matched-filter technique along with existing constraints. The red dotted lines represent the least stringent and most stringent discovery reaches for the 2010 GPS atomic clock network. The shaded blue regions are the constraints coming from astrophysics [47] while the salmon shaded regions are the constraints placed by previous work from the GPS.DM collaboration [14]. The green shaded region contains the constraints placed by optical clock experiments [28], while the yellow region contains the constraints from a global terrestrial network of laboratory clocks [15]

Appendix 1: The network covariance matrix and its perturbative inversion A.1 Properties of the network covariance matrix
The covariance matrix E is given by the ensemble average where n a j is the noise in the datastream of the ath device at the temporal grid point (epoch) j, and n a j = 0 is assumed. Here subscripts j and l range over epochs and the superscripts a and b span network sensors. When a = b the covariance refers to a single instrument, while cross-node correlations are given by the a = b elements. The matrix E ab jl can be visualized as a 2D matrix with super-indexes (aj) and (bl): E (aj)(bl) . The dimension of the matrix is determined by the number of devices in the network (excluding the reference clock) and the number of points in data window, N D × J W . Because the data streams are stationary, the covariance matrix only depends on the lag |j -l|. From the definition (35), it is apparent that the covariance matrix is symmetric with respect to swapping the (aj) and (bl) superindexes. Further, the covariance matrix is positive (semi-) definite.
For the GPS constellation, as discussed in Sect. 1.4, the noise component entering the definition (35) can be represented as where e a j is the individual clock noise and c j is the contribution from the reference clock noise common to all data streams. Then as the reference and the node clock noises are uncorrelated. While the definition of the covariance matrix (35) explicitly refers to noise, in practice [14] we use data d a j to compute this matrix (this assumes that DM events are exceedingly rare, so that most of contributions into the above values comes from the intrinsic noise of the network). Notice that in our approximation, the covariance matrix does not depend on the spatial geometry of the network-however it does depend on the network composition. For example, for GPS, if the reference clock is switched to a different clock or a satellite clock is swapped, the covariance matrix is affected and needs to be recomputed.
To gain an insight into the structure of the covariance matrix, consider a simplifying case: suppose the network is comprised from white-noise devices (including reference clock). Then where σ 2 a and σ 2 × are variances for the individual nodes and the reference clock respectively. The common noise source contributes to all the clocks. For example, for N D = 2 nodes and J W = 3 time window, the covariance matrix reads The block structure of the covariance matrix is apparent. Each block corresponds to individual sensors and the elements inside each block refer to epochs. For colored noise sensors each block is assembled from elements of the auto-correlation functions A a (|j -l|) = e a j e a l /σ 2 a and A × (|j -l|) = c j c l /σ 2 × , so that Eq. (37) becomes Thereby, the covariance matrix is a block matrix: the block diagonals are composed of the sum of cross-node and individual device auto-correlation functions while the off-diagonal blocks contain the cross-node correlation. By the definition of the auto-correlation function, |A a,× (|j -l|)| ≤ 1, and typically inside each block the elements with larger lag (further away from diagonals) become smaller. A a,× (0) = 1 by definition. Based on these observations, and to aid in computer implementation, we can introduce blocks A ab and X ab , so that the corresponding blocks of the covariance matrix E ab = A ab + X ab , with each block internally assembled as (cf., Eq. (40)) Because X ab does not depend on the particular sensor, we simply refer to all such blocks as X, i.e., X ab ≡ X.
We are interested in the inverse of the covariance matrix required for computing the SNR statistic (20). For our white noise example (39), the inversion of the first matrix is trivial as it is diagonal. The second (x-node covariance matrix) contribution introduces off-diagonal matrix elements making the inversion difficult. Moreover, the ×-node covariance matrix is singular.
It is instructive to rewrite the definition of the inverse, EE -1 = I, in our notation, We derived the covariance matrix inverse in a closed form for a special case of white noise devices (38), where ξ ≡ N D σ 2 × /σ 2 . One can verify directly that Eq. (43) is satisfied. The inverse retains the same block structure as the original matrix (38). Its derivation is outlined in the following section.
Additionally, certain simplifications can be obtained using discrete Fourier transformation (DFT) (see, e.g., Ref. [26], where DFT for a network covariance matrix was carried out). In DFT, the transformed matrix becomes block-diagonal, each block being of dimension N D , thus simplifying the inversion procedure. However, this approach also requires DFT of the sought DM signal. In our work, this signal is non-oscillatory making the interpretation of the DFT procedure non-transparent; we leave the DFT implementation for future work if the computational speed-up is needed. In this work, the full covariance matrix (40) is inverted numerically using Cholesky decomposition. Below we outline a perturbative method which holds in the limit when the noise of the reference sensor is well below that of the network sensors. We used this perturbative inversion for older (pre-2015) GPS data, where this approximation remains valid.

A.2 Perturbative inversion
This approximation to inverting the network covariance matrix was used in our earlier GPS.DM work [14] and we will detail it below. It relies on the von Neumann series expansion, where G and F are matrices and λ is the expansion (book-keeping) parameter. This identity can be proven by matching terms of the same power of λ in the definition (G + λF) -1 (G + λF) = I. The series converges as long as the absolute values of eigenvalues of the product matrix FG -1 are smaller than 1/|λ|. Returning to our covariance matrix E, we make the following identification using our block decomposition (42) and λ = 1. Notice that G is a block-diagonal matrix (inverse of such a matrix is again a block-diagonal matrix composed of inverses of original blocks). Now we illustrate this perturbative technique for a network of white-noise devices. Here the covariance matrix is given by Eq. (38). Then the above decomposition leads to G ab jl = σ 2 a δ ab δ jl , F ab jl = σ 2 × δ jl . Then which are the two leading terms in the expansion of the exact result (44) in ξ . Nominally, the contribution of the second, perturbative, term is suppressed when σ × min a (σ a ), i.e., this approximation can only be used for networks of sensors that have noise levels far greater than that of the reference sensor. In cases when the reference sensor noise is comparable to that of the network sensors, the approximation breaks down and either an exact inversion must be used or mitigation techniques must be implemented to eliminate or minimize the reference sensor contribution to the individual sensor data streams (see Sect. 3.2.1).
For the general case of colored noise sensors, we return to our block decomposition (42) of the network covariance matrix and focus on a single block, Because G is block-diagonal, (G -1 ) ab = δ ab (A aa ) -1 . To simplify the second term in the expansion, recall the product rule for block matrices, This rule parallels the conventional matrix multiplication with individual matrix elements replaced with blocks. Then Further approximation may consist in neglecting off-diagonal matrix elements of the ×-node correlation function in the above expression [14], X ≡ σ 2 × I. In this secondary approximation, This is the approximation used in our calculations for pre-2015 GPS network generations.
We also point out that the exact covariance matrix inversion (44) for our idealized network of white noise sensors can be derived by following these block-matrix steps and summing the von Neumann series (45) to all orders.

A.3 Performance comparison between exact and perturbative inversion
In order to utilize the perturbative inversion outlines in the above section, we require that the reference device noise be sufficiently smaller than that of the node devices. To verify the inadequacy of the perturbative inversion for networks with a noisy reference sensor, we simulated signal-free data for a homogeneous network of 30 white noise node devices for various reference device noise levels (σ × /σ = 0, 0.1, 0.5, and 1). We then calculated standard deviation of ≈ 25,000 template-specific SNR values for the simulated data streams using both the exact inversion and the approximate inversion of E. The results of these simulations are provided in Table 3. We find that the results using the perturbative inversion are nearly identical to the exact inversion for small levels of cross-correlation (σ × /σ ≤ 0.1). However, when the noise of the reference sensor is large, the approximate inversion behaves poorly compared to the expected value of σ ρ . Note that the deviations from σ ρ = 1 in the exact inversion column can be attributed to sampling error.

Appendix 2: Derivation of SNR for idealized network
Consider a homogeneous network of N D devices each with Gaussian white noise profiles with zero mean and standard deviation σ , along with a reference sensor also with a Gaus- sian white noise profile with a standard deviation of σ × and also with zero mean. That is, e j ∼ Normal 0, σ 2 and c j ∼ Normal 0, σ 2 × .
Data that contains a dark matter transient signal will be of the form where e j is the node sensor Gaussian noise at epoch j (with variance σ 2 ), c j is the reference sensor Gaussian noise at epoch j (with variance σ 2 × ), and s j is the "unit-ized" DM signal which is scaled by the DM signal strength h (the strength of the signal felt by the network sensors). Since the network is assumed to be homogeneous, h is the same for all network sensors, though we allow for the possibility of the strength of the DM interaction with the reference device to be different than that of the satellite nodes. In this case, the reference sensor experiences a signal of strength h R from the same event in which the network sensors experience a signal of strength h. Then, the unit signal for a sensor a will be of the form (in the case that h > 0 and the network sensor interacts with the DM wall prior to the reference device) s a = {0, . . . , 0, 1, 0, . . . 0, -η, 0, . . . , 0}, where η = h R /h. In order to calculate the detection statistic mean [Eq. (24)] and its variance [Eq. (25)], we must utilize the inverse of the covariance matrix from Appendix 1. This is given by where ξ = N D σ 2 × /σ 2 . Recall the definition of the template-specific SNR from Eq. (20), and suppose that the template repository contains the exact signal that lies in the data stream, i.e., s i = s for some signal template in the repository. Then, the detection statistic is given by ρ = d T E -1 s/ s T E -1 s. Thus, calculating the expectation value of ρ and its variance will consist of calculating s T E -1 s as well as d T E -1 s.
Using the thin wall template from Eq. (53), the vector-matrix-vector product is computed as the sum where N D is the number of network sensors and J W is the number of epochs in the given time window. Factoring out 1/σ 2 , one can sum over the terms separated by the subtraction independently: (1) sum over s a j δ jl δ ab s b l and (2) sum over -s a j δ jl 1 N D ξ 1+ξ s b l . The sum in (1) is just the sum of the squares of all the signal terms from Eq. (53) multiplied by the number of devices. Since the signal terms are all zero except at epochs where the satellite clock a and the reference clock R are affected, the sum simplifies immensely. Now, to compute the sum in (2), it must be dissected further. The Kronecker delta δ jl in (2) collapses the sum over l and we split the sum in (2) based on whether the signal terms come from distinct sensors or not Notice that the first term on the right side of this equation is the same as (1) above. Now, since every clock experiences the same signal term at the epoch when the reference sensor interacts with the DM wall (epoch j R ), the second term on the right can undergo yet another dissection The first term in parentheses on the right side is simply η 2 . Since the signal template values at all epochs consists of entirely null values besides the epochs where the individual sensors and the reference sensor interact with the DM wall, the second term on the right is only non-zero when there are disparate sensors that are affected by the DM object at the same epoch. We denote the ratio of network sensors that are close enough spatially to interact with the DM wall within the sampling time interval τ 0 as λ. For GPS, τ 0 = 30 s and at galactic velocities, this "fractional degeneracy" factor λ ≈ 0.2. Ultimately, we arrive at s T E -1 s = 1 σ 2 1 + η 2 N D -ξ 1 + ξ 1 + η 2 + (N D -1) η 2 + λ .
For a time window large enough to contain multiple non-overlapping events, it is clear that s T E -1 s → N E s T E -1 s, where N E is the number of events contained within the J W time window.
To derive d T E -1 s, we recall that each network sensor data term is the sum of noise (comprised of node sensor noise and reference sensor noise) and a signal [Eq. (52)]. Then, where the elements of n are given by Eq. (36). Since the elements of n are Gaussian random variables with zero mean, the quantity n T E -1 s will also be Gaussian distributed with a mean of zero. Furthermore, since s T E -1 s is a constant given by Eq. (57), we find that d T E -1 s is Gaussian distributed with a mean of hs T E -1 s and variance equal to the variance of n T E -1 s. This implies that the nature of the SNR detection statistic from Eq. (20) is Gaussian as well, with the same mean and standard deviation as d T E -1 s scaled by 1/ s T E -1 s. Ultimately, using Eq. (57), the mean of the SNR when a signal is present is given by where we dropped the dependence on the fractional degeneracy factor λ. The variance of the SNR, is computed in Sect. B.1, where we prove that σ 2 ρ = 1 in the most general case; this holds, in particular, for our idealized network.

B.1 SNR variance
The goal of this section is to prove that Var d T E -1 s = s T E -1 s, implying that the SNR variance (60) is σ 2 ρ = 1. The proof holds regardless of the nature of the covariance matrix-for example it applies to colored noise network with arbitrary cross-node correlations. Explicitly, where n is the intrinsic noise. To streamline notation, we will use Greek letters to index combinations (a, i), where as in Sect. By the definition of the covariance matrix, n α n β = E αβ . Further, β E αβ (E -1 ) ββ = δ αβ , which reduces as we intended to prove. From Eq. (60), it follows that σ 2 ρ = 1.

Appendix 3: Inverse transform sampling (importance sampling)
Consider a prior probability density function on one of the DM model parameters p(θ ) (e.g., the standard halo model velocity distribution). The cumulative distribution function (CDF) for the prior is defined as We can then define g(u) = θ = C -1 (u). This is particularly useful for sampling from known probability distributions: if u is randomly drawn from a uniform [0:1] distribution, then the θ = g(u) values will be drawn from the prior p(θ ) distribution. This has the affect of concentrating the sampled points in the regions where p(θ ) is large (and thus naturally reducing the probability of false-positives where p(θ ) is small). Thereby, the priors are taken into account implicitly in the template generation procedure. Note that this just the standard method of inverse transform sampling.