Lomb Scargle Periodogram
An astronomical time series data is a series of discrete events with either an even or uneven sampling rate. This can be expressed mathematically as a pointwise product between a continuous signal with a rectangular function and a dirac comb.
The Discrete Fourier transform of this results in the convolution of the three functions in the omega space.
The square of the Fourier transform is proportional to the power of each periodic signal in the time series or more accurately:
Scargle showed that this is equivalent to fitting the time series data with the model
using a least square fitting. The resulting Chi square for each frequency is indicative of the power that the signal contributed to the time series. Essentially the difference between the absolute difference between the chi square assuming non variability and the chi square assuming a sinusoidal variation in the time series is equivalent to the Power, or the classical periodogram given by Lomb.
Scargle’s Method of Calculating the Periodogram (The Lomb-Scargle Periodogram)
In Scargle’s approach, one must define a frequency range wherein we believe the signal's frequency of oscillation lies, and then fit y(t) to the data for each individual frequency with the amplitude and phase as the free parameters.
Defining the frequency grid:
If the total time of observation is T, then a frequency f = 1/T can complete one full cycle within the observing period. This can be set as the lower limit of our frequency grid. One can also set it to 0 for simplicity. The maximum possible frequency that is detectable in the case of evenly sampled data is determined by the Nyquist limit. If t is the cadence of observation then the maximum observable frequency in the time series will be 1 / (2t).
Skimming through spurious signals and finding out periodicities in the signal
False Alarm Probability
One crucial pitfall of the Lomb-Scargle periodogram is that it assumes the data is sinusoidal. This means that the periodogram will be plagued by false positive signals even if the actual time series does not have any periodicity. A simulation would reveal that the periodogram of a pure gaussian white noise will also show peaks in the periodogram.
The advantage of the technique employed by Scargle is that the distribution of the powers of the signals detected in a pure white noise is an exponential P(omega) = exp(-p) and can be filtered out using the False Alarm Probability.
The probability that a signal with frequency omega has a power p or higher is given by Pr[P(omega)>p] = exp(-p). The probability that the peak height is lower than p is 1 - Pr. The probability that every frequency is lower than p is (1-Pr)**Ni where Ni is the number of independent frequencies. Horne, J. H., & Baliunas, S. L. (1986) gives the equation to calculate Ni as:
Ni= -6.362 + 1.193No + 0.00098No^2 .
where No is the number of datapoints in the time series.
The False Alarm Probability is defined as FAP = 1 - (1 - exp(-p))**Ni which is the probability that there exists at least one signal with power > p.
So, one would calculate the periodogram and the FAP of the peak with the highest power, the value of the FAP would tell us the probability that the given peak is actually due to a signal and not from noise.
It is also important to note that the periodogram must be normalized by the variance of the data for the FAP to yield accurate results.
Bootstrapping
Another method to detect spurious signals is to reshuffle the time series data. Firstly, we calculate the LSP which would contain peaks of actual signals and noise peaks too. Then the lightcurve is randomized and the LSP is computed again. This is done for several (say 1000) iterations and the periodogram is calculated for each randomized lightcurve, while the peak of the actual signal is lost, the noise peaks tend to persist the randomization process and these persisting peaks can be discarded.
Comments
Post a Comment