# Spectral Envelope

The **spectral envelope** is a tool to study cyclic behaviors in categorical data. It is more informative than the traditional approach of attributing a different number to each category for power-spectral density estimation.

For each frequency in the spectrum, the **spectral envelope** finds an optimal real-numbered mapping that maximizes the normed power-spectral density at this point. Therefore, no matter what mapping is choosen for the different categories, the power-spectral density will always be bounded by the spectral envelope.

The spectral envelope was defined by David S. Stoffer in *DAVID S. STOFFER, DAVID E. TYLER, ANDREW J. MCDOUGALL*, Spectral analysis for categorical time series: Scaling and the spectral envelope.

## Main functions

**spectral_envelope — Function**

```
spectral_envelope(ts; m = 3)
```

Computes the spectral envelope of an input categorical time-series.

The degree of smoothing can be chosen by the user.

Parameters:

ts(Array{Any,1}): 1-D Array containing input categorical time-series.m(Int): Smoothing parameter. corresponds to how many neighboring points are to be involved in the smoothing (weighted average). Defaults to 3.

Returns:`(freq, se, eigvecs)`

, with`freq`

the frequencies of the power-spectrum,`se`

the values of the spectral envelope for each frequency in 'freq'.`eigvecs`

contains

the optimal real-valued mapping for each frequency point.

**get_mappings — Function**

```
get_mappings(data, freq; m = 3)
```

Computes, for a given frequency `freq`

, the optimal mappings for the categories in `data`

. Scans the vincinity of `freq`

to find the maximum of the spectral envelope, prints a sum up and returns the obtained mappings.

Parameters:

data(Array{Any,1}): 1-D Array containing input categorical time-series.freq(Float): Frequency for which the mappings are wanted. The vincinity of 'freq' will be scanned to find maximal value of the spectral envelope.m(Int): Smoothing parameter. corresponds to how many neighboring points are to be involved in the smoothing (weighted average). Defaults to 3.

Returns:`mappings`

, the optimal mappings for the found maxima around 'freq'.

## Example

Applying the spectral envelope to study a segment of DNA from the Epstein-Barr virus and plotting the results:

```
using DelimitedFiles, Plots
using CategoricalTimeSeries
data_path = joinpath(dirname(dirname(pathof(CategoricalTimeSeries))), "test", "DNA_data.txt")
data = readdlm(data_path, ',')
f, se, eigvecs = spectral_envelope(data; m = 0)
plot(f, se, xlabel = "Frequency", ylabel = "Intensity", title = "test data: extract of Epstein virus DNA", label = "spectral envelope")
```

To get the associated optimal mapping for the peak at frequency 0.33:

```
mappings = get_mappings(data, 0.33; m = 0)
>> position of peak: 0.33 strengh of peak: 0.02
print(mappings)
>> Dict{SubString{String}, Float64} with 4 entries:
"A" => -0.59
"T" => 0.55
"C" => 0.0
"G" => 0.6
```