## Spectral Analysis of wide variety of data in MATLAB

Signals can be any time-varying or space-varying quantity. Examples: speech, temperature readings, seismic data, stock price fluctuations.

A signal has one or more frequency components in it and can be viewed from two different standpoints: time-domain and frequency domain. In general, signals are recorded in time-domain but analyzing signals in frequency domain makes the task easier. For example, differential and convolution operations in time domain become simple algebraic operation in the frequency domain.

## Fourier Transform of a signal:

We can go between time-domain and frequency domain by using a tool called Fourier transform.

Now get comfortable with Fourier transform, let’s take an example in MATLAB:

clear; close all; clc

%%Creating dataset
fs=100;    %sampling frequency (samples/sec)
t=0:1/fs:1.5-1/fs;%time
f1=10;  %frequency1
f2=20;  %frequency2
f3=30;  %frequency3

x=1*sin(2*pi*f1*t+0.3)+2*sin(2*pi*f2*t+0.2)+3*sin(2*pi*f3*t+0.4);

We represent the signal by the variable $x$. It is the summation of three sinusoidal signals with different amplitude, frequency and phase shift. We plot the signal first.

%%Visualizing data
figure(1)
plot(x)
legend('Two-tone signal')
saveas(gcf,'signal_plot.jpg')

We then do the Fourier transform of the signal to obtain its magnitude and phase spectrum.

%%Fourier transform of the data
X=fft(x);
X_mag=abs(fftshift(X));   %magnitude
X_phase=angle(fftshift(X)); %phase

%%Frequency bins
N=length(x);
Fbins=(-N/2:N/2-1)/N*fs;

%%Magnitude response
figure(2);
plot(Fbins,X_mag)
xlabel('Frequency(Hz)')
ylabel('Magnitude')
title('FFT - Magnitude response')
grid on
saveas(gcf,'fft_mag_response.jpg')

%%Phase response
figure(3);
plot(Fbins,X_phase/pi)
xlabel 'Frequency (Hz)'
ylabel 'Phase / \pi'
title('FFT - Phase response')
grid on
saveas(gcf,'fft_phase_response.jpg')

## Power spectrum estimation using the Welch’s method:

Now we compute the power spectrum of the using the Welch’s method. We can use the function “pwelch” in Matlab to obtain the desired result.

pwelch(x,[],[],[],fs) %one-sided power spectral density
saveas(gcf,'power_spectral_plot.jpg')

Pwelch is a spectrum estimator. It computes an averaged squared magnitude of the Fourier transform of a signal. In short, it computes a set of smaller FFTs (using sliding windows), computes the magnitude square of each and averages them.

By default, $latex x$ is divided into the longest possible segments to obtain as close to but not exceed 8 segments with 50% overlap. Each segment is windowed with a Hamming window. The modified periodograms are averaged to obtain the PSD estimate. If you cannot divide the length of $latexx$  exactly into an integer number of segments with 50% overlap, $latex x$ is truncated accordingly.

Note the peak at 10, 20 and 30 Hz. Also, note the display default for Pwelch is in dB (logarithmic).

Let us inspect another data using the “pwelch” function of the Matlab.

clear; close all; clc

load officetemp1; %16 weeks 2 samples/hr

%%Visualizing data
figure(1);
plot(t,tempC)
ylabel(sprintf('Temperature (%c C)', char(176)))
xlabel('Time (weeks)')
grid on
saveas(gcf,'tempReadings.jpg')

tempnorm=tempC-mean(tempC); %temperature residual

fs=2*24*7; %2 samples per hour

figure(2);
pwelch(tempnorm,[],[],[],fs);
axis([0 16 -30 0])
xlabel('Occurrence/week')
saveas(gcf,'power_spectral_tempReadings.jpg')

## Resolving two close frequency components using “pwelch”:

Let’s first plot the data (having the frequency components at 500 and 550 Hz) using the default parameters of “pwelch” function:

clear; close all; clc

%%

pwelch(s,[],[],[],Fs);
title('pwelch with default input- f1: 500Hz, f2: 550Hz')
set(gca,'YLim',[-120,0])
set(gca,'XLim',[0,5])

saveas(gcf,'pwelchDefaultPlot.jpg')

Here, we can see that the frequency component at 500Hz can be resolved but the frequency component at 550 Hz is barely visible.

One can obtain better frequency resolution by increasing the window size:

figure;
filename = 'pwelchWindowAnalysis.gif';
for len=500:300:N
pwelch(s,len,[],len,Fs);
title(sprintf('Hamming Window size: %d',len))
set(gca,'YLim',[-120,0])
set(gca,'XLim',[0,1])
drawnow;
frame = getframe(1);
im = frame2im(frame);
[imind,cm] = rgb2ind(im,256);
if len == 500;
imwrite(imind,cm,filename,'gif', 'Loopcount',inf);
else
imwrite(imind,cm,filename,'gif','WriteMode','append');
end
end

Here, we can see that by increasing the window width, we can resolve the two components. By increasing the window width, we lose the temporal resolution but at the same time, we gain the spectral resolution.

## Resolving the frequency component using the Kaiser window:

“pwelch” function uses Hamming window by default.

L = 64;
wvtool(hamming(L))

Because of the inherent property of the Hamming window (fixed side lobe), sometimes the signal can get masked.

clear; close all; clc

figure;
pwelch(s,[],[],[],Fs);
title('pwelch with default input- f1: 500Hz, f2: 5kHz')
set(gca,'YLim',[-90,0])
set(gca,'XLim',[0,10])
saveas(gcf,'pwelchComplexDefaultPlot.jpg')

In the above figure, we can see that the frequency component of the signal at 5kHz is barely distinguishable.

To resolve this component of frequency, we use Kaiser window instead of the default Hamming window.

% %% Kaiser window
figure;
len=2050;
h1=kaiser(len,4.53); %side lobe attenuation of 50dB
pwelch(s,h1,[],len,Fs);
title('Kaiser window with 50dB side lobe attenuation')
saveas(gcf,'pwelchComplexKaise50dBsidelobe.jpg')

% %% Kaiser window
figure;
len=2050;
h2=kaiser(len,12.26); %side lobe attenuation of 50dB
pwelch(s,h2,[],len,Fs);
title('Kaiser window with 120dB side lobe attenuation')
saveas(gcf,'pwelchComplexKaise120dBsidelobe.jpg')

To obtain a Kaiser window that designs an FIR filter with sidelobe attenuation of α dB, we use the following β :

kaiser(len,beta)

## Amplitude of one frequency component is much lower than the other:

We can have some signal where the amplitude of one frequency component is much lower than the others and is inundated in noise.

To deal with such signals, we need to get rid of the noise using the averaging of the window.

clear; close all; clc

% %% Kaiser window
figure;
len=2050;
h1=kaiser(len,4.53); %side lobe attenuation of 50dB
pwelch(s,h1,[],len,Fs);
set(gca,'XLim',[8,18]);
set(gca,'YLim',[-60,-20]);
title('Kaiser window with 50dB side lobe attenuation')
saveas(gcf,'pwelchAvgComplexKaise50dBsidelobe.jpg')

In the above signal plot, the second frequency component at 14kHz is undetectable. We can get rid of the noise using the averaging approach.

figure;
filename = 'pwelchAveraging.gif';
for len=2050:-100:550
h2=kaiser(len,4.53); %side lobe attenuation of 50dB
pwelch(s,h2,[],len,Fs);
drawnow;
frame = getframe(1);
im = frame2im(frame);
[imind,cm] = rgb2ind(im,256);
if len == 2050;
imwrite(imind,cm,filename,'gif', 'Loopcount',inf);
else
imwrite(imind,cm,filename,'gif','WriteMode','append');
end
end

Here, we take the smaller window in steps to show the effect of averaging. Smaller window in frequency domain is equivalent to the larger window in time domain.

## Dealing with data having missing samples:

clear; close all; clc
%%Signals having missing samples
plot(1:length(wgt),wgt,'LineWidth',1.2)
ylabel('Weight in kg'); xlabel('Day of the year')
grid on; axis tight; title('Weight readings of a person')
saveas(gcf,'weightReading.jpg')

In case, we have missing samples in the data, i.e., the data is not regularly recorded, then we cannot apply the “pwelch” function of matlab to retrieve its frequency components. But thankfully, we have the function “plomb” which can be applied in such cases.

figure;
[p,f]=plomb(wgt,7,'normalized');
plot(f,p)
xlabel('Cycles per week');ylabel('Magnitude')
grid on; axis tight
saveas(gcf,'plombspectrum.jpg')

## Analyzing time and frequency domain simultaneously:

Sometimes, we need the time and frequency informations simultaneously. For a long series of data, we need to know which frequency component is recorded first and at what time. This can be done by making the spectrogram.

clear; close all; clc

%%
figure;
pwelch(x,[],[],[],Fs)
saveas(gcf,'powerspectrum_dtmf.jpg')

% %%
figure;
spectrogram(x,[],[],[],Fs,'yaxis'); colorbar; %default window is hamming window
saveas(gcf,'spectrogramDefault_dtmf.jpg')

Since, the time resolution is higher for smaller window and frequency resolution is lower at small window length. So there is the trade off between the time and frequency domain window length. We need to figure out the optimal point for better resolution in both time and frequency domain.

segLen = [120, 240,480,600,800,1000,1200,1600];

figure;
filename='spectrogramAnalysis.gif';
for itr = 1:numel(segLen)
spectrogram(x,segLen(itr),[],segLen(itr),Fs,'yaxis');
set(gca,'YLim',[0,1.7]);
title(sprintf('window length: %d',segLen(itr)))
colorbar;
drawnow;
frame = getframe(1);
im = frame2im(frame);
[imind,cm] = rgb2ind(im,256);
if itr == 1;
imwrite(imind,cm,filename,'gif', 'Loopcount',inf);
else
imwrite(imind,cm,filename,'gif','WriteMode','append');
end
end

## Complex Moving Waves

In the previous case, we have seen how can we model a simple wave travelling with one frequency. In nature, usually we encounter waves as an ensemble of many frequencies.

Here, let us try to add more frequencies in the previous scenario:

MATLAB code

clear; close all; clc

fs=1000;    %sampling frequency

t=0:1/fs:0.5-1/fs;%time

f=[1 2 3];  %frequency1

a=[1 2 3];  %amplitude

c=2; %wave speed

T=1./f;  %time period

w=2*pi*f;   %angular frequency

lb=c*T; %wavelength

k=(2*pi)./lb; %wavenumber

x=0:pi/200:(2.5*pi)-pi/200;

figure('Position',[440 378 800 500])

for i=1:length(t)/2

%     y=a*sin(k*x-w*t(i));    %waveform

y=a(1)*sin((k(1)*x)-(w(1)*t(i))+0.3)+a(2)*sin((k(2)*x)-(w(2)*t(i))+0.4)+a(3)*sin((k(3)*x)-(w(3)*t(i))+0.5);

plot(x,y,'--*')

title('Propagation of waves')

xlabel('x')

ylabel('Amplitude')

grid on

pause(0.05)

end

In the above code, we plotted wave containing three frequencies.

We have modelled the wave in which each frequency is travelling with the same velocity. We can add some complexity where every frequency in our wave is travelling with different speed. This is popularly known as dispersion.

Let us model our waves such that the wave speed for 1,2 and 3 Hz is 0.5,1 and 2 km/s.

In this case, we can notice that the waves are travelling in groups and its shape keep changing.

We can use the concept of hilbert transform to model the propagation of the group.

## Modelling Waves (MATLAB)

We can use waves to model almost everything in the world from the thing we can see or touch to the things which we can’t.

Here, we try to model the waves itself.

### Moving Waves

clear; close all; clc

a=1;    %amplitude

f=5;    %frequency

T=1/f;  %time period

w=2*pi*f;   %angular frequency

lb=2*T; %wavelength

k=2*pi/lb; %wavenumber

x=0:pi/200:10*pi;

t=0:0.01:2; %time

figure(1)

for i=1:length(t)

y=a*sin(k*x-w*t(i));    %waveform

plot(x,y)

pause(0.1)

end

### Fourier Transform to analyse the amplitude spectrum

clear; close all; clc

fs=1000;    %sampling frequency

t=0:1/fs:1.5-1/fs;%time

f1=10;  %frequency1

f2=20;  %frequency2

f3=30;  %frequency3

%x=5*sin(2*pi*10*t+3);

x=1*sin(2*pi*f1*t+0.3)+2*sin(2*pi*f2*t+0.2)+3*sin(2*pi*f3*t+0.4);

plot(t,x)

figure(1)

grid on

xlabel('Time')

ylabel('Amplitude')

title('Plot of 2*sin(2*pi*f1*t+0.3)-3*sin(2*pi*f2*t+0.2)+5*cos(2*pi*f3*t+0.4)')

X=fft(x);

fre=fs/length(t);

fre_hz=(0:length(t)/2-1)*fre;

X_mag=abs(X);   %X is complex

figure(2)

plot(fre_hz,X_mag(1:length(t)/2))

grid on

axis([0 40 -inf inf])

xlabel('Frequency (in hz)')

ylabel('Magnitude')

title('Magnitude spectrum of 5*sin(2*pi*10*t+3)')

### Hilbert Transform and get the envelope of the waveform

clear; close all; clc

fs = 1e4;   %sampling frequency

t = 0:1/fs:1;   %time

f1=10;

f2=20;

f3=30;

x=1*sin(2*pi*f1*t+0.3)+2*sin(2*pi*f2*t+0.2)+3*sin(2*pi*f3*t+0.4);

%x=5*sin(2*pi*10*t+3);

y = hilbert(x);

figure(1)

plot(t,real(y),t,imag(y))

% %xlim([0.01 0.03])

legend('real','imaginary')

title('Hilbert Function')

figure(2)

env=abs(y);

plot(t,x)

xlabel('Time')

title('Envelope')

hold on

plot(t,env)

legend('original','envelope')

### Application of Hilbert Transform: Edge Detection and comparison with the classical derivative method

clear; close all; clc

x = 0:0.1:100;

y = channel(x,[10 30 70]);  %channel function to define a trapezoidal channel

plot(x,y)

figure(1)

subplot(311)

plot(x,y)

grid on

title('Channel')

subplot(312)

deriv =diff(y); %dervative of the channel

plot(x(2:end),deriv)

title('Detection by derivative method')

grid on

subplot(313)

hil = hilbert(y);   %hilbert transform of the channel

env=abs(hil);

plot(x,env)

grid on

title('Detection by hilbert transform')

Channel Function

• – Utpal Kumar, IES, Academia Sinica

## Seismic Resolution

Seismic resolution and fidelity are the two important measures of the quality of the seismic record and the seismic images. Seismic resolution quantifies the level of precision, such as the finest size of the subsurface objects detectable by the seismic data whereas the seismic fidelity quantifies the truthfulness such as the genuineness of the data or the level to which the imaged target position matches its true subsurface position.

Let us try to understand this by making a synthetic data and doing the analysis over it.

Seismic Resolution Analysis: 1

Estimation of the Width of the Fresnel’s Zone

The Fresnel’s resolution quantifies the resolvability of seismic wave perpendicular to the direction of wave propagation. Fresnel’s resolution is defined as the width of the first Fresnel’s zone due to interference of the spherical waves from the source and from the receiver.

Seismic Resolution Analysis: 2

Fidelity

To investigate the fidelity of our data, let us consider the technique of resampling. For our case we consider the method of “bootstrapping”. Bootstrapping basically relies on random sampling with replacement. The other popular method for resampling is “jackknifing” which predates “bootstrapping”. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations.

The principle behind “bootstrapping” is that a dataset is taken, the total dataset is divided into two by randomly sampling with replacement. The newly sampled data are now used to invert for the model using some kernel function. If the two models correlate high enough then we can say that the prominent features in the model come from consistent signals in the data.

We don’t have the data set to make the velocity model. So instead, we can take random Gaussian distribution data and play with it.

Let’s pose the null hypothesis that the two sets of data come from the same probability distribution (not necessarily Gaussian). Under the null hypothesis, the two sets of data are interchangeable, so if we aggregate the data points and randomly divide the data points into two sets, then the results should be comparable to the results obtained with the original data. So, the strategy is to generate random datasets, with replacement (bootstrapping), compute difference in means (or difference in medians or any other reliable statistic), and then compare the resulting values to the statistic computed from the original data.

Seismic Resolution Analysis: 3

## Analytical signal and Hilbert Transform

For a real time series x(t), its analytic signal x(t) is defined as

x(t) = x(t) – iH[x(t)]

Let us consider an example of a monochromatic signal 𝑥(𝑡) = 5 sin(10𝑡 + 3).

Figure 1

Now, let us consider a more complex function

x(t) = 1*sin(2𝜋10𝑡 + 0.3) + 2*sin(2𝜋20𝑡 + 0.2) + 3*sin(2𝜋30𝑡 + 0.4).

Figure 2

We can clearly observe that the Hilbert transform estimates the instantaneous frequency of a signal for monocomponent signals only.

Matlab Codes:

clear; close all; clc

fs = 1e4;

t = 0:1/fs:1;

f1=10;

f2=20;

f3=30;

x=1*sin(2*pi*f1*t+0.3)+2*sin(2*pi*f2*t+0.2)+3*sin(2*pi*f3*t+0.4);

%x=5*sin(2*pi*10*t+3);

y = hilbert(x);

figure(1)

plot(t,real(y),t,imag(y))

%xlim([0.01 0.03])

legend('real','imaginary')

title('Hilbert Function')

figure(2)

subplot(3,1,1)

env=abs(y);

plot(t,x)

xlabel('Time')

title('Envelope')

hold on

plot(t,env)

legend('original','envelope')

subplot(312)

instph=fs/(2*pi)*unwrap(angle(y));

plot(t,instph)

xlabel('Time')

grid on

title('Instantaneous Phase')

subplot(313)

instfreq = fs/(2*pi)*diff(unwrap(angle(y)));

plot(t(2:end),instfreq)

xlabel('Time')

ylabel('Hz')

grid on

title('Instantaneous Frequency')

Application of Hilbert Transform:

Figure 3

## Aliasing and Nyquist Condition

Let us consider a simple sinusoidal signal

x(t) = 5sin(10t + 3)

This is a signal with an amplitude of 5, the angular frequency of 10 and phase of 3. The Nyquist condition demands that the angular frequency, mod(𝜔) ≤ pi/delta t.

For the above figure 1, we took 𝜔 to be 10 and pi/delta t is 99.5, so the Nyquist condition satisfies and we do not see the aliasing. Here the sampling rate, Δ𝑡 is 0.0316.

Let us see what happens when we take the pi/delta t to be less than 𝜔.

In this case, we took the sampling rate, Δ𝑡 to be 0.3307. Here the pi/delta t is 9.5000 which is less than 𝜔. Hence the Nyquist condition is not satisfied and there is a clear aliasing. The sinusoidal signal does not appear to be sinusoidal.

Figures