哪些FFT描述符应用作实现分类或聚类算法的功能?

本文介绍了哪些FFT描述符应用作实现分类或聚类算法的功能?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

但是我不确定使用哪个描述符作为基于频域的功能，因为信号存在振幅谱，功率谱和相位谱，虽然我已经阅读了一些参考文献，但对意义仍然感到困惑.在基于频域的特征向量上执行学习算法时(欧几里得距离，余弦距离，高斯函数，Chi-kernel或其他)，应使用什么距离(相似性)函数作为度量?

But I'm not sure using what descriptor as frequency domain based feature, since there are amplitude spectrum, power spectrum and phase spectrum of a signal and I've read some references but still got confused about the significance. And what distance (similarity) function should be used as measurement when performing learning algorithms on frequency domain based feature vector(Euclidean distance? Cosine distance? Gaussian function? Chi-kernel or something else?)

修改

import numpy as np
import matplotlib.pyplot as plt
sp = np.fft.fft(signal)
freq = np.fft.fftfreq(signal.shape[-1], d = 1.) # time sloth of histogram is 1 hour
plt.plot(freq, np.log10(np.abs(sp) ** 2))
plt.show()

为了确保我完全理解您的建议，我要问几个琐碎的问题:

And I have several trivial questions to ask to make sure I totally understand your suggestion:

在第二个建议中，您说忽略所有这些值."

您可以搜索两个，三个最大的峰，并使用它们的位置和宽度作为'特征'进行进一步分类."

修改

我用np.fft.rfft代替了np.fft.fft来计算正部分并绘制功率谱和对数功率谱.

I replaced np.fft.fft with np.fft.rfft to calculate the positive part and plot both power spectrum and log power spectrum.

f, axarr = plt.subplot(2, sharex = True)
axarr[0].plot(freq, np.abs(sp) ** 2)
axarr[1].plot(freq, np.log10(np.abs(sp) ** 2))
plt.show()

数字:

如果我错了，请纠正我:

figure:

Please correct me if I'm wrong:

我还看到一些参考资料，建议在执行fft之前应用窗口函数(例如汉明窗)以避免光谱泄漏.我的原始数据每5到15秒采样一次，并且在采样时间上应用了直方图，这种方法是否等同于应用窗口函数，还是我仍需要将其应用于直方图数据上?

I also see some reference suggest applying a window function (e.g. Hamming window) before doing fft to avoid spectral leakage. My raw data is sampled every 5 ~ 15 seconds and I've applied a histogram on sampling time, is that method equivalent to apply a window function or I still need apply it on the histogram data?

推荐答案

通常，您应该从整个FFT频谱中仅提取少量功能".

Generally you should extract just a small number of "Features" out of the complete FFT spectrum.

首先:使用对数功率规格.在这种情况下，复数和相位是无用的，因为它们取决于您开始/停止数据获取的位置(在许多其他事情中)

First: Use the log power spec. Complex numbers and Phase are useless in these circumstances, because they depend on where you start/stop your data acquisiton (among many other things)

第二:您会看到噪音级别"，例如大多数值都在某个阈值以下，请忽略所有这些值.

Second: you will see a "Noise Level" e.g. most values are below a certain threshold, ignore all these values.

第三:如果您很幸运，例如您的数据中包含一些谐波成分(循环，重复)，您将看到一些突出的峰.

Third: If you are lucky, e.g. your data has some harmonic content (cycles, repetitions) you will see a few prominent Peaks.

如果有清晰的峰，则检测噪声甚至更容易:峰之间的所有东西都应视为噪声.

If there are clear peaks, it is even easier to detect the noise: Everything between the peaks should be considered noise.

现在，您可以搜索两个，三个最大的峰，并使用它们的位置和宽度作为特征"进行进一步的分类.

Now you may search for the two, three largest peaks and use their location and probably widths as "Features" for further classification.

位置是峰值的x值，即频率".它说明了输入数据中的循环有多快".

Location is the x-value of the peak i.e. the 'frequency'. It says something how "fast" your cycles are in the input data.

如果在测量间隔l内周期的频率不恒定(或者在计算FFT之前使用窗口)，则峰值将比一个bin宽.因此，峰的宽度说明了周期的稳定性".

If your cycles don't have constant frequency during the measuring intervall (or you use a window before caclculating the FFT), the peak will be broader than one bin. So this widths of the peak says something about the 'stability' of your cycles.

基于此:如果两个最大峰值的频率相似且宽度相似，则两个模式相似.

Based on this: Two patterns are similar if the biggest peaks of both hava a similar frequency and a similar widths, and so on.

编辑

非常有兴趣看到您的示例之一的对数功率谱.

Very intersiting to see a logarithmic power spectrum of one of your examples.

现在很清楚，您的输入包含一个单谐波(周期性，振荡)分量，其频率(重复频率，周期持续时间)约为f0 = 0.04.(这是相对频率，与您的采样频率成正比，与各个测量点之间的时间成反比)

Now its clear that your input contains a single harmonic (periodic, oscillating) component with a frequency (repetition rate, cycle-duration) of about f0=0.04.(This is relative frquency, proprtional to the your sampling frequency, the inverse of the time beetween individual measurment points)

它不是普特正弦波，而是一些有趣的"波形.这样的波形在1 * f0、2 * f0、3 * f0等处产生峰值.(因此，使用FFT进行进一步分析是一个好主意)

Its is not a pute sine-wave, but some "interesting" waveform. Such waveforms produce peaks at 1*f0, 2*f0, 3*f0 and so on.(So using an FFT for further analysis turns out to be very good idea)

此时，您应该生成多个测量的光谱，并查看是什么构成相似的测量以及不同测量之间的差异.区分您的测量的重要"功能是什么?认为要注意:

At this point you should produce spectra of several measurements and see what makes a similar measurement and how differ different measurements. What are the "important" features to distinguish your mesurements? Thinks to look out for:

绝对振幅:突出(最左侧，最高)峰的高度.
音高(主循环速率，变化速度):这是第一个峰值的位置，即连续峰值之间的距离.
精确波形:前几个峰值的相对振幅.

如果您最重要的功能是绝对幅度，那么最好计算输入信号的RMS(均方根)电平.

If your most important feature is absoulute amplitude, you're better off with calculating the RMS (root mean square) level of our input signal.

如果音调很重要，最好通过计算输入信号的ACF(自相关函数)来实现.

If pitch is important, you're better off with calculationg the ACF (auto-correlation function) of your input signal.

不要专注于最左边的峰值，这些峰值来自您输入中的高频分量，并且其变化幅度往往与本底噪声相同.

Don't focus on the leftmost peaks, these come from the high frequency components in your input and tend to vary as much as the noise floor.

Windows

对于高质量分析，重要的是在应用FFT之前将窗口应用于输入数据.这样可以减少输入向量的结束与输入向量的开始之间的跳跃"影响，因为FFT将输入视为单个周期.

For a high quality analyis it is importnat to apply a window to the input data before applying the FFT. This reduces the infulens of the "jump" between the end of your input vector ant the beginning of your input vector, because the FFT considers the input as a single cycle.

有几个流行的窗口标记了不可避免的权衡的不同选择:单峰的精度与旁瓣的水平:

There are several popular windows which mark different choices of an unavoidable trade-off: Precision of a single peak vs. level of sidelobes:

您选择了一个矩形窗口"(相当于根本没有窗口，只需开始/停止测量).这为您的峰提供了出色的精确度，而现在这些峰的宽度仅为一个样品.旁瓣(主峰左右的小峰)为-21dB，在给定输入数据的情况下是可以忍受的.对于您来说，这是一个绝佳的选择.

You chose a "rectangular window" (equivalent to no window at all, just start/stop your measurement). This gives excellent precission of your peaks which now have a width of just one sample. Your sidelobes (the small peaks left and right of your main peaks) are at -21dB, very tolerable given your input data. In your case this is an excellent choice.

汉宁窗是一个余弦波.它使您的峰值稍微宽一些，但降低了旁瓣水平.

A Hanning window is a single cosine wave. It makes your peaks slightly broader but reduces side-lobe levels.

Hammimg窗口(余弦波，略微升高到0.0以上)产生甚至更宽的峰值，但旁瓣抑制了-42 dB.如果您希望在主峰之间出现更弱(但很重要)的分量，或者一般来说如果您有语音，音乐等复杂信号，这是一个不错的选择.

The Hammimg-Window (cosine-wave, slightly raised above 0.0) produces even broader peaks, but supresses side-lobes by -42 dB. This is a good choice if you expect further weak (but important) components between your main peaks or generally if you have complicated signals like speech, music and so on.

缩放

正确缩放频谱是一件复杂的事情，因为FFT线的值取决于诸如采样率，FFT长度，窗口甚至FFT算法的实现细节之类的东西(存在几种不同的公认惯例)

Correct scaling of a spectrum is a complicated thing, because the values of the FFT lines depend on may things like sampling rate, lenght of FFT, window, and even implementation details of the FFT algorithm (there exist several different accepted conventions).

毕竟，FFT应该显示出潜在的能量守恒.输入信号的均方根值应与频谱的均方根值相同.

After all, the FFT should show the underlying conservation of energy. The RMS of the input signal should be the same as the RMS (Energy) of the spectrum.

另一方面:如果用于分类，则足以维持相对振幅.只要上述参数不变，就可以将结果用于分类，而无需进一步缩放.

On the other hand: if used for classification it is enough to maintain relative amplitudes. As long as the paramaters mentioned above do not change, the result can be used for classification without further scaling.

这篇关于哪些FFT描述符应用作实现分类或聚类算法的功能?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！