本文介绍了Python高斯拟合模拟高斯噪声数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用高斯拟合对来自仪器的数据进行插值。为此,我想到了使用 scipy 中的 curve_fit 函数。
因为我想先在假数据上测试此功能,然后再在仪器上尝试使用,所以我编写了以下代码来生成嘈杂的高斯数据并将其拟合:

  from scipy.optimize import curve_fit 
import numpy
import pylab

#创建一个高斯函数

def gaussian(x,a,b,c):
val = a * numpy.exp(-(x-b)** 2 / /(2 * c ** 2))
return val

#生成伪造数据。
zMinEntry = 80.0 * 1E-06
zMaxEntry = 180.0 * 1E-06
zStepEntry = 0.2 * 1E-06

x = numpy.arange(zMinEntry,
zMaxEntry,
zStepEntry,
dtype = numpy.float64)

n = len(x)

meanY = zMinEntry +(zMaxEntry-zMinEntry) / 2
sigmaY = 10.0E-06

a = 1.0 /(sigmaY * numpy.sqrt(2 * numpy.pi))
y = gaussian(x,a,meanY, sigmaY)+ a * 0.1 * numpy.random.normal(0,1,size = len(x))


#拟合
popt,pcov = curve_fit(高斯, x,y)

#打印结果
print( Scale =%.3f +/-%.3f%(popt [0],numpy.sqrt(pcov [0,0 ])))
print( Offset =%.3f +/-%.3f%(popt [1],numpy.sqrt(pcov [1,1])))
print( Sigma =%.3f +/-%.3f%(popt [2],numpy.sqrt(pcov [2,2])))


pylab.plot(x, y,'ro')
pylab.plot(x,高斯(x,popt [0],popt [1],popt [2]))
pylab.grid(True)
pylab.show()

不幸的是t他无法正常工作,代码输出如下:

 比例= 6174.816 +/- 7114424813.672 
偏移= 429.319 +/- 3919751917.830
西格玛= 1602.869 +/- 17923909301.176

结果是(蓝色是拟合函数,红色点是嘈杂的输入数据):





我还尝试查看


I need to interpolate data coming from an instrument using a gaussian fit. To this end I thought about using the curve_fit function from scipy.Since I'd like to test this functionality on fake data before trying it on the instrument I wrote the following code to generate noisy gaussian data and to fit it:

from scipy.optimize import curve_fit
import numpy
import pylab

# Create a gaussian function

def gaussian(x, a, b, c):
    val = a * numpy.exp(-(x - b)**2 / (2*c**2))
    return val

# Generate fake data.
zMinEntry = 80.0*1E-06
zMaxEntry = 180.0*1E-06
zStepEntry = 0.2*1E-06

x = numpy.arange(zMinEntry,
                 zMaxEntry,
                 zStepEntry,
                 dtype = numpy.float64)

n = len(x)

meanY = zMinEntry + (zMaxEntry - zMinEntry)/2
sigmaY = 10.0E-06

a = 1.0/(sigmaY*numpy.sqrt(2*numpy.pi))
y = gaussian(x, a, meanY, sigmaY) + a*0.1*numpy.random.normal(0, 1, size=len(x))


# Fit
popt, pcov = curve_fit(gaussian, x, y)

# Print results
print("Scale =  %.3f +/- %.3f" % (popt[0], numpy.sqrt(pcov[0, 0])))
print("Offset = %.3f +/- %.3f" % (popt[1], numpy.sqrt(pcov[1, 1])))
print("Sigma =  %.3f +/- %.3f" % (popt[2], numpy.sqrt(pcov[2, 2])))


pylab.plot(x, y, 'ro')
pylab.plot(x, gaussian(x, popt[0], popt[1], popt[2]))
pylab.grid(True)
pylab.show()

Unfortunately this does not work properly, the output of the code is the following:

Scale =  6174.816 +/- 7114424813.672
Offset = 429.319 +/- 3919751917.830
Sigma =  1602.869 +/- 17923909301.176

And the plotted result is (blue is the fit function, red dots is the noisy input data):

I also tried to look at this answer, but couldn't figure out where my problem is.Am I missing something here? Or am I using the curve_fit function in the wrong way? Thanks in advance!

解决方案

I agree with Olaf in so far as it is a question of scale. The optimal parameters differ by many orders of magnitude. However, scaling the parameters with which you generated your toy data does not seem to solve the problem for your actual application. curve_fit uses lestsq, which numerically approximates the Jacobian, where numerical problems arise because of the differences in scale (try to use the full_output keyword in curve_fit).

In my experience it is often best to use fmin which does not rely on approximated derivatives but uses only function values. You now have to write your own least-squares function that is to be optimized.

Starting values are still important. In your case you can make sufficiently good guesses by taking the maximum amplitude for a and the corresponding x-values for band c.

In code, it looks like this:

from scipy.optimize import curve_fit,fmin
import numpy
import pylab

# Create a gaussian function

def gaussian(x, a, b, c):
    val = a * numpy.exp(-(x - b)**2 / (2*c**2))
    return val

# Generate fake data.
zMinEntry = 80.0*1E-06
zMaxEntry = 180.0*1E-06
zStepEntry = 0.2*1E-06

x = numpy.arange(zMinEntry,
                 zMaxEntry,
                 zStepEntry,
                 dtype = numpy.float64)

n = len(x)

meanY = zMinEntry + (zMaxEntry - zMinEntry)/2
sigmaY = 10.0E-06

a = 1.0/(sigmaY*numpy.sqrt(2*numpy.pi))
y = gaussian(x, a, meanY, sigmaY) + a*0.1*numpy.random.normal(0, 1, size=len(x))

print a, meanY, sigmaY
# estimate starting values from the data
a = y.max()
b = x[numpy.argmax(a)]
c = b

# define a least squares function to optimize
def minfunc(params):
    return sum((y-gaussian(x,params[0],params[1],params[2]))**2)

# fit
popt = fmin(minfunc,[a,b,c])

# Print results
print("Scale =  %.3f" % (popt[0]))
print("Offset = %.3f" % (popt[1]))
print("Sigma =  %.3f" % (popt[2]))


pylab.plot(x, y, 'ro')
pylab.plot(x, gaussian(x, popt[0], popt[1], popt[2]),lw = 2)
pylab.xlim(x.min(),x.max())
pylab.grid(True)
pylab.show()

这篇关于Python高斯拟合模拟高斯噪声数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-10 22:21