在噪声的2维数组中进行峰值检测
在噪声的2维数组中进行峰值检测
我想尽可能接近地让 python
返回以下图片中最明显聚类的中心:
在我之前的问题中,我问如何获得二维数组的全局最大值和本地最大值,给出的答案工作得非常好。问题是,我通过对不同bin大小获得的全局最大值取平均得到的中心估计总是稍微偏离我会亲自设定的中心,因为我只考虑了最大的bin</ strong>而不是一组最大的bin</ strong>(就像我们通过眼睛所做的一样)。
我尝试调整对此问题的回答,但结果表明我的图像对该算法来说太嘈杂了。这是实施该答案的代码:
import numpy as np from scipy.ndimage.filters import maximum_filter from scipy.ndimage.morphology import generate_binary_structure, binary_erosion import matplotlib.pyplot as pp from os import getcwd from os.path import join, realpath, dirname # Save path to dir where this code exists. mypath = realpath(join(getcwd(), dirname(__file__))) myfile = 'data_file.dat' x, y = np.loadtxt(join(mypath,myfile), usecols=(1, 2), unpack=True) xmin, xmax = min(x), max(x) ymin, ymax = min(y), max(y) rang = [[xmin, xmax], [ymin, ymax]] paws = [] for d_b in range(25, 110, 25): # Number of bins in x,y given the bin width 'd_b' binsxy = [int((xmax - xmin) / d_b), int((ymax - ymin) / d_b)] H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy) paws.append(H) def detect_peaks(image): """ Takes an image and detect the peaks usingthe local maximum filter. Returns a boolean mask of the peaks (i.e. 1 when the pixel's value is the neighborhood maximum, 0 otherwise) """ # define an 8-connected neighborhood neighborhood = generate_binary_structure(2,2) #apply the local maximum filter; all pixel of maximal value #in their neighborhood are set to 1 local_max = maximum_filter(image, footprint=neighborhood)==image #local_max is a mask that contains the peaks we are #looking for, but also the background. #In order to isolate the peaks we must remove the background from the mask. #we create the mask of the background background = (image==0) #a little technicality: we must erode the background in order to #successfully subtract it form local_max, otherwise a line will #appear along the background border (artifact of the local maximum filter) eroded_background = binary_erosion(background, structure=neighborhood, border_value=1) #we obtain the final mask, containing only peaks, #by removing the background from the local_max mask detected_peaks = local_max - eroded_background return detected_peaks #applying the detection and plotting results for i, paw in enumerate(paws): detected_peaks = detect_peaks(paw) pp.subplot(4,2,(2*i+1)) pp.imshow(paw) pp.subplot(4,2,(2*i+2) ) pp.imshow(detected_peaks) pp.show()
这是该算法的结果情况(变化bin大小):
显然,我的背景对于该算法来说太嘈杂了,因此问题是: 如何使该算法对噪声不那么敏感?如果存在替代解决方案,请让我知道。
编辑
按照 Bi Rico 的建议,我尝试在将其传递给本地最大值查找器之前对我的二维数组进行平滑处理,如下所示:
H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy) H1 = gaussian_filter(H, 2, mode='nearest') paws.append(H1)
这是sigma
为2、4和8时的结果:
编辑2
mode =\'constant\'
比nearest
要好得多。当最大箱子大小为sigma=2
时,它会收敛于正确的中心位置:
那么,我如何获得出现在上述最后一幅图中的最大值的坐标?
我在Stack Overflow上还是个n00b,无法对Alejandro在其他地方的答案进行评论。我会对他的代码进行一些改进,以使用预分配的numpy数组作为输出:
def get_max(image,sigma,alpha=3,size=10): from copy import deepcopy import numpy as np # preallocate a lot of peak storage k_arr = np.zeros((10000,2)) image_temp = deepcopy(image) peak_ct=0 while True: k = np.argmax(image_temp) j,i = np.unravel_index(k, image_temp.shape) if(image_temp[j,i] >= alpha*sigma): k_arr[peak_ct]=[j,i] # this is the part that masks already-found peaks. x = np.arange(i-size, i+size) y = np.arange(j-size, j+size) xv,yv = np.meshgrid(x,y) # the clip here handles edge cases where the peak is near the # image edge image_temp[yv.clip(0,image_temp.shape[0]-1), xv.clip(0,image_temp.shape[1]-1) ] = 0 peak_ct+=1 else: break # trim the output for only what we've actually found return k_arr[:peak_ct]
通过使用他的示例图像对这个代码和Alejandro的代码进行剖析,发现这个代码要快大约33%(Alejandro的代码为0.03秒,我的代码为0.02秒)。我预计在峰值更多的图像上,它的速度会更快-将输出附加到列表中对于更多峰值来说会变得越来越慢。
回答您问题的最后部分,当您在一幅图像中有点时,可以通过搜索图像的局部极大值来找到它们的坐标。在情况是您的数据不是点源时,您可以对每个峰应用掩膜,以避免峰周围在未来的搜索中成为极大值。 我提出以下代码:\n
import matplotlib.image as mpimg import matplotlib.pyplot as plt import numpy as np import copy def get_std(image): return np.std(image) def get_max(image,sigma,alpha=20,size=10): i_out = [] j_out = [] image_temp = copy.deepcopy(image) while True: k = np.argmax(image_temp) j,i = np.unravel_index(k, image_temp.shape) if(image_temp[j,i] >= alpha*sigma): i_out.append(i) j_out.append(j) x = np.arange(i-size, i+size) y = np.arange(j-size, j+size) xv,yv = np.meshgrid(x,y) image_temp[yv.clip(0,image_temp.shape[0]-1), xv.clip(0,image_temp.shape[1]-1) ] = 0 print xv else: break return i_out,j_out #reading the image image = mpimg.imread('ggd4.jpg') #computing the standard deviation of the image sigma = get_std(image) #getting the peaks i,j = get_max(image[:,:,0],sigma, alpha=10, size=10) #let's see the results plt.imshow(image, origin='lower') plt.plot(i,j,'ro', markersize=10, alpha=0.5) plt.show()
\n测试用的图像ggd4可以从以下网址下载:\nhttp://www.ipac.caltech.edu/2mass/gallery/spr99/ggd4.jpg\n第一步是获取有关图像中噪声的一些信息。我通过计算整个图像(实际上最好选择没有信号的小矩形)的标准偏差来实现这一点。这告诉我们图像中存在多少噪声。\n获取波峰的想法是要求连续的峰值,这些峰值高于某个阈值(假设为噪声的3、4、5、10或20倍)。这就是函数get_max实际在做的事情。它一直搜索最大值,直到其中一个低于噪声所强加的阈值为止。为了避免多次找到同一峰值,有必要将峰值从图像中移除。通常,为此所使用的掩膜形状强烈依赖于要解决的问题。对于星星的情况,最好使用高斯函数或类似的内容来去掉星星。 为了简单起见,我选择了一个方函数,函数的大小(以像素为单位)为变量“size”。\n我认为任何人都可以通过添加更一般的内容来改进代码,从这个例子中学习到各种东西。\n编辑:\n原始图像如下:\n\n在识别出光亮点后,图像如下:\n