以第1、2列作为坐标、第3列作为值将CSV文件读取为一个网格。

Question

14 浏览2023年2月16日

匿名的 2023年2月16日

0 Comments

大家好，我对Python还很新，正在学习中。我正在尝试读取一个有3列的CSV文件，前两列是坐标，第三列是数值。下面是CSV文件内容的示例。\n我需要按照以下方式读取它：\n(322000.235 582999.865 149.309 ) (322000.485 582999.865 149.249 ) (322000.735 582999.865 149.193 ) (322000.985 582999.865 149.156 )\n(322000.235 582999.615 149.29 ) (322000.485 582999.615 149.217 ) (322000.735 582999.615 149.159 ) (322000.985 582999.615 149.128 )\n(322000.235 582999.365 149.276 ) (322000.485 582999.365 149.224 ) (322000.735 582999.365 149.179 ) (322000.985 582999.365 149.16 )\n...\n我写了一段代码，通过比较第三列的值与相邻值使用.shift(-1)和.shift(1)，这样做可以实现目标，但是我获得了很多不必要的数据。实际上，我想检查的不仅仅是相邻的值，而是与相邻值形成的网格进行比较，这在大多数情况下需要进行4次检查。如链接的示例所示，红色是要与所有相邻的蓝色标记进行比较的值。\n这是我目前的代码，希望这些信息足够清楚，不知道能否修改它还是应该重新开始。希望有人能帮忙。\n

from __future__ import print_function
import pandas as pd
import os
import re
Dir = os.getcwd()
Blks = []
CSV = []
for f in os.listdir(Dir):
    if re.search('.txt', f):
        Blks = [each for each in os.listdir(Dir) if each.endswith('.txt')]
print(Blks)
for f in os.listdir(Dir):
    if re.search('.csv', f):
        CSV = [each for each in os.listdir(Dir) if each.endswith('.csv')]
print(CSV)
limit = 3
tries = 0
while True:
    print("----------------------------------------------------")
    spikewell = float(raw_input("Please Enter Parameters: "))
    tries += 1
    
    if tries == 4:
        print("----------------------------------------------------")
        print("Entered incorrectly too many times.....Exiting")
        print("----------------------------------------------------")
        break
    else:
        if spikewell > 50:
            print("Parameters past limit (20)")
            print("----------------------------------------------------")
            print(tries)
            continue
        elif spikewell < 0:
            print("Parameters can't be negative")
            print("----------------------------------------------------")
            print(tries)
            continue
        else:
            spikewell
            print("Parameters are set")
            print(spikewell)
            print("Searching files")
            print("----------------------------------------------------")
    
    for z in Blks:
        df = pd.read_csv(z, sep=r'\s+', names=['X', 'Y', 'Z'])
        z = sum(df['Z'])
        average = z / len(df['Z'])
    
    for terrain in Blks:
        for df in terrain:
            df = pd.read_csv(terrain, sep=r'\s+', names=['X', 'Y', 'Z'])
            spike_zleft = df['Z'] - df['Z'].shift(1)
            spike_zright = df['Z'] - df['Z'].shift(-1)
            wzdown = -(df['Z'] - df['Z'].shift(-1))
            wzup_abs = abs(df['Z'] - df['Z'].shift(1))
            wzdown_abs = abs(wzdown)
            spikecsv = ('spikes.csv')
            wellcsv = ('wells.csv')
            spikes_search = df.loc[(spike_zleft > spikewell) & (spike_zright > spikewell)]
            
            with open(spikecsv, 'a') as f:
                spikes_search[['X', 'Y', 'Z']].to_csv(f, sep='\t', index=False)
            
            well_search = df.loc[(wzup_abs > spikewell) & (wzdown > spikewell)]
            
            with open(wellcsv, 'a') as f:
                well_search[['X', 'Y', 'Z']].to_csv(f, sep='\t', index=False)
            
            print("----------------------------------------------------")
            print('Search completed')
            
            if len(spikes_search) == 0:
                print("0 SPIKES FOUND")
            elif len(spikes_search) > 0:
                print(terrain)
                print(str(len(spikes_search)) + " SPIKES FOUND")
            
            if len(well_search) == 0:
                print("0 WELLS FOUND")
            elif len(well_search) > 0:
                print(str(len(well_search)) + " WELLS FOUND")
            
            break
        break

0

1 答案

匿名的 · Answer 1 · 2023-08-03T16:17:15+00:00

问题原因：

1. 提供的脚本没有明确描述正在处理数据的操作，导致理解困难。

2. 没有使用csv模块来读取CSV文件，导致代码冗长且容易出错。

解决方法：

1. 使用csv模块来读取CSV文件，示例代码如下：

import csv
with open('FILENAME','r') as f:
    data = []
    readr = csv.reader(f)
    for line in readr:
        data.append([float(i) for i in line])

2. 如果要进行数值计算，建议使用numpy模块。该模块已经提供了许多功能，可能已经有适合您需求的函数。具体可以参考numpy官方文档（http://www.numpy.org/）。

3. 一旦使用了numpy数组，可以参考其他人针对相同问题的解决方案，例如寻找局部极值点的问题，可以参考以下链接：

- [Find all local Maxima and Minima when x and y values are given as numpy arrays](https://stackoverflow.com/questions/31070563)

- [Get coordinates of local maxima in 2D array above certain value](https://stackoverflow.com/questions/9111711)

最后，提到了数据是地形数据、GIS和坐标系统相关的，目标是找到高度超过5米的尖峰。对于代码的整体性，原作者表示抱歉，并表示已经注意到了提供的链接中的代码。他将查看附加的链接并尝试解决问题。