使用字符串和“for”语句来简化重复的代码 - Python
使用字符串和“for”语句来简化重复的代码 - Python
我在Python的“for”语句方面非常新手,无法实现我认为应该很简单的东西。我的代码如下:
import pandas as pd df1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])}) df2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])}) df3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])}) DF1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])}) DF2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])}) DF3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
然后:
A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]]) Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]]) A2 = len(df2.loc[df2['Column1'] <= DF2['Column1'].iloc[2]]) Z2 = len(df2.loc[df2['Column1'] >= DF2['Column1'].iloc[3]]) A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]]) Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])
如你所见,这是很多重复的代码,只有标识数字不同。因此,我第一次尝试使用“for”语句是:
Numbers = [1,2,3] for i in Numbers: "A" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]]) "Z" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])
这导致了SyntaxError: "can't assign to operator"。所以我尝试了:
Numbers = [1,2,3] for i in Numbers: A = "A" + str(i) Z = "Z" + str(i) A = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]]) Z = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])
这导致了AttributeError: 'str' object has no attribute 'loc'。我尝试了其他一些方法,如:
Numbers = [1,2,3] for i in Numbers: A = "A" + str(i) Z = "Z" + str(i) df = "df" + str(i) DF = "DF" + str(i) A = len(df.loc[df['Column1'] <= DF['Column1'].iloc[2]]) Z = len(df.loc[df['Column1'] <= DF['Column1'].iloc[3]])
但是这只给我带来了相同的错误。最终,我想要的是类似这样的代码:
Numbers = [1,2,3] for i in Numbers: Ai = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[2]]) Zi = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[3]])
其中的输出与我键入的等效:
A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]]) Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]]) A2 = len(df2.loc[df1['Column1'] <= DF2['Column1'].iloc[2]]) Z2 = len(df2.loc[df1['Column1'] >= DF2['Column1'].iloc[3]]) A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]]) Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])
在上述代码中,出现了重复的代码块,每个代码块都是为不同的动物执行相同的操作。这可能会导致重复的代码,代码冗余以及难以维护。
为了解决这个问题,可以使用Python中的"for"语句来替代重复的代码块。通过使用字符串列表,可以更方便地管理动物和条件。以下是重构后的代码:
# Change/Add animals and conditions here, make sure they match up directly Animal = ['26','45','46','47','51','58','64','65', '69','72','84'] Cond = ['Stomach','Intestine','Stomach','Stomach','Intestine','Intestine','Intestine','Stomach','Cut','Cut','Cut'] d = [] def CuSO4(): for i in range(len(Animal)): animal = Animal[i] condition = Cond[i] # load in Spike data A = pd.read_csv('TXT/INJ/' + animal + '.txt', delimiter=r"\s+", skiprows=15, header=None, usecols=range(1)) B = pd.read_csv('TXT/EKG/' + animal + '.txt', skiprows=3) C = pd.read_csv('TXT/ESO/' + animal + '.txt', skiprows=3) D = pd.read_csv('TXT/TRACH/' + animal + '.txt', skiprows=3) E = pd.read_csv('TXT/BP/' + animal + '.txt', delimiter=r"\s+").rename(columns={"4 BP": "BP"}) # Count number of beats before/after injection, divide by 10/30 minutes for average BPM. F = len(B.loc[B['EKG-evt'] <= A[0].iloc[0]]) / 10 G = len(B.loc[B['EKG-evt'] >= A[0].iloc[-1]]) / 30 # Count number of esophogeal events before/after injection H = len(C.loc[C['Eso-evt'] <= A[0].iloc[0]]) I = len(C.loc[C['Eso-evt'] >= A[0].iloc[-1]]) # Find Trach events after injection J = D.loc[D['Trach-evt'] >= A[0].iloc[-1]] # Count number of breaths before/after injection, divide by 10/30 min for average breaths/min K = len(D.loc[D['Trach-evt'] <= A[0].iloc[0]]) / 10 L = len(J) / 30 # Use Trach events from J to find the number of EE M = pd.DataFrame(pybursts.kleinberg(J['Trach-evt'], s=4, gamma=0.1)) N = M.last_valid_index() # Use N and M to determine the latency, set value to MaxTime (1800s)if EE = 0 O = 1800 if N == 0 else M.iloc[1][1] - A[0].iloc[-1] # Find BP value before/after injection, then determine the mean value P = E.loc[E['Time'] <= A[0].iloc[0]] Q = E.loc[E['Time'] >= A[0].iloc[-1]] R = P["BP"].mean() S = Q["BP"].mean() # Combine all factors into one DF d.append({'EE': N, 'EE-lat': O, 'BPM_Base': F, 'BPM_Test': G, 'Eso_Base': H, 'Eso_Test': I, 'Trach_Base': K, 'Trach_Test': L, 'BP_Base': R, 'BP_Test': S}) CuSO4() # Create shell DF with animal numbers and their conditions. DF = pd.DataFrame({'Animal': pd.Series(Animal), 'Cond': pd.Series(Cond)}) # Pull appended DF from CuSO4 and make it a pd.DF Df = pd.DataFrame(d) # Combine the two DF's df = pd.concat([DF, Df], axis=1) df
通过使用"for"循环和字符串列表,可以简化代码并避免重复。这种方法提高了代码的可读性和可维护性,并且可以更方便地添加或更改动物和条件。
在这个问题中,出现了重复的代码,使用for循环和字符串进行压缩的需求。问题主要原因是在循环中生成变量是有限制的,虽然可以这样做,但最好避免使用。解决方法是使用代码中提供的方法,通过生成器表达式和列表推导式来实现目标,而不是生成很多变量。
在给出的代码中,首先创建了两个包含数据框的列表Hanimals和Ianimals。然后,使用for循环通过这两个列表来生成BPM数据框。在生成BPM数据框的过程中,使用了列表推导式和生成器表达式来计算BPM_Base和BPM_Test的值。
在更新的代码中,更进一步地优化了解决方法。只使用了两个for循环,通过创建元组的列表来生成BPM数据框。这种方法更加高效。
总之,这个问题的出现是因为在循环中生成变量的限制,解决方法是使用列表推导式和生成器表达式来压缩重复的代码。这种方法可以更高效地生成所需的结果。