使用字符串和“for”语句来简化重复的代码 - Python

19 浏览
0 Comments

使用字符串和“for”语句来简化重复的代码 - Python

我在Python的“for”语句方面非常新手,无法实现我认为应该很简单的东西。我的代码如下:

import pandas as pd
df1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})

然后:

A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])  
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])
A2 = len(df2.loc[df2['Column1'] <= DF2['Column1'].iloc[2]])  
Z2 = len(df2.loc[df2['Column1'] >= DF2['Column1'].iloc[3]])
A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])  
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])

如你所见,这是很多重复的代码,只有标识数字不同。因此,我第一次尝试使用“for”语句是:

Numbers = [1,2,3]
for i in Numbers:
    "A" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
    "Z" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])

这导致了SyntaxError: "can't assign to operator"。所以我尝试了:

Numbers = [1,2,3]
for i in Numbers:
    A = "A" + str(i)
    Z = "Z" + str(i)
    A = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
    Z = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])

这导致了AttributeError: 'str' object has no attribute 'loc'。我尝试了其他一些方法,如:

Numbers = [1,2,3]
for i in Numbers:
    A = "A" + str(i)
    Z = "Z" + str(i)
    df = "df" + str(i)
    DF = "DF" + str(i)
    A = len(df.loc[df['Column1'] <= DF['Column1'].iloc[2]])
    Z = len(df.loc[df['Column1'] <= DF['Column1'].iloc[3]])

但是这只给我带来了相同的错误。最终,我想要的是类似这样的代码:

Numbers = [1,2,3]
for i in Numbers:
     Ai = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[2]])
     Zi = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[3]])

其中的输出与我键入的等效:

A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])  
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])
A2 = len(df2.loc[df1['Column1'] <= DF2['Column1'].iloc[2]])  
Z2 = len(df2.loc[df1['Column1'] >= DF2['Column1'].iloc[3]])
A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])  
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])

0
0 Comments

在上述代码中,出现了重复的代码块,每个代码块都是为不同的动物执行相同的操作。这可能会导致重复的代码,代码冗余以及难以维护。

为了解决这个问题,可以使用Python中的"for"语句来替代重复的代码块。通过使用字符串列表,可以更方便地管理动物和条件。以下是重构后的代码:

# Change/Add animals and conditions here, make sure they match up directly
Animal = ['26','45','46','47','51','58','64','65', '69','72','84']
Cond = ['Stomach','Intestine','Stomach','Stomach','Intestine','Intestine','Intestine','Stomach','Cut','Cut','Cut']    
d = []
def CuSO4():
    for i in range(len(Animal)):
        animal = Animal[i]
        condition = Cond[i]
        # load in Spike data
        A = pd.read_csv('TXT/INJ/' + animal + '.txt', delimiter=r"\s+", skiprows=15, header=None, usecols=range(1))
        B = pd.read_csv('TXT/EKG/' + animal + '.txt', skiprows=3)
        C = pd.read_csv('TXT/ESO/' + animal + '.txt', skiprows=3) 
        D = pd.read_csv('TXT/TRACH/' + animal + '.txt', skiprows=3)
        E = pd.read_csv('TXT/BP/' + animal + '.txt', delimiter=r"\s+").rename(columns={"4 BP": "BP"})
        
        # Count number of beats before/after injection, divide by 10/30 minutes for average BPM.
        F = len(B.loc[B['EKG-evt'] <= A[0].iloc[0]]) / 10   
        G = len(B.loc[B['EKG-evt'] >= A[0].iloc[-1]]) / 30
        
        # Count number of esophogeal events before/after injection
        H = len(C.loc[C['Eso-evt'] <= A[0].iloc[0]])
        I = len(C.loc[C['Eso-evt'] >= A[0].iloc[-1]])
        
        # Find Trach events after injection
        J = D.loc[D['Trach-evt'] >= A[0].iloc[-1]]
        
        # Count number of breaths before/after injection, divide by 10/30 min for average breaths/min
        K = len(D.loc[D['Trach-evt'] <= A[0].iloc[0]]) / 10
        L = len(J) / 30
        
        # Use Trach events from J to find the number of EE
        M = pd.DataFrame(pybursts.kleinberg(J['Trach-evt'], s=4, gamma=0.1))
        N = M.last_valid_index()
        
        # Use N and M to determine the latency, set value to MaxTime (1800s)if EE = 0
        O = 1800 if N == 0 else M.iloc[1][1] - A[0].iloc[-1]
        
        # Find BP value before/after injection, then determine the mean value
        P = E.loc[E['Time'] <= A[0].iloc[0]]
        Q = E.loc[E['Time'] >= A[0].iloc[-1]]
        R = P["BP"].mean()
        S = Q["BP"].mean()
        
        # Combine all factors into one DF
        d.append({'EE': N, 'EE-lat': O,
                  'BPM_Base': F, 'BPM_Test': G,
                  'Eso_Base': H, 'Eso_Test': I,
                  'Trach_Base': K, 'Trach_Test': L,
                  'BP_Base': R, 'BP_Test': S})
CuSO4()
# Create shell DF with animal numbers and their conditions.
DF = pd.DataFrame({'Animal': pd.Series(Animal), 'Cond': pd.Series(Cond)})
# Pull appended DF from CuSO4 and make it a pd.DF
Df = pd.DataFrame(d)
# Combine the two DF's
df = pd.concat([DF, Df], axis=1)
df

通过使用"for"循环和字符串列表,可以简化代码并避免重复。这种方法提高了代码的可读性和可维护性,并且可以更方便地添加或更改动物和条件。

0
0 Comments

在这个问题中,出现了重复的代码,使用for循环和字符串进行压缩的需求。问题主要原因是在循环中生成变量是有限制的,虽然可以这样做,但最好避免使用。解决方法是使用代码中提供的方法,通过生成器表达式和列表推导式来实现目标,而不是生成很多变量。

在给出的代码中,首先创建了两个包含数据框的列表Hanimals和Ianimals。然后,使用for循环通过这两个列表来生成BPM数据框。在生成BPM数据框的过程中,使用了列表推导式和生成器表达式来计算BPM_Base和BPM_Test的值。

在更新的代码中,更进一步地优化了解决方法。只使用了两个for循环,通过创建元组的列表来生成BPM数据框。这种方法更加高效。

总之,这个问题的出现是因为在循环中生成变量的限制,解决方法是使用列表推导式和生成器表达式来压缩重复的代码。这种方法可以更高效地生成所需的结果。

0