如何从在线来源解压文件，并在Python中使用它们而不保存。

Question

6 浏览2023年6月5日

匿名的 2023年6月5日

0 Comments

url = 'https://ihmecovid19storage.blob.core.windows.net/latest/ihme-covid19.zip'
from io import BytesIO
from zipfile import ZipFile
from urllib.request import urlopen
import pandas as pd
# Retrieve the ZIP file from the URL
resp = urlopen(url)
zipfile = ZipFile(BytesIO(resp.read()))
# Print the second last and last files in the ZIP file
print(zipfile.namelist()[-2])
print(zipfile.namelist()[-1])
# Open the second last file and convert it to a DataFrame
a = zipfile.open(zipfile.namelist()[-2])
df2 = pd.DataFrame(a)
# Print the DataFrame
print(df2)
# You mentioned that the DataFrame has just one column. To get rid of it, you can try:
# Remove the first row (assuming it contains the column names)
df2 = df2[1:]
# Reset the index
df2.reset_index(drop=True, inplace=True)
# Print the updated DataFrame
print(df2)

我在读取部分遇到了问题。当我将"a"转换为DataFrame时，它只给我一个列。我该如何解决这个问题？

0

1 答案

匿名的 · Answer 1 · 2023-06-23T06:58:28+00:00

问题：如何从在线来源解压文件并在python中使用而不保存？

原因：在某些情况下，我们可能需要从在线来源下载文件并在python中使用，而不保存这些文件。这可能是因为我们只是需要临时使用这些文件，或者因为我们不想在本地保存大量的文件。

解决方法：我们可以使用以下步骤来实现这个目标：

1. 首先，我们需要从在线来源下载文件。我们可以使用python的urllib库中的urlopen函数来实现这一点。以下是一个示例代码，演示如何从在线来源下载文件：

import urllib.request
url = 'https://www.example.com/file.zip'
response = urllib.request.urlopen(url)
# 读取文件内容
data = response.read()

2. 接下来，我们需要解压这个文件。我们可以使用python的zipfile库来实现这一点。以下是一个示例代码，演示如何解压文件：

import zipfile
# 将文件内容存储为临时文件
with open('temp.zip', 'wb') as f:
    f.write(data)
# 解压文件
with zipfile.ZipFile('temp.zip', 'r') as zip_ref:
    zip_ref.extractall('temp_folder')

3. 现在，我们可以使用解压后的文件。例如，如果我们有一个CSV文件，我们可以使用pandas库来读取它。以下是一个示例代码，演示如何读取解压后的CSV文件：

import pandas as pd
# 从解压后的文件夹中读取CSV文件
a = 'temp_folder/file.csv'
df = pd.read_csv(a)

通过按照上述步骤，我们可以从在线来源解压文件并在python中使用它们，而不保存这些文件。这样可以节省磁盘空间，并且使我们能够更灵活地处理在线数据。