将Pandas DataFrame转换为部分嵌套的JSON。
将Pandas DataFrame转换为部分嵌套的JSON。
我有一个类似于这个问题的问题。然而,我需要我的JSON部分嵌套。目前,我的数据框看起来像这样:
df = pd.DataFrame({'subsidary': ['公司名称', '公司名称'], 'purchase_order_number': ['PO编号', 'PO编号'], 'invoice_date': ['2018-10-15', '2018-10-15'], 'vendor_invoice_number': ['777', '777'], 'vendor_sku': ['SKU888', 'SKU888'], 'quantity': ['10', '20'], 'rate': ['12.00', '11.00'], 'amount': ['120.00', '220.00'], 'freight': ['5.00', '5.00'], 'taxes': ['0.00', '0.00']})
使用上面的链接和下面的代码:
j = (df.groupby(['subsidary', 'purchase_order_number', 'invoice_date', 'vendor_invoice_number'], as_index=False) .apply(lambda x: x[['vendor_sku', 'quantity', 'rate', 'amount']].to_dict('r')) .reset_index() .rename(columns={0: 'item_charges'}) .to_json(orient='records')) print(json.dumps(json.loads(j), indent=2, sort_keys=False))
我能得到以下结果:
[
{
"subsidary": "公司名称",
"purchase_order_number": "PO编号",
"invoice_date": "2018-10-15",
"vendor_invoice_number": "777",
"item_charges": [
{
"vendor_sku": "SKU888",
"quantity": "10",
"rate": "12.00",
"amount": "120.00"
},
{
"vendor_sku": "SKU888",
"quantity": "20",
"rate": "11.00",
"amount": "220.00"
}
]
}
]
然而,我希望它看起来像这样:
[ { "subsidary": "自然合作伙伴", "purchase_order_number": "AZ003387-PO", "invoice_date": "2018-10-15", "vendor_invoice_number": "76947", "item_charges": [ { "vendor_sku": "SUP002", "quantity": "12.00", "rate": "14.50", "amount": "174.00" }, { "vendor_sku": "SUP004", "quantity": "3.00", "rate": "8.75", "amount": "26.25" } ], "invoice_charges": { "freight": '5.00', "taxes": '0.00', } } ]
有没有办法在Python中实现这个?提前感谢。
Pandas DataFrame到部分嵌套JSON的转换方法
在处理Pandas DataFrame时,有时我们需要将数据转换为部分嵌套的JSON格式。下面是一个示例DataFrame:
df = pd.DataFrame({'subsidary': ['company name','company name'], 'purchase_order_number': ['PO Num', 'PO Num'], 'invoice_date': ['2018-10-15', '2018-10-15'], 'vendor_invoice_number': ['777','777'], 'vendor_sku': ['SKU888', 'SKU888'], 'quantity': ['10', '20'], 'rate': ['12.00', '11.00'], 'amount': ['120.00', '220.00'], 'freight': ['5.00', '5.00'], 'taxes': ['0.00', '0.00']})
我们想要将这个DataFrame转换为以下格式的JSON:
[
{
"subsidary": "company name",
"purchase_order_number": "PO Num",
"invoice_date": "2018-10-15",
"vendor_invoice_number": "777",
"freight": "5.00",
"taxes": "0.00",
"invoice_charges": [
{
"freight": "5.00",
"taxes": "0.00"
}
],
"item_charges": [
{
"vendor_sku": "SKU888",
"quantity": "10",
"rate": "12.00",
"amount": "120.00"
},
{
"vendor_sku": "SKU888",
"quantity": "20",
"rate": "11.00",
"amount": "220.00"
}
]
}
]
为了实现这个目标,我们需要在处理下一个嵌套之前,将每个嵌套存储起来。下面是解决方法的代码:
# 原始处理步骤 j = df.groupby( ['subsidary','purchase_order_number','invoice_date', 'vendor_invoice_number', "freight", "taxes"], as_index=False).apply(lambda x: x[['vendor_sku','quantity','rate','amount']].to_dict('r') ).reset_index().rename(columns={0:'item_charges'}) # 存储item_charges并继续处理 item_charges = j["item_charges"] j=j.groupby(['subsidary','purchase_order_number','invoice_date', 'vendor_invoice_number',"freight", "taxes"], as_index=False ).apply(lambda x: x[["freight", "taxes"]].to_dict('r') ).reset_index().rename(columns={0:'invoice_charges'}) # 添加之前存储的item_charges j["item_charges"] = item_charges # 转换为JSON格式 j = j.to_json(orient='records') print(json.dumps(json.loads(j), indent=2, sort_keys=False))
这段代码将DataFrame按照指定的列进行分组,并将每个分组的指定列转换为字典。然后,存储了item_charges,并继续处理其他列的分组。最后,再将存储的item_charges添加回去,并将结果转换为JSON格式。
这个方法可能不是最高效的,但可以达到我们的目标。通过这种方法,我们可以将Pandas DataFrame转换为部分嵌套的JSON格式。