如何使用python读取和映射CSV的多行标题行

前端之家收集整理的这篇文章主要介绍了如何使用python读取和映射CSV的多行标题行 前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

我有一个CSV file,它是从数据库下载的(因为它是CSV格式),现在我必须解析为JSON Schema..不要担心此链接,只是github gist

enter image description here

我面临的问题是其多行标题检查CSV File Here

如果您注意到文件中的内容

>在CSV的第一行上具有第一行标题,然后在下一行具有
这些标头的所有值.
>在CSV文件的第三行上,它具有标题的第二行,然后是下一行
具有这些标头的所有值.
>在CSV文件的第5行,它具有标题的第3行,然后是下一行
具有这些标头的所有值.

您也可以在这里注意到模式,

>第一行标题没有任何标签
>第二行标题只有一个标签
>页眉的第三行有两个选项卡

这适用于所有记录.

现在的第一个问题是标题的多行.
第二个问题是如何像我一样将其解析为嵌套的json.
我尝试过Create nested JSON from CSV解决方案之一,并注意到我的csv第一个问题.

我的样子是这样我只在尝试解析架构的初始字段的地方

import csv
import json


def csvParse(csvfile):
    # Open the CSV
    f = open(csvfile,'r')
    # Change each fieldname to the appropriate field name.
    reader = csv.DictReader(f,fieldnames=("Order Ref","Order 
Status","Affiliate","Source","Agent","Customer Name","Email 
Address","Telephone","Mobile","Address 1","Address 2","City","County/State","Postal Code","Country","Voucher Code"," Voucher Amount","Order Date","Item ID","Type","Supplier Code","Supplier Name","Booking Ref","Supplier Price","Currency","Selling Price","Depart","Arrive","Origin","Destination","Carrier","Flight No","Class","Pax Type","Title","Firstname","Surname","DOB","Gender","FOID Type"))

customer = []
data = []
# data frame names in a list
for row in reader:
    frame = {"orderRef": row["Order Ref"],"orderStatus": row["Order Status"],"affiliate": row["Affiliate"],"source": row["Source"],"customers": []}

    data.append(frame)
最佳答案
这不是csv解析器友好的,因为该文件包含多个csv,并且至少一个包含2个具有相同名称的列,这会阻止使用DictReader.

我将首先构建一个包装程序,该包装程序可以解析每个csv片段,并在到达另一个片段时引发一次停止迭代.我将使用re模块来查找不同的标头.

以下是包装程序的代码

class multi_csv:
    levels = [re.compile('Order Ref,Order Status,Affiliate,Source,Agent,'
                       '.*,Country,Voucher Code,Voucher Amount,Order Date'),re.compile('\s*,Item ID,Type,Supplier Code,Supplier Name,'
                      '.*,Arrive,Origin,Destination,Carrier,Flight No,Class,'),\s*,Pax Type,Title,Firstname,Surname,DOB,Gender,'
                        'FOID Type,*')
              ]
    def __init__(self,fd):
        self.fd = fd
        self.level = 0
        self.end = False
    def __iter__(self):
        return self
    def __next__(self):
        try:
            line = next(self.fd)
        except StopIteration:
            self.end = True
            raise
        for lev,rx in enumerate(self.levels):
            if rx.match(line):
                self.level = lev
                raise StopIteration('New level')
        return line

然后可以根据您的Json模式将其用于构建Python对象:

mc = multi_csv(open(csvfile,'r')
orders = []
while not mc.end:
    rd = csv.reader(mc)
    for row in rd:
        if mc.level == 0:
            items = []
            orders.append({
                'orderRef': int(row[0]),'orderStatus': row[1],'affiliate': row[2],'source': row[3],'agent': row[4],'customer': {
                    'name': row[5],'email': row[6],'telephone': row[7],'mobile': row[8],'address': {
                        'address1': row[9],'address2': row[10],'city': row[11],'county': row[12],'postCode': row[13],'country': row[14],},'voucherCode': row[15],'voucherAmount': int(row[16]),'orderDate': row[17],'items': items,})
        elif mc.level == 1:
            if len(row[1].strip()) != 0:
                legs = []
                passengers = []
                items.append({
                    'itemId': int(row[1]),'type': row[2],'supplierCode': row[3],'supplierName': row[4],'bookingReference': row[5],'supplierPrice': row[6],'supplierPriceCurrency': row[7],'sellingPrice': row[8],'sellingPriceCurrency': row[9],'legs': legs,'passengers': passengers,})
            legs.append({
                'departureDate': row[10],'arrivalDate': row[11],'origin': row[12],'destination': row[13],'carrier': row[14],'flightNumber': row[15],'class': row[16],})
        else:    # mc.level == 2
            passengers.append({
                'passengerType': row[2],'title': row[3],'firstName': row[4],'surName': row[5],'dob': row[6],'gender': row[7],'foidType': row[8],})

使用您的文件,可以得到预期的结果:

pprint.pprint(orders)

[{'affiliate': ' ','agent': 'akjsd@ad.com','customer': {'address': {'address1': ' ','address2': ' ','city': ' ','country': ' ','county': ' ','postCode': ' '},'email': 'asd@asd.com','mobile': ' ','name': 'Mr Kas Iasd','telephone': '3342926655'},'items': [{'bookingReference': 'ABC123','itemId': 125,'legs': [{'arrivalDate': 'ONEWAY','carrier': 'PK','class': 'Economy','departureDate': '12/01/2018 13:15','destination': 'LHE','flightNumber': '354','origin': 'KHI'}],'passengers': [{'dob': '19/09/1995','firstName': 'Aasdsa','foidType': 'None','gender': 'Male','passengerType': 'Adult','surName': 'Mas','title': 'Mr'},{'dob': '07/12/1995','firstName': 'Asdad','surName': 'Dasd','title': 'Mr'}],'sellingPrice': '5002','sellingPriceCurrency': 'PKR','supplierCode': 'SB','supplierName': 'Sabre','supplierPrice': '5002','supplierPriceCurrency': 'PKR','type': 'Flight'}],'orderDate': '11/01/2018 18:51','orderRef': 1234,'orderStatus': 'PayOfflineConfirmedManual','source': ' ','voucherAmount': 0,'voucherCode': ' '},{'affiliate': ' ','agent': 'asdss@asda.com','email': 'ads@ads.com','mobile': '3332784342','name': 'Mr Asdsd Asdsd','telephone': '3332324252'},'items': [{'bookingReference': 'Failed','itemId': 123,'legs': [{'arrivalDate': '18/01/2018 14:25','carrier': 'PA','departureDate': '18/01/2018 11:40','destination': 'DXB','flightNumber': '210','origin': 'KHI'},{'arrivalDate': '25/01/2018 10:40','departureDate': '25/01/2018 6:25','flightNumber': '211','origin': 'DXB'}],'passengers': [{'dob': '11/08/1991','firstName': 'Asd','surName': 'Azam',{'dob': '01/07/1974','firstName': 'Aziz','surName': 'Asdsd',{'dob': '28/05/1995','firstName': 'mureed','surName': 'ahmed',{'dob': '14/04/2012','firstName': 'abdullah','gender': 'Female','passengerType': 'Child','surName': 'Cdsd',{'dob': '17/12/1999','firstName': 'Asdsd','surName': 'Ahmed','sellingPrice': '154340','supplierCode': 'PITCH','supplierName': 'Kicker','supplierPrice': '154340','orderDate': '11/01/2018 17:06','orderRef': 1235,'orderStatus': 'PayOfflinePendingManualProcessing','voucherAmount': 100,'voucherCode': 'ABC123'},'agent': 'asda@asdad.com','email': 'asd@asdsd.com','mobile': '3067869234','name': 'Mr Asds Sdsd','telephone': '3067869234'},'items': [{'bookingReference': ' ','itemId': 124,'departureDate': '23/01/2018 2:00','destination': 'SHJ','flightNumber': '812','passengers': [{'dob': '01/12/1994','firstName': 'Asds','foidType': 'Passport','surName': 'raza','sellingPrice': '20134','supplierPrice': '20134','orderDate': '11/01/2018 16:23','orderRef': 1236,'voucherCode': ' '}]
原文链接:https://www.f2er.com/python/533112.html

猜你在找的Python相关文章