文章/答案/技术大牛

发布

社区首页 >问答首页 >Python :每行有一个JSON obj的文件

问Python :每行有一个JSON obj的文件
EN

Stack Overflow用户

提问于 2017-01-30 15:59:19

回答 2查看 70关注 0票数 0

来自东京的问候。

让我解释一下我试图用python2.7实现什么：

我在每一行上都有一个带有JSON 的文件，下面是一个捕获：

1 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"237000080507750613","siteId":1551642,"startTime":1483217576324,"endTime":1483217696000,"clientIPs":["69.61.12.70"],"country"    :["United States"],"countryCode":["US"],"clientType":"Vulnerability Scanner","clientApplication":"Grabber","clientApplicationId":780,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozi    lla/5.0 CommonCrawler Node 3AEHGF7VNEKJUWOPKJJIJ7ODKPM4XXVZQUTHNWS5B2O5AEAGHIG4HVC42LLEUSO.CQYXO3ZFD.GB5RZ5EG2SRWW335PUSOSIVLZUXPCTJUGV2MDJGQJDJPE5UH.cdn0.common.crawl.zone","os":"","osVersion":"","suppor    tsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","servedVia":["Ashburn,VA"],"securitySummary": {"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false,"responseTime":0,"thinkTime":0,"incidentId":"237000080507750613-304992946328    764549","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction    ":"api.rule_action_type.rule_action_block"}]}]}, ...

2 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"520000110618442601","siteId":1551642,"startTime":1482666233524,"endTime":1482666353000,"clientIPs":["93.175.201.18"],"countr    y":["Ukraine"],"countryCode":["UA"],"clientType":"Spam Bot","clientApplication":"DTS Agent","clientApplicationId":99,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozilla/4.0 (compati    ble; MSIE 5.0; Windows NT; DigExt; DTS Agent","os":"","osVersion":"","supportsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","served    Via":["Warsaw, Poland"],"securitySummary":{"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false,"responseTime":2,"thinkTime":1,"incidentId":"520000110618442601-1233371267206742195","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction":"api.rule_action_type.rule_action_block"}]}]}, ...

3 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"520000110602830007","siteId":1551642,"startTime":1482429957001,"endTime":1482430077000,"clientIPs":["93.175.201.18"],"countr    y":["Ukraine"],"countryCode":["UA"],"clientType":"Spam Bot","clientApplication":"DTS Agent","clientApplicationId":99,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozilla/4.0 (compati    ble; MSIE 5.0; Windows NT; DigExt; DTS Agent","os":"","osVersion":"","supportsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","served    Via":["Warsaw, Poland"],"securitySummary":{"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false","responseTime":4,"thinkTime":4,"incidentId":"520000110602830007-3073954101470953658","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction":"api.rule_action_type.rule_action_block"}]}]}, ...

我试着用json.loads()处理整个文件，但没有成功。

这是我的代码

g = open('monthlyLogShort.txt', 'w')
with open("page.txt") as f:
         data = f.read()
         parse = json.loads(data)        # <-load the JSON dict
         field_list = parse["visits"]
         for fields in field_list:       # <-extract the the following field
                 print >> g , "visit_id=",(fields["id"]),",","src_country=",(fields["country"]),",", "event_timestamp=",(fields["startTime"]),",","src_ip=",(fields["clientIPs"]),",","dest_name=", rwdname,"    ,","dest_id=",(fields["siteId"]),",","signature=",(fields["securitySummary"])
g.close()

正如您可以想象的那样，我只能用这段代码解析一行。什么是处理整个文件的最佳(pythonic)方法？

谢谢你读我的文章

parsing

dictionary

python

json

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-01-31 04:57:19

由于行数总是相同的，所以我想出了这样的解决方案：

g = open('monthlyLogShort.txt', 'w')
with open('page.txt','r') as f:
        data = f.readlines()
        countp = 0
        page = 0
        while countp < 10:
                parse = json.loads(data[page])  # load the JSON dict
                field_list = parse["visits"]
                for fields in field_list:       # extract the the following field
                        print >> g , "visit_id=",(fields["id"]),",","src_country=",(fields["country"]),",", "event_timestamp=",(fields["startTime"]),",","src_ip=",(fields["clientIPs"]),",","dest_name=", dname,",","dest_id=",(fields["siteId"]),",","signature=",(fields['securitySummary'])
                countp = countp + 1
                page = page + 1
        else:
                g.close()

就像一种魅力。

票数 0

Stack Overflow用户

发布于 2017-01-30 16:06:39

文件作为一个整体不是一个有效的JSON，但是可以逐行解析它

with open("page.txt") as f:
    for line in f:
        obj = json.loads(line.split(" ", 1)[1])
        print(obj["visits"])

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/41940228

复制

相似问题

问Python :每行有一个JSON obj的文件
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python :每行有一个JSON obj的文件EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python :每行有一个JSON obj的文件
EN