首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python :每行有一个JSON obj的文件

Python :每行有一个JSON obj的文件
EN

Stack Overflow用户
提问于 2017-01-30 15:59:19
回答 2查看 70关注 0票数 0

来自东京的问候。

让我解释一下我试图用python2.7实现什么:

我在每一行上都有一个带有JSON 的文件,下面是一个捕获:

代码语言:javascript
复制
1 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"237000080507750613","siteId":1551642,"startTime":1483217576324,"endTime":1483217696000,"clientIPs":["69.61.12.70"],"country"    :["United States"],"countryCode":["US"],"clientType":"Vulnerability Scanner","clientApplication":"Grabber","clientApplicationId":780,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozi    lla/5.0 CommonCrawler Node 3AEHGF7VNEKJUWOPKJJIJ7ODKPM4XXVZQUTHNWS5B2O5AEAGHIG4HVC42LLEUSO.CQYXO3ZFD.GB5RZ5EG2SRWW335PUSOSIVLZUXPCTJUGV2MDJGQJDJPE5UH.cdn0.common.crawl.zone","os":"","osVersion":"","suppor    tsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","servedVia":["Ashburn,VA"],"securitySummary": {"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false,"responseTime":0,"thinkTime":0,"incidentId":"237000080507750613-304992946328    764549","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction    ":"api.rule_action_type.rule_action_block"}]}]}, ...

2 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"520000110618442601","siteId":1551642,"startTime":1482666233524,"endTime":1482666353000,"clientIPs":["93.175.201.18"],"countr    y":["Ukraine"],"countryCode":["UA"],"clientType":"Spam Bot","clientApplication":"DTS Agent","clientApplicationId":99,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozilla/4.0 (compati    ble; MSIE 5.0; Windows NT; DigExt; DTS Agent","os":"","osVersion":"","supportsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","served    Via":["Warsaw, Poland"],"securitySummary":{"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false,"responseTime":2,"thinkTime":1,"incidentId":"520000110618442601-1233371267206742195","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction":"api.rule_action_type.rule_action_block"}]}]}, ...

3 {"res":0, "res_message":"OK", "debug_info":{"id-info":"9089"}, "visits":[{"id":"520000110602830007","siteId":1551642,"startTime":1482429957001,"endTime":1482430077000,"clientIPs":["93.175.201.18"],"countr    y":["Ukraine"],"countryCode":["UA"],"clientType":"Spam Bot","clientApplication":"DTS Agent","clientApplicationId":99,"httpVersion":"1.1","clientApplicationVersion":"null","userAgent":"Mozilla/4.0 (compati    ble; MSIE 5.0; Windows NT; DigExt; DTS Agent","os":"","osVersion":"","supportsCookies":false,"supportsJavaScript":false,"hits":1,"pageViews":0,"entryReferer":"","served    Via":["Warsaw, Poland"],"securitySummary":{"api.threats.bot_access_control":1},"actions":[{"postData":"","requestResult":"api.request_result.req_blocked_security","isSecured":false","responseTime":4,"thinkTime":4,"incidentId":"520000110602830007-3073954101470953658","threats":[{"securityRule":"api.threats.bot_access_control","alertLocation":"api.alert_location.alert_location_path","attackCodes":["200.0"],"securityRuleAction":"api.rule_action_type.rule_action_block"}]}]}, ...

我试着用json.loads()处理整个文件,但没有成功。

这是我的代码

代码语言:javascript
复制
g = open('monthlyLogShort.txt', 'w')
with open("page.txt") as f:
         data = f.read()
         parse = json.loads(data)        # <-load the JSON dict
         field_list = parse["visits"]
         for fields in field_list:       # <-extract the the following field
                 print >> g , "visit_id=",(fields["id"]),",","src_country=",(fields["country"]),",", "event_timestamp=",(fields["startTime"]),",","src_ip=",(fields["clientIPs"]),",","dest_name=", rwdname,"    ,","dest_id=",(fields["siteId"]),",","signature=",(fields["securitySummary"])
g.close()

正如您可以想象的那样,我只能用这段代码解析一行。什么是处理整个文件的最佳(pythonic)方法?

谢谢你读我的文章

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-01-31 04:57:19

由于行数总是相同的,所以我想出了这样的解决方案:

代码语言:javascript
复制
g = open('monthlyLogShort.txt', 'w')
with open('page.txt','r') as f:
        data = f.readlines()
        countp = 0
        page = 0
        while countp < 10:
                parse = json.loads(data[page])  # load the JSON dict
                field_list = parse["visits"]
                for fields in field_list:       # extract the the following field
                        print >> g , "visit_id=",(fields["id"]),",","src_country=",(fields["country"]),",", "event_timestamp=",(fields["startTime"]),",","src_ip=",(fields["clientIPs"]),",","dest_name=", dname,",","dest_id=",(fields["siteId"]),",","signature=",(fields['securitySummary'])
                countp = countp + 1
                page = page + 1
        else:
                g.close()

就像一种魅力。

票数 0
EN

Stack Overflow用户

发布于 2017-01-30 16:06:39

文件作为一个整体不是一个有效的JSON,但是可以逐行解析它

代码语言:javascript
复制
with open("page.txt") as f:
    for line in f:
        obj = json.loads(line.split(" ", 1)[1])
        print(obj["visits"])
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/41940228

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档