使用Python的pymongo库连接和读取Mongo的集合(表),读取的每个结果为一个JSON对象。下面实例将解析结果转化为pandas的DataFrame类型,一级属性作为列名(也可展开子对象扩展更多列)。如果属性值为字段和列表,即JSON内嵌对象,可用如下方法解析为JSON字符串:
python
import pymongo as pm
import pandas as pd
mongo_addr = 'mongodb://user_name:password@mongo_server_ip:27017'
client = pm.MongoClient(mongo_addr) # 连接MongoDB数据库客户端对象
db = client.get_database(dbn)
coll = db[coll_name]
cond = {'status': {'$gte': 1}}
cursor = coll.find(cond)
batch_list = []
# item为字典对象
for item in cursor:
for key, value in item.items():
# 对一级属性值判断,若为字典或列表转为JSON字符串
if isinstance(value, (dict, list)):
item[key] = json.dumps(value, ensure_ascii=False)
batch_list.append(item)
data_frame = pd.DataFrame(batch_list)
cursor.close()
client.close()