本文介绍了simple_salesforce python中的父子关系查询,从有序字典中提取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 python 中的 simple_salesforce 包从 salesforce 查询信息.

I'm trying to query information from salesforce using the simple_salesforce package in python.

问题在于,它将属于父子关系的字段嵌套到有序字典中的有序字典中

The problem is that it's nesting fields that are a part of a parent-child relationship into an ordered dict within an ordered dict

我想从 Opportunity 对象中找到 ID 以及与该记录关联的帐户 ID.

I want.. from the Opportunity object, to find the id, and the accountid associated with that record.

SOQL 查询可能看起来像......

The SOQL query may look like..

query = "select id, account.id from opportunity where closedate = last_n_days:5"

在 SOQL(salesforce 对象查询语言)中,点表示数据库中的父子关系.所以我试图从机会对象中获取 id,然后从该记录上的帐户对象中获取相关 id.

in SOQL (salesforce object query language), a dot denotes a parent child relationship in the database. So I'm trying to get the id from the opportunity object, and then the related id from the account object on that record.

出于某种原因,Id 没问题,但 account.id 嵌套在有序字典中的有序字典中:

for some reason the Id comes in fine, but the account.id is nested in an ordered dict within an ordered dict:

q = sf.query_all(query)

这将拉回有序字典..

OrderedDict([('totalSize', 455),
             ('done', True),
             ('records',
              [OrderedDict([('attributes',
                             OrderedDict([('type', 'Opportunity'),
                                          ('url',

我会拉取 ordereddictrecords 片段来创建一个 df

I would pull the records piece of the ordereddict to create a df

df = pd.DataFrame(q['records'])

这给了我 3 列,一个名为 'attributes'Id 的有序字典和另一个名为 'Account' 的有序字典.我正在寻找一种方法从嵌套的有序字典 'Account'

This gives me 3 columns, an ordered dict called 'attributes', Id and another ordered dict called 'Account'. I'm looking for a way to extract the ('BillingCountry', 'United States') piece out of the nested ordered dict 'Account'

[OrderedDict([('attributes',
               OrderedDict([('type', 'Opportunity'),
                            ('url',
                             '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])),
              ('Id', '0061B003451RhZgiHHF'),
              ('Account',
               OrderedDict([('attributes',
                             OrderedDict([('type', 'Account'),
                                          ('url',
                                           '/services/data/v34.0/sobjects/Account/001304300MviPPF3Z')])),
                            ('BillingCountry', 'United States')]))])

阐明我在寻找什么.

我想以一个数据框结束,其中每个查询字段都有一列.

I want to end with a dataframe with a column for each of the queried fields.

当我使用 df = pd.DataFrame(sf.query_all(query)['records'])'records' 片段放入 DataFrame 时,它​​给了我:

When I put the 'records' piece into a DataFrame using df = pd.DataFrame(sf.query_all(query)['records']) it gives me:

attributes  Id  Account
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])    0061B003451RhZgiHHF OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0013000000MvkRQQAZ')])), ('BillingCountry', 'United States')])
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001Pa52QQAR')]) 0061B00001Pa52QQAR  OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001vQPxqAAG')])), ('BillingCountry', 'United States')])
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001TRu5mQAD')]) 0061B00001TRu5mQAD  OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001rfRTrAAE')])), ('BillingCountry', 'United States')])

在删除 'attributes' 列后,我希望输出为

after I remove the 'attributes' column I want the output to be

Id BillingCountry
0061B003451RhZgiHHF 'United States'
0061B00001Pa52QQAR 'United States'
0061B00001TRu5mQAD 'United States'

推荐答案

Pandas 是一个了不起的表格数据工具.但是,虽然它可以包含 Python 对象,但这并不是它的最佳选择.我建议您在将数据插入 pandas.Dataframe 之前从查询中提取数据:

Pandas is an amazing tool for tabular data. But while it can contain Python objects, that is not its sweet spot. I suggest you extract your data from the query prior to inserting them into a pandas.Dataframe:

将所需字段提取为字典列表非常简单:

To extract the desired fields as a list of dictionaries is as easy as:

records = [dict(id=rec['Id'], country=rec['Account']['BillingCountry'])
           for rec in data['records']]

将记录插入数据帧:

使用字典列表,数据框就像:

Insert records into a dataframe:

With a list of dicts, a dataframe is as easy as:

df = pd.DataFrame(records)

测试代码:

import pandas as pd
from collections import OrderedDict

data = OrderedDict([
    ('totalSize', 455),
    ('done', True),
    ('records', [
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])),
            ('Id', '0061B003451RhZgiHHF'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0013000000MvkRQQAZ')])),
                                     ('BillingCountry', 'United States')])),
        ]),
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001Pa52QQAR')])),
            ('Id', '0061B00001Pa52QQAR'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001vQPxqAAG')])),
                                     ('BillingCountry', 'United States')])),
        ]),
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001TRu5mQAD')])),
            ('Id', '0061B00001TRu5mQAD'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001rfRTrAAE')])),
                                     ('BillingCountry', 'United States')])),
        ]),
    ])
])

records = [dict(id=rec['Id'], country=rec['Account']['BillingCountry'])
           for rec in data['records']]
for r in records:
    print(r)

print(pd.DataFrame(records))

测试结果:

{'country': 'United States', 'id': '0061B003451RhZgiHHF'}
{'country': 'United States', 'id': '0061B00001Pa52QQAR'}
{'country': 'United States', 'id': '0061B00001TRu5mQAD'}

         country                   id
0  United States  0061B003451RhZgiHHF
1  United States   0061B00001Pa52QQAR
2  United States   0061B00001TRu5mQAD

这篇关于simple_salesforce python中的父子关系查询,从有序字典中提取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 02:25