查询使用Salesforce的Java API超过100万的记录，并在寻找最好的方法

本文介绍了查询使用Salesforce的Java API超过100万的记录，并在寻找最好的方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我开发一个Java应用程序，这将查询其可容纳超过100万的记录表。我试图尽我所能，以尽可能有效的，但我只能够实现的魅力。约5000记录一分钟，在一个点最多10,000。我曾尝试逆向工程数据加载和我的code似乎非常相似，但仍然没有运气。

I am developing a Java application which will query tables which may hold over 1,000,000 records. I have tried everything I could to be as efficient as possible but I am only able to achieve on avg. about 5,000 records a minute and a maximum of 10,000 at one point. I have tried reverse engineering the data loader and my code seems to be very similar but still no luck.

在这里线程可行的解决方案？我曾经试过，但用很小的结果。

Is threading a viable solution here? I have tried this but with very minimal results.

我一直在读和已申请的每一件事可能似乎（COM pressing请求/响应，线程等），但我不能达到一样的速度数据加载。

I have been reading and have applied every thing possible it seems (compressing requests/responses, threads etc.) but I cannot achieve data loader like speeds.

要注意，似乎queryMore方法似乎是瓶颈。

To note, it seems that the queryMore method seems to be the bottle neck.

没有人有任何code样品或经验，他们可以分享到引导我朝着正确的方向？

Does anyone have any code samples or experiences they can share to steer me in the right direction?

感谢

推荐答案

我已经在过去使用的一种方法是只查询你想要的（这使得查询显著更快）的ID。然后，您可以并行在多个线程检索（）。

An approach I've used in the past is to query just for the IDs that you want (which makes the queries significantly faster). You can then parallelize the retrieves() across several threads.

这看起来是这样的：

[查询线程] - >的BlockingQueue - > [线程池做检索（）] - >的BlockingQueue

[query thread] -> BlockingQueue -> [thread pool doing retrieve()] -> BlockingQueue

第一个线程执行查询（）和queryMore（）一样快，因为它可以，写它进入了BlockingQueue的所有ID。 queryMore（）是不是你应该同时调用，因为据我所知，所以没有办法并行化这一步。所有的ID被写入到的BlockingQueue。您可能希望将它们打包成几百束以减少锁争如果这成为一个问题。线程池，然后可以做的并发检索（）调用的ID，以获取该SObjects所有字段并把它们放在一个队列为您的应用程序处理的其余部分。

The first thread does query() and queryMore() as fast as it can, writing all ids it gets into the BlockingQueue. queryMore() isn't something you should call concurrently, as far as I know, so there's no way to parallelize this step. All ids are written into a BlockingQueue. You may wish to package them up into bundles of a few hundred to reduce lock contention if that becomes an issue. A thread pool can then do concurrent retrieve() calls on the ids to get all the fields for the SObjects and put them in a queue for the rest of your app to deal with.

我写了一个Java库使用SF的API，它可能是有用的。

I wrote a Java library for using the SF API that may be useful. http://blog.teamlazerbeez.com/2011/03/03/a-new-java-salesforce-api-library/

这篇关于查询使用Salesforce的Java API超过100万的记录，并在寻找最好的方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！