本文介绍了map和unordered_map在c ++中的性能差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的要求,我需要一个类型的地图。但我需要最快的理论上可能的检索时间。

I have a simple requirement, i need a map of type . however i need fastest theoretically possible retrieval time.

我使用了两个地图和新提出的unordered_map从tr1
i发现,至少在解析文件和创建地图,通过在时间插入元素。

i used both map and the new proposed unordered_map from tr1i found that at least while parsing a file and creating the map, by inserting an element at at time.

地图只需要2分钟,而unordered_map需要5分钟。

map took only 2 minutes while unordered_map took 5 mins.

要在Hadoop集群上执行的代码,并且将包含〜1亿个条目,我需要最小的可能检索时间。

As i it is going to be part of a code to be executed on Hadoop cluster and will contain ~100 million entries, i need smallest possible retrieval time.

另一个有用的信息:
当前数据(键)是从1,2,...到〜10万的整数范围。

Also another helpful information:currently the data (keys) which is being inserted is range of integers from 1,2,... to ~10 million.

我也可以强制用户指定最大值和使用顺序如上,会显着影响我的实现吗? (我听说地图是基于rb树和插入按增加的顺序导致更好的性能(或最坏的?))

I can also impose user to specify max value and to use order as above, will that significantly effect my implementation? (i heard map is based on rb trees and inserting in increasing order leads to better performance (or worst?) )

这里是代码

map<int,int> Label // this is being changed to unordered_map  
fstream LabelFile("Labels.txt");  


// Creating the map from the Label.txt  
if (LabelFile.is_open())  
{  
    while (! LabelFile.eof() )  
    {             
        getline (LabelFile,inputLine);  
        try  
        {  
            curnode=inputLine.substr(0,inputLine.find_first_of("\t"));  
            nodelabel=inputLine.substr(inputLine.find_first_of("\t")+1,inputLine.size()-1);  
            Label[atoi(curnode.c_str())]=atoi(nodelabel.c_str());  
        }  
        catch(char* strerr)  
        {  
            failed=true;  
            break;  
        }  
    }  
    LabelFile.close(); 
}



暂时解决方案:在评论和回答之后,我相信动态C ++数组将是最好的选择,因为实现将使用密钥。感谢

Tentative Solution: After review of comments and answers, i believe a Dynamic C++ array would be the best option, since the implementation will use dense keys. Thanks

推荐答案

unordered_map的插入应为 O(1) reterival应该大致 O(1),(它本质上是一个哈希表)。

Insertion for unordered_map should be O(1) and reterival should be roughly O(1), (its essentially a hash-table).

您的时间结果是关闭,或者您的实施或使用unordered_map有错误

Your timings as a result are way OFF, or there is something WRONG with your implementation or usage of unordered_map.

您需要提供一些更多信息,以及您可能如何使用该容器。

You need to provide some more information, and possibly how you are using the container.

6.3的n1836插入/恢复的复杂性给出:

As per section 6.3 of n1836 the complexities for insertion/retreival are given:



  • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf

您应该考虑的一个问题是您的实现可能需要不断地 重新散列 >结构,因为你说你有 100mil +项目。在这种情况下,当实例化容器时,如果你有一个粗略的想法,将有多少unique元素将被插入到容器中,你可以将它作为参数传递给构造函数,容器将相应地用适当大小的桶表实例化。

One issue you should consider is that your implementation may need to continually be rehashing the structure, as you say you have 100mil+ items. In that case when instantiating the container, if you have a rough idea about how many "unique" elements will be inserted into the container, you can pass that in as a parameter to the constructor and the container will be instantiated accordingly with a bucket-table of appropriate size.

这篇关于map和unordered_map在c ++中的性能差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-27 11:25