本文介绍了用于以快速且内存高效的方式处理大型数据数组的良好 C++ 数组类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

之前关于堆使用限制的问题,我正在寻找一个很好的标准 C++ 类,用于以内存高效和速度高效的方式处理大数据数组.我一直在使用单个 malloc/HealAlloc 分配数组,但在使用各种调用多次尝试后,仍然遇到堆碎片问题.因此,除了移植到 64 位之外,我得出的结论是使用一种机制,允许我拥有一个跨越多个较小内存片段的大数组.我不希望每个元素都分配一个 alloc,因为它的内存效率非常低,因此计划编写一个类来覆盖 [] 运算符并根据索引选择适当的元素.是否已经有一个像样的课程可以做到这一点,还是我最好自己动手?

Following on from a previous question relating to heap usage restrictions, I'm looking for a good standard C++ class for dealing with big arrays of data in a way that is both memory efficient and speed efficient. I had been allocating the array using a single malloc/HealAlloc but after multiple trys using various calls, keep falling foul of heap fragmentation. So the conclusion I've come to, other than porting to 64 bit, is to use a mechanism that allows me to have a large array spanning multiple smaller memory fragments. I don't want an alloc per element as that is very memory inefficient, so the plan is to write a class that overrides the [] operator and select an appropriate element based on the index. Is there already a decent class out there to do this, or am I better off rolling my own?

根据我的理解,还有一些 谷歌搜索,一个 32 位 Windows 进程理论上应该能够寻址高达 2GB.现在假设我已经安装了 2GB,并且其他各种进程和服务占用了大约 400MB,您认为我的程序可以合理地期望从堆中获得多少可用内存?

From my understanding, and some googling, a 32 bit Windows process should theoretically be able address up to 2GB. Now assuming I've 2GB installed, and various other processes and services are hogging about 400MB, how much usable memory do you think my program can reasonably expect to get from the heap?

我目前正在使用各种风格的 Visual C++.

I'm currently using various flavours of Visual C++.

编辑 根据 Poita 的帖子,我尝试了 std::deque,在 VS2008 上使用以下测试;

Edit As per Poita's post, I've tried a std::deque, using the following test on VS2008;

#include <deque>
using namespace std;
struct V    
{
    double  data[11];
};

struct T
{
    long    data[8];    
};


void    dequeTest()
{
    deque<V> VQ;
    deque<T> TQ;

    V defV;
    T defT;

    VQ.resize(4000000,defV);
    TQ.resize(8000000,defT);
}

上述数据的总内存为 608MB,如果我使用直接 malloc 或 HeapAlloc,并且需要

1秒.deque 调整大小最初占用 950MB,然后慢慢开始回落.15 分钟后, dequeTest() 完成,仅使用 6MB 的内存显示该进程,这可能更多地与运行时间有关.我还尝试使用各种推送选项填充双端队列,但性能太差,我不得不提早出局.我可以提供比默认值更好的分配器以获得更好的响应,但从表面上看,deque 不是这项工作的类.请注意,这也可能与双端队列的 MS VS2008 实现有关,因为此类中似乎有很多在性能方面非常依赖于实现.

The total memory for the above data comes out at 608MB, were I to use straight malloc or HeapAlloc, and takes < 1 second. The deque resizes took 950MB originally, and then slowly started dropping back. 15 minutes later, dequeTest() finished, using just 6MB of memory showing for the process which probably was more to do with the run-times. I also tried populating the deque using various push options, but performance was so bad, I had to break out early. I could possibly provide a better allocator than the defualt to get a much better response, but on the face of it deque is not the class for this job. Note this could also relate to the MS VS2008 implementation of deque, as there seems to be alot in this class that is very implementation dependant when it comes to performance.

我认为是时候编写我自己的大数组类了.

Time to write my own big array class, I reckon.

第二次使用以下方法立即分配较小的数量产生 1.875GB;

Second Allocating smaller amounts yielded 1.875GB immediately using the following;

#define TenMB 1024*1024*10

void    SmallerAllocs()
{

    size_t Total = 0;
    LPVOID  p[200];
    for (int i = 0; i < 200; i++)
    {
        p[i] = malloc(TenMB);
        if (p[i])
            Total += TenMB; else
            break;
    }
    CString Msg;
    Msg.Format("Allocated %0.3lfGB",Total/(1024.0*1024.0*1024.0));
    AfxMessageBox(Msg,MB_OK);
}

最终编辑我决定接受 Poita 的帖子及其后的各种评论,不是因为我将直接使用 deque 类,而是更多地将数组作为一副纸牌概念随后的评论.基于每块固定数量的元素,这应该很容易通过 O(1) 随机元素访问来实现,这正是我所需要的.感谢大家的反馈!

Final edit I have decided to accept Poita's post and the various comments following it, not because I'll be using the deque class directly, but more for the array as a deck of cards notion in the comments that followed. This should be straightforward to implement with O(1) random element access, based on a fixed number of elements per block, which is what i need. Thanks to all for the feedback!

推荐答案

您是否尝试过使用 std::deque?与使用一个巨大堆分配的 std::vector 不同,deque 通常以小块分配,但仍通过 operator[].

Have you tried using an std::deque? Unlike a std::vector, which uses one huge heap allocation, deque usually allocates in small chunks, but still provides amortised constant time indexing via operator[].

这篇关于用于以快速且内存高效的方式处理大型数据数组的良好 C++ 数组类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 16:41