在一个阵列/最常见的元素寻找相对多数，确定性在O（n）时间及O（1）空间？

本文介绍了在一个阵列/最常见的元素寻找相对多数，确定性在O（n）时间及O（1）空间？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此，例如，对于在阵列的答案：

So for example, the answer for the array:

1，11，3，95，23，8，1

1, 11, 3, 95, 23, 8, 1

是1，因为所有其他元素，而1出现了两次才发生一次。

would be 1, since all the other elements only occur once while 1 occurs twice.

很多类似这样的问题，这些问题是我见过的计算器要求找绝对多数或（答案出现至少n / 2的长度为n的数组），或使用排序回答这个问题一个哈希表。前者是不是我问的，而后者是要么太慢（为O（n log n）的进行排序）或（对于一个哈希表为O（n））使用了太多的内存。

A lot of the questions similar to this question that I've seen on stackoverflow ask to find the absolute majority (the answer occurs at least n/2 in an array of length n), or answer the question using sorting or a hash table. The former is not what I'm asking, and the latter is either too slow ( O(n log n) for sorting ) or uses too much memory ( O(n) for a hash table ).

有没有这样的算法存在吗？如果没有，是否有证据说明为什么这是不可能的？包括源就好了。

Does such an algorithm exist? If not, is there a proof showing why it's impossible? Including a source would be nice.

推荐答案

这是不是一个完整的答案，但它应该有助于解释为什么这个问题是很难的一些情况。

This is not a complete answer, but it should help shed some light on why this problem is difficult.

考虑，我们要设计一个算法，但这一次扫描在阵列（在某些顺序）来找到最常见的元素。在我们的算法的运行，它被允许保留一些数据结构取值。让我们来看看有多少信息必须有在取值，因此，如果我们可以包含它 O（1）内存。

Consider we want to design an algorithm, that does one sweep over the array (in some order) to find the most common element. During the run of our algorithm, it is allowed to keep some data structure S. Let's see how much information there has to be in S, and thus if we can contain it in O(1) memory.

说，我们的算法处理的第一个 K 数组的元素。现在取值可以告诉我们在系列中最常见的元素 A [0..k] 。不过，说我们知道 K + 1 ST元素，那么我们也应该知道，在系列中最常见的元素 A [0..k +1] 。如果不能，我们的算法是行不通的，如果 N 是 K + 1 。更一般地，元素赋予知识 A [k..m] 和取值，我们知道中最常见的元素 A [0..m] 。

Say our algorithm has processed the first k elements of the array. Now S can tell us the most common element in the range a[0..k]. However, say we knew the k+1'st element, then we would also know the most common element in the range a[0..k+1]. If it couldn't, our algorithm wouldn't work if n was k+1. More generally, given knowledge of elements a[k..m] and S, we know the most common element in a[0..m].

我们可以用上面的参数提取取值信息。比方说，我们正在与整数范围内的 [0，U] （必须有一定的范围，如果原始数组了空间 O（N） ）。如果原始最常见的元素是 5 ，然后我们添加 0 的，直到最常见的元素变化。如果把 C 零， A [0..k] 必须包含 C 更多 5 比 0 的。重复这样的说法，我们得到了很多的线性方程组，我们可以解决，告诉到底有多少次，每个元素 [0，U] 分别为present在 A [0..k] 。

We can use the above argument to extract information from S. Say we are working with integers in the range [0,u] (there has to be some range if the original array took space O(n)). If the original most common element is 5, then we add 0's until the most common element changes. If that took c zeroes, a[0..k] must have contained c more 5's than 0's. Repeating this argument we get a lot of linear equations which we can solve to tell exactly how many times each of the elements [0,u] were present in a[0..k].

这告诉我们，任何数据结构做了扫描，还不如存储所有可见的元素（在某些COM pressed方式）的数量。如果你有兴趣在数学，存储看到 N 号码后的log（n + U-1选择N）这是日志的方式来划分 N 区分项目数为 U 区分箱。这比日志（U ^ N / N！）＆GT; = nlogu-nlogn

This tells us that any data structure that does a sweep, might as well store the counts of all the seen elements (in some compressed way). If you're interested in the maths, the stored after seeing n numbers is log(n+u-1 choose n) which is the log of the number of ways to partition n indistinguishable items into u distinguishable bins. That's more than log(u^n/n!) >= nlogu-nlogn.

结论：任何算法，做只有一个传递数组都会有，因为它需要存储所有见过这么远的计数使用尽可能多的内存。如果 N 比 U 此相对应的存储小 N 字的内存。

Conclusion: Any algorithm that does only one pass of the array will have to use as much memory as it takes to store all the counts seen so far. If n is small compared to u this corresponds to storing n words of memory.

（嗯，而不是额外的内存，我们也可能会覆盖现有阵列）。

(Well, instead of extra memory we might also overwrite the existing array).

还有很多更在这里探讨。例如。如何多次通过影响上述论点。不过，我想我应该停止在这一点:)，但它似乎并不可能，我认为任何线性时间算法，具有较大的 U ，就能得到废除 O（1）额外的内存。

There's a lot more to explore here. E.g. how multiple passes affect the above arguments. However I think I should stop at this point :), but it doesn't seem likely to me that any linear time algorithm, with a large u, will be able to get away with O(1) extra memory.

这篇关于在一个阵列/最常见的元素寻找相对多数，确定性在O（n）时间及O（1）空间？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！