问题描述
我正在练习算法,我的任务之一是计算给定 0 所有最长递增子序列的数量.n 个数字.解决方案 O(n^2) 不是一个选项.
I'm practicing algorithms and one of my tasks is to count the number of all longest increasing sub-sequences for given 0 < n <= 10^6 numbers. Solution O(n^2) is not an option.
我已经实现了查找 LIS 及其长度(LIS 算法),但是该算法将数字切换到尽可能低的值.因此,不可能确定具有先前数字(较大的数字)的子序列是否能够达到最长的长度,否则我猜我可以只计算那些开关.
I have already implemented finding a LIS and its length (LIS Algorithm), but this algorithm switches numbers to the lowest possible. Therefore, it's impossible to determine if sub-sequences with a previous number (the bigger one) would be able to achieve the longest length, otherwise I could just count those switches, I guess.
关于 O(nlogn) 的任何想法?我知道应该使用动态规划来解决.
Any ideas how to get this in about O(nlogn)? I know that it should be solved using dynamic-programming.
我实现了一个解决方案,效果很好,但它需要两个嵌套循环(i in 1..n) x (j in 1..i-1).
所以它是 O(n^2) 我认为,但它太慢了.
I implemented one solution and it works well, but it requires two nested loops (i in 1..n) x (j in 1..i-1).
So it's O(n^2) I think, nevertheless it's too slow.
我什至尝试将这些数字从数组移动到二叉树(因为在每次 i 迭代中,我都会查找所有较小的数字然后 number[i] - 经历元素i-1..1),但速度更慢.
I tried even to move those numbers from array to a binary tree (because in each i iteration I look for all smaller numbers then number[i] - going through elements i-1..1), but it was even slower.
示例测试:
1 3 2 2 4
result: 3 (1,3,4 | 1,2,4 | 1,2,4)
3 2 1
result: 3 (1 | 2 | 3)
16 5 8 6 1 10 5 2 15 3 2 4 1
result: 3 (5,8,10,15 | 5,6,10,15 | 1,2,3,4)
推荐答案
求所有最长递增子序列的个数
改进的LIS算法的完整Java代码,它不仅发现了最长递增子序列的长度,而且发现了这种长度的子序列的数量,如下所示.我更喜欢使用泛型,不仅允许整数,还允许任何可比较的类型.
Finding the number of all longest increasing subsequences
Full Java code of improved LIS algorithm, which discovers not only the length of longest increasing subsequence, but number of subsequences of such length, is below. I prefer to use generics to allow not only integers, but any comparable types.
@Test
public void testLisNumberAndLength() {
List<Integer> input = Arrays.asList(16, 5, 8, 6, 1, 10, 5, 2, 15, 3, 2, 4, 1);
int[] result = lisNumberAndlength(input);
System.out.println(String.format(
"This sequence has %s longest increasing subsequenses of length %s",
result[0], result[1]
));
}
/**
* Body of improved LIS algorithm
*/
public <T extends Comparable<T>> int[] lisNumberAndLength(List<T> input) {
if (input.size() == 0)
return new int[] {0, 0};
List<List<Sub<T>>> subs = new ArrayList<>();
List<Sub<T>> tails = new ArrayList<>();
for (T e : input) {
int pos = search(tails, new Sub<>(e, 0), false); // row for a new sub to be placed
int sum = 1;
if (pos > 0) {
List<Sub<T>> pRow = subs.get(pos - 1); // previous row
int index = search(pRow, new Sub<T>(e, 0), true); // index of most left element that <= e
if (pRow.get(index).value.compareTo(e) < 0) {
index--;
}
sum = pRow.get(pRow.size() - 1).sum; // sum of tail element in previous row
if (index >= 0) {
sum -= pRow.get(index).sum;
}
}
if (pos >= subs.size()) { // add a new row
List<Sub<T>> row = new ArrayList<>();
row.add(new Sub<>(e, sum));
subs.add(row);
tails.add(new Sub<>(e, 0));
} else { // add sub to existing row
List<Sub<T>> row = subs.get(pos);
Sub<T> tail = row.get(row.size() - 1);
if (tail.value.equals(e)) {
tail.sum += sum;
} else {
row.add(new Sub<>(e, tail.sum + sum));
tails.set(pos, new Sub<>(e, 0));
}
}
}
List<Sub<T>> lastRow = subs.get(subs.size() - 1);
Sub<T> last = lastRow.get(lastRow.size() - 1);
return new int[]{last.sum, subs.size()};
}
/**
* Implementation of binary search in a sorted list
*/
public <T> int search(List<? extends Comparable<T>> a, T v, boolean reversed) {
if (a.size() == 0)
return 0;
int sign = reversed ? -1 : 1;
int right = a.size() - 1;
Comparable<T> vRight = a.get(right);
if (vRight.compareTo(v) * sign < 0)
return right + 1;
int left = 0;
int pos = 0;
Comparable<T> vPos;
Comparable<T> vLeft = a.get(left);
for(;;) {
if (right - left <= 1) {
if (vRight.compareTo(v) * sign >= 0 && vLeft.compareTo(v) * sign < 0)
return right;
else
return left;
}
pos = (left + right) >>> 1;
vPos = a.get(pos);
if (vPos.equals(v)) {
return pos;
} else if (vPos.compareTo(v) * sign > 0) {
right = pos;
vRight = vPos;
} else {
left = pos;
vLeft = vPos;
}
}
}
/**
* Class for 'sub' pairs
*/
public static class Sub<T extends Comparable<T>> implements Comparable<Sub<T>> {
T value;
int sum;
public Sub(T value, int sum) {
this.value = value;
this.sum = sum;
}
@Override public String toString() {
return String.format("(%s, %s)", value, sum);
}
@Override public int compareTo(Sub<T> another) {
return this.value.compareTo(another.value);
}
}
说明
由于我的解释似乎很长,我将初始序列称为seq"及其任何子序列sub".所以任务是计算可以从 seq 中获得的最长递增子的计数.
Explanation
As my explanation seems to be long, I will call initial sequence "seq" and any its subsequence "sub". So the task is to calculate count of longest increasing subs that can be obtained from the seq.
正如我之前提到的,我们的想法是记录在前面的步骤中获得的所有可能最长的子项.因此,让我们创建一个编号的行列表,其中每行的编号等于存储在该行中的 subs 的长度.让我们将 subs 存储为数字对 (v, c),其中v"是 结束元素的值",c"是 结束时给定长度的 subs 的count"通过v".例如:
As I mentioned before, idea is to keep counts of all possible longest subs obtained on previous steps. So let's create a numbered list of rows, where number of each line equals the length of subs stored in this row. And let's store subs as pairs of numbers (v, c), where "v" is "value" of ending element, "c" is "count" of subs of given length that end by "v". For example:
1: (16, 1) // that means that so far we have 1 sub of length 1 which ends by 16.
我们将逐步构建这样的列表,按顺序从初始序列中获取元素.在每一步中,我们都会尝试将此元素添加到可以添加的最长子元素中并记录更改.
We will build such list step by step, taking elements from initial sequence by their order. On every step we will try to add this element to the longest sub that it can be added to and record changes.
让我们使用示例中的序列构建列表,因为它具有所有可能的选项:
Let's build the list using sequence from your example, since it has all possible options:
16 5 8 6 1 10 5 2 15 3 2 4 1
首先,取元素 16.到目前为止,我们的列表是空的,所以我们只放了一对:
First, take element 16. Our list is empty so far, so we just put one pair in it:
1: (16, 1) <= one sub that ends by 16
接下来是5.它不能添加到以 16 结尾的 sub,因此它将创建长度为 1 的新 sub.我们创建一对 (5, 1) 并将其放入第 1 行:
Next is 5. It cannot be added to a sub that ends by 16, so it will create new sub with length of 1. We create a pair (5, 1) and put it into line 1:
1: (16, 1)(5, 1)
元素 8 即将到来.它不能创建长度为 2 的子 [16, 8],但可以创建子 [5, 8].所以,这就是算法来的地方.首先,我们颠倒迭代列表行,查看最后一对的值".如果我们的元素大于所有行中所有最后一个元素的值,那么我们可以将其添加到现有的 sub(s) 中,将其长度增加一.所以值 8 将创建列表的新行,因为它大于迄今为止列表中存在的所有最后一个元素的值(即 > 5):
Element 8 is coming next. It cannot create the sub [16, 8] of length 2, but can create the sub [5, 8]. So, this is where algorithm is coming. First, we iterate the list rows upside down, looking at the "values" of last pair. If our element is greater than values of all last elements in all rows, then we can add it to existing sub(s), increasing its length by one. So value 8 will create new row of the list, because it is greater than values all last elements existing in the list so far (i. e. > 5):
1: (16, 1)(5, 1)
2: (8, ?) <=== need to resolve how many longest subs ending by 8 can be obtained
元素8可以继续5,但不能继续16.所以我们需要搜索上一行,从它的末尾开始,计算value"小于8的counts"的总和:
Element 8 can continue 5, but cannot continue 16. So we need to search through previous row, starting from its end, calculating the sum of "counts" in pairs which "value" is less than 8:
(16, 1)(5, 1)^ // sum = 0
(16, 1)^(5, 1) // sum = 1
^(16, 1)(5, 1) // value 16 >= 8: stop. count = sum = 1, so write 1 in pair next to 8
1: (16, 1)(5, 1)
2: (8, 1) <=== so far we have 1 sub of length 2 which ends by 8.
为什么我们不将值 8 存储到长度为 1(第一行)的 subs 中?因为我们需要最大可能长度的 subs,并且 8 可以继续一些以前的 subs.因此,每个大于 8 的下一个数字也将继续这样的 sub,并且没有必要将 8 作为 sub 的长度小于它可以的长度.
Why don't we store value 8 into subs of length 1 (first line)? Because we need subs of maximum possible length, and 8 can continue some previous subs. So every next number greater than 8 will also continue such sub and there is no need to keep 8 as sub of length less that it can be.
接下来.6.按行中的最后一个值"倒置搜索:
Next. 6. Searching upside down by last "values" in rows:
1: (16, 1)(5, 1) <=== 5 < 6, go next
2: (8, 1)
1: (16, 1)(5, 1)
2: (8, 1 ) <=== 8 >= 6, so 6 should be put here
找到6个房间,需要计算个数:
Found the room for 6, need to calculate a count:
take previous line
(16, 1)(5, 1)^ // sum = 0
(16, 1)^(5, 1) // 5 < 6: sum = 1
^(16, 1)(5, 1) // 16 >= 6: stop, write count = sum = 1
1: (16, 1)(5, 1)
2: (8, 1)(6, 1)
处理后1:
1: (16, 1)(5, 1)(1, 1) <===
2: (8, 1)(6, 1)
处理后10:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)
3: (10, 2) <=== count is 2 because both "values" 8 and 6 from previous row are less than 10, so we summarized their "counts": 1 + 1
处理后5:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1) <===
3: (10, 2)
处理后2:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1) <===
3: (10, 2)
处理后15:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1)
3: (10, 2)
4: (15, 2) <===
处理后3:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 1)
3: (10, 2)(3, 1) <===
4: (15, 2)
处理后2:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 2) <===
3: (10, 2)(3, 1)
4: (15, 2)
如果在按最后一个元素搜索行时,我们找到相等的元素,我们会根据前一行再次计算其count",并添加到现有的count"中.
If when searching rows by last element we find equal element, we calculate its "count" again based on previous row, and add to existing "count".
处理后4:
1: (16, 1)(5, 1)(1, 1)
2: (8, 1)(6, 1)(5, 1)(2, 2)
3: (10, 2)(3, 1)
4: (15, 2)(4, 1) <===
处理后1:
1: (16, 1)(5, 1)(1, 2) <===
2: (8, 1)(6, 1)(5, 1)(2, 2)
3: (10, 2)(3, 1)
4: (15, 2)(4, 1)
那么在处理完所有初始序列之后我们有什么?查看最后一行,我们看到我们有 3 个最长的子元素,每个子元素由 4 个元素组成:2 个以 15 结尾,1 个以 4 结尾.
So what do we have after processing all initial sequence? Looking at the last row, we see that we have 3 longest subs, each consist of 4 elements: 2 end by 15 and 1 ends by 4.
在每次迭代中,当从初始序列中获取下一个元素时,我们会进行 2 次循环:第一次是迭代行以查找下一个元素的空间,第二次是汇总前一行的计数.因此,对于每个元素,我们最多进行 n 次迭代(最坏的情况:如果初始 seq 由按升序排列的元素组成,我们将得到一个 n 行列表,每行有 1 对;如果 seq 已排序按降序排列,我们将获得包含 n 个元素的 1 行列表).顺便说一句,O(n) 复杂度不是我们想要的.
On every iteration, when taking next element from initial sequence, we make 2 loops: first when iterating rows to find room for next element, and second when summarizing counts in previous row. So for every element we make maximum to n iterations (worst cases: if initial seq consists of elements in increasing order, we will get a list of n rows with 1 pair in every row; if seq is sorted in descending order, we will obtain list of 1 row with n elements). By the way, O(n) complexity is not what we want.
首先,这很明显,在每个中间状态中,行都按其最后一个值"的升序排序.所以可以用二分查找代替暴力循环,复杂度为O(log n).
First, this is obvious, that in every intermediate state rows are sorted by increasing order of their last "value". So instead of brute loop, binary searching can be performed, which complexity is O(log n).
其次,我们不需要通过每次循环遍历行元素来汇总 subs 的计数".当新的对被添加到行中时,我们可以在过程中总结它们,例如:
Second, we don't need to summarize "counts" of subs by looping through row elements every time. We can summarize them in process, when new pair is added to the row, like:
1: (16, 1)(5, 2) <=== instead of 1, put 1 + "count" of previous element in the row
所以第二个数字将不显示 count 可以在最后以给定值获得的最长子,而是 所有结束的最长子的 汇总计数通过任何大于或等于该对中值"的元素.
So second number will show not count of longest subs that can be obtained with given value at the end, but summary count of all longest subs that end by any element that is greater or equal to "value" from the pair.
因此,计数"将替换为总和".而不是迭代前一行中的元素,我们只执行二进制搜索(这是可能的,因为任何行中的对总是按它们的值"排序)并将新对的总和"作为前一行中最后一个元素的总和"减去从左边元素到前一行找到位置的sum"加上当前行中前一个元素的sum".
Thus, "counts" will be replaced by "sums". And instead of iterating elements in previous row, we just perform binary search (it is possible because pairs in any row are always ordered by their "values") and take "sum" for new pair as "sum" of last element in previous row minus "sum" from element left to found position in previous row plus "sum" of previous element in the current row.
所以在处理4时:
1: (16, 1)(5, 2)(1, 3)
2: (8, 1)(6, 2)(5, 3)(2, 5)
3: (10, 2)(3, 3)
4: (15, 2) <=== room for (4, ?)
search in row 3 by "values" < 4:
3: (10, 2)^(3, 3)
4 将与 (3-2+2) 配对:(上一行最后一对的sum")-(左上一行的sum"到上一行中找到的位置)+(上一行的sum"对当前行):
4 will be paired with (3-2+2): ("sum" from the last pair of previous row) - ("sum" from pair left to found position in previous row) + ("sum" from previous pair in current row):
4: (15, 2)(4, 3)
在这种情况下,所有最长子项的最终计数是列表最后一行的最后一对的总和",即.e.3,而不是 3 + 2.
In this case, final count of all longest subs is "sum" from the last pair of the last row of the list, i. e. 3, not 3 + 2.
因此,对行搜索和求和搜索执行二分搜索,我们将具有 O(n*log n) 复杂度.
So, performing binary search to both row search and sum search, we will come with O(n*log n) complexity.
内存消耗怎么样,在处理完所有数组后,我们得到最大 n 对,所以动态数组的内存消耗将是 O(n).此外,当使用动态数组或集合时,需要一些额外的时间来分配和调整它们的大小,但大多数操作都是在 O(1) 时间内完成的,因为我们在处理过程中没有进行任何排序和重新排列.所以复杂度估计似乎是最终的.
What about memory consumed, after processing all array we obtain maximum n pairs, so memory consumption in case of dynamic arrays will be O(n). Besides, when using dynamic arrays or collections, some additional time is needed to allocate and resize them, but most operations are made in O(1) time because we don't make any kind of sorting and rearrangement during process. So complexity estimation seems to be final.
这篇关于所有最长递增子序列的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!