斯特恩双原子序列的首次出现

本文介绍了斯特恩双原子序列的首次出现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您得到一个整数n，并且需要在斯特恩的《双原子序列》中找到其首次出现的索引。

You get an integer n and you need to find the index of its first appearance in Stern's Diatomic Sequence.

序列的定义如下：

a[0]     = 0
a[1]     = 1
a[2*i]   = a[i]
a[2*i+1] = a[i] + a[i+1]

请参见。

因为n可以高达400000，所以强行使用它不是一个好主意，特别是因为时间限制为4000 ms。

Because n can be up to 400000, it's not a good idea to brute-force it, especially since the time limit is 4000 ms.

序列很奇怪：第一次出现8是21，但是第一次出现6是33。

The sequence is pretty odd: first occurrence of 8 is 21, but first occurrence of 6 is 33.

任何想法如何解决这个问题？

Any ideas how to solve this?

也许这可能有所帮助：

Maybe this might help: OEIS

推荐答案

我们可以轻松地解决第一个出现在400000以下的数字秒：

We can easily solve for the first occurrence of a number in the range of 400000 in under four seconds:

Prelude Diatomic> firstDiatomic 400000
363490989
(0.03 secs, 26265328 bytes)
Prelude Diatomic> map firstDiatomic [400000 .. 400100]
[363490989,323659475,580472163,362981813,349334091,355685483,346478235,355707595
,291165867,346344083,347155797,316314293,576398643,315265835,313171245,355183267
,315444051,315970205,575509833,311741035,340569429,313223987,565355925,296441165
,361911645,312104147,557145429,317106853,323637939,324425077,610613547,311579309
,316037811,311744107,342436533,348992869,313382235,325406123,355818699,312128723
,347230875,324752171,313178421,312841811,313215645,321754459,576114987,325793195
,313148763,558545581,355294101,359224397,345462093,307583675,355677549,312120731
,341404245,316298389,581506779,345401947,312109779,316315061,315987123,313447771
,361540179,313878107,304788843,325765547,316036275,313731751,355635795,312035947
,346756533,313873883,349358379,357393763,559244877,313317739,325364139,312128107
,580201947,358182323,314944173,357403987,584291115,312158827,347448723,363246413
,315935571,349386085,315929427,312137323,357247725,313207657,320121429,356954923
,557139285,296392013,576042123,311726765,296408397]
(2.45 secs, 3201358192 bytes)

其关键是Calkin-Wilf树。

The key to it is the Calkin-Wilf tree.

从分数<$开始c $ c> 1/1 ，它是根据以下规则构建的：对于分数 a / b 的节点，其左子节点携带分数 a /（a + b）及其右子元素分数（a + b）/ b 。

Starting from the fraction 1/1, it is built by the rule that for a node with the fraction a/b, its left child carries the fraction a/(a+b), and its right child the fraction (a+b)/b.

                        1/1
                       /   \
                      /     \
                     /       \
                  1/2         2/1
                 /   \       /   \
               1/3   3/2   2/3   3/1

等。双原子序列（从索引1开始）是Calkin-Wilf树中各部分的分子序列，当逐级遍历时，从左到右。

etc. The diatomic sequence (starting at index 1) is the sequence of numerators of the fractions in the Calkin-Wilf tree, when that is traversed level by level, each level from left to right.

如果我们看一下索引树

                         1
                        / \
                       /   \
                      /     \
                     2       3
                    / \     / \
                   4   5   6   7
                  / \ 
                 8   9 ...

我们可以轻松地验证索引为 k 的节点在Calkin-Wilf树中，归纳法包含分数 a [k] / a [k + 1] 。

we can easily verify that the node at index k in the Calkin-Wilf tree carries the fraction a[k]/a[k+1] by induction.

对于 k = 1 （ a [1] = a [2] = 1 ）显然是正确的，并且从那时起，

That is obviously true for k = 1 (a[1] = a[2] = 1), and from then on,

对于 k = 2 * j 索引为 j 的节点的左子节点，因此分数为 a [j] /（a [j] + a [j + 1]）和 a [k] = a [j] 和 a [k + 1] = a [j] + a [j + 1] 是序列的定义方程。

for k = 2*j we have the left child of the node with index j, so the fraction is a[j]/(a[j]+a[j+1]) and a[k] = a[j] and a[k+1] = a[j] + a[j+1] are the defining equations of the sequence.

对于 k = 2 * j + 1 ，我们有索引为 j的节点的右子节点，因此分数为（a [j] + a [j + 1]）/ a [j + 1] ，即<$再次通过定义方程式c $ c> a [k] / a [k + 1] 。

for k = 2*j+1 we have the right child of the node with index j, so the fraction is (a[j]+a[j+1])/a[j+1] and that is a[k]/a[k+1] again by the defining equations.

所有正的减少分数在Calkin-Wilf树中出现一次（供读者练习），因此所有正整数都出现在双原子序列中。

All positive reduced fractions occur exactly once in the Calkin-Wilf tree (left as an exercise for the reader), hence all positive integers occur in the diatomic sequence.

通过按照索引的二进制表示从最高有效位到最低位，我们可以从索引中找到Calkin-Wilf树中的节点，对于1位，我们转到正确的子节点，对于0位，位向左。（为此，最好用一个节点 0/1 扩展其Calkin-Wilf树，该节点的右子节点为 1/1 节点，因此我们需要为索引的最高有效位迈出一步。）

We can find the node in the Calkin-Wilf tree from the index by following the binary representation of the index, from the most significant bit to the least, for a 1-bit we go to the right child and for a 0-bit to the left. (For that, it is nice to augment the Calkin-Wilf tree with a node 0/1 whose right child is the 1/1 node, so that we need have a step for the most significant set bit of the index.)

现在，这对解决问题。

但是，让我们首先解决一个相关问题：对于减少的分数 p / q

But, let us first solve a related problem: For a reduced fraction p/q, determine its index.

假设 p> q 。然后我们知道 p / q 是一个正确的孩子，其父是（p-q）/ q 。如果 p-q> q ，我们又有了一个正确的孩子，其父母是（p-2 * q）/ q 。继续，如果

Suppose that p > q. Then we know that p/q is a right child, and its parent is (p-q)/q. If also p-q > q, we have again a right child, whose parent is (p - 2*q)/q. Continuing, if

p = a*q + b, 1 <= b < q

然后我们达到 p / q 从 b / q 节点转到正确的子节点 a 次。

then we reach the p/q node from the b/q node by going to the right child a times.

现在，我们需要找到一个分子小于分母的节点。那当然是其父母的左子女。那么， b / q 的父级就是 b /（q-b）。如果

Now we need to find a node whose numerator is smaller than its denominator. That is of course the left child of its parent. The parent of b/q is b/(q-b) then. If

q = c*b + d, 1 <= d < b

我们必须去左边的孩子 c 从节点 b / d 到达 b / q 的时间。

we have to go to the left child c times from the node b/d to reach b/q.

依此类推。

我们可以从根本上找到方法（ 1/1 ）扩展到 p / q 的 p / q 节点（这里我只考虑简单的连续分数） >。让 p> q 和

We can find the way from the root (1/1) to the p/q node using the continued fraction (I consider only simple continued fractions here) expansion of p/q. Let p > q and

p/q = [a_0, a_1, ..., a_r,1]

p / q 的连续分数扩展以 1 结尾。

the continued fraction expansion of p/q ending in 1.

如果 r 是偶数，然后转到右边的子 a_r 次，然后转到左边的 a_（r-1）次，然后是右边的孩子...然后 a_1 次是左边孩子，最后 a_0 次

如果 r 是奇数，则首先转到左侧的孩子 a_r 次，然后是向右的 a_（r-1）次...然后是 a_1 次左边的孩子，最后右边的 a_0 次。

If r is even, then go to the right child a_r times, then to the left a_(r-1) times, then to the right child ... then a_1 times to the left child, and finally a_0 times to the right.
If r is odd, then first go to the left child a_r times, then a_(r-1) times to the right ... then a_1 times to the left child, and finally a_0 times to the right.

对于 p< q ，我们必须结束向左移动，因此对于偶数 r 开始向左移动，对于奇数<$ c开始向右移动$ c> r 。

For p < q, we must end going to the left, hence start going to the left for even r and start going to the right for odd r.

因此，我们发现索引的二进制表示与所载分数的连续分数扩展之间的密切联系。

We have thus found a close connection between the binary representation of the index and the continued fraction expansion of the fraction carried by the node via the path from the root to the node.

让索引的运行长度编码 k be

Let the run-length-encoding of the index k be

[c_1, c_2, ..., c_j]           (all c_i > 0)

ie k 的二进制表示形式以 c_1 开头，后跟 c_2 零，然后为 c_3 等，并以 c_j

i.e. the binary representation of k starts with c_1 ones, followed by c_2 zeros, then c_3 ones etc., and ending with c_j

一个，如果 k 是奇数-因此 j 也是奇数;

归零，如果 k 是偶数-因此 j 也是偶数。

ones, if k is odd - hence j is also odd;
zeros, if k is even - hence j is also even.

然后 [c_j，c_（j-1），...，c_2，c_1] 是 a [k] / a [k + 1] 的连续小数展开，其长度与 k相同（每个有理数都具有两个连续的分数展开，一个具有奇数长度，另一个具有偶数长度）。

Then [c_j, c_(j-1), ..., c_2, c_1] is the continued fraction expansion of a[k]/a[k+1] whose length has the same parity as k (every rational has exactly two continued fraction expansions, one with odd length, the other with even length).

RLE给出了从 1/1 上方的 0/1 节点到 a [k] / a [k +1] 。路径的长度是

The RLE gives the path from the 0/1 node above 1/1 to a[k]/a[k+1]. The length of the path is

表示 k 所需的位数，

连续分数扩展中部分商的总和。

the number of bits necessary to represent k, and
the sum of the partial quotients in the continued fraction expansion.

现在，到找到 n>第一次出现的索引。 0 在双原子序列中，我们首先观察到最小索引必须为奇数，因为 a [k] = a [k / 2] 甚至 k 。令最小索引为 k = 2 * j + 1 。然后

Now, to find the index of the first occurrence of n > 0 in the diatomic sequence, we first observe that the smallest index must necessarily be odd, since a[k] = a[k/2] for even k. Let the smallest index be k = 2*j+1. Then

RLE的长度 k 是奇数的，

索引为 k 的节点上的分数为 a [2 * j + 1] / a [2 * j + 2] =（a [j] + a [j + 1]）/ a [j + 1] ，因此它是一个正确的孩子。

the length of the RLE of k is odd,
the fraction at the node with index k is a[2*j+1]/a[2*j+2] = (a[j] + a[j+1])/a[j+1], hence it is a right child.

因此具有 a [k] = n 的最小索引 k 到分子为 n 的节点的所有最短路径的最左端。

So the smallest index k with a[k] = n corresponds to the left-most ending of all the shortest paths to a node with numerator n.

最短路径对应到 n / m 的连续分数扩展，其中 0< m< = n 是 n 的互质数（必须减少小数），且部分商之和最小。

The shortest paths correspond to the continued fraction expansions of n/m, where 0 < m <= n is coprime to n [the fraction must be reduced] with the smallest sum of the partial quotients.

我们需要期望什么样的长度？给定连续分数 p / q = [a_0，a_1，...，a_r] 与 a_0> 0 和总和

What kind of length do we need to expect? Given a continued fraction p/q = [a_0, a_1, ..., a_r] with a_0 > 0 and sum

s = a_0 + ... + a_r

分子 p 受 F（s +1）和分母 q 除以 F code>，其中 F（j）是第 j 个斐波那契数。边界是尖锐的，对于 a_0 = a_1 = ... = a_r = 1 分数是 F（s + 1）/ F（s）。

the numerator p is bounded by F(s+1) and the denominator q by F(s), where F(j) is the j-th Fibonacci number. The bounds are sharp, for a_0 = a_1 = ... = a_r = 1 the fraction is F(s+1)/F(s).

因此，如果 F（t）< n< = F（t + 1），连续分数扩展（两个中的任意一个）的部分商之和是> = t 。通常会有一个 m ，使得 n / m 的连续分数扩展的部分商之和正好 t ，但不总是这样：

So if F(t) < n <= F(t+1), the sum of the partial quotients of the continued fraction expansion (either of the two) is >= t. Often there is an m such that the sum of the partial quotients of the continued fraction expansion of n/m is exactly t, but not always:

F(5) = 5 < 6 <= F(6) = 8

且两个缩减分数的连续分数扩展 6 / m 和 0< m< = 6 是

and the continued fraction expansions of the two reduced fractions 6/m with 0 < m <= 6 are

6/1 = [6]          (alternatively [5,1])
6/5 = [1,4,1]      (alternatively [1,5])

具有部分商的总和6。但是，部分商的最小总和永远不会大得多（我知道的最大商数是 t + 2 ）。

with sum of the partial quotients 6. However, the smallest possible sum of partial quotients is never much larger (the largest I'm aware of is t+2).

n / m 和 n /（nm）的连续分数扩展密切相关。假设 m< n / 2 ，然后让

The continued fraction expansions of n/m and n/(n-m) are closely related. Let's assume that m < n/2, and let

n/m = [a_0, a_1, ..., a_r]

然后 a_0> = 2 ，

(n-m)/m = [a_0 - 1, a_1, ..., a_r]

and since

n/(n-m) = 1 + m/(n-m) = 1 + 1/((n-m)/m)

n /（nm）的连续分数扩展是

n/(n-m) = [1, a_0 - 1, a_1, ..., a_r]

特别是，这两者的部分商之和是相同的。

In particular, the sum of the partial quotients is the same for both.

不幸的是，我不知道找到的方法m 具有最小部分商且没有蛮力，因此算法为（我假设 n> 2

Unfortunately, I'm not aware of a way to find the m with the smallest sum of partial quotients without brute force, so the algorithm is (I assume n > 2

 
  对于 0< m< n / 2 互质素为 n ，找到 n / m 的连续小数展开式，收集部分商之和最小的值（即u sual算法产生扩展，其最后部分商为> 1 ，我们假设是）。

for 0 < m < n/2 coprime to n, find the continued fraction expansion of n/m, collecting the ones with the smallest sum of the partial quotients (the usual algorithm produces expansions whose last partial quotient is > 1, we assume that).

按以下方式调整找到的连续分数扩展[数量不大] ：

Adjust the found continued fraction expansions [those are not large in number] it the following way:

 
 如果CF  [a_0，a_1，...，a_r] 具有偶数长度，请将其转换为 [a_0，a_1，...，a_（r-1），a_r-1，1]  
 
 否则，请使用 [1，a_0-1，a_1，...，a_（r-1），a_r-1，1]

if the CF [a_0, a_1, ..., a_r] has even length, convert it to [a_0, a_1, ..., a_(r-1), a_r - 1, 1]
otherwise, use [1, a_0 - 1, a_1, ..., a_(r-1), a_r - 1, 1]

（选择 n / m 和 n /（ nm）导致较小的索引）

(that chooses the one between n/m and n/(n-m) leading to the smaller index)

反转连续的分数以获得相应索引的游程长度编码

reverse the continued fractions to obtain the run-length-encodings of the corresponding indices

选择其中最小的一个。

在第1步中，使用到目前为止找到的最小的总和可以用作捷径。

In step 1, it is useful to use the smallest sum found so far to short-cut.

代码（Haskell，因为这是最简单的）：

Code (Haskell, since that's easiest):

module Diatomic (diatomic, firstDiatomic, fuscs) where

import Data.List

strip :: Int -> Int -> Int
strip p = go
  where
    go n = case n `quotRem` p of
             (q,r) | r == 0    -> go q
                   | otherwise -> n

primeFactors :: Int -> [Int]
primeFactors n
    | n < 1             = error "primeFactors: non-positive argument"
    | n == 1            = []
    | n `rem` 2 == 0    = 2 : go (strip 2 (n `quot` 2)) 3
    | otherwise         = go n 3
      where
        go 1 _ = []
        go m p
            | m < p*p   = [m]
            | r == 0    = p : go (strip p q) (p+2)
            | otherwise = go m (p+2)
              where
                (q,r) = m `quotRem` p

contFracLim :: Int -> Int -> Int -> Maybe [Int]
contFracLim = go []
  where
    go acc lim n d = case n `quotRem` d of
                       (a,b) | lim < a -> Nothing
                             | b == 0  -> Just (a:acc)
                             | otherwise -> go (a:acc) (lim - a) d b

fixUpCF :: [Int] -> [Int]
fixUpCF [a]
    | a < 3     = [a]
    | otherwise = [1,a-2,1]
fixUpCF xs
    | even (length xs) = case xs of
                           (1:_) -> fixEnd xs
                           (a:bs) -> 1 : (a-1) : bs
    | otherwise        = case xs of
                           (1:_) -> xs
                           (a:bs) -> 1 : fixEnd ((a-1):bs)

fixEnd :: [Int] -> [Int]
fixEnd [a,1] = [a+1]
fixEnd [a] = [a-1,1]
fixEnd (a:bs) = a : fixEnd bs
fixEnd _ = error "Shouldn't have called fixEnd with an empty list"

cfCompare :: [Int] -> [Int] -> Ordering
cfCompare (a:bs) (c:ds) = case compare a c of
                            EQ -> cfCompare ds bs
                            cp -> cp

fibs :: [Integer]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

toNumber :: [Int] -> Integer
toNumber = foldl' ((+) . (*2)) 0 . concat . (flip (zipWith replicate) $ cycle [1,0])

fuscs :: Integer -> (Integer, Integer)
fuscs 0 = (0,1)
fuscs 1 = (1,1)
fuscs n = case n `quotRem` 2 of
            (q,r) -> let (a,b) = fuscs q
                     in if r == 0
                          then (a,a+b)
                          else (a+b,b)
diatomic :: Integer -> Integer
diatomic = fst . fuscs

firstDiatomic :: Int -> Integer
firstDiatomic n
    | n < 0     = error "Diatomic sequence has no negative terms"
    | n < 2     = fromIntegral n
    | n == 2    = 3
    | otherwise = toNumber $ bestCF n

bestCF :: Int -> [Int]
bestCF n = check [] estimate start
  where
    pfs = primeFactors n
    (step,ops) = case pfs of
                   (2:xs) -> (2,xs)
                   _      -> (1,pfs)
    start0 = (n-1) `quot` 2
    start | even n && even start0 = start0 - 1
          | otherwise             = start0
    eligible k = all ((/= 0) . (k `rem`)) ops
    estimate = length (takeWhile (<= fromIntegral n) fibs) + 2
    check candidates lim k
        | k < 1 || n `quot` k >= lim = if null candidates
                                         then check [] (2*lim) start
                                         else minimumBy cfCompare candidates
        | eligible k = case contFracLim lim n k of
                         Nothing -> check candidates lim (k-step)
                         Just cf -> let s = sum cf
                                    in if s < lim
                                         then check [fixUpCF cf] s (k - step)
                                         else check (fixUpCF cf : candidates) lim (k-step)
        | otherwise = check candidates lim (k-step)

                        这篇关于斯特恩双原子序列的首次出现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！