本文介绍了向量化或优化循环,其中每次迭代都取决于前一次迭代的状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个要在python中实现的算法.该算法可能已执行1.000.000次,因此我想尽可能地对其进行优化.该算法的基础是三个列表(energypointvalList)以及两个计数器pe.

两个列表energypoint包含0到1之间的数字,这些数字是我进行决策的依据. p是点计数器,而e是能量计数器.我可以用点数换取Enery,每种能量的成本在valList中定义(取决于时间).反之,我也可以交易.但是我必须一次全部交易.

算法概述:

  1. 获取一个布尔列表,其中energy中的元素高于某个阈值,而point中的元素低于另一个阈值.这是用能量换取积分的决定.获取相应的点列表,从而决定能源的交易点
  2. 在每个布尔列表中.删除所有接在另一个真实值之后的真实值(如果我将所有点都换成能源,则不允许我再重复一点)
  3. 对于两个布尔列表中的每个项对(pB,点布尔值和eB,能量布尔值):如果pB为true并且我有点,我想将我所有的点都换成enery.如果eB是真实的并且我有能量,我想将我所有的能量都兑换成点.

这是我提出的实现:

start = time.time()
import numpy as np

np.random.seed(2) #Seed for deterministic result, just for debugging

topLimit = 0.55
bottomLimit = 0.45

#Generate three random arrays, will not be random in the real world
res = np.random.rand(500,3) #Will probably not be much longer than 500
energy = res[:,0]        
point = res[:,1]
valList = res[:,2]

#Step 1:
#Generate two bools that (for ex. energy) is true when energy is above a threashold
#and point below another threshold). The opposite applies to point
energyListBool = ((energy > topLimit) & (point < bottomLimit))
pointListBool = ((point > topLimit) & (energy < bottomLimit))

#Step 2:
#Remove all 'true' that comes after another true since this is not valid
energyListBool[1:] &= energyListBool[1:] ^ energyListBool[:-1]
pointListBool[1:] &= pointListBool[1:] ^ pointListBool[:-1]

p = 100
e = 0

#Step 3:
#Loop through the lists, if point is true, I loose all p but gain p/valList[i] for e
#If energy is true I loose all e but gain valList[i]*e for p
for i in range(len(energyListBool)):
    if pointListBool[i] and e == 0:
        e = p/valList[i] #Trade all points to energy
        p = 0
    elif energyListBool[i] and p == 0:
        p = valList[i]*e #Trade all enery to points
        e = 0

print('p = {0} (correct for seed 2: 3.1108006690739174)'.format(p))
print('e = {0} (correct for seed 2: 0)'.format(e))

end = time.time()
print(end - start)

我正在努力的是如何(如果可以做到)对for循环进行矢量化处理,因此我可以使用它代替for循环,因为在我看来这可能会更快.

解决方案

在当前问题设置之内,这是不可能的,因为矢量化本质上要求您的第n个计算步骤不应依赖于先前的n-1步骤.但是,有时候,有可能找到重复的f(n) = F(f(n-1), f(n-2), ... f(n-k))的所谓闭合形式",即找到不依赖nf(n)的显式表达式,但这是一个单独的研究问题. /p>

此外,从算法的角度来看,这样的矢量化不会有太大的作用,因为算法的复杂度仍然是C*n = O(n).但是,由于复杂性常数" C实际上很重要,因此有不同的方法来降低它.例如,用C/C ++重写关键循环不是什么大问题.

I have an algorithm which I am implementing in python. The algorithm might be executed 1.000.000 times so I want to optimize it as much as possible. The base in the algorithm is three lists (energy, point and valList) and two counters p and e.

The two lists energy and point contains number between 0 and 1 on which I base decisions. p is a point-counter and e is an energy-counter. I can trade points for enery, and the cost of each energy is defined in valList (it is time dependent). I can also trade the otherway. But I have to trade all at once.

The outline of the algorithm:

  1. Get a boolean-list where the elements in energy is above a threshold and the elements in point is below another threshold. This is a decision to trade energy for points. Get a corresponding list for point, which gives decision to trade points for energy
  2. In each of the boolean-lists. Remove all true-values that comes after another true value (if i have trade all points for energy, i am not allowed to do that again point after)
  3. For each item-pair (pB, point bool and eB, energy bool) from the two boolean lists: If pB is true and i have points, i want to trade all my points for enery. If eB is true and i have energy, i want to trade all my energy to points.

This is the implementation i have come up with:

start = time.time()
import numpy as np

np.random.seed(2) #Seed for deterministic result, just for debugging

topLimit = 0.55
bottomLimit = 0.45

#Generate three random arrays, will not be random in the real world
res = np.random.rand(500,3) #Will probably not be much longer than 500
energy = res[:,0]        
point = res[:,1]
valList = res[:,2]

#Step 1:
#Generate two bools that (for ex. energy) is true when energy is above a threashold
#and point below another threshold). The opposite applies to point
energyListBool = ((energy > topLimit) & (point < bottomLimit))
pointListBool = ((point > topLimit) & (energy < bottomLimit))

#Step 2:
#Remove all 'true' that comes after another true since this is not valid
energyListBool[1:] &= energyListBool[1:] ^ energyListBool[:-1]
pointListBool[1:] &= pointListBool[1:] ^ pointListBool[:-1]

p = 100
e = 0

#Step 3:
#Loop through the lists, if point is true, I loose all p but gain p/valList[i] for e
#If energy is true I loose all e but gain valList[i]*e for p
for i in range(len(energyListBool)):
    if pointListBool[i] and e == 0:
        e = p/valList[i] #Trade all points to energy
        p = 0
    elif energyListBool[i] and p == 0:
        p = valList[i]*e #Trade all enery to points
        e = 0

print('p = {0} (correct for seed 2: 3.1108006690739174)'.format(p))
print('e = {0} (correct for seed 2: 0)'.format(e))

end = time.time()
print(end - start)

What I am struggeling with is how (if it can be done) to vectorize the for-loop, so i can use that instead of the for-loop which in my mind probably would be faster.

解决方案

Within the current problem setting that's not possible since vectorization essentially requires that your n-th computation step shouldn't depend on previous n-1 steps. Sometimes, however, it's possible to find so-called "closed form" of a recurrence f(n) = F(f(n-1), f(n-2), ... f(n-k)), i.e. to find an explicit expression for f(n) that doesn't depend on n, but it's a separate research problem.

Moreover, from algorithmic point of view such a vectorization wouldn't give a lot, since complexity of your algorithm would still be C*n = O(n). However, since "complexity constant" C does matter in practice, there are different ways to reduce it. For example, it shouldn't be a big problem to rewrite your critical loop in C/C++.

这篇关于向量化或优化循环,其中每次迭代都取决于前一次迭代的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 08:58