问题描述
我有两个数据框,一个是产品价格的简单时间序列,一个是多指标数据框,其中三种不同类型的输入来自两种不同机器配置的三种产品的产量.
I have two dataframes, one is a simple timeseries of product prices, and one is a multindex dataframe, with yields of the three products from two different machinery configurations, for three different types of inputs.
我要生成的是红色数据框,其中的行是一个时间序列,列是一个多指标,上层是配置,下层是这三个乘积.单元格的值是价格和收益率的点积.
What I want to generate is the red dataframe where the rows are a a time series, the columns are a multindex, upper level being the configuration, and the lower level being the three products. The value of the cells is the dot product of the price and yield.
我有一个玩具例子:
import pandas as pd
import numpy as np
yield_data = {"red_delicious":[0.4, 0.4, 0.2, 0.4, 0.45, 0.05],
"macintosh":[0.6, 0.2, 0.2, 0.61, 0.3, 0.05],
'fuji':[0.3, 0.3, 0.4, 0.3, 0.35, 0.35],
'config':["a"]*3+['b']*3,
'product':['juice', 'candy', 'pulp']*2}
toy_yield = pd.DataFrame.from_dict(yield_data, ).set_index(['config', 'product'])
index=pd.date_range(start="20191201", end="20191210", freq="d")
price_data = {"juice":(np.random.randint(6000,7000,size=(len(index)))/100),
'candy': (np.random.randint(6000,7000,size=(len(index)))/100),
'pulp':(np.random.randint(6000,7000,size=(len(index)))/100),
}
toy_price = pd.DataFrame(data=price_data, index=pd.date_range(start="20191201", end="20191210", freq="d") )
我想用一种矢量化的方法进行点积运算,但是我不知道如何,到目前为止,我只是用糟糕的.apply()或循环类型的程序来弄混这类运算绝对不是理想的.
I would like to do the dot-product operation in a single vectorized approach, but I don't know how, and so far these sorts of operation I have just kludged with awful .apply() or looping-type procedures that definitely aren't ideal.
推荐答案
我相信您只需要:
toy_price @ toy_yield.unstack('config')
输出:
red_delicious macintosh fuji
config a b a b a b
2019-12-01 67.428 60.9110 66.698 64.3121 67.221 67.3640
2019-12-02 62.368 55.8040 61.850 59.0437 62.971 62.8850
2019-12-03 68.226 61.0995 68.702 65.5989 68.602 68.4485
2019-12-04 68.488 61.7440 68.102 65.5111 68.401 68.4710
2019-12-05 65.734 60.0965 65.220 63.6584 64.393 64.7925
2019-12-06 65.638 58.9445 64.476 61.8116 66.061 66.1005
2019-12-07 65.328 58.0940 67.152 63.6116 66.056 65.6460
2019-12-08 67.496 61.0005 66.654 64.3062 67.267 67.4295
2019-12-09 67.940 61.1280 68.708 65.9028 67.820 67.7540
2019-12-10 67.436 60.8665 67.468 64.9579 67.162 67.2265
这篇关于将点乘积应用于multindex pandas 数据框成员的矢量化解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!