本文介绍了将点乘积应用于multindex pandas 数据框成员的矢量化解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框,一个是产品价格的简单时间序列,一个是多指标数据框,其中三种不同类型的输入来自两种不同机器配置的三种产品的产量.

I have two dataframes, one is a simple timeseries of product prices, and one is a multindex dataframe, with yields of the three products from two different machinery configurations, for three different types of inputs.

我要生成的是红色数据框,其中的行是一个时间序列,列是一个多指标,上层是配置,下层是这三个乘积.单元格的值是价格和收益率的点积.

What I want to generate is the red dataframe where the rows are a a time series, the columns are a multindex, upper level being the configuration, and the lower level being the three products. The value of the cells is the dot product of the price and yield.

我有一个玩具例子:

import pandas as pd
import numpy as np
yield_data = {"red_delicious":[0.4, 0.4, 0.2, 0.4, 0.45, 0.05],
"macintosh":[0.6, 0.2, 0.2, 0.61, 0.3, 0.05],
'fuji':[0.3, 0.3, 0.4, 0.3, 0.35, 0.35],
'config':["a"]*3+['b']*3,
'product':['juice', 'candy', 'pulp']*2}

toy_yield = pd.DataFrame.from_dict(yield_data, ).set_index(['config', 'product'])

index=pd.date_range(start="20191201", end="20191210", freq="d")
price_data = {"juice":(np.random.randint(6000,7000,size=(len(index)))/100),
             'candy': (np.random.randint(6000,7000,size=(len(index)))/100),
             'pulp':(np.random.randint(6000,7000,size=(len(index)))/100),
             }
toy_price = pd.DataFrame(data=price_data, index=pd.date_range(start="20191201", end="20191210", freq="d") )

我想用一种矢量化的方法进行点积运算,但是我不知道如何,到目前为止,我只是用糟糕的.apply()或循环类型的程序来弄混这类运算绝对不是理想的.

I would like to do the dot-product operation in a single vectorized approach, but I don't know how, and so far these sorts of operation I have just kludged with awful .apply() or looping-type procedures that definitely aren't ideal.

推荐答案

我相信您只需要:

toy_price @ toy_yield.unstack('config')

输出:

           red_delicious          macintosh             fuji         
config                 a        b         a        b       a        b
2019-12-01        67.428  60.9110    66.698  64.3121  67.221  67.3640
2019-12-02        62.368  55.8040    61.850  59.0437  62.971  62.8850
2019-12-03        68.226  61.0995    68.702  65.5989  68.602  68.4485
2019-12-04        68.488  61.7440    68.102  65.5111  68.401  68.4710
2019-12-05        65.734  60.0965    65.220  63.6584  64.393  64.7925
2019-12-06        65.638  58.9445    64.476  61.8116  66.061  66.1005
2019-12-07        65.328  58.0940    67.152  63.6116  66.056  65.6460
2019-12-08        67.496  61.0005    66.654  64.3062  67.267  67.4295
2019-12-09        67.940  61.1280    68.708  65.9028  67.820  67.7540
2019-12-10        67.436  60.8665    67.468  64.9579  67.162  67.2265

这篇关于将点乘积应用于multindex pandas 数据框成员的矢量化解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-21 04:03