本文介绍了如何从具有匹配索引的数据框中减去序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有许多列的 DataFrame 和一个 Series 。两者都有相同的 DateTimeIndex

I have a DataFrame with a number of columns, and a Series. Both have the same DateTimeIndex.

我想减去系列中每一行的值 DataFrame

每行的所有值中这是我的示例数据:

Here is my sample data:

dates   = pandas.date_range('20180101', periods=10)
stocks  = ['AAPL', 'GOOG', 'MSFT', 'AMZN', 'FB']
data    = numpy.random.randn(10,5)
prices  = pandas.DataFrame(index=dates, columns=stocks, data=data)
returns = prices.pct_change(1)

这给了我 DataFrame 类似于以下内容

This gives me a DataFrame similar to the following

然后创建我的系列,这是一篮子股票的回报

I then create my Series, which is the return of the basket of stocks

basket = returns.mean(axis=1)

这给了我系列与以下类似

现在我想从每只股票的收益中减去篮子收益:

Now I want to subtract the basket returns from each stock's returns:

excess_ret = returns - basket

我收到以下警告:

RuntimeWarning: Cannot compare type 'Timestamp' with type 'str', sort order is
undefined for incomparable objects
  return this.join(other, how=how, return_indexers=return_indexers)


code> DataFrame :

This is the resulting DataFrame:

二手用于 pa ndas-0.16.2 ,但我现在正在使用 pandas-0.22.0 ,看来我无法减去来自 DataFrame 的系列现在具有匹配的索引

This used to work in pandas-0.16.2, but I am now using pandas-0.22.0, and it seems I am unable to subtract a Series from a DataFrame with matching Indexes now?

问题:


  • 此减法运算中正在发生什么

  • 如何从 Series 中每一行的值> DataFrame ?

  • What is happening in this subtraction operation I am currently performing?
  • How can I subtract each row's value in the Series from all the values in each row in the DataFrame?

推荐答案

我认为需要参数为 axis = 0 的按 Series索引的 DataFrame 匹配索引

I think need subwith parameter axis=0 for match index of DataFrame by index of Series:

对于系列输入,轴以匹配

For Series input, axis to match Series index on


excess_ret = returns.sub(basket, axis=0)
print (excess_ret)
                AAPL      GOOG      MSFT      AMZN        FB
2018-01-01       NaN       NaN       NaN       NaN       NaN
2018-01-02 -1.833226 -0.110935  0.455586 -0.173553  1.662127
2018-01-03 -0.662713  1.737714 -1.295243  1.381853 -1.161611
2018-01-04  3.269817 -0.824819  0.377973 -0.788368 -2.034604
2018-01-05 -0.082528  1.814466  2.295359 -3.543489 -0.483808
2018-01-06  0.295950  2.978380  1.000856  1.346977 -5.622164
2018-01-07  1.988864 -2.316191  0.633370  1.043901 -1.349943
2018-01-08 -2.640122 -0.861669 -1.472634 -1.559951  6.534376
2018-01-09  8.062484 -1.712583 -2.497513 -0.807566 -3.044822
2018-01-10 -1.823915  0.370618 -0.883559  0.888679  1.448177




如果要按列匹配:


If want match by columns:

a = returns.mean(axis=0)
print (a)
AAPL    0.088224
GOOG   -1.301244
MSFT   -2.436290
AMZN   -1.009339
FB     -0.102484
dtype: float64

excess_ret = returns.sub(a, axis=1)
print (excess_ret)
                AAPL      GOOG       MSFT      AMZN        FB
2018-01-01       NaN       NaN        NaN       NaN       NaN
2018-01-02 -1.353102  1.441870   5.759181  0.421661 -0.608508
2018-01-03 -0.434575 -0.969659   0.665239  0.823154  4.917633
2018-01-04  8.771575 -2.722012   0.409977 -2.113780 -1.164615
2018-01-05 -0.220083  0.213942   1.329937 -0.372537  0.037217
2018-01-06 -0.633686  6.371478 -14.157027 -0.831583  1.226992
2018-01-07 -2.363521  0.130848   1.743317 -1.381718 -1.929583
2018-01-08 -3.062185 -6.431137   0.438800  0.956752 -1.641623
2018-01-09 -0.450300  2.093572   2.965726 -0.617335  1.042234
2018-01-10 -0.254123 -0.128903   0.844849  3.115386 -1.879747

这篇关于如何从具有匹配索引的数据框中减去序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 00:17