本文介绍了 pandas 数据框中的逻辑或/按位或的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用布尔蒙版从2个不同的数据框中获取匹配项.

I am trying to use a Boolean mask to get a match from 2 different dataframes.U

使用逻辑OR运算符:

x = df[(df['A'].isin(df2['B']))
      or df['A'].isin(df2['C'])]

Output:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

但是,使用按位OR运算符可成功返回结果.

However using the bitwise OR operator, the results are returned successfully.

x = df[(df['A'].isin(df2['B']))
      | df['A'].isin(df2['C'])]

Output: x

两者是否有区别?按位或将是此处的最佳选择吗?为什么逻辑或不起作用?

Is there a difference in both and would bitwise OR be the best option here? Why doesn't the logical OR work?

推荐答案

据我了解的这个问题(来自C ++背景,目前正在学习Python的数据科学),我偶然发现了几篇文章,建议按位运算符(& ;, |)可以在类中重载,就像C ++一样.

As far as I have come to understand this issue (coming from a C++ background and currently learning Python for data sciences) I stumbled upon several posts suggesting that bitwise operators (&, |) can be overloaded in classes, just like C++ does.

因此,基本上,虽然您可以对数字使用此类按位运算符,但它们将比较这些位并为您提供结果.因此,例如,如果您具有以下条件:

So basically, while you may use such bitwise operators on numbers they will compare the bits and give you the result. So for instance, if you have the following:

Python实际要做的是比较这些数字的位:

What Python will actually do is compare the bits of these numbers:

结果将是:

00000011(因为0 | 0为False,因此为ergo 0;而0 | 1为True,则是ergo 1)

00000011 (because 0 | 0 is False, ergo 0; and 0 | 1 is True, ergo 1)

整数:3

它比较数字的每一位并吐出这八个连续操作的结果.这是这些运算符的正常行为.

It compares each bit of the numbers and spit out the result of these eight consecutive operations. This is the normal behaviour of these operators.

输入熊猫.当您可以使这些运算符重载时,Pandas便利用了此功能.因此,进入熊猫数据帧时按位运算符的操作如下:

Enter Pandas. As you can overload these operators, Pandas has made use of this. So what bitwise operators do when coming to pandas dataframes, is the following:

在这种情况下,第一只熊猫会根据==和!=运算的结果创建一系列对错(请注意:您必须在括号外加上大括号,因为python总是会尝试首先解析按位运算符,然后解析其他比较运算符!).因此,它将列中的每个值与表达式进行比较,并输出true或false.

In this case, first pandas will create a series of trues or falses depending on the result of the == and != operations (be careful: you have to put braces around the outer expressions because python will always try to resolve first bitwise operators and THEN the other comparision operators!!). So it will compare each value in the column to the expression and either output a true or a false.

然后您将有两个相同长度的对与错.然后,要做的就是将这两个系列拿来,然后将它们与和"(&)或或"(|)进行比较,最后吐出一个单独的系列,即满足或不满足所有三个比较操作.

Then you'd have two same-length series of trues and falses. What it THEN does is take these two serieses and basically compare them with either "and" (&) or "or" (|), and finally spit out one single series either fulfilling or not fulfilling all three comparision operations.

进一步讲,我想在幕后发生的是&运算符实际上调用了pandas函数,并给它们提供了先前评估过的运算(因此运算符左右有两个序列)然后pandas一次比较两个不同的值,并根据内部机制确定结果返回True或False.

To go even further, what I think is happening under the hood is that the &-operator actually calls a function of pandas, gives them both previously evaluated operations (so the two serieses to the left and right of the operator) and pandas then compares two distinct values at a time, returning a True or False depending on the internal mechanism to determine this.

这基本上与所有其他运算符(>,<,> =,< =,==,!=)所使用的原理相同.

This is basically the same principle they've used for all other operators as well (>, <, >=, <=, ==, !=).

为什么当您获得优美而整洁的"and"字样时,为什么要奋斗并使用不同的&表达式?好吧,这似乎是因为和"只是硬编码,不能手动更改.

Why do the struggle and use a different &-expression when you got the nice and neat "and"? Well, that seems to be because "and" is just hard coded and cannot be altered manually.

希望有帮助!

这篇关于 pandas 数据框中的逻辑或/按位或的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-25 00:41