问题描述
我正在将一些代码从 32 位移植到 64 位,并确保答案相同.这样做时,我注意到 atan2f 在两者之间给出了不同的结果.
我创建了这个最小重现:
#include #include <math.h>void testAtan2fIssue(float A, float B){浮动 atan2fResult = atan2f(A, B);printf("atan2f: %.15f
", atan2fResult);浮动 atan2Result = atan2(A, B);printf("atan2: %.15f
", atan2Result);}int main(){浮动 A = 16.323556900024414;浮动 B = -5.843180656433105;testAtan2fIssue(A, B);}
构建时:
gcc compilerTest.c -m32 -o 32bit.out -lm
它给出:
atan2f: 1.914544820785522atan2:1.914544820785522
构建时:
gcc compilerTest.c -o 64bit.out -lm
它给出:
atan2f: 1.914544701576233atan2:1.914544820785522
请注意,atan2 在两种情况下都会给出相同的结果,但 atan2f 不会.
我尝试过的事情:
使用 -ffloat-store 构建 32 位版本
使用 -msse2 -mfpmath=sse 构建 32 位版本
使用 -mfpmath=387 构建 64 位版本
没有改变我的结果.
(所有这些都基于假设它与浮点运算在 32 位和 64 位架构上发生的方式有关.)
问题:
我有哪些选择可以让他们给出相同的结果?(有我可以使用的编译器标志吗?)还有,这里发生了什么?
我在 i7 机器上运行,如果有帮助的话.
用十六进制表示法更容易看到.
void testAtan2fIssue(float A, float B) {双 d = atan2(A, B);printf(" atan2 : %.13a %.15f
", d, d);浮动 f = atan2f(A, B);printf(" atan2f: %.13a %.15f
", f, f);printf("(float) atan2 : %.13a %.15f
", (float) d, (float) d);浮动 f2 = nextafterf(f, 0);printf("问题值:%.13a %.15f
", f2, f2);}//_ 为清晰起见添加atan2:0x1.ea1f9_b9d85de4p+0 1.914544_797857041atan2f: 0x1.ea1f9_c0000000p+0 1.914544_820785522(浮动)atan2:0x1.ea1f9_c0000000p+0 1.914544_820785522问题值:0x1.ea1f9_a0000000p+0 1.914544_701576233
这里发生了什么?
从 double
到 float
的转换可以预期是最佳的,但是 反正切 函数可能是几个 ULP 在各种平台上关闭.1.914544701576233
是下一个较小的 float
值,反映了稍差的反正切计算.
我有哪些选择可以让他们给出相同的结果?
很少.代码可以从已建立的代码库中推出您自己的my_atan2()
.然而,即使这样也可能有细微的实现差异.@stark
相反,请考虑让代码检查能够容忍微小的变化.
I'm porting some code from 32 bit to 64 bit, and ensuring the answers are the same. In doing so, I noticed that atan2f was giving different results between the two.
I created this min repro:
#include <stdio.h>
#include <math.h>
void testAtan2fIssue(float A, float B)
{
float atan2fResult = atan2f(A, B);
printf("atan2f: %.15f
", atan2fResult);
float atan2Result = atan2(A, B);
printf("atan2: %.15f
", atan2Result);
}
int main()
{
float A = 16.323556900024414;
float B = -5.843180656433105;
testAtan2fIssue(A, B);
}
When built with:
gcc compilerTest.c -m32 -o 32bit.out -lm
it gives:
atan2f: 1.914544820785522
atan2: 1.914544820785522
When built with:
gcc compilerTest.c -o 64bit.out -lm
it gives:
atan2f: 1.914544701576233
atan2: 1.914544820785522
Note that atan2 gives the same result in both cases, but atan2f does not.
Things I have tried:
Building the 32 bit version with -ffloat-store
Building the 32 bit version with -msse2 -mfpmath=sse
Building the 64 bit version with -mfpmath=387
None changed the results for me.
(All of these were based on the hypothesis that it has something to do with the way floating point operations happen on 32 bit vs 64 bit architectures.)
Question:
What are my options for getting them to give the same result? (Is there a compiler flag I could use?) And also, what is happening here?
I'm running on an i7 machine, if that is helpful.
This is easier to see in hex notation.
void testAtan2fIssue(float A, float B) {
double d = atan2(A, B);
printf(" atan2 : %.13a %.15f
", d, d);
float f = atan2f(A, B);
printf(" atan2f: %.13a %.15f
", f, f);
printf("(float) atan2 : %.13a %.15f
", (float) d, (float) d);
float f2 = nextafterf(f, 0);
printf("problem value : %.13a %.15f
", f2, f2);
}
// _ added for clarity
atan2 : 0x1.ea1f9_b9d85de4p+0 1.914544_797857041
atan2f: 0x1.ea1f9_c0000000p+0 1.914544_820785522
(float) atan2 : 0x1.ea1f9_c0000000p+0 1.914544_820785522
problem value : 0x1.ea1f9_a0000000p+0 1.914544_701576233
The conversion from double
to float
can be expected to be optimal, yet arctangent functions may be a few ULP off on various platforms. The 1.914544701576233
is the next smaller float
value and reflects the slightly inferior arctangent calculation.
Few. Code could roll your own my_atan2()
from an established code base. Yet even that may have subtle implementation differences. @stark
Instead, consider making code checking tolerant of the minute variations.
这篇关于atan2f 使用 m32 标志给出不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!