问题描述
我遇到了与Geforce GTX 690的问题,同时试图追踪内存使用。一个简单的测试程序:
I have run into problems with Geforce GTX 690 while trying to track down the memory usage. A simple test program:
BOOST_AUTO_TEST_CASE(cudaMemoryTest) {
size_t mem_tot_0 = 0;
size_t mem_free_0 = 0;
size_t mem_tot_1 = 0;
size_t mem_free_1 = 0;
unsigned int mem_size = 100*1000000;
float* h_P = new float[mem_size];
for(size_t i = 0; i < mem_size; i++) {
h_P[i] = 0.f;
}
cudaSetDevice(0);
cudaDeviceReset();
cudaMemGetInfo (&mem_free_0, & mem_tot_0);
std::cout<<"Free memory before copy dev 0: "<<mem_free_0<<std::endl;
cudaSetDevice(1);
cudaDeviceReset();
cudaMemGetInfo (&mem_free_1, &mem_tot_1);
std::cout<<"Free memory before copy dev 1: "<<mem_free_1<<std::endl;
cudaSetDevice(0);
float* P;
cudaMalloc((void**)&P, mem_size*sizeof(float));
cudaMemcpy((void*)P, h_P, mem_size*sizeof(float), cudaMemcpyHostToDevice);
cudaSetDevice(0);
cudaMemGetInfo(&mem_free_0, & mem_tot_0);
std::cout<<"Free memory after copy dev 0: "<<mem_free_0<<std::endl;
cudaSetDevice(1);
cudaMemGetInfo(&mem_free_1, &mem_tot_1);
std::cout<<"Free memory after copy dev 1: "<<mem_free_1<<std::endl;
BOOST_CHECK(mem_free_0 != mem_free_1);
cudaError_t err;
err = cudaGetLastError();
if(err!=cudaSuccess)
std::cout<<"an error occurred"<<std::endl;
cudaSetDevice(0);
destroyMem(P);
delete [] h_P;
}
测试打印出:
1> Free memory before copy dev 0: 1733173248
1> Free memory before copy dev 1: 1688424448
1> Free memory after copy dev 0: 1289940992
1> Free memory after copy dev 1: 1289940992
CudaUtilsTest.cpp(47): error in "cudaMemoryTest": check mem_free_0 != mem_free_1 failed
问题是,在分配之后,设备1上的可用内存量与设备0上的完全相同,这不是这种情况,因此问题必须在cudaMemGetInfo和/或cudaSetDevice。
The problem is that after the allocation the amount of free memory on device 1 is exactly the same as on device 0, which shouldn't be the case, hence the problem has to be in cudaMemGetInfo and/or cudaSetDevice. Anyone run on the same problem, or is there something else fundamentally wrong in the test that someone can point out?
在Windows 7,Visual studio 2010,Cuda上运行代码的任何人都会遇到同样问题, SDK 5.0,使用代码生成进行编译:compute_30,sm_30
Running the code on Windows 7, Visual studio 2010, Cuda SDK 5.0, compiling with code generation: compute_30,sm_30
EDIT 22.4.2013
EDIT 22.4.2013
问题,似乎cudaSetDevice工作正常,可以从cudaGetDevice调用的结果验证。我在内存分配测试后添加了设备0的重置,似乎cudaMemGetInfo返回的可用内存的大小对于两个设备都是相同的。我已经在我自己的代码中检查了所有的cuda_error_t的返回值,所有的函数调用返回cudaSuccess。
I continued experimenting with this issue and it seems that cudaSetDevice works fine as can be verified from the result of cudaGetDevice calls. I added a reset of device 0 after the memory allocation test and it seems that the size of free memory returned by cudaMemGetInfo is again same for both devices. I have checked all return values of cuda_error_t in my own code and all function calls return cudaSuccess. Have anyone run into similar problems with GTX 690 with the setup descibed above?
最新的测试代码:
BOOST_AUTO_TEST_CASE(cudaMemoryTest) {
size_t mem_tot_0 = 0;
size_t mem_free_0 = 0;
size_t mem_tot_1 = 0;
size_t mem_free_1 = 0;
int device_num = 0;
unsigned int mem_size = 100*1000000;
float* h_P = new float[mem_size];
for(size_t i = 0; i < mem_size; i++) {
h_P[i] = 0.f;
}
cudaSetDevice(0);
cudaGetDevice(&device_num);
cudaDeviceReset();
cudaMemGetInfo (&mem_free_0, & mem_tot_0);
std::cout<<"Free memory before copy dev 0: "<<mem_free_0<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
cudaSetDevice(1);
cudaGetDevice(&device_num);
cudaDeviceReset();
cudaMemGetInfo (&mem_free_1, & mem_tot_1);
std::cout<<"Free memory before copy dev 1: "<<mem_free_1<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
cudaSetDevice(0);
cudaGetDevice(&device_num);
float* P;
cudaMalloc((void**)&P, mem_size*sizeof(float));
cudaMemcpy((void*)P, h_P, mem_size*sizeof(float), cudaMemcpyHostToDevice);
cudaMemGetInfo(&mem_free_0, & mem_tot_0);
std::cout<<"Free memory after copy dev 0: "<<mem_free_0<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
cudaSetDevice(1);
cudaGetDevice(&device_num);
cudaMemGetInfo(&mem_free_1, &mem_tot_1);
std::cout<<"Free memory after copy dev 1: "<<mem_free_1<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
BOOST_CHECK(mem_free_0 != mem_free_1);
cudaError_t err;
err = cudaGetLastError();
if(err!=cudaSuccess)
std::cout<<"an error occurred"<<std::endl;
// Reset only device 0 and check both
cudaSetDevice(0);
cudaGetDevice(&device_num);
cudaDeviceReset();
cudaMemGetInfo (&mem_free_0, & mem_tot_0);
std::cout<<"Free memory after second reset of device 0, dev 0: "<<mem_free_0<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
cudaSetDevice(1);
cudaGetDevice(&device_num);
cudaMemGetInfo (&mem_free_1, & mem_tot_1);
std::cout<<"Free memory after second device reset of device 0, dev 1: "<<mem_free_1<<" Device: "<<device_num<<std::endl;
cudaDeviceSynchronize();
delete [] h_P;
}
测试输出:
1> Free memory before copy dev 0: 1794379776 Device: 0
1> Free memory before copy dev 1: 1751728128 Device: 1
1> Free memory after copy dev 0: 1351696384 Device: 0
1> Free memory after copy dev 1: 1351696384 Device: 1
1> CudaUtilsTest.cpp(353): error in "cudaMemoryTest": check mem_free_0 != mem_free_1 failed
1> Free memory after second reset of device 0, dev 0: 1751728128 Device: 0
1> Free memory after second device reset of device 0, dev 1: 1751728128 Device: 1
推荐答案
这是通过更改WDDM驱动程序设置解决的,如下所示:
This was solved by changed WDDM driver settings as follows:
从作为社区Wiki条目的注释添加,以从CUDA标记的未应答队列中获取问题]
[This answer added from comments as a community wiki entry to get the question off the unanswered queue for the CUDA tag]
这篇关于cudaMemGetInfo在GTX 690的两个设备上返回相同的可用内存量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!