本文介绍了NVIDIA并行Nsight与可视化分析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Windows平台上使用CUDA。在Windows平台上,我们可以访问Parallel Nsight和Visual Profiler。两个都很好,但是他们有几乎类似的功能,用于分析和跟踪。有人可以告诉我他们是如何不同,哪一个更好的Windows平台?

解决方案

Nsight Visual Studio 2.2提供了比Visual Profiler更好的优势: / p>

整合


  1. 整合至Visual Studio 2008 SP1和2010专业版作为VS Express Edition不支持集成包)。


  2. 本地和远程分析会话。远程会话也可以配置为将应用程序和资源复制到远程系统。


  3. 从目标应用程序或从进程树收集信息。


  4. 报表视图支持更高级的分组和过滤。


TRACE ACTIVITY


  1. 跟踪操作系统活动,包括进程,线程和模块生命周期,线程上下文切换,线程等待原因,CPU利用率,进程CPU利用率和线程利用率。

    >
  2. 为CUDA,OpenGL 2.x-3.x,DirectX 9-11和OpenCL 1.1收集API和GPU工作记录,并显示时间轴上的所有信息。

    / li>
  3. 对所有跟踪的API调用或仅当跟踪的API调用返回错误时调用堆栈跟踪的集合。


  4. CUDA软件计数器以显示每个上下文分配的内存。


  5. 对所跟踪的信息进行额外控制。


  6. NVIDIA工具扩展库和D3D性能的用户注释的时间轴和树显示标记。


CUDA个人资料活动


  1. CUDA分析器提供了一种捕获内核并对应用程序多次透明的方法。这允许在非确定性应用程序中收集性能分析数据,并且只启动1个应用程序。 Visual Profiler


  2. 支持收集许多有用的指标,通过可视化分析器,包括经验资格,这是最重要的指标,如果你有足够的占用和扭曲停顿的理由,以帮助您了解什么是限制应用程序的性能。


Visual Profiler具有以下优点:


  1. 跨平台。


  2. 提供专家系统来检查收集的信息。



  3. 时间轴可以在点击活动时显示CPU和GPU事件之间的相关性。


  4. CUDA 5.0支持新的命令行分析器(nvprof)。


  5. 访问模式。


  6. 更好的方法是将CUDA 5.0分析器集成到Nsight Eclipse版本中。支持特斯拉PM计数器。


CUDA 5.0中的Visual Profiler增加了Nsight 1.5和2 .x包括




  • NVIDIA Tools扩展程序库用于使用可在时间轴中显示的范围和标记来注释应用程序。 p>


  • Fermi和Kepler GPU上的并发内核跟踪。




这两个工具将为您分析您的应用程序提供非常有用的信息。我建议您使用每个工具的最新版本。



即将到来的Nsight VSE版本将有许多新的功能调查您的CUDA内核的执行。有关详情,请参阅。


I am working with CUDA on the windows platform. On the windows platform we have access to both Parallel Nsight and Visual Profiler. Both are pretty good but then they have almost similar features for profiling and tracing. Can someone say me how are they both different and which one is better for the windows platform ?? I will basically be needing a tool for profiling.

解决方案

Nsight Visual Studio Edition 2.2 offers the following advantages over the Visual Profiler:

OVERALL

  1. Integration into Visual Studio 2008 SP1 and 2010 (requires Professional Edition as VS Express Edition does not support integration packages).

  2. Local and remote analysis sessions. Remote sessions can also be configured to copy the application and resources to the remote system.

  3. Collect information from a target application or from a process tree.

  4. Report views support more advanced grouping and filtering. Data tables can be exported to excel.

TRACE ACTIVITY

  1. Trace OS activity including process, thread, and module lifetime, thread context switching, thread wait reasons, CPU utilization, process CPU utilization, and thread utilization.

  2. Collect API and GPU work trace for CUDA, OpenGL 2.x-3.x, DirectX 9-11, and OpenCL 1.1 and show all information on the timeline.

  3. Collection of call stack traces on all traced API calls or only when traced API calls return errors.

  4. CUDA software counters to show allocated memory per context.

  5. Additional control over what information is traced. This is critical as tracing too much information can cause the application to become CPU bound.

  6. Timeline and tree display for user annotations from NVIDIA Tools Extensions Library and D3D Performance Markers.

CUDA PROFILING ACTIVITY

  1. The CUDA profiler provides a method to capture your kernel and replay it many times transparent to your application. This allows collection of profiling data in non-deterministic applications and with only 1 launch of your applications. The Visual Profiler <= 5 requires the application to be deterministic so that it can relaunch the application many times.

  2. Supports collection of many useful metrics not yet support by the Visual Profiler including warps eligible which is the most critical metric for understanding if you have sufficient occupancy and warp stall reasons to help you understand what is limiting the performance of the application.

The Visual Profiler has the following advantages:

  1. Cross platform.

  2. Provides expert system to review the collected information.

  3. Links in the results to the CUDA Best Practices Guide.

  4. Timeline can show correlation between CPU and GPU events when you click on an event.

  5. CUDA 5.0 supports new command line profiler (nvprof).

  6. CUDA 5.0 supports source correlation for branch divergence and memory access with bad access patterns.

  7. CUDA 5.0 profiler is integrated into Nsight Eclipse Edition.

  8. Better support for Tesla PM counters.

Visual Profiler in CUDA 5.0 adds a number of the features available in Nsight 1.5 and 2.x including

  • NVIDIA Tools Extension Library for annotating your application with ranges and markers that can be displayed in the timeline.

  • Concurrent kernel trace on Fermi and Kepler GPUs.

Both tools will provide your very helpful information for analyzing your application. I recommend that you use the latest version of each of the tools.

The upcoming version of Nsight VSE will have many new features for investigating the execution of your CUDA kernel. For more information see http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0430-GTC2012-Developing-CUDA-Nsight.pdf.

这篇关于NVIDIA并行Nsight与可视化分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-13 14:33