使用IBM Rational Application Developer对Java应用程序进行性能分析

总览

在处理能力和存储技术方面，技术的不断进步带来了许多新的有趣的技术。这些技术以纯粹的应用程序性能为代价来解决诸如程序员效率或系统灵活性之类的次要问题。其中包括诸如Java™之类的垃圾收集即时编译语言的技术，以及整个系统虚拟化的盛行。

随着计算机处理能力和速度的Swift增长，以及每单位处理能力的成本不断下降，似乎对单个应用程序效率的需求似乎正在降低。但是，即使是最小的应用程序，如果在足够多的用户中使用它们，也可能会遇到性能下降的问题。同样，最大的应用程序可能会成为讨厌的性能瓶颈和内存泄漏的牺牲品，这些性能瓶颈和内存泄漏会损害应用程序可用性，由于页面加载时间问题而导致可用性降低，并可能需要进行昂贵的升级，而这些问题只能通过代码修复来解决。

IBM®Rational®Application Developer和IBM®Rational®Software Architect概要分析工具提供了开发人员可以用来识别和缓解这些性能问题的复杂工具。两种产品都打包了本教程中描述的配置工具。但是，尽管两种功能都可用，但是本教程重点介绍Rational Application Developer。分析功能基于开源的Eclipse测试和性能工具项目（TPTP）Java™虚拟机工具接口（JVMTI）分析代理，有关更多信息，请参见“ 相关主题”部分。

Rational Application Developer性能分析平台提供了三种不同的应用程序行为分析：

内存使用率分析
方法级执行分析
线程分析

与现有Rational Application Developer启动类型的内置集成使得对应用程序进行性能分析就像选择秒表性能分析图标一样简单，然后从启动列表中选择现有的Run / Debug启动配置。但是，当您启动了概要分析的应用程序并开始收集数据时，熟悉应用程序分析的术语和概念将有助于最大程度地利用分析功能。

本教程为您提供有关使用Rational Application Developer概要分析Java应用程序的指南。为此，它将首先提供性能工具功能的相关背景知识。

JVMTI接口和代理架构

Rational Application Developer Java Profiler由一组本机代理组成，这些代理通过Java虚拟机的JVMTI接口实现。 JVMTI接口是一个标准接口，允许本地库（称为代理）控制运行中的虚拟机（VM）并获取有关正在执行的Java应用程序的信息。 JVMTI代理订阅JVM事件（以及必要时，仪器类字节码），以确保在Java应用程序在该VM上执行时，将所有概要分析事件通知给它。

探查器通过JVMTI接口注册其本机功能：发生注册事件时，JVM将调用这些功能。使用Rational Application Developer Java性能分析代理，可以实时生成性能分析数据。也就是说，代理程序不会等到应用程序终止以提供和呈现该数据。该产品支持代理商收集有关前面列出的三个方面的信息。

一个重要的注意事项：Rational Application Developer JVMTI Java Profiler不是基于样本的Profiler，因此，所有未由概要分析过滤器过滤的JVM事件都将被传输回工作台。这种剖析方法可确保较高的准确性，但与基于样本的剖析相比，其花费的剖析开销更大。

分析代理

当使用特殊的JVMTI特定VM参数运行JVM时，将与JVM一起执行分析代理（并在JVM进程内部）。当性能分析代理运行时，它们以执行，堆或线程事件的形式从JVM收集数据。在Rational Application Developer分析领域中，这些代理被称为执行分析，堆分析和线程分析代理。

Execution Analysis代理用于收集执行统计信息，例如每种方法所花费的时间，方法调用的数量以及整个调用图。
堆分析代理用于生成内存使用情况统计信息。它收集对象分配详细信息，例如活动实例和内存中的活动大小。
线程分析代理用于获取有关Java应用程序产生的线程的详细信息，并跟踪对象监视器在整个目标应用程序中的使用和争用。

为了促进代理程序启动和客户端-代理程序通信，概要分析工具使用了称为代理程序控制器的第二个组件，以允许代理程序与工作台进行通信。该组件已与Rational Application Developer预先捆绑在一起。

使用代理控制器，目标Java应用程序可以驻留在与开发人员工作台不同的机器上。代理控制器充当中介，在工作台和概要分析代理之间发送命令和数据。下图显示了远程配置方案。开发人员的工作台与代理控制器之间的通信通过套接字进行，而代理控制器与代理之间的通信通过命名管道和共享内存进行。

图1.远程分析

除代理程序支持和通信外，代理程序控制器还提供其他服务，例如进程启动，终止和监视以及文件传输。代理控制器进程始终必须在目标计算机上运行，才能利用该计算机上的概要分析功能。幸运的是，有多种方法可以启动此过程。

当直接在本地工作台中对本地启动的应用程序进行性能分析时，只要需要它的任何功能被激活，集成代理控制器（IAC）都会自动启动，而无需用户干预。只要工作台处于打开状态，IAC就会一直运行，而当工作台处于关闭状态时，IAC将关闭。可以通过“ 首选项”>“代理控制器”下的配置视图来调整默认的IAC设置。

除了通过IAC本地启动之外，代理控制器还可用于在任何受支持的远程主机上启动应用程序。可作为32位和64位Linux®，Microsoft®Windows®，IBM®AIX®，Solaris，IBM®z /OS®，IBM®Rational®Agent Controller守护程序流程打包的独立代理控制器下载。 IBM®Systemz®系统上的Linux和Linux。这些功能使应用程序可以直接在远程主机上执行，而由远程代理生成的数据则通过与远程计算机的套接字连接传输回本地工作台。

有关Rational Agent Controller的下载位置，请参阅本教程末尾的“ 相关主题”部分。

您需要完成本教程中的示例

使用Rational Application Developer版本7.5.4对本教程中的所有示例进行了概要分析。可通过Rational Application Developer或Rational Software Architect或带有Eclipse TPTP插件的受支持的基于Eclipse的产品来使用概要分析工具。有关更多信息，请参见“ 相关主题”部分。

分析对话框简介

首先，切换到基于Java的透视图，然后查找绿色的播放按钮配置对话框图标（

）。 “概要文件配置”概要分析对话框是Rational Application Developer中所有概要分析功能的中心启动点。从大多数角度来看，也可以通过从菜单中选择运行> 配置文件配置来访问它。

Rational Application Developer概要分析功能支持几乎所有标准启动类型：

Eclipse应用程序
Java Applet
Java应用
测试
Java单元（JUnit）
IBM®WebSphere®Application Server版本6.0、6.1和7.0应用程序客户机
和别的

此外，支持还通过两种特定于探查器的类型扩展，即“ 附加到代理”和“ 外部Java应用程序” ：

附加到代理允许对任何Java虚拟机进行概要分析，而与应用程序对该JVM的使用无关，只要使用了适当的JVMTI分析代理VM参数即可。由于此条目可与任何应用程序一起使用，因此它可用于不受支持的其他应用程序配置，无论是本地应用程序还是远程应用程序，无论它们是在应用程序服务器上还是成熟的独立应用程序。这要求您从命令行配置适当的类路径，环境变量和应用程序参数，以及所需的JVMTI分析代理参数（在本教程的后面部分中介绍）。

外部Java应用程序 （如“附加到代理”）支持对所有JVM进行分析，而与其使用无关。与附加到代理程序不同，外部Java应用程序要求您在工作台（在启动器中）而不是从命令行中指定类路径，环境变量，应用程序参数和配置文件类型。需要注意的一个导入点：目标主机上必须已经存在所有必需的类文件，Java归档（JAR）文件和其他依赖项，因为代理控制器不支持满足应用程序要求的类或文件的远程文件传输。

收集方法级执行统计信息

在本教程中讨论的三个概要分析代理中，最常用的代理可能是方法级执行分析代理，它提供有关方法执行的各种统计信息。以下示例应提供有关可用执行统计信息功能的简单说明。

首先，选择一个Java应用程序进行概要分析。
右键单击Java项目，然后选择Profile as 。
如果这是您第一次对该项目进行性能分析，则会显示“性能分析”对话框。否则，它将自动恢复为所选项目的最后保存的配置文件配置。
要修改先前的配置文件配置，请使用“配置文件配置”对话框，如前所述。

分析选项包括执行时间分析，内存分析和线程分析，如图2所示。

图2. Edit配置和启动对话框中的选项

在概要文件对话框中，您将看到几个概要分析选项：本教程的主题是Java概要分析下的上述概要分析选项。第四项，“探针插入”，描述了使用Probekit工具使用自定义探针对应用程序进行检测的附加功能，但不属于本教程的范围。

在此处，单击“ 编辑选项”按钮。

执行时间分析具有三个选项：

执行流提供了更多的性能分析数据视图，但增加了性能分析开销，性能数据量和工作台内存使用率。这可能不适用于具有较大分析数据集的应用程序。
执行统计信息显着减少了开销，并减少了工作台内存使用量，但以某些性能分析功能为代价。仅执行统计信息和方法调用详细信息可用。
收集方法CPU时间信息 ：在以上两种模式中，探查器都可以收集CPU花在执行探查方法上的时间。这与上面的有所不同，因为I / O和等待时间将不包括在CPU时间中。

最好的做法是从“执行流”开始，然后在数据量太大时切换到“执行统计”（或调整过滤器集）。

对于此示例，请选择“执行流”，因为这提供了所有可用概要分析视图的示例。

筛选器

选择配置文件按钮之前，重要的是要安装适当的过滤器。作为概要分析过程的一部分，会生成大量概要分析数据，因为每个单个方法调用，对象分配或线程事件都需要生成，传输和处理事件。

无论是正在执行的应用程序还是工作台本身，分析数据的绝对数量有时都可能不堪重负。但是，只有一小部分可用于分析目的。幸运的是，性能分析工具提供了一种过滤掉无关信息的方法，以便您可以减少所分析应用程序的代理数据量。

例如，在对Java应用程序进行性能分析时，您可能只关心应用程序的方法，而与诸如java。*，sun。*等标准Java语言包的执行时间无关。重要的是使用专门针对应用程序类的过滤器，以尽可能减少来自外部类的分析开销。

要设置过滤器，请双击Java性能分析-JRE 1.5或更高版本 。您将看到一个类似于图3所示的窗口。过滤器指定要分析的包和类；过滤器本身包含在可配置的过滤器集中。

图3.选择过滤器集及其内容

在此对话框中，您可以定义新的过滤器或修改现有的过滤器。如图3所示，提供了一些在最常见的配置方案中有用的过滤器。过滤器集在顶部窗格中列出，过滤器本身在底部列出。过滤器优先级从上到下，这意味着列表中较高的过滤器将覆盖列表中较低的所有冲突过滤器。可以使用对话框右侧的按钮添加和删除过滤器和过滤器集。

对于此示例，您对默认过滤器感到满意，然后选择“ 完成”以返回到“分析”对话框。
在此处，选择“ 执行分析” ，然后单击“ 配置文件”按钮。工作台将确认切换到“概要分析和日志记录”透视图，并且配置文件的应用程序将开始在“概要分析监视器”视图中运行。
单击“ 执行时间分析”以打开“执行统计信息”视图，如图4所示。

图4.在Execution Statistics视图中正在分析的类的小型数据集

“ 会话摘要”选项卡提供了具有最高基本时间的前十种方法的概述。但是，这是“ 执行统计信息”选项卡，从中可以完全使用所有数据。在这里，您将找到有关方法调用及其统计信息的详细信息：调用次数，花费的平均时间，累积时间等。

一些关键的定义应有助于阐明此视图中的列：

基本时间：执行方法本身内容的时间，不包括对其他方法的调用。（在表中，“基准时间”字段将该方法的所有调用汇总在一起。）
平均基准时间 ：特定方法完成的平均时间，不包括方法调用其他方法的时间。（在表中，这是基准时间除以通话次数）
累积时间 ：执行方法内容本身的时间，包括对其他方法的调用。

这些统计信息可以帮助您找到程序中的性能瓶颈。

基于这些定义，有三个重要的注意事项：

平均基准时间 ：这是一种方法完成所需的平均时间。因此，平均而言，这是该方法的单次调用完成所需的时间（如上所述，这不包括此方法调用的子方法所花费的时间，或更具体地说，不包括未过滤的子方法的时间）
基本时间 ：这是一种方法完成所需的总时间。这是在此方法上花费的所有时间的合并（不包括对其他未过滤方法的调用。）
累积CPU时间 ：累积CPU时间表示执行指定方法所花费的CPU时间。但是，在这方面，由JVM提供的数据粒度比可能需要的要粗糙。因此，如果时间少于JVM报告的单个平台特定单位，则CPU时间可能报告为零。同样，CPU时间将不考虑其他类型的性能瓶颈，例如那些涉及通信类型和I / O访问时间的瓶颈。结果，通常将基准时间作为降低性能瓶颈的指标。

乍一看，平均基准时间似乎是确定哪些方法会使系统速度下降的关键数据点。但是，虽然平均基本时间可以确定执行时间很长的方法，但它并未考虑方法被调用的次数。您可以问自己哪个更糟：一种方法只能运行一次且耗时5.0秒，或者一种方法可以运行1000次且耗时0.5秒？第一个的平均基准时间为5.0秒，而第二个的平均基准时间为0.5秒。

但是，第一种方法的基准时间为5.0秒，而第二种方法的基准时间为500秒。与仅减少5秒相比，从应用程序运行时间减少500秒具有更大的影响。因此，您可以明确地说，由于基本时间的差异，第二种方法比第一种方法更受关注。因此，基准时间是减少应用程序性能瓶颈的主要考虑因素。因为基准时间代表整个应用程序运行的融合，所以通常来说，减少基准时间等同于减少运行时间。

基本时间仅表示执行方法本身所花费的时间（不包括对其他方法的调用），而累积时间表示执行方法所花费的总时间，包括子调用。因此，例如，在单线程应用程序中， main(…)方法的累积时间等于应用程序中所有其他方法的所有基本时间的总和，因为main(…)方法是应用程序的起点，因此是所有方法调用的起点。因此，它等于应用程序的总运行时间。

在了解了执行统计信息的详细信息之后，让我们看一下进一步深入分析性能问题的方法。若要更好地了解正在调用的特定方法以及从何处调用特定方法，请在“ 执行统计信息”选项卡中双击所需的方法。这将打开“ 方法调用详细信息”选项卡，如图5所示。

图5.深入细化特定方法

“ 方法调用详细信息”选项卡相当直观地表示了与所选方法直接相关的那些执行统计信息。选项卡中的第二个表是被调用的Selected方法 ，它将列出在应用程序运行期间调用了selected方法的所有方法。

第三个表Selected方法调用 ，将列出该选定方法调用的所有方法。这里复制了“ 执行统计信息”选项卡中的相同统计信息，并且从这些表中的任何一个表中选择一种方法都将更新视图顶部的所选方法。

此处要注意的下一个选项卡是“ 调用树”选项卡，该选项卡仅在使用“执行流”性能分析选项进行性能分析时可用。 “ 调用树”选项卡细分了为特定线程调用的所有方法的方法调用。该表中的第一级项目是在应用程序运行期间产生的所有线程。下面是对每个方法调用的合并，后一级的方法由树的前一级的方法调用。

图6. Call Tree选项卡从更以线程为中心的角度展示了方法调用

在最顶层，每个线程的“ 累积时间 ”表示线程在应用程序中花费的总时间。具有较高累积时间的那些线程是分析和优化的候选对象。

“ 每个线程的百分比”字段表示执行一个方法所花费的总时间，以执行该线程所花费的总时间的百分比表示。它表示为该线程最顶层方法调用的累积时间的总和（即执行该线程所花费的总时间）。提供了其他统计信息，例如完成线程上的方法执行所花费的最长时间，最短时间和平均时间，以及调用的总数。

顶部调用树表的下方是方法调用堆栈的表，该表显示了调用树表中当前所选项目的每个方法调用的堆栈的内容。可用的堆栈数将等于列出的调用数。这对于分析特定的方法调用实例非常有用。

最后，您可以右键单击方法条目，然后从任何视图中选择Open Source。如果工作空间中存在相关的Java文件，则将打开工作台文件并找到方法定义。确定性能瓶颈后，您可以对其进行修改，然后再次进行分析以查看差异。

堆分析：查找内存泄漏

应用程序开发人员希望了解内存使用情况的主要目的是：

在应用程序运行时分析堆的内容（允许生成逐类统计信息）
通过即时进行堆分析或在用户请求的垃圾收集之后使用堆分析来识别内存泄漏（以识别是否正在垃圾收集中的对象）。

要开始从应用程序中收集堆信息，请使用“内存分析”分析类型从“分析”对话框启动应用程序。内存分析中唯一可用的探查器选项是是否跟踪对象分配站点 。分配站点是代码中（隐式或显式）实例化对象的位置。选择此选项后，您可以在“对象分配”视图中选择类，并识别从哪些方法创建了这些对象。

选择“ 跟踪对象分配站点”选项的唯一缺点是，它将大大增加所生成的概要分析数据的数量。如果分析性能受到影响，或工作台响应性受到影响，请考虑清除此选项或通过使用更新的过滤器集来减少数据量。

除了对象分配之外，请确保已为应用程序设置了正确的过滤器。与执行或线程分析相比，堆分析具有更大的整体检测开销，而执行或线程分析本身已经对应用程序性能造成了巨大的损失。最后，在选择过滤器时：例如，如果要查看Java类型（例如String或Integer占用的空间，则需要将它们添加到过滤器中。默认情况下，它们被java* * EXCLUDE过滤器过滤掉。

对象分配视图

从配置文件的应用程序中获得数据后，该数据将显示在“对象分配”视图中。对象分配视图是主视图，其中显示了堆概要分析代理程序收集的所有信息。

对象分配列：

活动实例 ：指定类别的堆中当前正在使用的当前对象数（尚未进行垃圾回收）。
实例总数 ：在JVM的生命周期中已创建的堆中对象的总数（包括已被垃圾回收的对象）。
活动大小（字节） ：JVM当前使用的（即尚未进行垃圾收集的）特定类的所有对象实例的总大小。请注意，对象大小取决于JVM实现。
总大小（字节） ：类的所有对象实例的总大小，包括那些在应用程序生命周期中较早进行垃圾回收的对象。
平均寿命 ：对象被垃圾回收之前的平均年龄，以该对象幸存的垃圾回收次数来衡量。如果应用程序不再要求使用大量垃圾回收的对象，则将其视为内存泄漏。

使用视图工具栏，可以在包级别和类级别之间切换内存统计信息表中的数据。当您处理大量类时，这尤其有用。数据可以表示为上一次刷新后的现有数据的百分比或增量。图7从左到右显示了报告生成，过滤，包/类视图选项以及百分比和增量选项的图标。

图7. Object Allocations视图工具栏中的图标

如果选择了“对象分配”分析选项，则双击“内存统计信息”表中的任何条目都将切换到“ 分配详细信息”选项卡。此选项卡显示程序中分配了该类型对象的所有位置的表。当确定某个特定类型在堆中的数量过多时，使用视图中的数据来确定这些对象的分配位置特别有用，可以识别并消除过多的分配。

使用此功能识别堆问题

当工作台开始从目标应用程序收集性能分析数据时，您可以随时查阅“对象分配”视图，以确定堆的当前内容。该表将实时反映所有发生的对象分配和取消分配事件。这提供了存储内容的瞬间视图，以占总数的百分比或绝对值（字节）表示。

通过排序数据表并选择活动大小最大的类，开发人员可以针对需要改进的问题区域。然后，这些类的分配详细信息提供了对象创建源的列表，然后您可以逐一地排除或调查是导致堆大小问题的原因。

此示例表示一个简单的聊天室Web应用程序，该应用程序允许用户登录并彼此通信。聊天室聊天会传输给所有参与者，然后所有对话都将写入服务器上的日志文件中。但是，使用Rational Application Developer的堆分析功能，您已经确定了严重的内存泄漏（可能涉及一个特定的类）。

在列出了总百分比的“内存统计信息”视图中，您可以看到ChatlineMessage类代表了当前在堆中分配的对象的近98％，按大小构成了总堆内容的61％。对于应用程序开发人员来说，这应该是一个严重的警告信号，即一个或多个类在堆内容中被过度表示，并导致应用程序中的内存泄漏。

图8.选中delta选项后按包查看

Profiling Monitor视图还允许您请求JVM的垃圾回收。当与增量表结合使用时，这对于确定不可回收堆对象中包含多少堆特别有用，这是另一种内存泄漏。

选择垃圾回收之前，请从“对象分配”工具栏中选择“ 显示增量列”图标。这引入了四个新的增量列，它们是现有列的增量版本，并反映了您先前选择“ 刷新”按钮后这些值的更改（请参见下面的图9）。在增量表中时，单击“ 运行垃圾收集” （带有垃圾桶的绿色播放箭头）图标

从Profiling Monitor工具栏触发JVM垃圾收集。选择刷新按钮后，垃圾收集的结果将立即反映在Memory Statistics表中，如下图9所示。

那些对垃圾收集堆内容贡献最大的类可以通过按Delta：活动大小表列进行排序来标识。这些类在收集期间损失最多的大小，并且在未使用的对象池总数中贡献最大。

当JVM执行垃圾回收时，它会在堆中查找孤立的对象（也就是说，堆中的任何其他对象都不会引用它们，而堆本身也不会引用这些对象。）使用同一聊天室在前面讨论的应用程序示例中，您现在已指示JVM通过工作台执行垃圾回收。您可以看到，最初的结果是分配的ChatlineMessage对象的数量以及堆的总大小都大大减少了。

图9. delta列提供有关对象收集的瞬间统计信息

此屏幕截图显示了请求垃圾回收之后的“内存统计信息”视图的内容。堆的内容已经大大减少，活动实例减少了22,079，相应的百分比下降了。在大约五六秒钟内，“ 活动实例”百分比将下降到几乎为零。在此应用程序中，您发现ChatlineMessage对象已被分配，短暂使用，然后被丢弃。

使用堆分析，增量列和垃圾回收，您可以标识正在分配但没有进行垃圾回收的对象，或者未在应用程序生命周期中被任何其他对象引用的对象，这些对象正在促成大量的孤立对象。应用程序开发人员可以分析和减少其应用程序的总体覆盖范围，并可以潜在地减少系统交换惩罚，缩短响应时间并降低应用程序系统要求和成本。

线程分析：跟踪线程行为

本部分的读者是希望识别和更正其应用程序中与线程有关的任何问题的用户。对于感兴趣的特定线程，您可以使用这些概要分析工具检查这些特定线程被阻止的频率（以及被谁锁定），并可视化应用程序的线程特征。

For global concerns, such as for instance performance concerns where thread behavior is a potential suspect, the goal is to identify general thread issues that might impact application performance. A primary objective of an application developer who wants to profile thread behavior is to find threads that would otherwise run sooner, or more rapidly, were the resource and thread characteristics of the application altered.

To begin to gather thread information from the target application, launch the application from the Profiling dialog using the Thread Analysis profiling type. The Thread Analysis view of Rational Application Developer is the central view for all of the information gathered by the thread profiling agent.

The Thread Analysis view is split into three tabs:

Thread Statistics : A table of statistics for every thread launched by the application, both past and present. Listed information includes thread state; total running, waiting, and blocked times; as well as the number of blocks and deadlocks per thread.
Monitor Statistics : Provides detailed information on monitor class statistics, including block and wait statistics for individual monitor classes.
Threads Visualizer : Provides a visual representation of all threads profiled in the target application, by status.

Threads in all of the views are organized by thread group. The data in the thread analysis view will only update whenever additional data is received from the profiled application. This point is an important one: the tables and graph will only be updated when a thread-related event is received from the target application profiler. If it appears as if the data is not being updated, this is because no new thread events have been received, and the data is still in its previous state.

The thread name listed is the name passed into the Thread(String name) constructor, or set with the Thread.setName method. It may be beneficial to call this method in the target application, in order to allow for easier thread identification. Thread statistics are gathered on all Java threads, which include VM threads, and may also include threads used by the application container or application framework. Fortunately, you can filter uninteresting threads out of the view by selecting the thread filter icon (three arrows with middle yellow one going through vertical green line

), and clearing unwanted threads.

The seven thread states are:

Running
Sleeping: one in which the sleep method has been explicitly called
Waiting: one on which the wait method has been explicitly called and is waiting for a notify or notifyAll call on its monitor object
Blocked: refers to cases where a thread is blocked by an object monitor that is in use by another thread (for instance, a thread holding an object monitor in a synchronized statement)
Deadlocked: statistics are gathered on a per-thread level, for each thread in the target application. A deadlock is considered to have occurred when two or more threads hold resources for which the resource dependency graph contains a cycle (that is, all deadlocked threads require additional resources that they cannot obtain without another deadlocked thread releasing required resources).
Stopped
Unknown

In Java, deadlocks can occur in a variety of situations, for example:

When, in two threads, synchronized methods in two classes are trying to call the synchronized methods of each other.
When one thread synchronizes on resource A and attempts to synchronize on a second resource B, while a second thread synchronizes on resource B and attempts to synchronize on resource A.
Any other situation where it is not possible for two or more deadlocked threads to unblock and complete.

Thread statistics

This tab lists all of the threads that are currently running, or have run, throughout the lifetime of the application. Threads remain in the table even after termination. This view lists running time, waiting time, and blocked time. Running time is defined as the total running time of the thread minus the time that it was waiting or blocked. Waiting time is defined as the amount of time that the thread spent waiting for a monitor, and Blocked time is the amount of time that the thread spent blocked by the ownership of monitors by other threads. In addition, there are several recorded counts: Block count and Deadlock count are the number of times a thread has blocked or deadlocked, respectively, throughout the life of the thread.

In this example, you are profiling an Eclipse plug-in. Figure 10 shows the present states of all running threads, their running time, waiting time, blocked time, and deadlocked time, and the blocked count. Application developers may utilize this view to either view a general picture of the entire thread landscape of their application, or to drill down to the moment-by-moment statistics of particular threads.

Figure 10. List of threads that are running in a profiled Eclipse plug-in

The waiting time, blocked time, and deadlocked time are important statistics to consider in the context of application performance. These values should be closely scrutinized to ensure that they are appropriate, especially for time-dependent threads.

Monitor statistics

The second tab in the Thread Analysis view is Monitor Statistics , shown in Figure 11. All of the objects in Java have a corresponding monitor, which is the basis for all concurrency operations in Java. Monitors are invoked when inside a synchronization block, or when wait or notify methods are called to wait for a dependency or signal an availability, respectively. This tab provides monitor statistics on a thread-by-thread basis.

Figure 11. The Monitor Statistics tab of the Thread Analysis view

In Figure 11, the Thread Statistics table contains a list of threads in the profiled application, along with various statistics of the thread from the first tab. Select one of the threads in the Thread Statistics table to display the associated monitors referenced by that Thread, including various statistics for those monitors. You can then select the monitor to open up information about its class, including block and wait statistics for callers of the monitor, as well as timing and object information. This allows the identification of the particular objects that are in contention, by whom, and how often.

Threads visualizer

In the Threads Visualizer tab shown in Figure 12, each of the seven thread states is denoted by bars of varying backgrounds and line patterns. Threads are sorted by thread group and thread name. The x-axis of the graph represents time, the range of which can be adjusted using the zoom-in and zoom-out icons.

Each row of the table contains a bar representing thread execution. Inside each bar is a continuous list of events, which represent changes in the thread state. You can double-click an event to display its call stack in the Call Stack view, and you can move from event to event using the Select Next Event and Select Previous Event buttons in the top right-hand corner of the Threads Visualizer tab.

In the graph, Waiting and Blocked states are denoted by dotted lines, and Deadlocked and Stopped states are denoted by a solid line. Of most importance to the application developers looking to identify performance issues are Deadlocked (red), Waiting (orange), and Blocked (yellow).

One important UI note: Perhaps contrary to what might be expected behavior, when a thread has terminated it will continue to maintain a dark grey representation on the chart (rather than disappearing from the chart entirely).

Figure 12. The Threads Visualizer represents the thread status of all threads in a graph, plotted against the application timeline

The buttons on the thread analysis toolbar are used to move or change focus to or from particular threads. From left to right, as shown in Figure 13, the buttons are: Legend, Show Call Stack, Reset Timescale, Zoom In/Out, Select Next/Previous Event, Select Next/Previous Thread, Group Threads, Filter Threads, and Visualize Thread Interactions. Most of these are self-explanatory: for example, the Next/Previous Thread button changes the currently selected thread, and the Select Next/Previous Event button moves the event cursor to the next or previous event of the currently selected thread. Additionally, you can group and filter threads as required.

Figure 13. Interact with the Threads Visualizer graph using the buttons at the top of the view

When you use the Rational Application Developer thread profiling functionality, you should take into account a number of considerations:

For threads that are on a wake-sleep cycle, how long does it take to complete the wake phase of the cycle, and how long does the thread sleep?
For threads that are dependent on external resources becoming available, how long are they blocked waiting for a resource to become available?
In input-processing-output oriented applications, such as a Web application, how long do the various threads take to respond to user input, process the data, and produce the corresponding output?
Some threads periodically wake to check a condition or perform a function, then return to sleep. Thread analysis allows you to observe these relationships using the Threads Visualizer.
You can use the profiling functionality to monitor producer-consumer and reader-writer relationships.

In aggregate, the thread profiling functionality provides a variety of views to analyze application thread performance and behavior. These views allow you to gather information and analyze various aspects of program execution, in order to gain insight into potential bottlenecks or failure conditions.

Profiling applications on a remote machine

So far, this tutorial has discussed profiling Java applications as they are running on a local machine. Rational Application Developer profiling also provides the capacity to launch and profile applications that are running on a machine separate from the workbench. To enable this functionality, you can download the Rational Agent Controller component and install it separately on Windows, Linux x86/x64, Linux for System Z, IBM AIX, IBM z/OS, Solaris SPARC, and Solaris x86.

Instructions to install and start the Rational Agent Controller are available on the IBM download site; consult the Related topics section for more details. When it is installed, run the SetConfig setup script, and then start the agent controller using ACServer.exe (Windows) or ACStart.sh (UNIX®) on the remote machine. An example Linux on System Z configuration is shown in Figure 14.

Figure 14. Starting the Rational Agent Controller process from the command line and setting up the Java Profiler environment variables

The profiler requires you to set additional environment variables. On Linux, for instance, agent-specific additions to the LD_LIBRARY_PATH and PATH variables are required. Other variables shown in Figure 14 previously are set just for the convenience of using their values more than once without having to type the long path. You can add these to your global environment variables, specify them in the terminal session, or add these to the launch script of your Java application. Additionally, some platforms allow you to profile without setting these environment variables (instead specifying the path on the command line). Consult the Getting Started document of your Rational Agent Controller installation for more information.

In addition to setting these environment variables, you'll also need to determine the type of profiling data that is required, and set the JVM arguments to reflect this type when launching your application.

Windows:
JVM Arguments on Windows:

-agentlib:JPIBootLoader=JPIAgent:server=<agent-behaviour>;<profile-option>

All UNIX varieties :
JVM Arguments on Linux:

'-agentlib:JPIBootLoader=JPIAgent:server=<agent-behaviour>;<profile-option>'

(Note the single quotes around the entire string; these are required so that the semi-colon is not interpreted as a new line character by the shell)

Note the use of CGProf ( <profile-option> in the generic command line) in the command line JVM arguments in the example above: this is one of the options that corresponds to the data collection profiling types:

CGProf : This is equivalent to Execution Time Analysis in the workbench Profiling UI. As mentioned previously, this option is used to identify performance bottlenecks, by breaking down execution time on a method-by-method basis.
HeapProf : This is equivalent to Memory Analysis in the workbench Profiling UI. As mentioned, this option tracks the contents of the heap by tracing object allocation and deallocation, as well as garbage collection events.
ThreadProf : This is equivalent to Thread Analysis in the workbench Profiling UI. This option traces thread and monitor usage during application execution.

You need to select one of the data collection types, and place that value in the profile-option JVM argument value above. You may only specify one at a time.

In addition, you need to select agent behavior (for instance, the example in Figure 14 uses controlled ):

controlled : This agent behavior prevents the JVM from initializing until the agent is attached to (from the workbench) and given instructions to start monitoring. As soon as the agent connection is established, the JVM will start. Because the JVM waits until the workbench has connected, the profiling agent will generate data for the entire lifecycle of the application.
enabled : With this agent behavior, the profiling agent is launched at JVM startup. However, the JVM is initialized immediately, and begins running without waiting for the workbench to connect. The profiling agent does not begin to generate data until after the workbench has connected to the agent and started monitoring. No profiling data is produced until the workbench attaches. Any application execution that takes place before the workbench has connected will not be recorded.

An additional agent behavior is standalone , which is outside the purview of this tutorial. It allows profiling without an agent controller by writing data to a trace file on the local file system, which can then be directly imported into Rational Application Developer. Similarly, additional command line options are available to fine tune profiler data. For more information, consult the Rational Agent Controller Getting Started document.

Example 1 :

-agentlib:JPIBootLoader=JPIAgent:server=enabled;HeapProf

(Heap profiling on Windows, mentioned enabled mode)

Example 2 :

-agentlib:JPIBootLoader=JPIAgent:server=controlled;CGProf

(Execution time profiling, on Linux mentioned controlled mode)

When the target application JVM has been run with appropriate JVM arguments, you are ready to connect from the workbench. To connect from the workbench, bring up the Profile Configurations dialog, as shown in Figure 15.

Figure 15. Selecting the Profiling Configurations dialog from the workbench UI

A dialog box that shows the available profiling options is displayed, as shown in Figure 16.

Figure 16. Profile launch configuration options

Create a new Attach to Agent launch configuration (described previously) by double-clicking that option. You can also customize the new configuration by adding the remote machine as a new host. The agent controller on the remote machine is available at port 10002 (the default port number), as shown in Figure 17.

Figure 17. Add host dialog

After it is added as a host, the agent running on the remote machine (in this example the execution statistics agent) should be available on the Agents tab. If not, it could help to verify the Agent Controller setup and status. When you are ready, select the agent and click Profile . The workbench will attach to the agent and switch to the profiling dialog. You can now collect and analyze the data as required.

One other way of profiling an application on a remote machine is to use the previously described External Java Application option (as shown in Figure 18) instead of Attach to Agent .

Figure 18. Specifying class name and class path under External Java Application

In a new configuration of External Java Application, you specify the location and name of the Java main class on the remote machine, as shown in Figure 18. The Monitor tab helps specify the kind of profiling agent to use (Execution profiler, Memory profiler, or Thread profiler). When you click the Profile button, the application executes on the remote host, but the input and output are directed to the console window in the local workbench.

Profiling Eclipse Rich Client Platform plug-ins

Rational Application Developer supports profiling Eclipse Rich Client Platform (RCP) plug-ins. You can perform this profiling through the Eclipse Application option in the Profiling Configurations dialog box. There is an option to profile a new Eclipse instance using the plug-ins that are under development in the workbench. When you profile Eclipse plug-ins, it is especially wise to use a filter set that limits profile data directly to the packages that relate to your particular plug-in. As with other launch configurations, the profiling launch UI is built on the existing launch UI, which means that the workbench maintains a consistent profiling UI across varied application types. Profiling a plug-in is as easy as profiling a local Java application or other launch type.

Eclipse Web Tools Platform integration and profiling with WebSphere

Additionally, Rational Application Developer supports profiling servers like WebSphere Application Server or Tomcat, either running on the local machine, or connected to a remote machine. Rational Application Developer's profiling functionality closely integrates with existing server configurations. When you develop Web applications or Web services that run on WebSphere Application Server, or other supported application servers, you can launch a profiled application by selecting the server to profile in the Server view and then selecting the Profile icon. From here, the Profile on Server dialog box is displayed, and you can select the profiling type, as well as additional choices such as filters and profiling options.

A note on server profiling: the JVMTI profiling agent collects data at the JVM level rather than collecting data on a per-application basis. This means that all Java code that runs on the JVM will generate event data (including the server itself). You must ensure that you have correct filters set up to correctly target your application.

What you have learned

This tutorial explored the multi-faceted profiling functionality provided by Rational Application Developer. Rational Application Developer provides a user-friendly and intuitive interface to examine those details that are helpful for tuning a Java application, all the while seamlessly integrating with existing application configurations. The profiler is available for a wide variety of platforms, and supports any and all JVM configurations quickly and easily. With the proper application of profiling tools, and careful analysis of application performance and characteristics, you can discover and deal with performance issues before they become a problem, and before more costly solutions are required.

翻译自: https://www.ibm.com/developerworks/rational/tutorials/profilingjavaapplicationsusingrad/index.html