Best practice for JVM Tuning[转]

原文地址：https://backstage.forgerock.com/knowledge/kb/article/a35746010

JVM tuning considerations

Before you begin JVM tuning, you should consider the following points:

Costs - depending on your environments, it may be more cost effective to add more hardware to improve performance rather than spending the time to tune.
Desired outcome - tuning for stability is more important in the long run than tuning for performance, although there is an overlap.
Ongoing problems - tuning may only delay or mask a problem temporarily if there is an underlying issue you need to resolve.
Memory leaks - these will always cause Garbage Collection (GC) issues regardless of tuning.

What is Java® heap and how do I know it needs tuning?

The Java heap is the amount of memory allocated to applications running in the JVM. Objects in heap memory can be shared between threads. Garbage collection (GC) refers to the automatic process of managing the runtime memory. The JVM is subject to GC pauses that vary in frequency and duration; if these pauses become more frequent and/or last longer, you will see issues (such as application pauses) which indicate that you may need to tune your Java heap. The heap is initially created when the JVM starts and is split into different spaces or generations, the key ones being Young (Eden or Nursery) and Tenured (Old):

The Young generation is used for new objects. A GC process (ParNew) automatically runs when it becomes full which removes unused objects and moves all objects that have lived long enough to the Tenured generation, thereby freeing up space in the Young generation for more new objects. Typically objects in the Young generation are short lived and temporary. The Young generation is small but active and GC happens frequently but with limited impact on performance.
The Tenured generation is used for the longer lived objects. Another GC process (CMS) runs when it becomes full to remove any unused objects. The Tenured generation is larger and less active than the Young generation but GC tends to have more impact on performance.

The following diagram illustrates the heap and also the corresponding GC options that can be set in (Java 7 and earlier):

Key symptoms that indicate you need to tune your Java heap are:

High CPU usage. When GC threads consume a lot of CPU, you will see spikes in overall CPU usage.
Application hangs. When your application appears to hang and stops responding, there are gaps in your log files or you experience general slowness when performing routine tasks, poorly tuned GC can be the cause.

The first thing to check if you experience these symptoms is your GC logs; this are easy to check and may help you identify your issue quickly.

GC Logging

There are various different GCs available; ForgeRock recommends the use of CMS (ConcurrentMMarkSweep), which is a throughput collector and tends to give the best performance as it runs concurrently with the application, meaning less pauses. You should specify this collector by setting the following JVM option:

-XX:+UseConcMarkSweepGC

You can enable GC logging as described in the following articles:

How do I enable Garbage Collector (GC) Logging for AM/OpenAM (All versions)?
How do I enable Garbage Collector (GC) Logging for DS/OpenDJ (All versions)?
How do I enable Garbage Collector (GC) Logging for IDM/OpenIDM (All versions)?
How do I enable Garbage Collector (GC) Logging for IG/OpenIG (All versions)?

Note

Typically you can leave GC logging on without impacting your system, however, you can just enable it when you are experiencing issues.

Understanding GC Logs

These example log files come from a CMS collector, where ParNew refers to the Young generation and CMS refers to the Tenured generation. These log file examples also show typical GC logs for a system where tuning is appropriate.

ParNew section

An example GC log output for the ParNew section looks like this:

24.245: [GC24.245: [ParNew: 545344K->39439K(613440K), 0.0424360 secs] 594671K->88767K(4101632K), 0.0426850 secs] [Times: user=0.17 sys=0.00, real=0.04 secs]
29.747: [GC29.747: [ParNew: 584783K->27488K(613440K), 0.0889130 secs] 634111K->104723K(4101632K), 0.0892180 secs] [Times: user=0.24 sys=0.01, real=0.09 secs]

Each line in this log file represents one GC and shows the following types of information for ParNew:

Timestamp information - if you did not set the PrintGCTimeStamps and PrintGCDateStamps options, this just shows as the number of seconds from JVM startup:
```
24.245: [GC24.245:
29.747: [GC29.747:
```
This information is useful as it shows the frequency of GCs. You should be aiming for GCs to occur once every 1 to 5 seconds; if they are occurring more than once a second, you need to tune your JVM and it is likely that you will be seeing high CPU usage.
Young generation information:
```
[ParNew: 545344K->39439K(613440K), 0.0424360 secs]
[ParNew: 584783K->27488K(613440K), 0.0889130 secs]
```
This information shows the initial size of the Young space before doing the GC, the size after doing the GC, the total size available and the time it took to do the GC. If the time taken to do the GC is large, that is more than 0.1 seconds, then it is possible that your Young generation heap size is too big. The total size available will not grow if you set NewSize and MaxNewSize to the same value and this is what you want to see in your logs.
Overall heap information:
```
594671K->88767K(4101632K), 0.0426850 secs]
634111K->104723K(4101632K), 0.0892180 secs]
```
This information shows initial size of the overall heap before doing the GC, the size after doing the GC and the total size available. With the second lot of figures (size after doing the GC), you can see how this incrementally grows up to the point when the GC is done and then reduces back down to a baseline figure. If you have a memory leak, you will see the baseline figure growing incrementally as it becomes increasingly full even after doing GC.
System information:
```
[Times: user=0.17 sys=0.00, real=0.04 secs]
[Times: user=0.24 sys=0.01, real=0.09 secs]
```
This information shows how much time was spent in user space, how much time spent in kernel or system space and what was the real impact to the user. If the user time is high, it indicates there isn't enough CPU for the number of user threads. If the system time is high, it indicates your system is swapping memory to disk which means that there isn't enough physical memory for your heap size.

CMS section

The CMS section of the log file is different to the ParNew section as it shows multiple lines for the same GC. An example GC log output for the CMS section looks like this:

40.146: [GC [1 CMS-initial-mark: 26386K(786432K)] 26404K(1048384K), 0.0074495 secs]
40.154: [CMS-concurrent-mark-start]
40.683: [CMS-concurrent-mark: 0.521/0.529 secs]
40.683: [CMS-concurrent-preclean-start]
40.701: [CMS-concurrent-preclean: 0.017/0.018 secs]
40.704: [GC40.704: [Rescan (parallel) , 0.1790103 secs]40.883: [weak refs processing, 0.0100966 secs] [1 CMS-remark: 26386K(786432K)] 52644K(1048384K), 0.1897792 secs]
40.894: [CMS-concurrent-sweep-start]
41.020: [CMS-concurrent-sweep: 0.126/0.126 secs]
41.020: [CMS-concurrent-reset-start]
41.147: [CMS-concurrent-reset: 0.127/0.127 secs]

This log shows the different phases the GC goes through. All the ones marked concurrent are happening while your application is running and therefore have little impact. The ones that don't say concurrent are stopping your application from running temporarily and therefore can have an impact. The CMS-initial-mark and CMS-remark show heap sizing details.

Note

Typically, you will see these sections intermingled, for example, some ParNews then a CMS, some more ParNews, another CMS and so on as the GC processes automatically occur to manage the heap.

Common issues shown in the GC logs and solutions

This section identifies common issues you might find in your GC logs and how you can resolve them. You can also use the Universal GC Log Analyzer to analyze your GC log. This is a third-party website that we suggest can be used for analysis but is not supported by ForgeRock.

Young generation - too small

The following log snippet shows an issue where the Young generation heap size is too small:

1.813: [GC1.813: [ParNew: 1152K>128K(1152K), 0.0008100 secs] 16620K->15756K(26936K), 0.0008350 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
1.816: [GC1.816: [ParNew: 1152K>128K(1152K), 0.0006430 secs] 16780K->15913K(26936K), 0.0006640 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
1.819: [GC1.819: [ParNew: 1152K>128K(1152K), 0.0005370 secs] 16937K->16038K(26936K), 0.0005570 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

As you can see, there are multiple GCs occurring within a second. This means you need to increase the size of the Young generation (NewSize and MaxNewSize) and may also need to increase total heap size to compensate for this change (Xms and Xmx). Alternatively you could change the NewRatio option (this is set to 2 by default, which means the Tenured generation heap is twice as big as the Young generation heap or that the Young generation heap is a 1/3 of the size of the Tenured generation heap).

Young generation - too large / Tenured generation - too small

The following log snippet shows an issue where the Young generation heap size is too large and consequently the Tenured generation heap size is too small:

276.716: [GC (CMS Initial Mark) [1 CMS-initial-mark: 104176K(135168K)] 2758985K(6741248K), 0.8762860 secs] [Times: user=0.88 sys=0.00, real=0.88 secs]
277.592: [CMS-concurrent-mark-start]
277.657: [CMS-concurrent-mark: 0.063/0.065 secs] [Times: user=0.12 sys=0.00, real=0.06 secs]
277.657: [CMS-concurrent-preclean-start]
277.658: [CMS-concurrent-preclean: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
277.658: [CMS-concurrent-abortable-preclean-start]
277.658: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
277.658: [GC (CMS Final Remark)[YG occupancy: 2657419 K (6606080 K)]277.658: [Rescan (parallel) , 0.9815460 secs]278.639: [weak refs processing, 0.0000320 secs]278.640: [scrub string table, 0.0011700 secs] [1 CMS-remark: 104176K(135168K)] 2761595K(6741248K), 0.9828250 secs] [Times: user=7.18 sys=0.09, real=0.99 secs]
278.641: [CMS-concurrent-sweep-start]
278.668: [CMS-concurrent-sweep: 0.026/0.027 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]
278.668: [CMS-concurrent-reset-start]
278.668: [CMS-concurrent-reset: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
280.673: [GC (CMS Initial Mark) [1 CMS-initial-mark: 104079K(135168K)] 2774091K(6741248K), 0.9033730 secs] [Times: user=0.90 sys=0.00, real=0.90 secs]
281.577: [CMS-concurrent-mark-start]
281.640: [CMS-concurrent-mark: 0.063/0.063 secs] [Times: user=0.13 sys=0.00, real=0.07 secs]
281.640: [CMS-concurrent-preclean-start]
281.641: [CMS-concurrent-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
281.641: [CMS-concurrent-abortable-preclean-start]
281.641: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
281.641: [GC (CMS Final Remark)[YG occupancy: 2670011 K (6606080 K)]281.641: [Rescan (parallel) , 0.9914290 secs]282.633: [weak refs processing, 0.0000110 secs]282.633: [scrub string table, 0.0008100 secs] [1 CMS-remark: 104079K(135168K)] 2774091K(6741248K), 0.9923100 secs] [Times: user=7.14 sys=0.11, real=0.99 secs]
282.634: [CMS-concurrent-sweep-start]
282.659: [CMS-concurrent-sweep: 0.024/0.025 secs] [Times: user=0.06 sys=0.01, real=0.02 secs]
282.659: [CMS-concurrent-reset-start]
282.659: [CMS-concurrent-reset: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

As you can see, the CMS collections are occurring too frequently, without any ParNew collections in between. This is a result of the Young generation heap size being too large so that each time a ParNew collection happens, the Tenured heap size instantly becomes full and a CMS collection runs. This could happen if you increased your Young generation heap size without increasing your total heap size. This log snippet could also happen if you have memory leaks.

CMS failure

The following log snippet shows a CMS failure (concurrent mode failure):

(concurrent mode failure): 1551948K->1548050K(1572864K), 15.1154540 secs] 1986510K->1548050K(2044736K), [CMS Perm : 78597K->78588K(524288K)], 15.4406700 secs] [Times: user=15.52 sys=0.00, real=15.44 secs]
703240.983: [GC [1 CMS-initial-mark: 1548050K(1572864K)] 1561545K(2044736K), 0.0059690 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]

This error means that the CMS collector cannot keep up with the amount of garbage being generated (the Tenured generation heap space is about 90% full) and GCs are taking longer meaning your application stops for longer periods of time. This example shows the application stopping for over 15 seconds, which will have a significant impact on performance.

You can add the following GC options to try to avoid the CMS failure:

-XX:CMSInitiatingOccupancyFraction=<percentage-of-tenured>
-XX:+UseCMSInitiatingOccupancyOnly

You should set -XX:CMSInitiatingOccupancyFraction to 70 or 80% (instead of default ~90%) which may enable the CMS collector to keep up with the amount of garbage being generated.

Full GC

The following log snippet shows a Full GC:

25.466: [Full GC )25.466: [CMS: 68756K>75361K(1987392K), 0.2548820 secs] 96571K->75361K(2064064K), [CMS Perm : 61069K>60722K(61376K)], 0.2551850 secs] [Times: user=0.25 sys=0.01, real=0.25 secs]
26.185: [Full GC (System.gc())26.185: [CMS: 75361K>73283K(1987392K), 0.2398450 secs] 76136K->73283K(2064064K), [CMS Perm : 60723K>60723K(101204K)], 0.2400120 secs] [Times: user=0.24 sys=0.00, real=0.24 secs]
27.293: [Full GC (System.gc())27.293: [CMS: 73283K>73291K(1987392K), 0.2111960 secs] 74096K>73291K(2064064K), [CMS Perm : 60723K>60723K(101396K)], 0.2112730 secs] [Times: user=0.21 sys=0.00, real=0.22 secs]
68.081: [Full GC (Heap Inspection Initiated GC)68.081: [CMS: 73291K->73187K(1987392K), 0.2172670 secs] 77634K>73187K(2064064K), [CMS Perm : 60742K>60742K(101396K)], 0.2174520 secs] [Times: user=0.22 sys=0.00, real=0.21 secs]

These log snippets refer to Full GCs occurring, meaning that the entire heap is being cleared (Young and Tenured); this often happens due to system processes such as RMI which performs a Full GC every hour on the hour. The PrintGCCause option is very useful to help you find out why the GC is happening (the last three lines in the log above show the additional information output by this option, making it clear which ones are system related). You can also set the DisableExplicitGC option to prevent system related Full GCs from happening.

Permanent generation

Note

The following information does not apply to Java 8 since the Permanent generation space has been removed.

The following log snippet shows a Full GC as a result of the Permanent generation space growing (the Permanent generation space contains metadata and other data that should be permanently stored and does not typically grow):

1.241: [Full GC (Ergonomics) [PSYoungGen: 6125K>0K(45056K)] [ParOldGen: 17659K->20350K(20480K)] 23785K>20350K(65536K) [PSPermGen: 13550K->13546K(27136K)], 0.0531680 secs] [Times: user=0.25 sys=0.00, real=0.05 secs]
1.513: [Full GC (Ergonomics) [PSYoungGen: 38912K>2783K(45056K)] [ParOldGen: 20350K>20424K(20480K)] 59262K>23208K(65536K) [PSPermGen: 15188K->15188K(31744K)], 0.0992090 secs] [Times: user=0.65 sys=0.02, real=0.10 secs]

These log snippets show the Permanent generation space growing (figure in brackets after PSPermGen). You can increase the MaxPermSize option to resolve this issue.

Best practice advice on tuning

Initially, you should configure your JVM as per the recommended settings for your product (if available):

AM/OpenAM - Setup and Maintenance Guide › Maintaining an Instance › Tuning Java Virtual Machine Settings
DS/OpenDJ - Administration Guide › Tuning Servers For Performance › Java Settings

You can then tune your JVM as and when required.

You should always follow this basic process when tuning:

Establish a baseline using a valid load testing process. This step is essential as it is the only way you will know if your tuning is having the desired effect.
Change one thing at a time and document each change. This allows you to determine the impact of each change and revert if necessary.

Heap tuning guidelines

Although we cannot give recommended heap settings because they vary considerably between environments and applications, you should follow these guidelines when tuning:

You should always use the latest supported software versions.
You should use the CMS GC as recommended by ForgeRock.
Your total heap size must not exceed the amount of physical RAM available.
You must leave some physical RAM for other applications to run on your machine.
You should set Xms and Xmx to the same value for best performance. These options determine your total heap size.
You should set NewSize and MaxNewSize to the same value. These options determine how the heap is allocated between generations.
You should set the DisableExplicitGC option to prevent system related full GCs from running.
You must not add lots of options at once or add options you don't understand.
You must not copy and paste options from other applications.
You should use GC logs to help identify issues with your tuning.

Getting your total heap size and allocation between generations right is key; once you have these set correctly, you likely won't need to do anymore tuning.

Summary

In summary, the most important GC knobs to get right are:

The collector, which should be CMS.
The total heap size, where Xmx = Xms.
The split between generations, using NewSize or NewRatio, where NewSize = MaxNewSize.

Related Training

ForgeRock Directory Services Core Concepts

Related Issue Tracker IDs

OPENAM-4414 (Apache Policy Agent does not complete cleanup / logout)

OPENIDM-1563 (Task scanner creates a new thread pool for each execution resulting in a thread leak.)

转载于:https://www.cnblogs.com/davidwang456/articles/11286391.html