posts - 15,  comments - 34,  trackbacks - 27

Maximizing Java Performance on AIX: Part 5 - References and Conclusion

developerWorks
Document options
Set printer orientation to landscape mode

Print this page

Email this page

E-mail this page

Document options requiring JavaScript are not displayed


New site feature

Plan and design IT solutions with our new Architecture area


Rate this page

Help us improve this content


Level: Intermediate

Amit Mathur (amitmat@us.ibm.com), Senior Technical Consultant and Solutions Enablement Manager, IBM
Sumit Chawla (sumitc@us.ibm.com), IBM Certified IT Architect and Technical Lead, Java Enablement, IBM

17 May 2004

This is the conclusion of the 5-part series providing tips and techniques that are commonly used for tuning Java™ applications for optimum performance on AIX®. We touch upon other interesting areas of Java performance tuning for AIX, look at a few case studies, and then end the series with a list of useful references.

Introduction

This is the conclusion of the 5-part series on Java Performance.The first article in the series laid the foundation for performance tuning, and parts 2, 3 and 4 looked at various bottlenecks that can affect a system's scalability and throughput. This article covers two important topics that were not covered previously, along with providing case studies and references.

A frequently asked question (FAQ) is about translating Sun-specific command-line switches to IBM-specific switches. Also, any serious performance tuning exercise, like benchmarks, cannot ignore system-wide tuning. We touch upon these topics briefly in the next section.

This is followed by a few case studies that attempt to illustrate how the tools and tips described in the series are applied to solve problems in the field. The emphasis is on understanding and learning how to use the tools and techniques mentioned in the series.

The article, and the series, concludes with a recap of useful references.



Back to top


Other Important Considerations

This section talks about translating Sun Java configuration to IBM Java configuration, and System-wide tuning for AIX applications. The scope of both of these topics is quite vast, so we only touch them briefly.

Translating Sun Java Switches

If you have an application that has been tuned for Sun Java, and you are attempting to migrate your application to AIX (or, for that matter, any platform running IBM Java), you may already have done the hard work. Understanding the application characteristics is half the battle. You can use the characteristics-based tuning tips explained in Part 2 and Part 3 based on the understanding obtained by the tuning exercise with Sun Java.

However, we frequently receive queries about how to translate specific Sun Java command-line switches to equivalent IBM Java command-line switches. These switches almost always correspond to Garbage Collection, as a well-tuned GC is essential to any Java-based application's performance. A mapping between Sun and IBM switches is difficult because of the difference in JVM architecture. The IBM Java does not contain a Generational Garbage Collector, and does not understand any command-line switches that start with -XX. IBM Java "Sovereign" architecture is not based on Sun HotSpot architecture as well. The easiest, and in most cases the quickest, way is to throw away all Sun-specific settings when running your application on IBM Platforms, and carrying out the fine-tuning as needed. But if you are curious about how some Sun switches map to IBM switches, read on.

The table below attempts to translate Sun Java GC command-line switches to equivalent IBM switches. This mapping is based on the functionality of Sun switches as described in the Sun-specific article "Tuning Garbage Collection with the 1.4.2 Java Virtual Machine". You should attempt to use this table only for the very specific purpose of locating an equivalent (or close) switch for IBM Java. This is not meant to replace the tuning exercise, as even heap size requirements can be quite different. For general GC tuning tips with IBM Java, as well as for information on these and other GC switches for IBM Java, please refer to Fine-tuning Java garbage collection performance. The creation of this table did not involve any performance-related testing, but was based entirely on the documented use of the Sun switches in the reference quoted above.

Sun Switch Equivalent IBM Switch Notes
-Xms, -Xmx -Xms, -Xmx These parameters, and their meaning, remain unchanged. You may still need to do heap sizing.
-XX:SurvivorRatio, -XX:NewSize,
-XX:MaxNewSize, -XX:NewRatio
None These switches can simply be removed, as they are for generational GC which doesn't apply for IBM Java.
-XX:MinHeapFreeRatio, -XX:MaxHeapFreeRatio -Xminf, -Xmaxf Heap expansion/shrinkage is controlled by other factors, not just these switches.
-Xverbose:gc, -XX:+PrintGCDetails -Xverbose:gc IBM Java verbosegc trace format is quite different from the Sun GC. More detailed tracing can be enabled as needed, but in most cases the default verbosegc traces are sufficient.
-XX:+UseParallelGC, -Xincgc, -XX:+AggressiveHeap None These are various types of Garbage Collectors supported by Sun. These do not apply to IBM Java.
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xgcpolicy:optavgpause Concurrent Low-pause collector is close to the IBM Concurrent Mark in intent (but not necessarily in design).
-XX:+CMSParallelRemarkEnabled None Not applicable for IBM Java.
-XX:ParallelGCThreads -Xgcthreads It is not advisable to change this setting at least for IBM Java.
-Dsun.rmi.dgc.client.gcInterval, -Dsun.rmi.dgc.server.gcInterval -Dsun.rmi.dgc.client.gcInterval, -Dsun.rmi.dgc.server.gcInterval See NIO003 in Part 4.

As the above table demonstrates, translating the switches in most cases will involve simply discarding the Sun switches for IBM platforms. This makes the GC tuning exercise for IBM Java quite painless, while still providing superior performance.

System-wide tuning

Using AIX tools like schedo and vmo has a system-wide effect, so a thorough coverage of these tools is beyond the scope of the current series. See the Resources section for more information on these tools.

But for most multi-tiered applications and especially benchmarks, system-wide tuning is unavoidable. There are several excellent resources available for AIX performance tuning that you can consider. To get an idea of the kind of tuning normally required, you can look at actual published benchmarks. If you look at a recent SpecJBB 2000 result for IBM Java on AIX, say http://www.spec.org/osg/jbb2000/results/res2003q3/jbb2000-20030624-00194.html, the OS tunings are mentioned below:
Operating system tunings
  • SPINLOOPTIME=2000
  • vmo -r -o lgpg_regions=256 -o lgpg_size=16777216
  • setsched -S rr -P 40 -p $$
  • schedtune -t 400 -F 1
  • vmtune -S 1

Warning: These settings must not be applied without carefully understanding the consequences, as an improper use of these settings can actually worsen the system performance.

So what do the above settings do? Referring to AIX documentation, you can quickly get a better understanding of what each of these settings is doing. Let us examine each of these in turn.

The SPINLOOPTIME controls the number of times the system will retry a busy lock before yielding to another process. Since the default is 40, a higher value tells the system that it should try a little bit longer for the lock to be freed up. On multiprocessor systems, it results in better performance since a busy lock retry is cheaper than a process context switch.

The vmo line sets up the size and number of large pages. If you look at the Java command-line switches, you will see that -Xlp is being used. This enables Large page support in Java, which is described in more detail here. If you have a memory intensive application, you can experiment with large pages to see if it helps. More information on this topic is available in the SDK Guide accompanying Java.

The setsched line is actually a script, not an AIX command. It calls the thread_setsched kernel service to select fixed-priority round-robin scheduling, with a fixed priority of 40 for the Java process.

The schedtune command is also a script from AIX 5.2 onwards, that maps the passed parameters to the new schedo command. The above line is changing the time slice for fixed-priority threads to 400 ticks. It is also forcing the fixed priority threads to reside in the global run queue.

Finally, the vmtune command translates the call to an equivalent vmo call, and the above line enables pinning of shared memory segments.

So you can see that the system was switched to large pages and a fixed priority scheduler, to get record numbers with SpecJBB 2000 benchmark. Can you use these same settings in your application? Probably not, but you are now in a position to examine these commands and the scheduling policies, and experiment to suit your application characteristics. This is the next step in Performance tuning.



Back to top


Case Studies

In this section we look at a few examples, taken from actual issues handled by Java service team. These examples should give you a good idea of how to approach performance tuning, and how to use various tools to gather information that can be used for tuning exercise. Note that the cases for this section were not chosen based on how frequently the problem is encountered in the field. The emphasis is on understanding how to use the various tools and techniques discussed in the series to locate and correct performance issues.

Case 1: Bad application response time

The reported issue was that the Java-based application's response time was unacceptable. Using topas, and then with vmstat, it was seen that Java was the application consuming most CPU. Using tprof, the functions that showed up had GC-related terms in them (e.g. localMark, which is used in Mark phase), so this indicated a possible issue with Java heap sizing.

Looking at GC logs confirmed that the heap was expanding very often. This, combined with multiple allocation failures in quick succession and a very full heap, resulted in the Java application spending a lot of time just trying to locate a free chunk, not finding it, expanding the heap, and then satisfying only the current allocation request.

This was fixed by specifying a larger value for -Xmine, forcing the heap to grow faster (seeTip MEM003 in Part 3). The result was that a single expansion avoided multiple potential allocation failures.

The first step, using AIX tools, confirmed this to be a Java-related issue. The second step, guided by the fact that AIX tools indicated this to be a problem related with GC, could concentrate on GC logs directly. The third step used the available tuning parameters to break the unusual cycle that the application was getting into, allowing the excessive CPU time to be recovered.

Case 2: JVMPI to the rescue

Another interesting scenario was raised as a performance issue, with CPU being busy most of the time. Looking at verbosegc, we could see that a GC cycle was being called a bit too frequently, resulting in the application spending most of its time doing just GC.

The verbosegc traces showed that most of the GC activity was being caused due to multiple allocations of very large objects, roughly 10 MB or more in size. These were fragmenting the heap, making the GC cycles longer and thus affecting the performance. But looking at the verbosegc cycle, the customer could not say what these objects were.

The easiest way to locate the culprit would have been to analyze the heapdump using HeapRoots tool. But there was another twist to the situation: the large objects were not surviving the GC cycle. So the heapdump did not show any objects of such size.

This is a classic example of how profiling can be a very useful ally for locating and correcting problems in application sources. The Java Virtual Machine Profile Interface makes this problem trivial. For this particular example, we used a variation of the method described at Using JVMPI to Identify Large Memory Allocations, and were able to quickly identify the code that was doing this allocation.

Case 3: Unbounded growth

As the final case study for this article, we discuss a scenario that showed up looking like a simple sizing problem. An attempt was being made to scale the application to a 1000 users, and the application would run out of Java heap. Calculating the heap requirements based on the number of users, the heap size was increased from 1 GB to 1.5 GB.

But this triggered OOM errors not coming because of Java heap. The Java heap would show enough free space, but the application logs would show that an OOM occurred. Using svmon, it was seen that only around 4 segments, or 1 GB, were being used for native heap, and the fourth segment seemed to be almost empty (see "Balancing Memory" in "Getting more memory in AIX for your Java applications").

To dig further, the command-line switch -verbose:jni was added. The extra messages being printed as a result of this switch revealed that the global JNI reference pool was getting exhausted, which is a very rare thing to happen. The global JNI reference pool is large enough to ensure that most normal applications never run even close to exhausting it.

For some time we tried to work around the problem by increasing the number of JNI references (if you specify a higher -Xoss value, it increases the limit in a proportional manner). But this only delayed the inevitable, and the severe Java heap fragmentation caused by the large number of pinned JNI references did not help either.

A closer look at the application design revealed the true cause: an unbounded number of threads being created by the application. As the tests proceeded, the threads would wait for finalizers, and since finalizers are not predictable, there would be a large number of these threads waiting to release their JNI references. The only feasible solution in this case was to change the application code to correct these two problems. Once the application replaced the unbounded threads with a thread pool, and replaced finalizers wherever possible, the sizing work completed with flying colors.

This case shows how you can sometimes end up trying to hit a moving target. An issue reported as Java heap exhaustion eventually turned out to be a design problem. Balancing the Java and Native heaps is usually a critical part of performance tuning, but in this case it was not sufficient. Having so many tools and techniques at your disposal gives you a much broader picture, allowing you to make informed decisions about what to tune.



Back to top


Conclusion

This article concludes the series. We hope you find this series a valuable guide in maximizing the performance of your Java applications on AIX.

The authors thank Ashok Ambati, Rajesh Jeyapaul, Sharad Ballal, Roger Leuckie and Mark Bluemel for their input and advice on these articles. A special note of thanks goes to John Tesch, whose collection of information on AIX Java performance was a major source of inspiration to the series.



Back to top


Resources

Java on AIX
  • The IBM developer kits for AIX, Java technology edition, at http://www-106.ibm.com/developerworks/java/jdk/aix/service.html, contains links to all available Java releases on AIX. The SDK Guide for each version contains information relevant to the release and should be reviewed before you use that particular version. Another useful link is "Fix Info" link that, among other things, gives you a list of APARs you will need to apply to update all the filesets to the latest level.

  • The IBM developer kits - diagnosis documentation, at http://www-106.ibm.com/developerworks/java/jdk/diagnosis/, is a comprehensive document that provides thorough coverage of diagnostic capabilities of IBM Java. Best of all, it is platform-independent, so you can use it for all platforms that IBM Java runs on. A recent addition on this page is the "IBM Garbage Collection and Storage Allocation Techniques", a detailed look at how IBM Java implements Garbage Collection and is a must-read for anyone interested in finding out more about how IBM Java works.

  • Implementing Java on AIX (developerworks, March 2004), at http://www-106.ibm.com/developerworks/eserver/library/es-JavaOnAix_install.html, contains information you need to start using Java on AIX.

  • Since GC tuning is usually the most important part of performance tweaking, see Fine-tuning Java garbage collection performance (developerworks, January 2003), at http://www-106.ibm.com/developerworks/library/i-gctroub/, for a discussion of tweaking IBM Java heap characteristics. Note that this article is not AIX-specific. Parts 2 and 3 of the current series also look at some tips useful for GC tuning.

  • Getting more memory in AIX for your Java applications (developerworks, September, 2003), at http://www-106.ibm.com/developerworks/eserver/articles/aix4java1.html, provides information for heaps larger than 1 GB.
AIX
  • AIX 5L Performance Tools Handbook at http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/SG246039.html, covers each of the performance tools described in the current series (plus many more that are not). You will find practical advice on how to use a particular tool, examples of tool usage, and hints on how to interpret and use the information obtained using these tools.

  • Understanding IBM eServer pSeries Performance and Sizing at http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/SG244810.html, gives an overview of how IBM AIX-based systems are designed. If you are planning to do a benchmark exercise, this book is invaluable.

  • A series of three articles that talk about the changes to Performance tools in AIX 5.2 are Part 1 (developerworks, July 2003), Part 2 (developerworks, November 2003), and Part 3 (developerworks, January 2004. AIX 5.2 did a major overhaul of some of the familiar performance tools and this series covers the changes very well.

  • The AIX Performance PMR Data Collection Tools, described in Part 1, can be downloaded from ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr/

  • Checklist for isolating Java performance problems on AIX servers at http://www-106.ibm.com/developerworks/java/library/j-perf-checklist/


Back to top


About the authors

Amit Mathur works in the IBM Solutions Development group, working primarily with IBM ISVs in enablement/performance of their apps on IBM eServer platforms and providing self-sufficiency to ISVs and customers by providing education and articles on developer works. Amit has more than fourteen years' experience working in Leading software support and development in C/C++, Java and databases on UNIX and Linux platforms. He holds a Bachelor of Engineering degree in Electronics and Telecommunication from India. You can reach Amit at amitmat@us.ibm.com.


Sumit Chawla leads the Java Enablement initiative for IBM eServer (for AIX, Windows, and Linux platforms), assisting Independent Software Vendors for IBM Servers. Sumit has a Master of Science degree in Computer Science, with almost 10 years of experience in the IT industry, and is certified by IBM as an Application Architect. He is a frequent contributor to the developerWorks eServer zone. You can contact him at sumitc@us.ibm.com.

posted on 2005-12-29 14:04 jacky 阅读(1337) 评论(0)  编辑  收藏 所属分类: java

只有注册用户登录后才能发表评论。


网站导航:
 
<2024年11月>
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

常用链接

留言簿(10)

随笔档案

文章分类

文章档案

相册

收藏夹

java

搜索

  •  

最新评论