A middleware performance characterization of Linux using IBM WebSphere Application Server
Linux ** has come a long way since it was first introduced in 1991. It is undoubtedly one of the most successful open-source programs in the market today. Two of its strong points are its cost-effectiveness and its availability on many hardware platforms, including powerful workstations and mainframes. At least four factors brought Linux to its current status. First, its open nature encouraged talented people from all over the world to collaborate in its development and maintenance with the goal of continually improving the software. Second, the emergence of vendors and distributors such as Red Hat, Caldera, Debian **, SUSE, and others provided the software support customers look for in a product. Third, Linux gained the confidence of big software companies such as IBM, Hewlett-Packard, and Sun Microsystems, which in turn promoted the operating system and encouraged its adoption. Finally, as more and more customers invested in Linux, software development companies and independent software vendors became more engaged in porting their products to this platform.
Linux now faces even greater challenges, as businesses begin to look at their Linux machines as the next generation of enterprise servers. This creates high expectations for Linux to perform well, especially in the context of Internet-based services such as Web servers and application servers. Most companies today deploy networks of symmetrical multiprocessor (SMP) systems for their IT infrastructure. Originally, Linux was written for single processor systems; modern Linux kernels now support SMP systems, but traditionally have had scalability problems with them.
LINUX AND IBM MIDDLEWARE
IBM has a strong presence in the middleware market. For example, it holds the biggest market share for application servers today. As a Linux advocate, IBM views its middleware performance on Linux throughout all of its eServer* platforms as critical to the success of its Linux strategy. For this reason, a special work group within IBM was formed to specifically characterize the performance of Linux using IBM WebSphere * Application Server Version 5. The mission of the work group was to investigate the special characteristics of WebSphere Application Server and how the new enhancements to the Linux kernel could help improve its performance and scalability on SMP systems. Throughput and response rimes are the key metrics for performance.
WebSphere Application Server is a Java ** 2 Enterprise Edition (J2EE **) server, and IBM's primary platform for e-business. Many other IBM products run on it, such as IBM WebSphere Commerce Suite, IBM WebSphere Portal Server and IBM Content Manager. Because it is implemented almost entirely in Java, it runs on many different hardware and operating-system platforms, including various Linux distributions.
This paper is not a formal performance report. Instead, our focus is on describing the effects of applying some of the major enhancements in the Linux kernel to the overall performance and scalability of WebSphere Application Server Version 5. We describe relative improvements from a known baseline in terms of percentages. When a negative percentage is obtained, we describe how we resolved the problem to gain a positive improvement. Although we performed the tests on several platforms, our discussions are based on the 32-bit Intel Architecture ** (IA32 **) unless mentioned otherwise.
In the following sections, we first introduce the work group that performed the study and provide information about the benchmark applications used. After that, we discuss the Linux enhancements that we applied. We also cover some issues and improvements that are not Linux-specific but helped boost performance, and which required collaboration with other IBM product teams. To get a sense of how Linux was performing relative to other operating systems, we tan the same tests on exactly the same machine with a different operating system. In the case of IA32, we used Microsoft Windows ** 2000 and 2003 servers.
This work represents the view of the authors and does not necessarily represent the view of IBM.
PERFORMANCE EVALUATION WORK GROUP
Our work group was composed of IBM teams from various organizations. Its primary goals were to make sure that we uncovered issues with Linux on WebSphere Application Server and to understand how we could achieve the best performance by applying enhancements and fixes to the Linux kernel and by fine tuning parameters ("knobs") in the software stack. This necessitated collaborating with various groups, especially with the Linux open-source community, when problems and issues were discovered. We performed our benchmarking on the IA32, PowerPC *, and S/390 * platforms.
We believe that the best test for Linux performance is an end-to-end "macro benchmarking" using applications that run on middleware, such as an application server. This is because the real performance landscape is only seen by customers when their applications are actually running on top of the underlying infrastructure.
The members of this work group included the IBM teams from WebSphere Application Server Performance, the Linux Technology Center (LTC), the Java Technology Center (JTC), DB2* Performance, and the various performance teams from the pSeries *, iSeries *, and zSeries * platforms. (1) LTC was our main liaison to the Linux open-source community. The work group remains active to this date and is continuing its studies on newer versions of WebSphere Application Server and Linux kernels.
THE BENCHMARK APPLICATIONS
We chose two benchmark applications to test WebSphere Application Server Version 5 on different kernel levels of Linux. Each application stresses different parts of the application server.
The Trade benchmark application was written by the WebSphere Application Server performance team for its own performance work. The application models an electronic stock brokerage firm that provides Web-based online securities trading. There are several versions of Trade, depending on the J2EE version supported by the application server. In our study, we used Trade Version 2.7 and Version 3, which are based on J2EE 1.2 and J2EE 1.3, respectively. For more information about J2EE, see Reference 2. More information about Trade is available at Reference 3.
Figure 1 is a simple diagram of the 3 tier Trade configuration adopted by the work group. The client machine sends HTTP (HyperText Transport Protocol) requests directly to WebSphere Application Server through port 9080. The minimum requirement for the benchmark is to use Trade's EJB ** (Enterprise JavaBeans **) runtime mode, where all access to the database uses the EJB technology, thereby exercising the container-managed persistence component of WebSphere Application Server more heavily. The order-processing mode is set to synchronous, which means that all buy and sell orders are completed immediately when the request is issued, removing the need for queuing messages. The access mode we used is standard, in which all communications between servers and EJBs are performed using the Java Remote Method Invocation (RMI) protocol. The scenario workload mix, which provides an equal distribution of Trade operations such as login, register, quotes, and buy, is also standard. For the Web interface, simple JSPs ** (JavaServer Pages **) are used.
The Trade database on the DB2 server was populated initially with 5000 users and 1000 quotes. As the benchmark executed, the database was modified by updating and inserting records. In order to maintain consistency between runs, we kept a toaster copy of the original populated database and restored it for every new run. For every test run, the application server was restarted. The database was restored and the desired Trade configurations were reset. A warm-up run consisted of the following workload (expressed in terms of number of concurrent users and total number of requests submitted to the system) and executed in this sequence: one user, 1000 total requests; two users, 1000 total requests; five users, 1000 total requests; ten users, 1000 total requests; 25 users, 5000 total requests; 50 users, 5000 total requests; and 100 users, 5000 total requests. In a real environment, users spend some amount of "pause" time after requesting a page; for example, reading the contents or making decisions. In performance terminology, this is called "think rime." In our experiments, there is no think time, which means that after a requested page is received, the next request is immediately sent. This is also equivalent to simulating more users than the actual number of users in the system. In effect, the application server is stressed much more than in a real environment, given the same number of users.