-<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">\r
-<HTML>\r
- <HEAD>\r
- <TITLE>\r
- Understanding FastCGI Application Performance\r
- </TITLE>\r
-<STYLE TYPE="text/css">\r
- body {\r
- background-color: #FFFFFF;\r
- color: #000000;\r
- }\r
- :link { color: #cc0000 }\r
- :visited { color: #555555 }\r
- :active { color: #000011 }\r
- div.c3 {margin-left: 2em}\r
- h5.c2 {text-align: center}\r
- div.c1 {text-align: center}\r
-</STYLE>\r
- </HEAD>\r
- <BODY>\r
- <DIV CLASS="c1">\r
- <A HREF="http://fastcgi.com"><IMG BORDER="0" SRC="../images/fcgi-hd.gif" ALT="[[FastCGI]]"></A>\r
- </DIV>\r
- <BR CLEAR="all">\r
- <DIV CLASS="c1">\r
- <H3>\r
- Understanding FastCGI Application Performance\r
- </H3>\r
- </DIV>\r
- <!--Copyright (c) 1996 Open Market, Inc. -->\r
- <!--See the file "LICENSE.TERMS" for information on usage and redistribution-->\r
- <!--of this file, and for a DISCLAIMER OF ALL WARRANTIES. -->\r
- <DIV CLASS="c1">\r
- Mark R. Brown<BR>\r
- Open Market, Inc.<BR>\r
- <P>\r
- 10 June 1996<BR>\r
- </P>\r
- </DIV>\r
- <P>\r
- </P>\r
- <H5 CLASS="c2">\r
- Copyright © 1996 Open Market, Inc. 245 First Street, Cambridge, MA 02142 U.S.A.<BR>\r
- Tel: 617-621-9500 Fax: 617-621-1703 URL: <A HREF=\r
- "http://www.openmarket.com/">http://www.openmarket.com/</A><BR>\r
- $Id: fcgi-perf.htm,v 1.3 2001/11/27 01:03:47 robs Exp $<BR>\r
- </H5>\r
- <HR>\r
- <UL TYPE="square">\r
- <LI>\r
- <A HREF="#S1">1. Introduction</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S2">2. Performance Basics</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S3">3. Caching</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S4">4. Database Access</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S5">5. A Performance Test</A> \r
- <UL TYPE="square">\r
- <LI>\r
- <A HREF="#S5.1">5.1 Application Scenario</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S5.2">5.2 Application Design</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S5.3">5.3 Test Conditions</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S5.4">5.4 Test Results and Discussion</A>\r
- </LI>\r
- </UL>\r
- </LI>\r
- <LI>\r
- <A HREF="#S6">6. Multi-threaded APIs</A>\r
- </LI>\r
- <LI>\r
- <A HREF="#S7">7. Conclusion</A>\r
- </LI>\r
- </UL>\r
- <P>\r
- </P>\r
- <HR>\r
- <H3>\r
- <A NAME="S1">1. Introduction</A>\r
- </H3>\r
- <P>\r
- Just how fast is FastCGI? How does the performance of a FastCGI application compare with the performance of\r
- the same application implemented using a Web server API?\r
- </P>\r
- <P>\r
- Of course, the answer is that it depends upon the application. A more complete answer is that FastCGI often\r
- wins by a significant margin, and seldom loses by very much.\r
- </P>\r
- <P>\r
- Papers on computer system performance can be laden with complex graphs showing how this varies with that.\r
- Seldom do the graphs shed much light on <I>why</I> one system is faster than another. Advertising copy is\r
- often even less informative. An ad from one large Web server vendor says that its server "executes web\r
- applications up to five times faster than all other servers," but the ad gives little clue where the\r
- number "five" came from.\r
- </P>\r
- <P>\r
- This paper is meant to convey an understanding of the primary factors that influence the performance of Web\r
- server applications and to show that architectural differences between FastCGI and server APIs often give an\r
- "unfair" performance advantage to FastCGI applications. We run a test that shows a FastCGI\r
- application running three times faster than the corresponding Web server API application. Under different\r
- conditions this factor might be larger or smaller. We show you what you'd need to measure to figure that\r
- out for the situation you face, rather than just saying "we're three times faster" and moving\r
- on.\r
- </P>\r
- <P>\r
- This paper makes no attempt to prove that FastCGI is better than Web server APIs for every application. Web\r
- server APIs enable lightweight protocol extensions, such as Open Market's SecureLink extension, to be\r
- added to Web servers, as well as allowing other forms of server customization. But APIs are not well matched\r
- to mainstream applications such as personalized content or access to corporate databases, because of API\r
- drawbacks including high complexity, low security, and limited scalability. FastCGI shines when used for the\r
- vast majority of Web applications.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S2">2. Performance Basics</A>\r
- </H3>\r
- <P>\r
- Since this paper is about performance we need to be clear on what "performance" is.\r
- </P>\r
- <P>\r
- The standard way to measure performance in a request-response system like the Web is to measure peak request\r
- throughput subject to a response time constriaint. For instance, a Web server application might be capable of\r
- performing 20 requests per second while responding to 90% of the requests in less than 2 seconds.\r
- </P>\r
- <P>\r
- Response time is a thorny thing to measure on the Web because client communications links to the Internet have\r
- widely varying bandwidth. If the client is slow to read the server's response, response time at both the\r
- client and the server will go up, and there's nothing the server can do about it. For the purposes of\r
- making repeatable measurements the client should have a high-bandwidth communications link to the server.\r
- </P>\r
- <P>\r
- [Footnote: When designing a Web server application that will be accessed over slow (e.g. 14.4 or even 28.8\r
- kilobit/second modem) channels, pay attention to the simultaneous connections bottleneck. Some servers are\r
- limited by design to only 100 or 200 simultaneous connections. If your application sends 50 kilobytes of data\r
- to a typical client that can read 2 kilobytes per second, then a request takes 25 seconds to complete. If your\r
- server is limited to 100 simultaneous connections, throughput is limited to just 4 requests per second.]\r
- </P>\r
- <P>\r
- Response time is seldom an issue when load is light, but response times rise quickly as the system approaches\r
- a bottleneck on some limited resource. The three resources that typical systems run out of are network I/O,\r
- disk I/O, and processor time. If short response time is a goal, it is a good idea to stay at or below 50% load\r
- on each of these resources. For instance, if your disk subsystem is capable of delivering 200 I/Os per second,\r
- then try to run your application at 100 I/Os per second to avoid having the disk subsystem contribute to slow\r
- response times. Through careful management it is possible to succeed in running closer to the edge, but\r
- careful management is both difficult and expensive so few systems get it.\r
- </P>\r
- <P>\r
- If a Web server application is local to the Web server machine, then its internal design has no impact on\r
- network I/O. Application design can have a big impact on usage of disk I/O and processor time.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S3">3. Caching</A>\r
- </H3>\r
- <P>\r
- It is a rare Web server application that doesn't run fast when all the information it needs is available\r
- in its memory. And if the application doesn't run fast under those conditions, the possible solutions are\r
- evident: Tune the processor-hungry parts of the application, install a faster processor, or change the\r
- application's functional specification so it doesn't need to do so much work.\r
- </P>\r
- <P>\r
- The way to make information available in memory is by caching. A cache is an in-memory data structure that\r
- contains information that's been read from its permanent home on disk. When the application needs\r
- information, it consults the cache, and uses the information if it is there. Otherwise is reads the\r
- information from disk and places a copy in the cache. If the cache is full, the application discards some old\r
- information before adding the new. When the application needs to change cached information, it changes both\r
- the cache entry and the information on disk. That way, if the application crashes, no information is lost; the\r
- application just runs more slowly for awhile after restarting, because the cache doesn't improve\r
- performance when it is empty.\r
- </P>\r
- <P>\r
- Caching can reduce both disk I/O and processor time, because reading information from disk uses more processor\r
- time than reading it from the cache. Because caching addresses both of the potential bottlenecks, it is the\r
- focal point of high-performance Web server application design. CGI applications couldn't perform in-memory\r
- caching, because they exited after processing just one request. Web server APIs promised to solve this\r
- problem. But how effective is the solution?\r
- </P>\r
- <P>\r
- Today's most widely deployed Web server APIs are based on a pool-of-processes server model. The Web server\r
- consists of a parent process and a pool of child processes. Processes do not share memory. An incoming request\r
- is assigned to an idle child at random. The child runs the request to completion before accepting a new\r
- request. A typical server has 32 child processes, a large server has 100 or 200.\r
- </P>\r
- <P>\r
- In-memory caching works very poorly in this server model because processes do not share memory and incoming\r
- requests are assigned to processes at random. For instance, to keep a frequently-used file available in memory\r
- the server must keep a file copy per child, which wastes memory. When the file is modified all the children\r
- need to be notified, which is complex (the APIs don't provide a way to do it).\r
- </P>\r
- <P>\r
- FastCGI is designed to allow effective in-memory caching. Requests are routed from any child process to a\r
- FastCGI application server. The FastCGI application process maintains an in-memory cache.\r
- </P>\r
- <P>\r
- In some cases a single FastCGI application server won't provide enough performance. FastCGI provides two\r
- solutions: session affinity and multi-threading.\r
- </P>\r
- <P>\r
- With session affinity you run a pool of application processes and the Web server routes requests to individual\r
- processes based on any information contained in the request. For instance, the server can route according to\r
- the area of content that's been requested, or according to the user. The user might be identified by an\r
- application-specific session identifier, by the user ID contained in an Open Market Secure Link ticket, by the\r
- Basic Authentication user name, or whatever. Each process maintains its own cache, and session affinity\r
- ensures that each incoming request has access to the cache that will speed up processing the most.\r
- </P>\r
- <P>\r
- With multi-threading you run an application process that is designed to handle several requests at the same\r
- time. The threads handling concurrent requests share process memory, so they all have access to the same\r
- cache. Multi-threaded programming is complex -- concurrency makes programs difficult to test and debug -- but\r
- with FastCGI you can write single threaded <I>or</I> multithreaded applications.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S4">4. Database Access</A>\r
- </H3>\r
- <P>\r
- Many Web server applications perform database access. Existing databases contain a lot of valuable\r
- information; Web server applications allow companies to give wider access to the information.\r
- </P>\r
- <P>\r
- Access to database management systems, even within a single machine, is via connection-oriented protocols. An\r
- application "logs in" to a database, creating a connection, then performs one or more accesses.\r
- Frequently, the cost of creating the database connection is several times the cost of accessing data over an\r
- established connection.\r
- </P>\r
- <P>\r
- To a first approximation database connections are just another type of state to be cached in memory by an\r
- application, so the discussion of caching above applies to caching database connections.\r
- </P>\r
- <P>\r
- But database connections are special in one respect: They are often the basis for database licensing. You pay\r
- the database vendor according to the number of concurrent connections the database system can sustain. A\r
- 100-connection license costs much more than a 5-connection license. It follows that caching a database\r
- connection per Web server child process is not just wasteful of system's hardware resources, it could\r
- break your software budget.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S5">5. A Performance Test</A>\r
- </H3>\r
- <P>\r
- We designed a test application to illustrate performance issues. The application represents a class of\r
- applications that deliver personalized content. The test application is quite a bit simpler than any real\r
- application would be, but still illustrates the main performance issues. We implemented the application using\r
- both FastCGI and a current Web server API, and measured the performance of each.\r
- </P>\r
- <P>\r
- </P>\r
- <H4>\r
- <A NAME="S5.1">5.1 Application Scenario</A>\r
- </H4>\r
- <P>\r
- The application is based on a user database and a set of content files. When a user requests a content file,\r
- the application performs substitutions in the file using information from the user database. The application\r
- then returns the modified content to the user.\r
- </P>\r
- <P>\r
- Each request accomplishes the following:\r
- </P>\r
- <P>\r
- </P>\r
- <OL>\r
- <LI>\r
- authentication check: The user id is used to retrieve and check the password.\r
- <P>\r
- </P>\r
- </LI>\r
- <LI>\r
- attribute retrieval: The user id is used to retrieve all of the user's attribute values.\r
- <P>\r
- </P>\r
- </LI>\r
- <LI>\r
- file retrieval and filtering: The request identifies a content file. This file is read and all occurrences\r
- of variable names are replaced with the user's corresponding attribute values. The modified HTML is\r
- returned to the user.<BR>\r
- <BR>\r
- </LI>\r
- </OL>\r
- <P>\r
- Of course, it is fair game to perform caching to shortcut any of these steps.\r
- </P>\r
- <P>\r
- Each user's database record (including password and attribute values) is approximately 100 bytes long.\r
- Each content file is 3,000 bytes long. Both database and content files are stored on disks attached to the\r
- server platform.\r
- </P>\r
- <P>\r
- A typical user makes 10 file accesses with realistic think times (30-60 seconds) between accesses, then\r
- disappears for a long time.\r
- </P>\r
- <P>\r
- </P>\r
- <H4>\r
- <A NAME="S5.2">5.2 Application Design</A>\r
- </H4>\r
- <P>\r
- The FastCGI application maintains a cache of recently-accessed attribute values from the database. When the\r
- cache misses the application reads from the database. Because only a small number of FastCGI application\r
- processes are needed, each process opens a database connection on startup and keeps it open.\r
- </P>\r
- <P>\r
- The FastCGI application is configured as multiple application processes. This is desirable in order to get\r
- concurrent application processing during database reads and file reads. Requests are routed to these\r
- application processes using FastCGI session affinity keyed on the user id. This way all a user's requests\r
- after the first hit in the application's cache.\r
- </P>\r
- <P>\r
- The API application does not maintain a cache; the API application has no way to share the cache among its\r
- processes, so the cache hit rate would be too low to make caching pay. The API application opens and closes a\r
- database connection on every request; keeping database connections open between requests would result in an\r
- unrealistically large number of database connections open at the same time, and very low utilization of each\r
- connection.\r
- </P>\r
- <P>\r
- </P>\r
- <H4>\r
- <A NAME="S5.3">5.3 Test Conditions</A>\r
- </H4>\r
- <P>\r
- The test load is generated by 10 HTTP client processes. The processes represent disjoint sets of users. A\r
- process makes a request for a user, then a request for a different user, and so on until it is time for the\r
- first user to make another request.\r
- </P>\r
- <P>\r
- For simplicity the 10 client processes run on the same machine as the Web server. This avoids the possibility\r
- that a network bottleneck will obscure the test results. The database system also runs on this machine, as\r
- specified in the application scenario.\r
- </P>\r
- <P>\r
- Response time is not an issue under the test conditions. We just measure throughput.\r
- </P>\r
- <P>\r
- The API Web server is in these tests is Netscape 1.1.\r
- </P>\r
- <P>\r
- </P>\r
- <H4>\r
- <A NAME="S5.4">5.4 Test Results and Discussion</A>\r
- </H4>\r
- <P>\r
- Here are the test results:\r
- </P>\r
- <P>\r
- </P>\r
- <DIV CLASS="c3">\r
-<PRE>\r
- FastCGI 12.0 msec per request = 83 requests per second\r
- API 36.6 msec per request = 27 requests per second\r
-</PRE>\r
- </DIV>\r
- <P>\r
- Given the big architectural advantage that the FastCGI application enjoys over the API application, it is not\r
- surprising that the FastCGI application runs a lot faster. To gain a deeper understanding of these results we\r
- measured two more conditions:\r
- </P>\r
- <P>\r
- </P>\r
- <UL>\r
- <LI>\r
- API with sustained database connections. If you could afford the extra licensing cost, how much faster\r
- would your API application run?\r
- <P>\r
- </P>\r
-<PRE>\r
- API 16.0 msec per request = 61 requests per second\r
-</PRE>\r
- Answer: Still not as fast as the FastCGI application.\r
- <P>\r
- </P>\r
- </LI>\r
- <LI>\r
- FastCGI with cache disabled. How much benefit does the FastCGI application get from its cache?\r
- <P>\r
- </P>\r
-<PRE>\r
- FastCGI 20.1 msec per request = 50 requests per second\r
-</PRE>\r
- Answer: A very substantial benefit, even though the database access is quite simple.<BR>\r
- <BR>\r
- </LI>\r
- </UL>\r
- <P>\r
- What these two extra experiments show is that if the API and FastCGI applications are implemented in exactly\r
- the same way -- caching database connections but not caching user profile data -- the API application is\r
- slightly faster. This is what you'd expect, since the FastCGI application has to pay the cost of\r
- inter-process communication not present in the API application.\r
- </P>\r
- <P>\r
- In the real world the two applications would not be implemented in the same way. FastCGI's architectural\r
- advantage results in much higher performance -- a factor of 3 in this test. With a remote database or more\r
- expensive database access the factor would be higher. With more substantial processing of the content files\r
- the factor would be smaller.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S6">6. Multi-threaded APIs</A>\r
- </H3>\r
- <P>\r
- Web servers with a multi-threaded internal structure (and APIs to match) are now starting to become more\r
- common. These servers don't have all of the disadvantages described in Section 3. Does this mean that\r
- FastCGI's performance advantages will disappear?\r
- </P>\r
- <P>\r
- A superficial analysis says yes. An API-based application in a single-process, multi-threaded server can\r
- maintain caches and database connections the same way a FastCGI application can. The API-based application\r
- does not pay for inter-process communication, so the API-based application will be slightly faster than the\r
- FastCGI application.\r
- </P>\r
- <P>\r
- A deeper analysis says no. Multi-threaded programming is complex, because concurrency makes programs much more\r
- difficult to test and debug. In the case of multi-threaded programming to Web server APIs, the normal problems\r
- with multi-threading are compounded by the lack of isolation between different applications and between the\r
- applications and the Web server. With FastCGI you can write programs in the familiar single-threaded style,\r
- get all the reliability and maintainability of process isolation, and still get very high performance. If you\r
- truly need multi-threading, you can write multi-threaded FastCGI and still isolate your multi-threaded\r
- application from other applications and from the server. In short, multi-threading makes Web server APIs\r
- unusable for practially all applications, reducing the choice to FastCGI versus CGI. The performance winner in\r
- that contest is obviously FastCGI.\r
- </P>\r
- <P>\r
- </P>\r
- <H3>\r
- <A NAME="S7">7. Conclusion</A>\r
- </H3>\r
- <P>\r
- Just how fast is FastCGI? The answer: very fast indeed. Not because it has some specially-greased path through\r
- the operating system, but because its design is well matched to the needs of most applications. We invite you\r
- to make FastCGI the fast, open foundation for your Web server applications.\r
- </P>\r
- <P>\r
- </P>\r
- <HR>\r
- <A HREF="http://www.openmarket.com/"><IMG SRC="omi-logo.gif" ALT="OMI Home Page"></A> \r
- <ADDRESS>\r
- © 1995, Open Market, Inc. / mbrown@openmarket.com\r
- </ADDRESS>\r
- </BODY>\r
-</HTML>\r
-\r
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
+<HTML>
+ <HEAD>
+ <TITLE>
+ Understanding FastCGI Application Performance
+ </TITLE>
+<STYLE TYPE="text/css">
+ body {
+ background-color: #FFFFFF;
+ color: #000000;
+ }
+ :link { color: #cc0000 }
+ :visited { color: #555555 }
+ :active { color: #000011 }
+ div.c3 {margin-left: 2em}
+ h5.c2 {text-align: center}
+ div.c1 {text-align: center}
+</STYLE>
+ </HEAD>
+ <BODY>
+ <DIV CLASS="c1">
+ <A HREF="http://fastcgi.com"><IMG BORDER="0" SRC="../images/fcgi-hd.gif" ALT="[[FastCGI]]"></A>
+ </DIV>
+ <BR CLEAR="all">
+ <DIV CLASS="c1">
+ <H3>
+ Understanding FastCGI Application Performance
+ </H3>
+ </DIV>
+ <!--Copyright (c) 1996 Open Market, Inc. -->
+ <!--See the file "LICENSE.TERMS" for information on usage and redistribution-->
+ <!--of this file, and for a DISCLAIMER OF ALL WARRANTIES. -->
+ <DIV CLASS="c1">
+ Mark R. Brown<BR>
+ Open Market, Inc.<BR>
+ <P>
+ 10 June 1996<BR>
+ </P>
+ </DIV>
+ <P>
+ </P>
+ <H5 CLASS="c2">
+ Copyright © 1996 Open Market, Inc. 245 First Street, Cambridge, MA 02142 U.S.A.<BR>
+ Tel: 617-621-9500 Fax: 617-621-1703 URL: <A HREF=
+ "http://www.openmarket.com/">http://www.openmarket.com/</A><BR>
+ $Id: fcgi-perf.htm,v 1.4 2002/02/25 00:42:59 robs Exp $<BR>
+ </H5>
+ <HR>
+ <UL TYPE="square">
+ <LI>
+ <A HREF="#S1">1. Introduction</A>
+ </LI>
+ <LI>
+ <A HREF="#S2">2. Performance Basics</A>
+ </LI>
+ <LI>
+ <A HREF="#S3">3. Caching</A>
+ </LI>
+ <LI>
+ <A HREF="#S4">4. Database Access</A>
+ </LI>
+ <LI>
+ <A HREF="#S5">5. A Performance Test</A>
+ <UL TYPE="square">
+ <LI>
+ <A HREF="#S5.1">5.1 Application Scenario</A>
+ </LI>
+ <LI>
+ <A HREF="#S5.2">5.2 Application Design</A>
+ </LI>
+ <LI>
+ <A HREF="#S5.3">5.3 Test Conditions</A>
+ </LI>
+ <LI>
+ <A HREF="#S5.4">5.4 Test Results and Discussion</A>
+ </LI>
+ </UL>
+ </LI>
+ <LI>
+ <A HREF="#S6">6. Multi-threaded APIs</A>
+ </LI>
+ <LI>
+ <A HREF="#S7">7. Conclusion</A>
+ </LI>
+ </UL>
+ <P>
+ </P>
+ <HR>
+ <H3>
+ <A NAME="S1">1. Introduction</A>
+ </H3>
+ <P>
+ Just how fast is FastCGI? How does the performance of a FastCGI application compare with the performance of
+ the same application implemented using a Web server API?
+ </P>
+ <P>
+ Of course, the answer is that it depends upon the application. A more complete answer is that FastCGI often
+ wins by a significant margin, and seldom loses by very much.
+ </P>
+ <P>
+ Papers on computer system performance can be laden with complex graphs showing how this varies with that.
+ Seldom do the graphs shed much light on <I>why</I> one system is faster than another. Advertising copy is
+ often even less informative. An ad from one large Web server vendor says that its server "executes web
+ applications up to five times faster than all other servers," but the ad gives little clue where the
+ number "five" came from.
+ </P>
+ <P>
+ This paper is meant to convey an understanding of the primary factors that influence the performance of Web
+ server applications and to show that architectural differences between FastCGI and server APIs often give an
+ "unfair" performance advantage to FastCGI applications. We run a test that shows a FastCGI
+ application running three times faster than the corresponding Web server API application. Under different
+ conditions this factor might be larger or smaller. We show you what you'd need to measure to figure that
+ out for the situation you face, rather than just saying "we're three times faster" and moving
+ on.
+ </P>
+ <P>
+ This paper makes no attempt to prove that FastCGI is better than Web server APIs for every application. Web
+ server APIs enable lightweight protocol extensions, such as Open Market's SecureLink extension, to be
+ added to Web servers, as well as allowing other forms of server customization. But APIs are not well matched
+ to mainstream applications such as personalized content or access to corporate databases, because of API
+ drawbacks including high complexity, low security, and limited scalability. FastCGI shines when used for the
+ vast majority of Web applications.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S2">2. Performance Basics</A>
+ </H3>
+ <P>
+ Since this paper is about performance we need to be clear on what "performance" is.
+ </P>
+ <P>
+ The standard way to measure performance in a request-response system like the Web is to measure peak request
+ throughput subject to a response time constriaint. For instance, a Web server application might be capable of
+ performing 20 requests per second while responding to 90% of the requests in less than 2 seconds.
+ </P>
+ <P>
+ Response time is a thorny thing to measure on the Web because client communications links to the Internet have
+ widely varying bandwidth. If the client is slow to read the server's response, response time at both the
+ client and the server will go up, and there's nothing the server can do about it. For the purposes of
+ making repeatable measurements the client should have a high-bandwidth communications link to the server.
+ </P>
+ <P>
+ [Footnote: When designing a Web server application that will be accessed over slow (e.g. 14.4 or even 28.8
+ kilobit/second modem) channels, pay attention to the simultaneous connections bottleneck. Some servers are
+ limited by design to only 100 or 200 simultaneous connections. If your application sends 50 kilobytes of data
+ to a typical client that can read 2 kilobytes per second, then a request takes 25 seconds to complete. If your
+ server is limited to 100 simultaneous connections, throughput is limited to just 4 requests per second.]
+ </P>
+ <P>
+ Response time is seldom an issue when load is light, but response times rise quickly as the system approaches
+ a bottleneck on some limited resource. The three resources that typical systems run out of are network I/O,
+ disk I/O, and processor time. If short response time is a goal, it is a good idea to stay at or below 50% load
+ on each of these resources. For instance, if your disk subsystem is capable of delivering 200 I/Os per second,
+ then try to run your application at 100 I/Os per second to avoid having the disk subsystem contribute to slow
+ response times. Through careful management it is possible to succeed in running closer to the edge, but
+ careful management is both difficult and expensive so few systems get it.
+ </P>
+ <P>
+ If a Web server application is local to the Web server machine, then its internal design has no impact on
+ network I/O. Application design can have a big impact on usage of disk I/O and processor time.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S3">3. Caching</A>
+ </H3>
+ <P>
+ It is a rare Web server application that doesn't run fast when all the information it needs is available
+ in its memory. And if the application doesn't run fast under those conditions, the possible solutions are
+ evident: Tune the processor-hungry parts of the application, install a faster processor, or change the
+ application's functional specification so it doesn't need to do so much work.
+ </P>
+ <P>
+ The way to make information available in memory is by caching. A cache is an in-memory data structure that
+ contains information that's been read from its permanent home on disk. When the application needs
+ information, it consults the cache, and uses the information if it is there. Otherwise is reads the
+ information from disk and places a copy in the cache. If the cache is full, the application discards some old
+ information before adding the new. When the application needs to change cached information, it changes both
+ the cache entry and the information on disk. That way, if the application crashes, no information is lost; the
+ application just runs more slowly for awhile after restarting, because the cache doesn't improve
+ performance when it is empty.
+ </P>
+ <P>
+ Caching can reduce both disk I/O and processor time, because reading information from disk uses more processor
+ time than reading it from the cache. Because caching addresses both of the potential bottlenecks, it is the
+ focal point of high-performance Web server application design. CGI applications couldn't perform in-memory
+ caching, because they exited after processing just one request. Web server APIs promised to solve this
+ problem. But how effective is the solution?
+ </P>
+ <P>
+ Today's most widely deployed Web server APIs are based on a pool-of-processes server model. The Web server
+ consists of a parent process and a pool of child processes. Processes do not share memory. An incoming request
+ is assigned to an idle child at random. The child runs the request to completion before accepting a new
+ request. A typical server has 32 child processes, a large server has 100 or 200.
+ </P>
+ <P>
+ In-memory caching works very poorly in this server model because processes do not share memory and incoming
+ requests are assigned to processes at random. For instance, to keep a frequently-used file available in memory
+ the server must keep a file copy per child, which wastes memory. When the file is modified all the children
+ need to be notified, which is complex (the APIs don't provide a way to do it).
+ </P>
+ <P>
+ FastCGI is designed to allow effective in-memory caching. Requests are routed from any child process to a
+ FastCGI application server. The FastCGI application process maintains an in-memory cache.
+ </P>
+ <P>
+ In some cases a single FastCGI application server won't provide enough performance. FastCGI provides two
+ solutions: session affinity and multi-threading.
+ </P>
+ <P>
+ With session affinity you run a pool of application processes and the Web server routes requests to individual
+ processes based on any information contained in the request. For instance, the server can route according to
+ the area of content that's been requested, or according to the user. The user might be identified by an
+ application-specific session identifier, by the user ID contained in an Open Market Secure Link ticket, by the
+ Basic Authentication user name, or whatever. Each process maintains its own cache, and session affinity
+ ensures that each incoming request has access to the cache that will speed up processing the most.
+ </P>
+ <P>
+ With multi-threading you run an application process that is designed to handle several requests at the same
+ time. The threads handling concurrent requests share process memory, so they all have access to the same
+ cache. Multi-threaded programming is complex -- concurrency makes programs difficult to test and debug -- but
+ with FastCGI you can write single threaded <I>or</I> multithreaded applications.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S4">4. Database Access</A>
+ </H3>
+ <P>
+ Many Web server applications perform database access. Existing databases contain a lot of valuable
+ information; Web server applications allow companies to give wider access to the information.
+ </P>
+ <P>
+ Access to database management systems, even within a single machine, is via connection-oriented protocols. An
+ application "logs in" to a database, creating a connection, then performs one or more accesses.
+ Frequently, the cost of creating the database connection is several times the cost of accessing data over an
+ established connection.
+ </P>
+ <P>
+ To a first approximation database connections are just another type of state to be cached in memory by an
+ application, so the discussion of caching above applies to caching database connections.
+ </P>
+ <P>
+ But database connections are special in one respect: They are often the basis for database licensing. You pay
+ the database vendor according to the number of concurrent connections the database system can sustain. A
+ 100-connection license costs much more than a 5-connection license. It follows that caching a database
+ connection per Web server child process is not just wasteful of system's hardware resources, it could
+ break your software budget.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S5">5. A Performance Test</A>
+ </H3>
+ <P>
+ We designed a test application to illustrate performance issues. The application represents a class of
+ applications that deliver personalized content. The test application is quite a bit simpler than any real
+ application would be, but still illustrates the main performance issues. We implemented the application using
+ both FastCGI and a current Web server API, and measured the performance of each.
+ </P>
+ <P>
+ </P>
+ <H4>
+ <A NAME="S5.1">5.1 Application Scenario</A>
+ </H4>
+ <P>
+ The application is based on a user database and a set of content files. When a user requests a content file,
+ the application performs substitutions in the file using information from the user database. The application
+ then returns the modified content to the user.
+ </P>
+ <P>
+ Each request accomplishes the following:
+ </P>
+ <P>
+ </P>
+ <OL>
+ <LI>
+ authentication check: The user id is used to retrieve and check the password.
+ <P>
+ </P>
+ </LI>
+ <LI>
+ attribute retrieval: The user id is used to retrieve all of the user's attribute values.
+ <P>
+ </P>
+ </LI>
+ <LI>
+ file retrieval and filtering: The request identifies a content file. This file is read and all occurrences
+ of variable names are replaced with the user's corresponding attribute values. The modified HTML is
+ returned to the user.<BR>
+ <BR>
+ </LI>
+ </OL>
+ <P>
+ Of course, it is fair game to perform caching to shortcut any of these steps.
+ </P>
+ <P>
+ Each user's database record (including password and attribute values) is approximately 100 bytes long.
+ Each content file is 3,000 bytes long. Both database and content files are stored on disks attached to the
+ server platform.
+ </P>
+ <P>
+ A typical user makes 10 file accesses with realistic think times (30-60 seconds) between accesses, then
+ disappears for a long time.
+ </P>
+ <P>
+ </P>
+ <H4>
+ <A NAME="S5.2">5.2 Application Design</A>
+ </H4>
+ <P>
+ The FastCGI application maintains a cache of recently-accessed attribute values from the database. When the
+ cache misses the application reads from the database. Because only a small number of FastCGI application
+ processes are needed, each process opens a database connection on startup and keeps it open.
+ </P>
+ <P>
+ The FastCGI application is configured as multiple application processes. This is desirable in order to get
+ concurrent application processing during database reads and file reads. Requests are routed to these
+ application processes using FastCGI session affinity keyed on the user id. This way all a user's requests
+ after the first hit in the application's cache.
+ </P>
+ <P>
+ The API application does not maintain a cache; the API application has no way to share the cache among its
+ processes, so the cache hit rate would be too low to make caching pay. The API application opens and closes a
+ database connection on every request; keeping database connections open between requests would result in an
+ unrealistically large number of database connections open at the same time, and very low utilization of each
+ connection.
+ </P>
+ <P>
+ </P>
+ <H4>
+ <A NAME="S5.3">5.3 Test Conditions</A>
+ </H4>
+ <P>
+ The test load is generated by 10 HTTP client processes. The processes represent disjoint sets of users. A
+ process makes a request for a user, then a request for a different user, and so on until it is time for the
+ first user to make another request.
+ </P>
+ <P>
+ For simplicity the 10 client processes run on the same machine as the Web server. This avoids the possibility
+ that a network bottleneck will obscure the test results. The database system also runs on this machine, as
+ specified in the application scenario.
+ </P>
+ <P>
+ Response time is not an issue under the test conditions. We just measure throughput.
+ </P>
+ <P>
+ The API Web server is in these tests is Netscape 1.1.
+ </P>
+ <P>
+ </P>
+ <H4>
+ <A NAME="S5.4">5.4 Test Results and Discussion</A>
+ </H4>
+ <P>
+ Here are the test results:
+ </P>
+ <P>
+ </P>
+ <DIV CLASS="c3">
+<PRE>
+ FastCGI 12.0 msec per request = 83 requests per second
+ API 36.6 msec per request = 27 requests per second
+</PRE>
+ </DIV>
+ <P>
+ Given the big architectural advantage that the FastCGI application enjoys over the API application, it is not
+ surprising that the FastCGI application runs a lot faster. To gain a deeper understanding of these results we
+ measured two more conditions:
+ </P>
+ <P>
+ </P>
+ <UL>
+ <LI>
+ API with sustained database connections. If you could afford the extra licensing cost, how much faster
+ would your API application run?
+ <P>
+ </P>
+<PRE>
+ API 16.0 msec per request = 61 requests per second
+</PRE>
+ Answer: Still not as fast as the FastCGI application.
+ <P>
+ </P>
+ </LI>
+ <LI>
+ FastCGI with cache disabled. How much benefit does the FastCGI application get from its cache?
+ <P>
+ </P>
+<PRE>
+ FastCGI 20.1 msec per request = 50 requests per second
+</PRE>
+ Answer: A very substantial benefit, even though the database access is quite simple.<BR>
+ <BR>
+ </LI>
+ </UL>
+ <P>
+ What these two extra experiments show is that if the API and FastCGI applications are implemented in exactly
+ the same way -- caching database connections but not caching user profile data -- the API application is
+ slightly faster. This is what you'd expect, since the FastCGI application has to pay the cost of
+ inter-process communication not present in the API application.
+ </P>
+ <P>
+ In the real world the two applications would not be implemented in the same way. FastCGI's architectural
+ advantage results in much higher performance -- a factor of 3 in this test. With a remote database or more
+ expensive database access the factor would be higher. With more substantial processing of the content files
+ the factor would be smaller.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S6">6. Multi-threaded APIs</A>
+ </H3>
+ <P>
+ Web servers with a multi-threaded internal structure (and APIs to match) are now starting to become more
+ common. These servers don't have all of the disadvantages described in Section 3. Does this mean that
+ FastCGI's performance advantages will disappear?
+ </P>
+ <P>
+ A superficial analysis says yes. An API-based application in a single-process, multi-threaded server can
+ maintain caches and database connections the same way a FastCGI application can. The API-based application
+ does not pay for inter-process communication, so the API-based application will be slightly faster than the
+ FastCGI application.
+ </P>
+ <P>
+ A deeper analysis says no. Multi-threaded programming is complex, because concurrency makes programs much more
+ difficult to test and debug. In the case of multi-threaded programming to Web server APIs, the normal problems
+ with multi-threading are compounded by the lack of isolation between different applications and between the
+ applications and the Web server. With FastCGI you can write programs in the familiar single-threaded style,
+ get all the reliability and maintainability of process isolation, and still get very high performance. If you
+ truly need multi-threading, you can write multi-threaded FastCGI and still isolate your multi-threaded
+ application from other applications and from the server. In short, multi-threading makes Web server APIs
+ unusable for practially all applications, reducing the choice to FastCGI versus CGI. The performance winner in
+ that contest is obviously FastCGI.
+ </P>
+ <P>
+ </P>
+ <H3>
+ <A NAME="S7">7. Conclusion</A>
+ </H3>
+ <P>
+ Just how fast is FastCGI? The answer: very fast indeed. Not because it has some specially-greased path through
+ the operating system, but because its design is well matched to the needs of most applications. We invite you
+ to make FastCGI the fast, open foundation for your Web server applications.
+ </P>
+ <P>
+ </P>
+ <HR>
+ <A HREF="http://www.openmarket.com/"><IMG SRC="omi-logo.gif" ALT="OMI Home Page"></A>
+ <ADDRESS>
+ © 1995, Open Market, Inc. / mbrown@openmarket.com
+ </ADDRESS>
+ </BODY>
+</HTML>
+