Posts Tagged ‘Acrobat’

Compressing Web Output Using mod_deflate and Apache 2.0.x

October 3rd, 2006 by smp | Comments | Filed in Web Performance, WebPerformance.Org

In a previous paper, the use of mod_gzip to dynamically compress the output from an Apache server. With the growing use of the Apache 2.0.x family of Web servers, the question arises of how to perform a similar GZIP-encoding function within this server. The developers of the Apache 2.0.x servers have included a module in the codebase for the server to perform just this task.

mod_deflate is included in the Apache 2.0.x source package, and compiling it in is a simple matter of adding it to the configure command.

	./configure --enable-modules=all --enable-mods-shared=all --enable-deflate

When the server is made and installed, the GZIP-encoding of documents can be enabled in one of two ways: explicit exclusion of files by extension; or by explcit inclusion of files by MIME type. These methods are specified in the httpd.conf file.


Explicit Exclusion

SetOutputFilter DEFLATE
DeflateFilterNote ratio
SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
SetEnvIfNoCase Request_URI .(?:exe|t?gz|zip|bz2|sit|rar)$ no-gzip dont-vary
SetEnvIfNoCase Request_URI .pdf$ no-gzip dont-vary

Explicit Inclusion

DeflateFilterNote ratio
AddOutputFilterByType DEFLATE text/*
AddOutputFilterByType DEFLATE application/ms* application/vnd* application/postscript

Both methods enable the automatic GZIP-encoding of all MIME-types, except image and PDF files, as they leave the server. Image files and PDF files are excluded as they are already in a highly compressed format. In fact, PDFs become unreadable by Adobe’s Acrobat Reader if they are further compressed by mod_deflate or mod_gzip.On the server used for testing mod_deflate for this article, no Windows executables or compressed files are served to visitors. However, for safety’s sake, please ensure that compressed files and binaries are not GZIP-encoded by your Web server application.For the file-types indicated in the exclude statements, the server is told explicitly not to send the Vary header. The Vary header indicates to any proxy or cache server which particular condition(s) will cause this response to Vary from other responses to the same request.

If a client sends a request which does not include the Accept-Encoding: gzip header, then the item which is stored in the cache cannot be returned to the requesting client if the Accept-Encoding headers do not match. The request must then be passed directly to the origin server to obtain a non-encoded version. In effect, proxy servers may store 2 or more copies of the same file, depending on the client request conditions which cause the server response to Vary.

Removing the Vary response requirement for objects not handled means that if the objects do not vary due to any other directives on the server (browser type, for example), then the cached object can be served up without any additional requests until the Time-To-Live (TTL) of the cached object has expired.

In examining the performance of mod_deflate against mod_gzip, the one item that distinguished mod_deflate from mod_gzip in versions of Apache prior to 2.0.45 was the amount of compression that occurred. The examples below demonstrate that the compression algorithm for mod_gzip produces between 4-6% more compression than mod_deflate for the same file.[1]

Table 1 — /compress/homepage2.html

Compression Size Compression %
No compression 56380 bytes n/a
Apache 1.3.x/mod_gzip 16333 bytes 29% of original
Apache 2.0.x/mod_deflate 19898 bytes 35% of original

Table 2 — /documents/spierzchala-resume.ps

Compression Size Compression %
No Compression 63451 bytes n/a
Apache 1.3.x/mod_gzip 19758 bytes 31% of original
Apache 2.0.x/mod_deflate 23407 bytes 37% of original

Attempts to increase the compression ratio of mod_deflate in Apache 2.044 and lower using the directives provided for this module produced no further decrease in transferred file size. A comment from one of the authors of the mod_deflate module stated that the module was written specifically to ensure that server performance was not degraded by using this compression method. The module was, by default, performing the fastest compression possible, rather than a mid-range compromise between speed and final file size.

Starting with Apache 2.0.45, the compression level of mod_deflate is configurable using the DeflateCompressionLevel directive. This directive accepts values between 1 (fastest compression speed; lowest compression ratio) and 9 (slowest compression speed; highest compression ratio), with the default value being 6. This simple change makes the compression in mod_deflate comparable to mod_gzip out of the box.

Using mod_deflate for Apache 2.0.x is a quick and effective way to decrease the size of the files that are sent to clients. Anything that can produce between 50% and 80% in bandwidth savings with so little effort should definitely be considered for any and all Apache 2.0.x deployments wishing to use the default Apache codebase.


[1] A note on the compression in mod_deflate for Apache 2.044 and lower: The level of compression can be modified by changing the ZLIB compression setting in mod_deflate.c from Z_BEST_SPEED (equivalent to “gzip -1″) to Z_BEST_COMPRESSION (equivalent to “gzip -9″). These defaults can also be replaced with a numeric value between 1 and 9.

More info on hacking mod_deflate for Apache 2.0.44 and lower can be found here.

Tags: , , , , , , , , , , , , , , , , ,

Compressing Web Output Using mod_gzip for Apache 1.3.x and 2.0.x

October 3rd, 2006 by smp | Comments | Filed in Web Performance, WebPerformance.Org

Web page compression is not a new technology, but it has just recently gained higher recognition in the minds of IT administrators and managers because of the rapid ROI it generates. Compression extensions exist for most of the major Web server platforms, but in this article I will focus on the Apache and mod_gzip solution.

The idea behind GZIP-encoding documents is very straightforward. Take a file that is to be transmitted to a Web client, and send a compressed version of the data, rather than the raw file as it exists on the filesystem. Depending on the size of the file, the compressed version can run anywhere from 50% to 20% of the original file size.

In Apache, this can be achieved using a couple of different methods. Content Negotiation, which requires that two separate sets of HTML files be generated — one for clients that can handle GZIP-encoding, and one for those who can’t — is one method. The problem with this solution should be readily apparent: there is no provision in this methodology for GZIP-encoding dynamically-generated pages.

The more graceful solution for administrators who want to add GZIP-encoding to Apache is the use of mod_gzip. I consider it one of the overlooked gems for designing a high-performance Web server. Using this module, configured file types — based on file extension or MIME type — will be compressed using GZIP-encoding after they have been processed by all of Apache’s other modules, and before they are sent to the client. The compressed data that is generated reduces the number of bytes transferred to the client, without any loss in the structure or content of the original, uncompressed document.

mod_gzip can be compiled into Apache as either a static or dynamic module; I have chosen to compile it as a dynamic module in my own server. The advantage of using mod_gzip is that this method requires that nothing be done on the client side to make it work. All current browsers — Mozilla, Opera, and even Internet Explorer — understand and can process GZIP-encoded text content.

On the server side, all the server or site administrator has to do is compile the module, edit the appropriate configuration directives that were added to the httpd.conf file, enable the module in the httpd.conf file, and restart the server. In less than 10 minutes, you can be serving static and dynamic content using GZIP-encoding without the need to maintain multiple codebases for clients that can or cannot accept GZIP-encoded documents.

When a request is received from a client, Apache determines if mod_gzip should be invoked by noting if the “Accept-Encoding: gzip” HTTP request header has been sent by the client. If the client sends the header, mod_gzip will automatically compress the output of all configured file types when sending them to the client.

This client header announces to Apache that the client will understand files that have been GZIP-encoded. mod_gzip then processes the outgoing content and includes the following server response headers.

	Content-Type: text/html
	Content-Encoding: gzip

These server response headers announce that the content returned from the server is GZIP-encoded, but that when the content is expanded by the client application, it should be treated as a standard HTML file. Not only is this successful for static HTML files, but this can be applied to pages that contain dynamic elements, such as those produced by Server-Side Includes (SSI), PHP, and other dynamic page generation methods. You can also use it to compress your Cascading Stylesheets (CSS) and plain text files. As well, a whole range of application file types can be compressed and sent to clients. My httpd.conf file sets the following configuration for the file types handled by mod_gzip:

	mod_gzip_item_include mime ^text/.*
	mod_gzip_item_include mime ^application/postscript$
	mod_gzip_item_include mime ^application/ms.*$
	mod_gzip_item_include mime ^application/vnd.*$
	mod_gzip_item_exclude mime ^application/x-javascript$
	mod_gzip_item_exclude mime ^image/.*$

This allows Microsoft Office and Postscript files to be GZIP-encoded, while not affecting PDF files. PDF files should not be GZIP-encoded, as they are already compressed in their native format, and compressing them leads to issues when attempting to display the files in Adobe Acrobat Reader.[1] For the paranoid system administrator, you may want to explicitly exclude PDF files.

	mod_gzip_item_exclude mime ^application/pdf$

Another side-note is that nothing needs to be done to allow the GZIP-encoding of OpenOffice (and presumably, StarOffice) documents. Their MIME-type is already set to text-plain, allowing them to be covered by one of the default rules.

How beneficial is sending GZIP-encoded content? In some simple tests I ran on my Web server using WGET, GZIP-encoded documents showed that even on a small Web server, there is the potential to produce a substantial savings in bandwidth usage.

http://www.pierzchala.com/bio.html Uncompressed File Size: 3122 bytes
http://www.pierzchala.com/bio.html Compressed File Size: 1578 bytes
http://www.pierzchala.com/compress/homepage2.html Uncompressed File Size: 56279 bytes
http://www.pierzchala.com/compress/homepage2.html Compressed File Size: 16286 bytes

Server administrators may be concerned that mod_gzip will place a heavy burden on their systems as files are compressed on the fly. I argue against that, pointing out that this does not seem to concern the administrators of Slashdot, one of the busiest Web servers on the Internet, who use mod_gzip in their very high-traffic environment.

The mod_gzip project page for Apache 1.3.x is located at SourceForge. The Apache 2.0.x version is available from here.


[1] From http://www.15seconds.com/issue/020314.htm:
“Both Internet Explorer 5.5 and Internet Explorer 6.0 have a bug with decompression that affects some users. This bug is documented in: the Microsoft knowledge Base articles, Q312496 is for IE 6.0 … , the Q313712 is for IE 5.5. Basically Internet Explorer doesn’t decompress the response before it sends it to plug-ins like Adobe Photoshop.”

Tags: , , , , , , , , , , , , , , , , , , , , ,

Compressing Web Output Using mod_gzip for Apache 1.3.x and 2.0.x

April 30th, 2005 by smp | Comments | Filed in smp

Web page compression is not a new technology, but it has just recently gained higher recognition in the minds of IT administrators and managers because of the rapid ROI it generates. Compression extensions exist for most of the major Web server platforms, but in this article I will focus on the Apache and mod_gzip solution.

The idea behind GZIP-encoding documents is very straightforward. Take a file that is to be transmitted to a Web client, and send a compressed version of the data, rather than the raw file as it exists on the filesystem. Depending on the size of the file, the compressed version can run anywhere from 50% to 20% of the original file size.

In Apache, this can be achieved using a couple of different methods. Content Negotiation, which requires that two separate sets of HTML files be generated — one for clients that can handle GZIP-encoding, and one for those who can’t — is one method. The problem with this solution should be readily apparent: there is no provision in this methodology for GZIP-encoding dynamically-generated pages.

The more graceful solution for administrators who want to add GZIP-encoding to Apache is the use of mod_gzip. I consider it one of the overlooked gems for designing a high-performance Web server. Using this module, configured file types — based on file extension or MIME type — will be compressed using GZIP-encoding after they have been processed by all of Apache’s other modules, and before they are sent to the client. The compressed data that is generated reduces the number of bytes transferred to the client, without any loss in the structure or content of the original, uncompressed document.

mod_gzip can be compiled into Apache as either a static or dynamic module; I have chosen to compile it as a dynamic module in my own server (more compile instructions here). The advantage of using mod_gzip is that this method requires that nothing be done on the client side to make it work. All current browsers — Mozilla, Opera, and even Internet Explorer — understand and can process GZIP-encoded text content.

On the server side, all the server or site administrator has to do is compile the module, edit the appropriate configuration directives that were added to the httpd.conf file, enable the module in the httpd.conf file, and restart the server. In less than 10 minutes, you can be serving static and dynamic content using GZIP-encoding without the need to maintain multiple codebases for clients that can or cannot accept GZIP-encoded documents.

When a request is received from a client, Apache determines if mod_gzip should be invoked by noting if the “Accept-Encoding: gzip” HTTP request header has been sent by the client. If the client sends the header, mod_gzip will automatically compress the output of all configured file types when sending them to the client.

This client header announces to Apache that the client will understand files that have been GZIP-encoded. mod_gzip then processes the outgoing content and includes the following server response headers.

		Content-Type: text/html
		Content-Encoding: gzip
		

These server response headers announce that the content returned from the server is GZIP-encoded, but that when the content is expanded by the client application, it should be treated as a standard HTML file. Not only is this successful for static HTML files, but this can be applied to pages that contain dynamic elements, such as those produced by Server-Side Includes (SSI), PHP, and other dynamic page generation methods. You can also use it to compress your Cascading Stylesheets (CSS) and plain text files. As well, a whole range of application file types can be compressed and sent to clients. My httpd.conf file sets the following configuration for the file types handled by mod_gzip:

		mod_gzip_item_include mime ^text/.*
		mod_gzip_item_include mime ^application/postscript$
		mod_gzip_item_include mime ^application/ms.*$
		mod_gzip_item_include mime ^application/vnd.*$
		mod_gzip_item_exclude mime ^application/x-javascript$
		mod_gzip_item_exclude mime ^image/.*$
		

This allows Microsoft Office and Postscript files to be GZIP-encoded, while not affecting PDF files. PDF files should not be GZIP-encoded, as they are already compressed in their native format, and compressing them leads to issues when attempting to display the files in Adobe Acrobat Reader.[1] For the paranoid system administrator, you may want to explicitly exclude PDF files.

		mod_gzip_item_exclude mime ^application/pdf$
		

Another side-note is that nothing needs to be done to allow the GZIP-encoding of OpenOffice (and presumably, StarOffice) documents. Their MIME-type is already set to text-plain, allowing them to be covered by one of the default rules.

How beneficial is sending GZIP-encoded content? In some simple tests I ran on my Web server using WGET, GZIP-encoded documents showed that even on a small Web server, there is the potential to produce a substantial savings in bandwidth usage.

http://www.pierzchala.com/bio.html Uncompressed File Size: 3122 bytes
http://www.pierzchala.com/bio.html Compressed File Size: 1578 bytes
http://www.pierzchala.com/compress/homepage2.html Uncompressed File Size: 56279 bytes
http://www.pierzchala.com/compress/homepage2.html Compressed File Size: 16286 bytes

Server administrators may be concerned that mod_gzip will place a heavy burden on their systems as files are compressed on the fly. I argue against that, pointing out that this does not seem to concern the administrators of Slashdot, one of the busiest Web servers on the Internet, who use mod_gzip in their very high-traffic environment.

The mod_gzip project page for Apache 1.3.x is located at SourceForge. The Apache 2.0.x version is available from here.


[1] From http://www.15seconds.com/issue/020314.htm

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Adobe Buys Macromedia: Bullshit and Dinosaurs

April 18th, 2005 by smp | Comments | Filed in RANTING

Kottke has a great summary of the links for the Adobomedia/Macrodobe story here.


In my opinion, this quote sums up what is wrong with this merger.

The combination of Adobe and Macromedia strengthens our mission of helping people and organizations communicate better. Through the combination of our powerful development, authoring and collaboration tools – and the complementary functionality of PDF and Flash – we have the opportunity to drive an industry-defining technology platform that delivers compelling, rich content and applications across a wide range of devices and operating systems. [here]

The Adobe and Macromedia Marketing/Press Relations teams need the help of the Bullfighter software.

Flash makes web products that are great for online games…and useless for anything else. Adobe makes a PDF reader, which I can replace with any number of free readers.

I can feel gravity dragging this merger into the pit of despair.

I agree with Om Malik and Russell Beattie — I vote “-1″ on this merger.

More here and here and here and here

Richard Koman says this is a good deal…Flash on Mobile devices merged with a lighter version of Acrobat.

Sramana Mitra says that Apple should buy Adobe now. [here]

Strategize says Adobe everywhere, all the time. [here]

Roland Tanglao quotes Marc Canter, who is happy to see Macromedia disappear.

More from Roland here.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,