User Agent Sniffing at Google Libraries CDN

I recently took a closer look at Google Libraries, their content delivery network (CDN) for various Javascript libraries, and HTTP compression. I started with a simple test:

curl -O -v --compressed http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js

This downloads a minified version of jQuery 1.4.3, with --compressed, which means I’d like the response to be compressed. The HTTP request looked like:

> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: curl/7.16.4 (i386-apple-darwin9.0) libcurl/7.16.4 OpenSSL/0.9.7l zlib/1.2.3
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> 

The response from Google was:

< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:27:16 GMT
< Expires: Sat, 29 Oct 2011 03:27:16 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Cache-Control: public, max-age=31536000
< Age: 145355
< Transfer-Encoding: chunked
< 

I was surprised that there was no Content-Encoding: gzip header in the response, meaning the response was NOT compressed. I wasn’t quite sure what to make of this at first. No way would Google forget to turn on HTTP compression, I must have missed something. I stared at the HTTP response for sometime, trying to figure out what I was missing. Nothing came to mind, so I ran another test.

This time I made a request for http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js in Firefox 3.6.12 on Mac OS X and used Firebug to inspect the HTTP transaction. The request:

GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
Host: ajax.googleapis.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

and the response:

HTTP/1.1 200 OK
Content-Type: text/javascript; charset=UTF-8
Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
Date: Fri, 29 Oct 2010 03:12:35 GMT
Expires: Sat, 29 Oct 2011 03:12:35 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Server: sffe
Content-Encoding: gzip
Cache-Control: public, max-age=31536000
Content-Length: 26769
Age: 147128

This time the content was compressed. There were several differences in the request headers between curl and Firefox, I decided to start with just one, the “User-Agent”. I modified my initial curl request to include the User-Agent string from Firefox:

curl -O -v --compressed --user-agent "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js

The request:

> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> 

and the response:

< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:33:09 GMT
< Expires: Sat, 29 Oct 2011 03:33:09 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Content-Encoding: gzip
< Cache-Control: public, max-age=31536000
< Content-Length: 26769
< Age: 147018
< 

Sure enough, I got back a compressed response. Google was sniffing the User-Agent string to determine if a compressed response should be sent. It didn’t matter if the client asked for a compressed response ( Accept-Encoding: deflate, gzip) or not. What still wasn’t clear is if this was a black list approach (singling out curl) or a white list approach (Firefox is okay). So I tried a few other requests with various User-Agent strings. First up, no User-Agent set at all:

curl -O -v --compressed --user-agent "" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js

Not compressed. Next a made up string:

curl -O -v --compressed --user-agent "JosephScott/1.0 test/2.0" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js

Not compressed. At this point I think Google is using a white list approach, if you aren’t on the list of approved User-Agent strings for getting a compressed response then you won’t get one, no matter how nicely you ask.

I collected a few more browser samples as well, just to be sure:

  • Safari 5.0.2 on Mac OS X – compressed
  • IE 8 on Windows XP – compressed
  • Firefox 3.6.12 on Windows XP – compressed
  • Chrome 7.0.517.41 beta on Windows XP – compressed
  • Opera 10.63 on Windows XP – NOT compressed
  • Safari 5.0.2 on Windows XP – compressed

One more time, curl using the IE 8 User-Agent string:

curl -O -v --compressed --user-agent "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET4.0C;" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js

Compressed.

Since I can manipulate the response based on the User-Agent value I’m left to conclude that the Google Library CDN sniffs the User-Agent string to determine if it will respond with a compressed result. From what I’ve seen so far Google Library contains a white list of approved User-Agent patterns that it checks against to determine if it will honor the compression request.

If you are on a current version of one of the popular browsers you will get a compressed response. For those using anything else you’ll have to test to confirm if Google Library will honor your request for compressed content. Opera users are just plain out of luck, even the most recent version gets an uncompressed response.

9 Comments

  1. Depressing. One more reason to avoid the Google CDN.
    Billy Hoffman has listed other points against the CDN version in his article Should You Use JavaScript Library CDNs?

  2. We actually found out that many servers do similar things. We have the same pattern with the ‘Connection: Close’ header. Some servers will automatically close the connection based on the UA (browsers mostly), or will keep it alive (Bots).

  3. I’m not sure if it is reason enough to avoid using the Google Library CDN, but it does raise a few more questions.

    As for Hoffman’s post, he brings up good points, but tries to apply them way too broadly. To say that avoiding a CDN for your Javascript files is the right answer for every site on the web is over simplifying by a long (LONG) shot.

  4. Interesting, I’ll have to look closer at the Connection header behavior in more detail at some point.

  5. Great post, Joseph! Using a whitelist even in the presence of Accept-Encoding has merit (I discuss this in HPWS). However, Opera should be in the whitelist afaik. I’ll contact the Google Libraries API folks.

  6. Thanks. As for a whitelist approach, are there really that many new HTTP clients coming out that have broken compression support? If most of the broken ones are older then it seems like less work to simply black list known problem clients. As the Opera item highlights, there is often little motivation in updating a white list. A black list might cause more pain in the short term (when a new broken client shows up), but that pain would lead to motivation in making sure the list is up to date.

    Making the white list that Google Library is using to determine feature support public would be really helpful. Including test methodologies and contact info for sending in updates would allow others to help keep the list up to date.

  7. Reminds me of FaceBook: You get ‘X-Cnection: close’ when you try to call an unknown user, e.g.: http://www.facebook.com/people/hammernich/
    Oh, and an empty Location header …

  8. I’m not sure about the empty Location header, but the ‘X-Cnection’ appears to be a load balancer – http://www.nextthing.org/archives/2005/08/07/fun-with-http-headers

  9. good post!
    I’m using Firefox 3.6.13 and asking for
    http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js
    I receive an uncompressed file…
    …but using https
    https://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js
    the returned file is definitely compressed.
    There’s something wrong in their white list approach.
    I wonder why not trusting the request headers?
    Federico

Leave a Reply

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2014 Joseph Scott

Theme by Anders NorenUp ↑