HTTP Basic Auth with httplib2

While working on pressfs I ran into an issue with httplib2 using HTTP Basic Authentication.

Here is some example code:

import httplib2

if __name__ == '__main__' :
    httplib2.debuglevel = 1

    h = httplib2.Http()
    h.add_credentials( 'username', 'password' )

    resp, content = h.request( 'http://www.google.com/', 'GET' )

If you run this you’ll notice that httplib2 doesn’t actually include the HTTP Basic Auth details in the request, even though the code specifically asks it to do so. By design it will always make one request with no authentication details and then check to see if it gets an HTTP 401 Unauthorized response back. If and only if it gets a 401 response back will it then make a second request that includes the authentication data.

Bottom line, I didn’t want to make two HTTP requests when only one was needed (huge performance hit). There is no option to force the authentication header to be sent on the first request, so you have to do it manually:

import base64
import httplib2

if __name__ == '__main__' :
    httplib2.debuglevel = 1

    h = httplib2.Http()
    auth = base64.encodestring( 'username' + ':' + 'password' )

    resp, content = h.request(
        'http://www.google.com/',
        'GET',
        headers = { 'Authorization' : 'Basic ' + auth }
    )

Watching the output of this you’ll see the authentication header in the request.

Someone else already opened an issue about this ( Issue 130 ), unfortunately Joe Gregorio has indicated that he has no intention of ever fixing this :-(

On the up side, working around this deficiency only takes a little bit of extra code.

4 Comments

  1. Why are you using httplib2? There is a built-in HTTP library: urllib2, which will serve you will (and supports basic auth out of the box).

  2. I looked at urllib/urllib2 originally, but the HTTPPasswordMgr approach did not look appealing.

  3. httplib2 does handle connections differently than the built-in libraries and conveniently enables an on-disk cache (.cache by default). I chose it for a project that needed to manage connections more proactively than urllib, et al. A side effect of this is the response object maintains a reference and is therefore persistent allowing other code to be a bit more lazy. That can bite, for me it allowed for some cleaner code.

    The auth bit you were at odds with is the _request method (__init__.py file). While I’ve not poked at it in earnest, the change appears to be trivial to implement cleanly. I wanted to see if a guy could easily append the auth into the authorizations attribute but the way the first request is made precludes that from working it appears.

    The one justification that may make a difference with the author is that some systems (some might say broken), do not play well without some from the start. I work with one vendor that requires basic-auth to prompt a 401 response for the next level. Like you, I simply send auth (both to the example) in the first headers and the full query URI for a single-step request and response.

    Regardless of all this, I used the same method as you via custom headers. It was trivial and easily maintained.

  4. I’ve also started looking at Requests as another option.

Leave a Reply

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2014 Joseph Scott

Theme by Anders NorenUp ↑