spider threads with basic auth

Kevin Hamilton kevin.hamilton at gmail.com
Sat Apr 5 05:18:44 EST 2008


I didn't look at it in any depth, so I'm not sure if this is a bug in Venus
or in httplib2, but I thought I'd start here.

If I try to fetch a feed which requires authentication, using the URL
structure http://username:password@www.example.com, it will work if
spider_threads=0 in the config.ini but it will fail if spider_threads=1 (or
more) in the config.ini.

Here's the output from the spider.py test program when spider_threads=1:
1207331661.923120 Socket timeout set to 20 seconds
1207331662.161323 Fetching
http://username:password@www.example.com/extrss.php?type=custom&forumids=16%2C39&lastpost=1&fulldesc=1via
0
1207331662.162117 Error processing
http://username:password@www.example.com/extrss.php?type=custom&forumids=16%2C39&lastpost=1&fulldesc=1
1207331662.219738 InvalidURL: nonnumeric port: 'password at www.example.com'
1207331662.220118   File
"/homepages/41/d94174740/htdocs/home/v/planet/spider.py", line 312, in
httpThread
    (resp, content) = h.request(idna, 'GET', headers=headers)
1207331662.220417   File
"/homepages/41/d94174740/htdocs/home/v/planet/vendor/httplib2/__init__.py",
line 780, in request
    conn = self.connections[scheme+":"+authority] =
connection_type(authority)
1207331662.220724   File
"/kunden/homepages/41/d94174740/htdocs/home/lib/python2.4/httplib.py", line
586, in __init__
    self._set_hostport(host, port)
1207331662.221006   File
"/kunden/homepages/41/d94174740/htdocs/home/lib/python2.4/httplib.py", line
598, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
1207331662.262189 Error 500 while updating feed
http://username:password@www.example.com/extrss.php?type=custom&forumids=16%2C39&lastpost=1&fulldesc=1
1207331662.283040 Finished threaded part of processing.

and here is the output when spider_threads=0:
1207331728.759372 Socket timeout set to 20 seconds
1207331728.760210 Building work queue
1207331731.451615 Updating feed
http://username:password@www.example.com/extrss.php?type=custom&forumids=16%2C39&lastpost=1&fulldesc=1@
http://www.example.com/extrss.php?type=custom&forumids=16%2C39&lastpost=1&fulldesc=1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /archives/devel/attachments/20080404/185852b5/attachment.htm 


More information about the devel mailing list