permanent looping on time.sleep() in spider.py
skvidal at fedoraproject.org
Sat Aug 14 06:21:15 EST 2010
On Fri, 2010-08-13 at 11:20 -0400, seth vidal wrote:
> I've got an instance of venus running at planet.fedoraproject.org and
> we have some smaller sub-planets running in other dirs. We recently set
> our threading count to a specific number (20 threads) and have started
> to notice that sometimes the process will hang repeatedly at:
> while fetch_queue.qsize() or parse_queue.qsize() or threads:
> while parse_queue.qsize() == 0 and threads:
> without ever unhanging, it's waiting for something.. I've checked to see
> if it was an http timeout of some kind but the socket timeout is set to
> 20s as the default is in the code.
> Has anyone else encountered this and have any suggestions?
Okay -so I hacked around this - here's what's happening:
in one run of the loop all the threads are stopping simultaneously -
which means the threads still exist in the threads dict and there is
nothing else to process but the loop can't tell that. So it continues to
The attached patch checks to see if all of our threads are dead, if so -
it breaks out.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 1015 bytes
Desc: not available
More information about the devel