permanent looping on time.sleep() in spider.py

seth vidal skvidal at fedoraproject.org
Sat Aug 14 06:21:15 EST 2010


On Fri, 2010-08-13 at 11:20 -0400, seth vidal wrote:
> Hi,
>  I've got an instance of venus running at planet.fedoraproject.org and
> we have some smaller sub-planets running in other dirs. We recently set
> our threading count to a specific number (20 threads) and have started
> to notice that sometimes the process will hang repeatedly at:
> 
>     while fetch_queue.qsize() or parse_queue.qsize() or threads:
>         while parse_queue.qsize() == 0 and threads:
>             time.sleep(0.1)
> 
> 
> without ever unhanging, it's waiting for something.. I've checked to see
> if it was an http timeout of some kind but the socket timeout is set to
> 20s as the default is in the code.
> 
> Has anyone else encountered this and have any suggestions?
> 
> Thanks,

Okay -so I hacked around this - here's what's happening:

in one run of the loop all the threads are stopping simultaneously -
which means the threads still exist in the threads dict and there is
nothing else to process but the loop can't tell that. So it continues to
wait.

The attached patch checks to see if all of our threads are dead, if so -
it breaks out.

-sv

-------------- next part --------------
A non-text attachment was scrubbed...
Name: venus-thread-death.patch
Type: text/x-patch
Size: 1015 bytes
Desc: not available
URL: </archives/devel/attachments/20100813/0102c955/attachment.bin>


More information about the devel mailing list