Planet-Venus 0~bzr116-1 on Debian Squid error
Sam Ruby
rubys at intertwingly.net
Sat Jun 19 03:32:12 EST 2010
On 06/18/2010 05:22 AM, Matteo Calorio wrote:
> Hello,
>
>
> I get the following list of errors when I try to get feeds from
> http://www.mymovies.it/cinema/xml/rss/:
Looks to be a bug in BeautifulSoup (which is designed to handle non-well
formed input), which was, at one time used by the FeedParser to parse
microformats. I believe that since that time, the FeedParser disabled
the parsing of microformats as that feature was never completely
implemented. In any case, I can verify the feed you mention can be
successfully parsed by the latest Venus (which incidentally is in git,
and includes a later version of the feed parser).
> ERROR:planet.runner:Error processing http://www.mymovies.it/cinema/xml/rss/
> ERROR:planet.runner:HTMLParseError: malformed start tag, at line 1, column 91
> ERROR:planet.runner: File "/usr/lib/pymodules/python2.5/planet/spider.py",
> line 441, in spiderPlanet
> data = feedparser.parse(feed, **options)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 3525, in
> parse
> feedparser.feed(data)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 1662, in feed
> sgmllib.SGMLParser.feed(self, data)
> ERROR:planet.runner: File "/usr/lib/python2.5/sgmllib.py", line 99, in feed
> self.goahead(0)
> ERROR:planet.runner: File "/usr/lib/python2.5/sgmllib.py", line 138, in
> goahead
> k = self.parse_endtag(i)
> ERROR:planet.runner: File "/usr/lib/python2.5/sgmllib.py", line 315, in
> parse_endtag
> self.finish_endtag(tag)
> ERROR:planet.runner: File "/usr/lib/python2.5/sgmllib.py", line 355, in
> finish_endtag
> self.unknown_endtag(tag)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 569, in
> unknown_endtag
> method()
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 1414, in
> _end_description
> value = self.popContent('description')
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 849, in
> popContent
> value = self.pop(tag)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 764, in pop
> mfresults = _parseMicroformats(output, self.baseuri, self.encoding)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 2218, in
> _parseMicroformats
> p = _MicroformatsParser(htmlSource, baseURI, encoding)
> ERROR:planet.runner: File
> "/usr/lib/pymodules/python2.5/planet/vendor/feedparser.py", line 1823, in
> __init__
> self.document = BeautifulSoup.BeautifulSoup(data)
> ERROR:planet.runner: File "/usr/lib/pymodules/python2.5/BeautifulSoup.py",
> line 1499, in __init__
> BeautifulStoneSoup.__init__(self, *args, **kwargs)
> ERROR:planet.runner: File "/usr/lib/pymodules/python2.5/BeautifulSoup.py",
> line 1230, in __init__
> self._feed(isHTML=isHTML)
> ERROR:planet.runner: File "/usr/lib/pymodules/python2.5/BeautifulSoup.py",
> line 1263, in _feed
> self.builder.feed(markup)
> ERROR:planet.runner: File "/usr/lib/python2.5/HTMLParser.py", line 108, in
> feed
> self.goahead(0)
> ERROR:planet.runner: File "/usr/lib/python2.5/HTMLParser.py", line 148, in
> goahead
> k = self.parse_starttag(i)
> ERROR:planet.runner: File "/usr/lib/python2.5/HTMLParser.py", line 226, in
> parse_starttag
> endpos = self.check_for_whole_start_tag(i)
> ERROR:planet.runner: File "/usr/lib/python2.5/HTMLParser.py", line 301, in
> check_for_whole_start_tag
> self.error("malformed start tag")
> ERROR:planet.runner: File "/usr/lib/python2.5/HTMLParser.py", line 115, in
> error
> raise HTMLParseError(message, self.getpos())
>
> Regards,
> Matteo
- Sam Ruby
More information about the devel
mailing list