Patch: more control over filtering

Chris Dolan chris at chrisdolan.net
Tue Jan 17 07:39:53 EST 2006


Currently, planet allows positive or negative regex matches on the  
title and content fields via the "filter" and "exclude" directives.   
Attached is a patch (against jdub at perkypants.org--projects/planet-- 
devel--1.0--patch-9) that offers more control over the fields you can  
filter on.  For example, I'd like to say:

filter = language =~ /^en/
          author =~ /Chris/
exclude = title =~ /foo/
           any =~ /spam/

The syntax allows line continuations to permit more than one filter.   
Note that the "=~" and the "/" delimeters trigger the new syntax.  If  
they aren't found, then the old syntax is assumed (that is, the whole  
value is a regexp that applies to content and title).  The possible  
keywords are any of the properties of the NewsItem class, namely: any  
(which means just title+content right now), title, link, summary,  
content, author, category, etc.

The complete syntax is:

    keyword := [a-z]+
    pattern := *
    match   := keyword \s* =~ \s* /pattern/ | pattern
    filter  := match ( \n \s* match )*

I'm a newcomer to Python, so there are probably better ways to do  
some of this...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: filter.patch
Type: application/octet-stream
Size: 5030 bytes
Desc: not available
Url : http://lists.waugh.id.au/archives/devel/attachments/20060116/500ff533/filter.obj
-------------- next part --------------

Chris

P.S. the NewsItem class currently does not support the "language"  
property, which is used in the example above.  This appears to be an  
oversight.  If it existed, I would use it to filter multi-lingual  
blog feeds to show just the posts that I can read.  I'd be thrilled  
if someone else would improve NewsItem to support <dc:language> tags  
in <item> entries, as shown in this example RSS:  http://glazman.org/ 
weblog/dotclear/rss.php

--
Chris Dolan, Software Developer, http://www.chrisdolan.net/
Public key: http://www.chrisdolan.net/public.key
vCard: http://www.chrisdolan.net/ChrisDolan.vcf
Planet: http://www.chrisdolan.net/planet/
Arch repository: chris at chrisdolan.net--2004
Arch mirror: http://www.chrisdolan.net/arch/




More information about the devel mailing list