Steve Kemp's Blog Writings relating to Debian & Free Software

Putting the finishing touches to a nodejs library

Fri, 11 Apr 2014 12:32:38 GMT

For the past few years I've been running a simple service to block blog/comment-spam, which is (currently) implemented as a simple JSON API over HTTP, with a minimal core and all the logic in a series of plugins.

One obvious thing I wasn't doing until today was paying attention to the anchor-text used in hyperlinks, for example:

  <a href="http://fdsf.example.com/">buy viagra</a>

Blocking on the anchor-text is less prone to false positives than blocking on keywords in the comment/message bodies.

Unfortunately there seem to exist no simple nodejs modules for extracting all the links, and associated anchors, from a random Javascript string. So I had to write such a module, but .. given how small it is there seems little point in sharing it. So I guess this is one of the reasons why there often large gaps in the module ecosystem.

(Equally some modules are essentially applications; great that the authors shared, but virtually unusable, unless you 100% match their problem domain.)

I've written about this before when I had to construct, and publish, my own cidr-matching module.

Anyway expect an upload soon, currently I "parse" HTML and BBCode. Possibly markdown to follow, since I have an interest in markdown.

| No comments

 

 

Spiral Logo

Search

Recent Posts

Recent Tags

Links

RSS Feed

  • Subscribe to feed