Spring, and Return of the Botweed

Spring has come to Changing Way, with new content blooming. But weeds, the form of spam comments, are also appearing, and Akismet isn’t a completely effective weedkiller. I’ll help it, by manually marking the comments in question as spam.

I hope that the change of seasons is going well for you.

Teahouse in Spring 3I took the photo about a year ago, at Brookside Gardens.

What Costumes Do Spam Villains Wear?

I just marked a couple of comments as spam. They were from “Mortgage Man” and “Credit Guy.” Each linked to the same web site. Which raised the questions:

  • What other spam villains lurk in the same lair? Bankruptcy Boy? Loan Lad? Hedge Fund Hombre? Refinance Girl?
  • Why didn’t Akismet mark this nonsense as spam? Because it’s not the most blatant example of spam, and not enough people had previously warned Akismet about that particular nest of spam villains?

Hello Spam

I blog (and I allow comments), therefore I get spammed. I have many blogs, some of which I set up for test purposes and don’t use much. The “typical” Andrew blog uses WordPress and the Akismet spamfighting service.

It seems as though posts with a title starting with “Hello” attract a disproportionate amount of spam. This is course includes the “Hello Word” post that comes in every new WordPress blog.

I wonder if:

  • Hello posts are targeted by spambots?
  • They have characteristics targeted by spambots?
  • Spamfighting services are particularly suspicious of comments on Hello posts?

Comment Systems and the Spam in the Sandwich

What does a comment system for a blog or other web site actually do? Let’s think about what needs to happen when you read a blog post and leave a comment. The system needs to:

  1. Display existing comments (or some subset of them or information about them).
  2. Allow entry of a new comment.
  3. Validate the comment. For example, has the commenter provided an email address?
  4. Assess the spam-ness or otherwise of the comment. This may involve a captcha.
  5. Store the comment as appropriate, depending on whether it is spam, requires moderation, or immediately joins the ranks of approved comments.
  6. If necessary, notify the admin of the action taken.

The comment system actually needs to do more than this: provide the admin with access to the moderation queue, for example. But I want to focus on the six-layer sandwich described above, and regard the admin interface as chips (or crisps) served to the side of the sandwich.

Having asserted that a comment system has those six layers, I want to focus on four ways in which it can be implemented. The comment system can be part of a larger system; for example, WordPress Classic (WPC) includes all six layers, as well as a whole bunch of other stuff. In an attempt to be clear, I’ll note that WPC refers to self-hosted WordPress.

I’ll turn to a table to highlight the contrasts between the four cases, and I’ll continue to use concrete examples. I’ll stick with WordPress for the examples; that said, the points I want to make aren’t WordPress-specific.

Spam filter is not a separate plugin Spam filter is a separate plugin
Comment system is not a plugin WordPress Classic (WPC) unplugged (i.e. with no plugins) WPC with Akismet plugin

WordPress.com, which uses Akismet

Comment system is a plugin WPC with Disqus plugin ?

The other cell of the top row represents the use of a plugin to handle step 4 (assess the spam-ness). There are many such plugins. A previous post focused on four of them. One of them is Akismet, which handles spam at the hosted blogging service WordPress.com.

Moving to the second row reflects the replacement by a plugin, not just of step 4, but of all the comment system steps. Disqus provides such a plugin; in fact, I just started using it at my WordPress test blog.

I know of no example for the last cell of the table: hence the ? The cell would be of interest to a blog admin whose preferred spam plugin is Akismet, but who also wants Disqus features such as a cross-site discussion community.

The idea of combining a comment plugin with a spam plugin is a little tricky. It’s probably tricky in technical terms: if Disqus ever invokes Akismet, it will probably use the Akismet API rather than the plugin.

The business trickiness is about revenue sharing. If a comment service invokes a spam service, and each service wants to make money, how should the money be divided? I believe that these tricky issues will be addressed. Disqus may hold to its own spam fighting. But, if it does, it will present an opportunity to competitors willing to work specialized spam services.

Two Four-Letter Words: Spam and Free

Spam is, for many of us, the worst aspect of Web 2.0. The threat of spam of course creates an need, and hence an opportunity, for spam-fighting services. Last week, I compared four of them: Akismet, Defensio, Mollom, and TypePad AntiSpam. The comparison was prompted by the launch of the last of these (the list, like the comparison table in the previous post, is in order of launch date).

TPAS is interesting, not just because it is the most recent, but because it has claims to be the most free. I use the plural claims because TPAS seems to make that claim with respect to each sense of the word free: free of charge (gratis) and free (libre, open source) software.

In this post, I’ll extend the comparison between the four services with respect to each sense of free. First, free of charge. The last two lines of the comparison table refer to this kind of free. The first of these lines shows that each of the four services is free for personal use.

The last line of the table asks whether each service is free for commercial use. It answers “Yes” for TPAS, and “No” for each of the other services. Following some email exchanges and some thinking, it seems that the pricing issue needs clarification.

Akismet has multiple levels of commercial API key. For example, a problogger key is $5/month. Given that a problogger is defined for this purpose as one who makes more than $500/month, the cost seems reasonable (but then, I’m not a problogger). That an enterprise key starts at $50/month also seems reasonable (but then, I’m not an enterprise).

Defensio is free for commercial use up to a limited amount of traffic. That’s a paraphrase of an email. Defensio.com is down at the moment. I don’t know whether that means that the service is down.

Mollom currently describes its future pricing model as follows.

The basic Mollom service will be free… but it will be limited in volume and features… Our goal is to make sure that the free version of Mollom goes well beyond meeting the needs of the average site…

For large and mission-critical business and enterprise websites, we will offer commercial subscriptions. We are currently working out our commercial pricing scheme for access to more advanced features, unlimited traffic, enhanced performance, reliability and support.

TPAS, per its FAQ, “is free, and will always be free, regardless of the number of comments your blog receives.” The FAQ also addresses how Six Apart will support the service; the firm “may choose to provide enterprise-class services on top of TypePad AntiSpam at some point in the future.”

TPAS is the outlier on this “free as in beer” issue, but I now think that it’s closer to the others than I first thought and implied. Like the other three, it seeks to make money from enterprise clients (and I don’t see anything wrong with that). The difference is that it doesn’t attach the price tag to AntiSpam itself.

TPAS is also the outlier on the free software, or “free as in freedom,” issue. As I remarked in the earlier post, “while the TPAS inference engine is open, the rules are hidden.”

I wouldn’t be at all surprised to see Akismet, Mollom, or both move to a similar model. I base this on the following assumptions.

  1. Spam-fighting software has the classic intelligent system split between inference engine and rules base. In particular, Akismet and Mollom already have this architecture.
  2. The action is in the rules, which are specific to the domain of spam-fighting.
  3. Following from the above, you don’t give much away to spammers or to competitors if you free/open-source your engine.
  4. The people behind Akismet and Mollom don’t want to cede the “free high ground” to TPAS.

With respect to this aspect of free (libre), as with respect to the first aspect (gratis), I may have exaggerated TPAS’ outlier status. TPAS does have a legitimate claim to being more free than its competitors in each of the two senses of free. But the gap between TPAS and, say, Akismet, may not be as great or as durable as might at first appear.

That conclusion is, of course, my opinion. Comments (or email: andrew at changingway etc.) would be a good way of telling me that you draw a different conclusion or that my conclusion is based on faulty premises or reasoning. I’d welcome other relevant comments. For example, you might know of a spam-fighting service other than the four I’ve focused on.

AntiSpam: TypePad and the Trio

There’s a new spam fighting service in town: TypePad AntiSpam. To put it another way, the spam sheriff of TypePad town is now available to lay down the law elsewhere.

TPAS competes directly with Akismet. The table compares the two spamfighting services with each other, and with two other competitors. I’ve ordered the columns from earliest to most recent (so the alphabetical order is coincidental).

Akismet Defensio Mollom TypePad AntiSpam
Previous post at Changing Way? Yes Yes Yes No
Service offered by Automattic Karabunga Mollom: shares founder with Acquia Six Apart
If in doubt, challenge with CAPTCHA? No No Yes No
Service has own API? Yes Yes Yes No, uses Akismet API
Open source engine? No No No Yes
Free of charge for personal use? Yes Yes Yes Yes
Free for commercial use?* No No No Yes

Each of the four is the odd one out in at least one sense. Akismet was first out, and remains the service against which each rival positions itself.

Defensio is the one that doesn’t share developers or an organization with a prominent publishing or content management platform (Akismet/WordPress, Mollom/Drupal, TPAS/TypePad and Movable Type).

Mollom uses CAPTCHA when unsure whether a comment is ham (the good stuff) or spam, whereas each of the others queues the suspect comment for moderation. That’s something of an oversimplification about the others: for example, a TPAS client can use CAPTCHA when told about a suspect comment by the server.

TPAS is open source (GPL V2). I found this particularly interesting, given that the other three are not. They explain that source code access would help spammers. I then realized that while the TPAS inference engine is open, the rules are hidden.

TechCrunch is currently using TPAS via the WordPress plugin that Six Apart provide. Mike Arrington reports that TPAS is doing well so far.

Anil Dash wrote the announcement post at the Six Apart blog. TPAS also has its own blog.

Missing from the table are two of the most interesting potential comparisons: performance and market share. I suspect that we will before long see data relevant to these comparisons, and challenges to the data, and…

Update, after a few hours sleep and some further research. I made a few changes to the above.

I’d like to add that I find the name TypePad AntiSpam interesting. Or rather, I find the choice of name interesting. The name may give the impression that it’s more specific to TypePad than it really is. My guess is that Six Apart think they have a winner on their hands here, and that the success of TPAS will raise awareness and reputation for TypePad.

* Final update to this post. I decided that the last line of the table, while close to the mark, needs clarification. Hence the followup post (see the first comment to the current post).

Automattic Making Money From Other Projects

By other, I mean other than WordPress. We are almost at the end of my series of posts on Automattic, and how the firm makes money. We’ll start by noting that the firm provides a handy summary of its projects. Some of them are covered in earlier posts in this series (e.g., WordPress.com).

There are three non-WordPress projects: Akismet, bbPress, and Gravatar. (Actually, to describe them as “non-WordPress” is to simplify since, as we will see, each has firm connections to WordPress.) I find the first of these the most interesting, and I know I’m not alone in that. Askismet is an ambitious project.

Automattic Kismet (Akismet for short) is a collaborative effort to make comment and trackback spam a non-issue and restore innocence to blogging, so you never have to worry about spam again.

Although Akismet is an Automattic project and is WordPress.com’s spam cop, it is not only for WordPress blogs. The Akismet API is published so that the server can be invoked from other applications.

The Akismet server is unusual among Automattic projects in that it is closed source. This seems to be the norm for spam-fighting server code: it is also true of Akismet’s rivals Defensio and Mollom.

Automattic, as a privately-held firm, is under no obligation to provide details of how much money it makes from specific projects. But Duncan Riley at TechCrunch described Akismet as Automattic’s biggest money earner. Toni, Automattic’s CEO, was quick to counter what he described as “misconceptions,” stating that Akismet is not even close to being Automattic’s biggest earner.

Direct earnings from Akismet come from commercial licenses. Indirect earnings arise from the extent to which Akismet helps convince bloggers to choose WordPress.com.

Moving on to the other other projects, bbPress is forum software. It runs the various WordPress forums. To put it another way, bbPress is the name under which Automattic released the software on which the WordPress.org support forums have been running for years. Automattic intends to offer hosted forums under the name TalkPress (rather as it offers hosted blogging at WordPress.com).

Gravatar is notable among Automattic projects for having been acquired; I believe it to be Automattic’s only acquisition so far. At the time of the acquisition, Om Malik described Gravatar as a small project that gives WordPress users the ability to add avatars to their profiles. It is clear from the Gravatar about page that there are far loftier ambitions for the project. Today, an avatar. Tomorrow, Your Identity—Online.

I’ll stop there, rather than speculate about the future of online identity. I’ll add one more post to this series: a wrapup.

Blog Software Firms Spread Their Wings

Acquia was started up by Dries Buytaert, the lead developer of the Drupal CMS, in late 2007. At the time I remarked on the similarities between Acquia and Automattic.

Now that Dries has announced Mollom, there’s a new and significant similarity. Mollom, like Automattic’s Akismet, is a spam-fighting web service. Duncan at TechCrunch reports that Akismet is the current market leader.

Here are a couple of ways in which Mollom is following the leader. In each case, the server code is closed-source, even though it comes from a firm notable for its foundations in open source. In each case, the spam-fighting service can be invoked by any client using the API: Mollom isn’t just for Drupal, any more than Akismet is just for WordPress. One of the main differences is that Mollom uses captcha, albeit only when it’s unsure whether it’s just bitten on spam or ham.

Meanwhile, Six Apart has made an acquisition that expands its range beyond blogging, albeit into a closely related domain. Mike Arrington posted a guest the acquired firm contest on Friday. It now has almost 400 comments: that guy really knows how to get his audience going.

It turns out that Six Apart acquired Apperceptive. Here’s how Rafat Ali described the deal.

SixApart, the blogging software firm with products like MovableType, Typepad and Vox, is now moving up the value chain into offering advertising and consulting services, and has bought New York City-based social media creative agency, Apperceptive. The financial details were not disclosed.

In case you, like me, were wondering what “social media creative agency” means, it seems to be how they say “ad network” on the mean streets of New York.

Defensio Against Comment Spam

logoDefensio is a spam comment blocker. Defensio boasts a few features lacking in Akismet, most notably the ability to sort your blog’s blocked comments by “spamminess”, according to Mark Hendrickson at TechCrunch. That got me interested enough to try the Defensio plugin at my WordPress test blog, but then realized that WordPlay doesn’t get enough spam comment to provide a test of Defensio.

Mark refered to “a few features lacking in Akismet,” and goes on to imply Defensio’s “open API that makes the blocker available for use with non-blogging applications” is one of them. But that feature is present in Akismet, and clearly described in the FAQ: The Akismet API can be adapted for almost any application with submitted content, including forums, wikis, and contact forms.