Counting Cats in Zanzibar Rotating Header Image

Spam

You reckon blogging is easy. You reckon it’s just being a pub gobshite with a laptop? Well, this morning I had to wade through the spam filter. This means stuff like this….

Everyday this unique post is actually totaly unrelated as to the I had been seeking google and yahoo designed for, nevertheless it really was in the past found over the initially website. I reckon that your accomplishing one thing suitable if perhaps The search engines wants you enough that will put yourself on page 1 of an no similar research.

For the record (and we have had complaints) this is how commenting here works. We have a spambot. This filters spam. Just that. It doesn’t search for “inappropriate content”. No folk’s comment is ever held-up or “censored” for that reason. Now if your IP is a first-time caller you also go in the queue. If it takes a while to get on the site then that’s how we work. It is purely to avoid the Viagra salesmen or people bizarrely offering me breast enlargement (my wife gets spam offering to enhance her penile size). It is nothing personal and more to the point we have other things to do. But we will get to you, eventually. Your comments make this site.

2 Comments

  1. Roue le Jour says:

    OK, so is that text computer translated or computer generated? I think translated because computers are smarter than that.

  2. David Gillies says:

    It’s purely machine generated. It simply contains keywords with a certain probabilistic envelope in an attempt to fool the search engines’ matching criteria. A good way to train a spam filter is to run garbage text like this against it. You use a thing called a Markov sequence generator, which generates a random string or words or phrases with a ‘memory’. It can be remarkably powerful at defeating filters. I even wrote a modified Markov generator while I was doing my PhD which used a list of buzz-phrases cribbed from a list someone had stuck up on a board at BAe. It generated a command-line configurable number of paragraphs and sections of LaTeX, formatted like a real research paper, all of it completely grammatical English, and completely content-free. For a laugh we submitted one and it got all the way to the head of department before anyone said, “hang on, this is a load of cobblers.” It was a bit like the Post-Modernism Generator. Wish I still had the source code.

Leave a Reply

%d bloggers like this: