The Inevitability of Spam
Blog spam (blam), search engine spam (spamdexing), fax spam, wikipedia spam, second-life spam, Instant Messenger spam (spim) - you name an electronic form of communication, I’ll show you a spam that invades it.
Taking into account the following variables:
It gets interesting when, as in nearly every form of electronic communication, TransactionCost quickly approaches zero. Also, anonymity is easy, so Risks are also minimized. Doing a little algebra we get:
Overhead/AudienceSize <> Benefit*ConversionRate
So, as soon as any electronic community gets large enough to outweigh the initial spam-tool overhead, it will invariably fall prey to spam. Slightly pessimistic of me, I know - there doesn’t exist any other alternatives, machine techniques for spam filtering continue to fight an arms race with spam distributors with no clear limits on the horizon. The only hope (that I can see) is CAPTCHAS offsetting the formula by making TransactionCosts keep above zero, imposing a “cost” of time/attention/thinking for every communication.
Which brings us to the facetiously named
Hill’s Law
Any online community’s total value = the cost of injecting spam into the system * amount of spam in the system * ? (a constant I just made up)
Put another way: A new and potentially more accurate way to estimate a startup’s market worth as it vies for VC money is through the sum of spim, blam, spaSMS, spamdexing, bots, farmers, phishers and offshore traders.
The modestly named “Hill’s Law” came to mind during talks with internet companies of various sizes about their relative problems with spam, and observing how the problem’s magnitude (IMHO) tracked with the community’s market value. An example is Yahoo’s recent addition of a CAPTCHA verification for their online chatting service, a common reaction to reduce spam in the transaction-cost-free communication environment of “web 2.0" communities. Combine this with the going rate for CAPTCHA cracking style low latency OCR work (human powered? Who cares!) through something similar to Amazon Mechanical Turk, and you’ve got a precise market metric for exactly how much it is worth to spammers to infiltrate a chat room and push unwanted ads to the room and a known number of viewers, a number I’m sure they’d be none to happy about revealing. I think this is a better metric than the CPM or CPA cost of advertising, because the spam has to go through the same hoops that each user does when communicating.
For any given form of electronic (or low transaction cost) communication that provides the backbone of a Web 2.0 community:
Originally Posted: June 28th, 2007
Taking into account the following variables:
- Overhead: The costs and overhead of electronic spamming include bandwidth, developing or acquiring an email/wiki/blog spam tool, taking over or acquiring a host/zombie, etc.
- TransactionCost: The incremental cost of contacting each additional recipient once a method of spamming is constructed, multiplied by the number of recipients.
- Risks: Chance and severity of legal and/or public reactions, including damages and punitive damages
- Damage: Impact on the community and/or communication channels being spammed (see Newsgroup spam)
- Benefit: total expected profit from spam
- ConversionRate: chance of someone who is spammed adding to your Benefit total
Risks*AudienceSize + Overhead + TransactionCost*AudienceSize
(is greater or less than)
Benefit*ConversionRate*AudienceSize
It gets interesting when, as in nearly every form of electronic communication, TransactionCost quickly approaches zero. Also, anonymity is easy, so Risks are also minimized. Doing a little algebra we get:
Overhead/AudienceSize <> Benefit*ConversionRate
So, as soon as any electronic community gets large enough to outweigh the initial spam-tool overhead, it will invariably fall prey to spam. Slightly pessimistic of me, I know - there doesn’t exist any other alternatives, machine techniques for spam filtering continue to fight an arms race with spam distributors with no clear limits on the horizon. The only hope (that I can see) is CAPTCHAS offsetting the formula by making TransactionCosts keep above zero, imposing a “cost” of time/attention/thinking for every communication.
Which brings us to the facetiously named
Hill’s Law
Any online community’s total value = the cost of injecting spam into the system * amount of spam in the system * ? (a constant I just made up)
Put another way: A new and potentially more accurate way to estimate a startup’s market worth as it vies for VC money is through the sum of spim, blam, spaSMS, spamdexing, bots, farmers, phishers and offshore traders.
The modestly named “Hill’s Law” came to mind during talks with internet companies of various sizes about their relative problems with spam, and observing how the problem’s magnitude (IMHO) tracked with the community’s market value. An example is Yahoo’s recent addition of a CAPTCHA verification for their online chatting service, a common reaction to reduce spam in the transaction-cost-free communication environment of “web 2.0" communities. Combine this with the going rate for CAPTCHA cracking style low latency OCR work (human powered? Who cares!) through something similar to Amazon Mechanical Turk, and you’ve got a precise market metric for exactly how much it is worth to spammers to infiltrate a chat room and push unwanted ads to the room and a known number of viewers, a number I’m sure they’d be none to happy about revealing. I think this is a better metric than the CPM or CPA cost of advertising, because the spam has to go through the same hoops that each user does when communicating.
For any given form of electronic (or low transaction cost) communication that provides the backbone of a Web 2.0 community:
- … is there a range of CAPTCHA difficulty/human-only barrier placed on the communication choke points where (economically speaking) it isn’t worth it for any spammers to outsource or manually crack through, but is “worth it” for the general users of the service to put up with?
- … to what extent does placing these added restrictions on the end users’ experience squeeze down on the users’ tolerance for the prevalent advertising based business model?
- … and finally, can this squeeze be offset by tricks like re-captcha, actually putting those brain cycles to work and recouping value from the spammers?
Originally Posted: June 28th, 2007
Comments
Post a Comment