When volume becomes high enough you start to observe patterns in SPAM pretty easily. I think that this is primarily because people like to see patterns, whether they are present or not.
The trick is determining whether they are real patterns or not, and then to a lesser extent whether they are useful patterns.
For example I host mail for a business domain. That means that incoming messages come primarily from existing customers, and very rarely from potential new ones.
In practise that means that email is expected to arrive from 9am til 6pm (+/-2hours) Email received at 2AM? Either it is somebody working remotely, a foreign contact, or much more likely it is SPAM.
Now clearly you cannot dump all messages received at unusual times of the day, but it is a surprisingly robust SPAM indicator for that particular domain.
All heuristics are fallable, but some are useful regardless..
I'd love to know what people can learn from their SPAM. This week I'm handling approximately 80,000 messages a day, per MX, which isn't huge (ie. 2-3 million a month).
ObQuote: Highlander
Long before I had my blog, I used my LiveJournal to write occasional blurbs about technology.
At one point, I lost almost all of my spam. That was freaky. Of course, it came back later. Sorry the graph isn't there on that post anymore, but that host is long gone.
I've not used spamassassin, but given its extensible nature I'm sure it'd be very easy to add a point based on the time of the day.
I'd expect you'd want to define a symbol such as OUTSIDE_BUSINESS_HOURS for mails sent after 7pm, and before 8am, or something similar to that.
By itself that wouldn't be a useful thing to do. I guess you'd want it to be used on a per-domain basis, only in combination with other tests regardless.
http://www.spam.com/legal/spam/
The "spam" vs "SPAM" battle is one that is lost.
I'll refer to email-based spam as SPAM, unless or until I'm forced to no longer do so.