slash2mail: a slashdot to email gateway
Download
slash2mail.pl
What:
slash2mail is a slashdot-to-email gateway that polls the
slashdot webpage every few hours, and
mails out new stories and (optionally) score5 comments, all nicely mail-threaded.
This picture is worth 163 words, exactly
That mail ends with
Why:
for a number of reasons.
- IMHO it is considerably nicer to have new stories trickle into your
mail folder, than to hit the site every once in a while and plod through its
entire length to locate new stories, and more importantly, new score5
comments (which are usually worth reading). Typically you end up missing out
on quite a few comments.
- I've often wanted to save a gem of a slashdot posting or comment, and
cut-n-paste and remember where you kept the file rapidly becomes a headache.
Mail folders are so handy.
- I prefer mutt mail threading (set
sort=threads and set sort_aux=last-date-received) to slashdot comment
threading.
- Slashdot has the following (changable) thresholds typically set.
- there is a maximum limit on the number of stories that show on the
cover page, which is quite small. So often, good comments (which probably
took its time to bubble up to score 5) on slightly older stories
tend to go unnoticed, because the story has moved off the front page.
- Anything beyond the first paragraph of an story gets pushed out to
"somany bytes in body", and extra clicks become necessary.
- There is a painful limit set on the length of a comment, beyond which
it truncates and continues in the Read More. With mail, a single longer mail
for a long comment doesn't hurt, since it can be easily deleted.
- Comments switch to index mode, which is desired for the web and yet
hindersome; a mail gateway can afford to do away with it.
Sure, you can set them all to infinity, but then the webpage gets quite
painful to traverse. No such hassles exist in the mail world, so there is a
special "/2mail" account created on slashdot for this purpose with suitable
preferences.
Why not:
- the primary disadvantage is that external links are
cumbersome to click on, and slashdot often abounds with these. An external
url grabber and browser starter can alleviate this problem. The pine pager
lets you traverse the body of mails url-by-url and select to launch an
external browser, so its not too bad.
- Posting isn't implemented yet, but reply to authors is handier this way.
Caveats:
- Note that this script doesn't bother fetching any ad gifs, so those
slashdot guys don't like us by default.
- To avoid overuse/misuse/abuse, avoid running this script more than once
every few hours - six should be optimal - else a bot-detector on slashdot
will happily ban your IP address, and eventually your ISP, for evermore.
This bit is official (from the Taco man himself), so heed.
- You CANNOT redistribute content fetched in this manner, since
slashdot material is copyright the respective authors. So if you run this
script, you can only have yourself in the $mailto; in particular, you cannot
use this to drive a mailing list.
Observed traffic:
Moderately high. About 15 stories in one day
(today), and about 60 score5 comments. I tell myself that I'd be reading
all these anyway, only more painfully via the web. And you can turn off
comments if you feel 75 mails a day is too much.
Mechanisms:
It runs via a cron job, set to launch every six hours. The common-case overhead
is very small - it only needs to fetch a barebones version of the main
slashdot page, and perform one pass over it. If there are any new stories
or new comments (detected by changes to the m portion of the
"m of n comments" thing), it fetches the suitable comments
page and extracts all the relevant stuff, and sends mail to the designated
recepients. This overhead is typically a few seconds. It generates
suitable Message-ID and In-Reply-To headers, so that comments thread as
expected. This has a side-effect: the script has to run sendmail on the
local machine; you cannot use a mail server, as these like to set their own
Message-IDs on relayed mails.
There is a timeout of 30 seconds imposed on
each slashdot wget; this is in case the site hangs, and a 3-second timeout
on each gethostbyname.
It pipes the body of stories and comments through
lynx -dump, so as to generate a `References' list of urls. The ">" at the
beginning of the Author and References lines is for the mutt pager:
skip-quoted while viewing the mail jumps over to the body, which I find
convenient.
Installation:
You'll need perl,
wget
and lynx. This program is
one 325-line perl script, hacked up in one afternoon. It uses the Mail::Sendmail
and the MIME::Base64
CPAN modules; you can download them from www.cpan.org. It uses a DBM file
~/.slashdot.db to store a history of read stories.
Toss the script into /usr/local/bin or something, and edit it; you might
want to change some variables to your liking. In fact, you'll definitely
want to edit the recepients line, at least. Then add this line to crontab:
(you can edit user crontab files with the command crontab -e):
0 0,6,12,18 * * * /usr/local/bin/slash2mail.pl
That should do it. In the next few hours, you'll get an initial flurry of mail to
slurp up everything that's currently on slashdot, then it'll trickle down to
one new mail whenever a new story or comment shows up.
TODO:
- Expire ~/.slashdot.db entries after a week or something
- Posting comments (eventually)
- non-5 threshold, but I doubt anyone would want that much mail.
- many recepients, but divided into only-story and story-and-comments
Licence:
This script is distributed under the BSD license.
Since I don't like the license taking up more than half the script, I'll
choose to simply link to the general template, and leave the inclusion to
whoever cares :) fwiw OWNER="Sitaram Iyer", ORGANIZATION="", YEAR=2000.
FreeBSD note:
The default wget on FreeBSD has one problem: those smart guys decided that
the ampersand "&" character is unsafe for general consumption, so they
escape it out into "%26". This causes unbelievable problems with GET
requests that have CGI arguments. Fixing this involves
$ cd /usr/ports/ftp/wget; mkdir x; mv patches/* x; make && make install
Sitaram Iyer, 01jun2000