Cruft-Free URLs
The first question/answer from Ask Arvind...
Lea writes:
Can you provide tips on how to have cruft-free URLs (a la diveintomark.org, as well as others)? I already have this employed in my blog, but a lot of others don't.
The diveintomark tutorial in question can be found here.
First of all what is cruft ? Cruft is basically the extensions of webpages for example .php, .html etc. In the field of SEO it is meant to be very bad. Here's a - hopefully easier - method to have cruft-free urls. For this to work you will need to install the Short-Titles plugin (thanks Amit!)
Blank out "File Extension for Archive Files" under Weblog Config → Preferences
Change the Archive File Template for Individual archives - if you have it checked - to
<$MTArchiveDate format="%Y/%m/"$>/<$MTEntryShortTitle dirify="1" trim_to="15"$>in Weblog Config → Archive FilesDelete all old individual archives from your webdirectory
Remove the extensions for your Main Index Template and Master Archive Index.
In your weblog folder, edit the .htaccess file - create it if it already isn't there. If your files were
*.htmladd the following to .htaccess
DefaultType text/html
DirectoryIndex index
RedirectMatch permanent /archives/(.*).html$ http://www.mydomain.com/archives/$1
If your files were *.php then add the following to .htaccess:
DefaultType application/x-httpd-php
DirectoryIndex index
RedirectMatch permanent /archives/(.*).php$ http://www.mydomain.com/archives/$1
- Rebuild all files and test it out !
Let me explain what is going on here and what you can do to improve upon this. Basically your web server now treats all files in your weblog directory with no extension as html/php files - depending on what you put in your .htaccess. The rest of the world will still be linking to your crufty urls so that's why you put in the redirect.
Lea continues:
Now here's my REAL question: I mindlessly followed Mark's tutorial once upon a time, but I can't find out how to change it so that the title spaces aren't separated by an "_" (underscore) instead of a "-" (dash). I want to change it to a dash instead because Google separates the words with dashes properly as opposed to words separated by underscores, where it will think all the words are combined into one word. e.g. one_word=oneword, one-word=one word)
I'm not really sure how to combat the underscores but you coudl try something called "slugs." Slugs is just another way of saying keywords, they replace your very long entry title in the URL. To play around with slugs, simply add it to your keywords wrapped with square brackets:
[nocruft]
So if I added that to this entry, the permalink would be something similar to
http://www.movalog.com/archives/2004/08/nocruft
instead of
http://www.movalog.com/archives/2004/08/cruft_free_urls
as it would be if I just left it to entry titles. Is there any real benefit ? Well, IMHO not really but still a pretty fun thing to do.

abhi said:
on Aug 13, 2004 2:01 PM | Reply
i really don't get the point here though... wat's the use... i mean u still use up a lot of space in ur HDD. is there a way to put the posts in a DB, take it out dynamically n place it with the above mentioned type of url's???
Arvind Satyanarayan said:
on Aug 13, 2004 2:02 PM | Reply
Yep that's called dynamically serving up the pages as described here http://www.movalog.com/archives/2004/08/dynamicvsstat.php. That's what WP does, and what MT 3.1 will !
Lea said:
on Aug 13, 2004 7:50 PM | Reply
Actually, my friend Cal made me a little "Dashify" plug-in for MT that will create dashes instead of underscores the other day. What an awesome guy.
Anyway, for anyone who cares, you can download the Dashify plug-in here: http://code.iamcal.com/pl/mt3/dashify/
I was also poking around the MT-Plugins.org page and found this, too: http://mt-plugins.org/archives/entry/dirifyplus.php
However, Dirify Plus is a lot more complicated work. If all you want are dashes, and no other customization, Cal's Dashify plug-in should work wonders. It's now working for my page. :-) (it works for MT 3.0 and under, since I am on MT 2.64)
Arvind Satyanarayan said:
on Aug 13, 2004 7:51 PM | Reply
Ah well that's cool, what fun you answered your own question ;)
Lea said:
on Aug 13, 2004 9:21 PM | Reply
Haha, yah, I'm an impatient one and it's nice to have a web programmer/developer as a friend. LOL.
Joost Schuur said:
on Aug 13, 2004 10:46 PM | Reply
Don't for get Már Örlygsson's tip on using the Regex plugins for removing index.html/php from archive links.
abhi said:
on Aug 14, 2004 2:46 AM | Reply
arvind, ur link gives me a 404... :(
Arvind Satyanarayan said:
on Aug 14, 2004 7:19 AM | Reply
Sorry, the correct link is http://www.movalog.com/archives/2004/08/dynamicvsstat.php
Michael Pate said:
on Aug 25, 2004 9:01 AM | Reply
When using DififyPlus, this works really well for individual archives.
{$MTArchiveDate format="%Y/%m"$}/{$MTEntryTitle dirifyplus="pld"$}
David H. Sundwall said:
on Aug 30, 2004 11:39 PM | Reply
I apologize for this question b/c no one else obviously has a problem with this but I cannot seem to be able to rename my Master Archives index file. If I did it would have the same name as my archives directory. I am unable to remove the .html extension. What am I missing?
kenny said:
on Nov 27, 2004 3:22 AM | Reply
Yep that's called dynamically serving up the pages as described here http://www.movalog.com/archives/2004/08/dynamicvsstat.php. That's what WP does, and what MT 3.1 will !
Sujay said:
on Aug 19, 2005 4:02 AM | Reply
I dashified all my dirified underscores (what a sentence), now is there any way to redirect permalinks that used underscores to the new permalinks?
Tom Keating said:
on Dec 19, 2006 8:07 PM | Reply
Why is this needed?
I noticed you trim the title to 15 characters. I use a similar plugin, but I trim it to 70 with a .asp extension, which is much longer.
Soon I'll have no extension, but was wondering if there is some standard that says 70 characters is too long for a URL with no extension. 15 seems so short to me if you want descriptive titles and better SEO.
I assume once I go cruftfree, I won't have a problem with a 70 character long URL. If some standards body says otherwise, let me know.
samigrersuash said:
on Oct 1, 2008 5:39 PM | Reply
I just want to take some money! :) Press here