The first question/answer from Ask Arvind...

Lea writes:

Can you provide tips on how to have cruft-free URLs (a la diveintomark.org, as well as others)? I already have this employed in my blog, but a lot of others don't.

The diveintomark tutorial in question can be found here.

First of all what is cruft ? Cruft is basically the extensions of webpages for example .php, .html etc. In the field of SEO it is meant to be very bad. Here's a - hopefully easier - method to have cruft-free urls. For this to work you will need to install the Short-Titles plugin (thanks Amit!)

  1. Blank out "File Extension for Archive Files" under Weblog Config → Preferences

  2. Change the Archive File Template for Individual archives - if you have it checked - to <$MTArchiveDate format="%Y/%m/"$>/<$MTEntryShortTitle dirify="1" trim_to="15"$> in Weblog Config → Archive Files

  3. Delete all old individual archives from your webdirectory

  4. Remove the extensions for your Main Index Template and Master Archive Index.

  5. In your weblog folder, edit the .htaccess file - create it if it already isn't there. If your files were *.html add the following to .htaccess

DefaultType text/html
DirectoryIndex index
RedirectMatch permanent /archives/(.*).html$ http://www.mydomain.com/archives/$1

If your files were *.php then add the following to .htaccess:

DefaultType application/x-httpd-php
DirectoryIndex index
RedirectMatch permanent /archives/(.*).php$ http://www.mydomain.com/archives/$1
  1. Rebuild all files and test it out !

Let me explain what is going on here and what you can do to improve upon this. Basically your web server now treats all files in your weblog directory with no extension as html/php files - depending on what you put in your .htaccess. The rest of the world will still be linking to your crufty urls so that's why you put in the redirect.

Lea continues:

Now here's my REAL question: I mindlessly followed Mark's tutorial once upon a time, but I can't find out how to change it so that the title spaces aren't separated by an "_" (underscore) instead of a "-" (dash). I want to change it to a dash instead because Google separates the words with dashes properly as opposed to words separated by underscores, where it will think all the words are combined into one word. e.g. one_word=oneword, one-word=one word)

I'm not really sure how to combat the underscores but you coudl try something called "slugs." Slugs is just another way of saying keywords, they replace your very long entry title in the URL. To play around with slugs, simply add it to your keywords wrapped with square brackets:

[nocruft]

So if I added that to this entry, the permalink would be something similar to

http://www.movalog.com/archives/2004/08/nocruft

instead of

http://www.movalog.com/archives/2004/08/cruft_free_urls

as it would be if I just left it to entry titles. Is there any real benefit ? Well, IMHO not really but still a pretty fun thing to do.

14 Comments

abhi said:
on Aug 13, 2004 2:01 PM | Reply

i really don't get the point here though... wat's the use... i mean u still use up a lot of space in ur HDD. is there a way to put the posts in a DB, take it out dynamically n place it with the above mentioned type of url's???

Arvind Satyanarayan said:
on Aug 13, 2004 2:02 PM | Reply

Yep that's called dynamically serving up the pages as described here http://www.movalog.com/archives/2004/08/dynamicvsstat.php. That's what WP does, and what MT 3.1 will !

Lea said:
on Aug 13, 2004 7:50 PM | Reply

Actually, my friend Cal made me a little "Dashify" plug-in for MT that will create dashes instead of underscores the other day. What an awesome guy.

Anyway, for anyone who cares, you can download the Dashify plug-in here: http://code.iamcal.com/pl/mt3/dashify/

I was also poking around the MT-Plugins.org page and found this, too: http://mt-plugins.org/archives/entry/dirifyplus.php

However, Dirify Plus is a lot more complicated work. If all you want are dashes, and no other customization, Cal's Dashify plug-in should work wonders. It's now working for my page. :-) (it works for MT 3.0 and under, since I am on MT 2.64)

Arvind Satyanarayan said:
on Aug 13, 2004 7:51 PM | Reply

Ah well that's cool, what fun you answered your own question ;)

Lea said:
on Aug 13, 2004 9:21 PM | Reply

Haha, yah, I'm an impatient one and it's nice to have a web programmer/developer as a friend. LOL.

Joost Schuur said:
on Aug 13, 2004 10:46 PM | Reply

Don't for get Már Örlygsson's tip on using the Regex plugins for removing index.html/php from archive links.

abhi said:
on Aug 14, 2004 2:46 AM | Reply

arvind, ur link gives me a 404... :(

Michael Pate said:
on Aug 25, 2004 9:01 AM | Reply

When using DififyPlus, this works really well for individual archives.

{$MTArchiveDate format="%Y/%m"$}/{$MTEntryTitle dirifyplus="pld"$}

David H. Sundwall said:
on Aug 30, 2004 11:39 PM | Reply

I apologize for this question b/c no one else obviously has a problem with this but I cannot seem to be able to rename my Master Archives index file. If I did it would have the same name as my archives directory. I am unable to remove the .html extension. What am I missing?

kenny said:
on Nov 27, 2004 3:22 AM | Reply

Yep that's called dynamically serving up the pages as described here http://www.movalog.com/archives/2004/08/dynamicvsstat.php. That's what WP does, and what MT 3.1 will !

Sujay said:
on Aug 19, 2005 4:02 AM | Reply

I dashified all my dirified underscores (what a sentence), now is there any way to redirect permalinks that used underscores to the new permalinks?

Tom Keating said:
on Dec 19, 2006 8:07 PM | Reply

For this to work you will need to install the Short-Titles plugin

Why is this needed?

I noticed you trim the title to 15 characters. I use a similar plugin, but I trim it to 70 with a .asp extension, which is much longer.

Soon I'll have no extension, but was wondering if there is some standard that says 70 characters is too long for a URL with no extension. 15 seems so short to me if you want descriptive titles and better SEO.

I assume once I go cruftfree, I won't have a problem with a 70 character long URL. If some standards body says otherwise, let me know.

samigrersuash said:
on Oct 1, 2008 5:39 PM | Reply

I just want to take some money! :) Press here