Adverts

Archives By Subject

Calendar

Mon Tue Wed Thu Fri Sat Sun
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            

Search

RSS


Tags

adobe apache book review cfimage coldfusion google google chrome hosting iis internet explorer java javascript jquery lucene photoshop regex ses urls software review sql injection svn trac

Subscribe

Enter your email address to subscribe to this blog.

Are SES URLs evil?

I was reading a recent article in .NET magazine about optimising Flash sites for Google. When I got to the box on Friendly URLs, I thought it ironic that they were perpetuating another web design myth. After all, their recent June issue had been dedicated to myth busting, and SES URLs are the biggest myth out there?

Checking back however, I realised that even .NET magazine are still taken in …

At this point, I should point out that I'm aware of the irony in this post. I'm well aware that I'm writing an article against SES URLs on a website that uses them. This blog is powered by Ray Camden's BlogCFC which has SES URLs by default. I've also implemented SES URLs on other projects.

The Myth

Every so often we get a customer complaining that their URLs aren't search engine friendly. Normally it's the result of some expert telling them that Google can't index dynamic URLs.

If you're not aware yet, dynamic URLs are often of the form http://www.mysite.com/product.cfm?productID=123. This is typically used when the page is produced dynamically. Often from a database. The productID and number bit are referred to as a URL parameter.

Now many people mistakenly believe that search engines such as Google can't index pages with these URLs. Often they will re-write this URL along the lines of http://www.mysite.com/product/productID/123.

The truth is that search engines have been able to index dynamic URLs for years without problem. Google even debunk this myth on their own official blog. If that's not enough to convince clients, we ask them to do a Google search such as

site:www.mysite.com
. That search will list every page indexed by Google for the site.

The Problems with SES URLs

Now maybe evil was a bit too string a word to use, but there are certainly disadvantages to using SES URLs.

  • You need to use a rewrite module such as Apache's mod_rewrite. As well as the extra time and knowledge setting this up requires, it makes your application less portable if you want it to work on IIS as well.
  • Relative URLs don't work - you need to link everything from the web root. There's a lot more work invloved if you wanted to change your directory structure.
  • You need to take extra care not to duplicate content. While it's easy for Google to work out that www.mysite.com/product.cfm?productID=123&categoryID=456 is the same page as www.mysite.com/product.cfm?categoryID=456&productID=123, Google would consider www.mysite.com/productID/123/categoryID/456 and www.mysite.com/categoryID/456/productID/123 to be two separate pages with duplicated content. As soon as Google thinks you are duplicating content, your search ranking will plummet.
  • You also can't put session or other client parameters in the URL, or it will confuse the crawlers. For example, Google is clever enough to work out that the CFID parameter here is tied to a client's session: www.mysite.com/product.cfm?productID=123&CFID=12345678. It can therefore ignore it. www.mysite.com/product/productID/123/CFID/12345678 however will confuse Google.

Are there any reasons left why I might use SES URLs

Now to be fair to SES URLs, there may be some benefits:

  • Adding kewords in the URL might increase search engine ranking. Some SEO experts claim this and Google don't specifcally mention in it in their blog article. However, if Google are discouraging URL re-writing, it would seem strange.
  • From a usability point of view, it's fairly obvious that mysite.com/contact is easier to remember than mysite.com/page.cfm?pageID=789. However, there's nothing to stop you setting up /contact as a permanent redirect to /page.cfm?pageID=789 then you have the best of both worlds. And Google likes this approach.

And Finally…

Use the correct tool for the job.

A few years back, Jeffrey Zeldman taught us to stop abusing table tags, and to use html for the purpose it was designed for. Everyone now sees the benefits.

URL parameters were designed for use by dynamic web pages, not complicated folder structures. Search engines understand this, and it gives them a better understanding of how your site works.

While the internet may have started off as a collection of static pages, it's now about online applications. As the likes of Google develop more intelligent crawlers, they will not only be able to read your static content, but interact with your application. Who knows what this may bring?

Related Blog Entries

Comments
Robert Rawlins's Gravatar excellent little article this, the most usefully thing with regards to this is that link to the google blog post, kind of dis spells the myth.

Thanks,

Rob
# Posted By Robert Rawlins | 06/08/09 12:10
Raymond Camden's Gravatar While I do think the importance of SES URLs are overblown, I do have a few comments.

a) IIS supports SES URLs, and you can, for the most part, uses almost the exact same configuration at the server. I develop on Apache at home and have no problem moving to IIS in production. There are multiple IIS plugins that support SES URLs.

b) Relative URLs - to me, this isn't a huge deal. Where I type ../foo.cfm or /foo isn't any harder or easier. I'd maybe argue that it's even better. If I'm writing a link from some resource, and that resource moves, then I'd have to update the link. (Of course, if foo moved I'm in the same bot.)
# Posted By Raymond Camden | 06/08/09 12:31
freelance web developer's Gravatar I have to disagree with a few things here.

"You need to use a rewrite module such as Apache's mod_rewrite. As well as the extra time and knowledge setting this up requires, it makes your application less portable if you want it to work on IIS as well."
ISAPI_rewrite for IIS uses apache's .htaccess files and syntax. Quite portable. Besides that, when moving a site, should copying rewrite rules really be such a large chore?

"Relative URLs don't work - you need to link everything from the web root. There's a lot more work involved if you wanted to change your directory structure."
If you're rewriting URLS, you don't really have a directory structure. You just have a few templates that are used.

"You need to take extra care not to duplicate content."
Lazy or bad programming is not a case against SES URLs.

"You also can't put session or other client parameters in the URL, or it will confuse the crawlers."
You certainly can put url variables in SES urls. Following your example above:
www.mysite.com/product/productID/123/?CFID=1234567...

Finally...
"Now many people mistakenly believe that search engines such as Google can't index pages with these URLs. Often they will re-write this URL along the lines of http://www.mysite.com/product/productID/123.";
The point of SES URLs is not to remove the ? and = characters from the url. It's to create human-pretty and machine-intelligent urls. Such as:
http://www.mysite.com/products/tdk-80mb-cdr/ (equivalent of http://www.mysite.com/product.cfm?proID=123)
# Posted By freelance web developer | 06/08/09 12:43
Justin Carter's Gravatar For the most part, rewriting a URL like http://www.mysite.com/product.cfm?productID=123
into http://www.mysite.com/product/productID/123 really doesn't achieve much.

A better idea, for SEO and for human readability, is to rewrite to something like http://www.mysite.com/product/Xbox-360-Premium-60G... (you may or may not want to include the product ID in the URL, but my point is that the text is the important part).

Another thing is that it doesn't really matter if you can hit a page by multiple different URLs if you specify a canonical URL on your pages: http://googlewebmastercentral.blogspot.com/2009/02...
# Posted By Justin Carter | 06/08/09 13:41
Al Davidson's Gravatar I'm with Justin on this one. When I was doing Fusebox all the time, I used to think that there was no point to 'friendly' / SES URLs. But then I saw Tom Coates' presentation at FOWA that really set me thinking.

The point for me is that your URL schema is an interface to your application, and needs designing just like any other interface. It should ideally be dictated by an information design approach, rather than a particular framework. It's not even for the sake of search engines, which have, as you say, got much more intelligent in the last few years, it's for the sake of your users. Of course, you have to weigh the cost in terms of dev effort against the benefits, but i believe the benefits are worth it.

When I'm typing into my browser URL bar, trying to remember what URL it was that I visited for 20s two days ago that I'm trying to refer a friend to, it's SO much easier if the auto-suggest is showing me URLs like blah.com/articles/why-x-is-the-new-y-and-z-is-dead, or gruntfuddlr.com/products/mang_wurdler_3_42 as opposed to dogbotherer.com/index.cfm?fa=articles.show&article_id=125222. It also helps increase your search ranking if the term someone is searching for is actually in the url of the article.

It's not even that much effort to put an apache rewrite together with a simple translation layer on top of a Fusebox or Mach-II app. On the last big Fusebox site I did, we did something roughly equivalent to Rails' routes.rb in a couple of days, and it made a huge difference to the usability of the site.

Give it a try and see!

Al
# Posted By Al Davidson | 06/08/09 14:17
Martin's Gravatar Here's something interesting. The term "SES URLs" seems to be a very coldfusion-centric thing: Here are Google search results for "SES URLs" and all of them are from CF related sites and blogs (http://www.google.com/search?q=ses+urls).

Was this myth created by the ColdFusion community?

The point here, as most of the other commenters already said, is that "pretty URLs" have very little to do with search engines and more to do with building better, cleaner web apps.
# Posted By Martin | 06/08/09 16:18
Gareth's Gravatar Thanks for the comments folks. I though this might raise a few!

I realise that setting up SES URLs aren't a huge amount of work, and I already acknowledged that there may be a usability benefit.

My main argument though, was that the web already has a well defined way to pass URL parameters. Passing this information via folder structure is a hack.

Now Google can easily recognise normal URL parameters, but it can't work them out if passed via folder structure. At the moment, Google only uses this information to work out which parameters to ignore, but in the future it may use them positively too.

In future Google and others might be able to work out what URL parameters do on your site automatically, and use them to build the next generation of screen scrappers. For example, it may work out that your website is a booking engine or quoting tool, and automatically interogate it as if it were using a public API.
# Posted By Gareth | 06/08/09 16:34
Jules Gravinese's Gravatar I've always thought that google sees directory hierarchy as it does document hierarchy (h1, p, h2, p, p, etc). A well laid out directory structure gives you similar 'seo credits' as would a clean document layout.
Certainly a whole folder dedicated to any subject would be worth more (have more information) than a single file. So why would google not give you credit as such? It's just another step higher. For example:
/buick/ is better than /buick.html is better than /cars.cfm?make=buick
# Posted By Jules Gravinese | 06/08/09 18:33
Ben Nadel's Gravatar I think ultimately, it all comes down to page rank. Sure, readabiltiy is nice, and easy directory navigation is nice... but at the end of the day, I think it all comes down to page rank - does a more "relevant" URL get you a better page rank? Is there any definitive research on this? The stuff I see on the Google blog is more about "IF" a bot can read dynamic URLs, not if it affects rank.
# Posted By Ben Nadel | 07/08/09 15:07
Gareth's Gravatar @Jules

I've been on a few SEO courses, and it seems that a lot of the techniques people recommend are based purely on guess work.
Google doesn't give away it's algorithms in case people exploit them, so SEO experts rely on anecdotal evidence. I haven't heard of any scientific SEO tests that have been done, and I'm not sure they are possible in a real-world situation.

Ultimately, we only have Google's own guidelines to go on, and they don't mention a folder structure as having a hierarchical impact on SEO or any other SEO impact.
If Google recommend NOT using SES URLs, then that's what I'm inclined to go for.
# Posted By Gareth | 12/08/09 15:37
Jules Gravinese's Gravatar Ben,
I've seen sites with lower pagerank come up higher in results because of their content. PR is important, but that's more of a popularity number, which has been made to be less important than it used to be.

Gareth,
I don't have any control tests. But using logic it's hard to argue that friendly url names with a visible hierarchical structure must help SEO. For instance if someone is looking for a "compact florescent bulb", which of these is better:
a) /phillips-par20.html
b) /bulbs/florescent-compact/phillips-par20/
It's not just slashes for the sake of breaking up words. It's categorization of information.
You're right in that it's guesswork. But as long as it works, we can also call it witchcraft :)
# Posted By Jules Gravinese | 13/08/09 01:24
Adobe Certified Professional - Macromedia ColdFusion MX 7 Advanced Developer Powered By ColdFusion 8 aggregated by mxna aggregated by coldfusionBloggers