Sunday, November 06, 2005

Dynamic URLs still aren't as good as static ones

After a lot of delibertaion and messing around, I have come to the conclusion that what are commonly referred to as dynamic urls are not regarded as highly as static ones.

Firstly, let's be clear on what we are calling a dynamic url (as opposed to a dynamic page).

Strictly speaking, a dynamic web page is any web page that is generated as it is requested by the web server. The data for the page can be drawn from a database, other files on the server or the server it's self. Many servers are set up to use dynamic pages with .php or .asp file extensions and querystrings, but then have the server configured to convert these urls to simple ones ending in .htm. In this case the urls themselves are static, but the pages are dynamic.

What I refer to as a dynamic url is basically any url which uses query strings. That means the address has & and ? symbols followed by values for example:
http://www.example.com/showproduct.asp?id=3423 or
http://www.example.com/showproduct.asp?id=343&ref=4333
and soforth.

When it comes to search engine indexing, it appears they are not so concerned with the file extension, .asp, .php or .htm as they are with the presence of these querystrings. With the querystrings they are not crawled as quickly and their links are not followed as extensively as simple static files with .htm extensions. When they are finally crawled and indexed, they will take longer to climb the serps, and will be outranked by pages that are otherwise totally equal. This is especially worth considering if you are working in a competitive sector of the web.

There are many common sense reasons why this would be the case. Many "out of the box" e-commerce and content management packages have very poor error control. One ASP package I use was returning code 200 ok with all page not found errors, and was storing querystrings in session variables rather than in the url, which meant that bots were being presented with hundreds of identical error pages, all returning a 200 ok when trying to follow these links. Lickily the package was open source, so I changed it.

On the other hand, it's not like these pages don't get indexed. Search engines are also getting better and better with them. If you are in a really competitiive area, I would say to go with generated static .htm pages, or use a mod rewrite. Mod rewrite is easy if you are using Apache/Linux hosting but not if you are using Windows and IIS. These things can be a pain to set up though.

This issue will probably be one of many that fade away over the coming few years. The internet is now at the point where dynamic sites are simpy a requirement for most webmasters, and search engines will find more reliable ways of reading and indexing them.

No comments: