Pop quiz: what's the difference between the following URLs:
* http://website.com
* http://www.website.com
* http://website.com/default.php
* http://www.website.com/default.php
Give up? If you're a user, then chances you expect all of those URLs will lead you to the same page. Robots, however, are not as good at determining if pages are the same, so they often store each separately. A big part of how search engines rank pages is based on how many external links those pages have. If other sites on the web link to the different versions of your home page, then search engines may calculate the value of each URL separately, based on the number of links to each version. This can effectively diminish the potential rank your page would have if it were found (and linked to) by only one URL.
The practice of consolidating all versions of a page under one URL is referred to as "canonicalization" (because you collapse all versions under the "canonical" or true version). The four examples listed above are the most common, but there are potentially many, many URLs that lead you to the same page. By adhering to several best practices, you should be able to address 90% of common site-wide canonicalization issues on your site and consequently increase how your site ranks.
Recommendation
The solution is to be explicit about the canonical form of your URLs. Following are four best practices to achieve this, with specific code and configuration examples.
1.
Select WWW or Non-WWW, then redirect the other option to your preferred version.
The hard part is choosing if you want your site to be "www.website.com" or simply "website.com". There is no right answer for every company so you'll have to figure this out on your own (but, removing the "www." saves your customers 4 keystrokes, which really add up on a mobile device, and it makes your brand the first thing your customers see).
Once you've selected, you then need to find a way to trap all requests to your application, check which form is being used, and if it is not the correct form, initiate a 301 Redirect to the correct form. For example, if the user types in wikipedia.org, they will automatically get redirected to www.wikipedia.org.
2.
Remove the default filename from the end of your URLs.
All web servers allow you to select one or more default filenames to serve when the browser requests a directory. For example, this website is run on IIS, so when the user requests "http://janeandrobot.com" we really serve "http://janeandrobot.com/default.aspx".
In the same code you use to enforce www vs. non-www, you should also check and see if the default filename is at the end of the URL and then trim it off. So, "http://janeandrobot.com/default.aspx" would be converted to "http://janeandrobot.com".
3.
Link internally to the canonical form of your URL.
Make sure you always link to the proper canonical form of your URLs from within your site. This practice helps encourage external sites to link to the site using the correct version as well (since those linking to you often cut and paste from your pages or RSS feed.) Note there is a degree of diminishing returns here, so you don't need to spend the whole weekend hunting down every last URL. Just make sure to review your site's primary navigation, top landing pages and blog.
4.
Use Google Webmaster Tools to tell Google the correct form.
Implementing these best practices on your site are ideal, since they address the problem for all search engines and give your customers a consistent, properly branded navigation experience. But what can you do if you reviewed steps 1-3 and found that it would take six months to implement on your production site? There is something that you can do today: using Google's Webmaster Tools, you can navigate to the "Tools" section and select "Set preferred domain." Here you can specify if you'd like Google to use "www.website.com" or "website.com" in their index and search results, as well as consolidate links to both versions. Note that while this will provide you short-term benefit from Google, it does not help you in Yahoo! or Live Search.
Checking Your Website
To check your website to see if you're handling domain canonicalization correctly, you can use the Live HTTP Headers add-on for Firefox.
Open the Live HTTP Headers tool, then try all the variations of the URL at several different levels to ensure they all redirect back to the appropriate canonical form. As you're checking each variation, look at the HTTP headers using the Firefox plug-in to ensure they are all 301 redirects (and not, for instance, 302 redirects).
Here's an example test case:
Canonical URL Form Test Case Test Result
http://janeandrobot.com janeandrobot.com Success
janeandrobot.com/default.aspx Success
www.janeandrobot.com Success
www.janeandrobot.com/default.aspx Success
http://janeandrobot.com/about.aspx janeandrobot.com/about.aspx Success
www.janeandrobot.com/about.aspx Success
http://janeandrobot.com/folder janeandrobot.com/folder Success
janeandrobot.com/folder/default.aspx Success
www.janeandrobot.com/folder Success
www.janeandrobot.com/folder/default.aspx Success
http://janeandrobot.com/folder/test.aspx janeandrobot.com/folder/test.aspx Success
www.janeandrobot.com/folder/test.aspx Success
Examples
Canonicalization issues are very common and being an Microsoft employee, I don't have to go far to find an example. Check out the website for Microsoft's annual Mix conference for web developers.
I was able to generate the table below by plugging the common URL variations into Yahoo's Site Explorer to find a list of links to each variation.
URL Variation Number of Links from within website Number of Links from outside websites
http://visitmix.com 17,663 59,498
http://www.visitmix.com 9,074 22,179
http://visitmix.com/default.aspx 0 22
http://www.visitmix.com/default.aspx 0 12
Looking through these numbers yields some interesting insights:
*
Not doing "www" vs "non-www" is definitely hurting their ranking - you can tell because they have a similar number of inlinks for each version. Ranking is done on a logarithmic scale, so every additional link is more valuable than the one before. If they redirected all versions to one canonical form, search engines would see their home page has having 81,711 external links, would would be a substantial boost.
*
They are not good about using the same version of the URL within their site. If you're not cognizant of this on your site, others won't be either. It looks like they use visitmix.com about 75% of the time internally, and www.visitmix.com the other 25%.
Domain Canonicalization
Monday, December 8, 2008 at 9:08 PM Posted by Vasu
Labels: Website Navigation
Subscribe to:
Post Comments (Atom)
Blog Archive
-
►
2009
(1)
- ► 01/04 - 01/11 (1)
-
▼
2008
(153)
- ► 12/14 - 12/21 (2)
-
▼
12/07 - 12/14
(13)
- Googles Page Update Life Cycle
- Advanced Website Diagnostics with Google Webmaster...
- Domain Canonicalization
- URL Referrer Tracking
- Best Robots.txt Tools: Generators and Analyzers
- Leveraging Webmaster tools for SEO Success
- Do Search Engines Use Bounce Rate As A Ranking Fac...
- Yahoo! Search BOSS
- Pagination and Duplicate Content Issues
- 7 Ways to Tame Duplicate Content
- Keyword Cannibalization and How to Handle It
- We often discuss the search network and content ne...
- Why User Experience Is A Crucial Part Of Good SEO
- ► 11/30 - 12/07 (11)
- ► 11/23 - 11/30 (8)
- ► 11/16 - 11/23 (7)
- ► 11/09 - 11/16 (5)
- ► 11/02 - 11/09 (2)
- ► 10/26 - 11/02 (20)
- ► 10/19 - 10/26 (6)
- ► 10/12 - 10/19 (79)
0 comments:
Post a Comment