Tom Lauck’s Deseloper.org

RIA Myth Busting: Back Button, History, and SEO

author:

I have touched on the topic of SEO in the past, however, that article the focused on a broad range of ideas to improve organic search.  I wanted to focus in on realities surrounding RIAs and commonly requested features, specifically to the combination of back button support, bookmarking, and search engine optimization.  All of these features would be wonderful in a single page interface, and with current technology and methodology, is it possible to have our cake and eat it too?

It is important to first have an understanding of the difference in objectives between a web application and a web site.  (Technically speaking, most web sites are web applications, but we are using the term in a looser sense).  Typical examples of an application in this scope would be GMail, Google Reader, and the user section of Mint.com – sites with almost no need of searchable content.  Examples from the opposite end of the spectrum would be Bloomberg, A List Apart, Wired, and someone’s blog.  Therefore, we have two camps, one where the sole focus is on interaction with data and no search strategy is needed and the other dictates a solid search strategy.

A method that has been gaining steam in the RIA world is using a hash sign (#), or anchor, in the URL.  Many talented people have spent precious time creating solutions to history and back button support for AJAX and Flash applications.  This is fantastic for a web application, because it provides capability for standard user interactions supported in browsers that are typically unsupported in rich internet applications.

So now that there is history support, does that mean SEO has been fully considered?  An article I found on w3.org sheds some light on the subject using a CNN video player as a case study:

CNN uses links like the above for all the topical video segments that are published on its site. The URL in this case has the following components:

Component Value
Protocol http
Host www.cnn.com
Path video
Client Param #/video/tech/2008/02/19/vo.aus.sea.spider.ap

2.1.1 Things To Note

The browser is expected to do a GET of the URL leading up to the fragment, and the processing application, in this case, the JavaScript embedded in the HTML Response processes the portion of the URL following the #.
The fragment identifier has been intentionally identified as a client parameter.
Treating it as a regular fragment identifier in this usage would result in one incorrectly infering that the URL for the video resource being addressed is http://www.cnn.com/video.
This would result in all the video links on the CNN site getting the same URL.
Thus, the entire URL in this case is http://www.cnn.com/video/#/video/tech/2008/02/19/vo.aus.sea.spider.ap
A consumer of this URL who goes looking for an idwithin the Response that matches the #-suffix of this URL will fail.
The reported Content-Type for the resource is text/html. However the behavior of the #-suffix in this case is not defined by the HTML specification.
As used, the #-suffix is a first-class client parameter in that it gets consumed by a script that is served as part of the HTML document returned by the server upon receiving a GET request.
This embedded script examines the URL available to it as script variable content.location, strips off the # and uses the rest of the prefix as an argument to function that generates the actual URL.
Having constructed this content URL, the script then proceeds to instruct the browser to play the media at the newly constructed location.

Notice that “the browser is expected to do a GET of the URL leading up to the fragment…JavaScript embedded in the HTML Response processes the portion of the URL following the #.”  To paraphrase, Google does not look at client side interactions, the fragment is truncated from Google’s index.  From this several assumptions can be made:

  1. Any back link using http://example.com/#example is actually viewed as http://example.com
  2. Back links pointing to URL fragments will have no individual page rank.
  3. In content rich scenarios, the use of URL fragments in leu of separate pages effectively dilutes almost all search traction.

Reflecting on how Google treats URL fragments, it can be clearly seen that a single page interface is not an effective strategy in scenarios with rich content.  Another big myth around single page interfaces is in the use of Flash, SWFAddress and/or Flex’s history manager.  Google will disregard URL fragments, the very foundation of SWFAddress and Flex’s history manager.  To reiterate, Googlebot just disregarded the URLs you have just crafted with SWFAddress.  It should be stated that some individuals wholeheartedly believe that using URL fragments is a successful SEO strategy.  Yet, when Google is typically the number one returning visitor, do you really want to take a chance at questioning the very foundation Google uses to spider your site?

Take for instance a designer’s personal site with tabs for Home, Resume, Portfolio, and Contact.  Would the designer want to implement a single page interface?  The answer would likely be no.  The content would gain more traction if it is separated properly.  To rehash the example of Google Reader, a single page interface is a good choice, for Google Reader would not benefit from having a separate page for each feed a user is subscribed to.

The advances made in RIA with regard to history and back button support encapsulate the innovative spirit that the web has embraced.  However, web workers tend to jump on bandwagons and this filters down to individuals with the power to poorly implement a technology – remember Flash intro pages?  Much like a seasoned web worker becomes very business and client savvy after years in the field, we need to be Google savvy.  Threfore, next time a wirefrime for a single page interface lands your desk, does the content dictate a search strategy?  If so, do some research on the reality of the solutions and ultimateley be kind to Google and Google will be kind to you.

Oct 22 2008

Ten SEO Tips

author:

Lately, many have been asking about SEO. With the economy in a little slump of sorts, I am assuming people are for one feeling the pinch and realizing the selling value of the internet. The end of the age of bad flash sites and complex tables is in sight! For many years, big companies have been playing the pragmatic SEO game, but now the average joe is finally jumping on the bandwagon and willing to invest in SEO and SEM. In my opinion this is big, because small businesses are willing to give up valuable revenue to the relatively uncertain territory of SEO and SEM.To accompany my recent observation of those with small businesses I have compiled a brief list of essential SEO tips:

  1. If you use Flash, please give alternate content. Sure Google can “spider flash” now, but do you really trust the outcome? And how is any search ever going to get high quality content from this?
  2. Keep your markup clean.
  3. Use heading tags well. So don’t overuse/abuse h1′s and h2′s – I usually use each once on a page.
  4. Use markup tags for their intended purpose.
  5. Make sure page titles are used and content rich.
  6. Use keywords and descriptions when you have something important to say about that page. Just like your momma told you, if don’t have anything good to say don’t say it all.
  7. Use URL rewriting. If you are an SEO, in my opinion, URL rewriting should be your best friend. In strong accordance, be familiar with HTTP/1.1 status codes and know how to use them.
  8. Use and be familiar with robots.txt, the X-Robots-Tag, the “rel” attribute, and sitemap.xml files.
  9. Use webmaster tools from one or all of the major players (Google, Yahoo, and Microsoft).
  10. Use Google Analytics at the very least. For the most accurate metrics, combine Google Analytics with server analytics like AWStats, and other tools.

For the more advanced readers, please feel free to add to or suggest alternatives. And for the rest, this list is not meant to be a be all and end all, but – just like one’s wedding day – a mere beginning on a long journey.Now for the shameless plug…If you need some assistance implementing an SEO plan, drop Vovéo or myself a line, we will be glad to help :)

Feb 18 2008

Semantic Headers and Menus (CSS of Course)

author:

UPDATE: I’ve since rethought the implications of this post and I wanted to add that using h1 as the company name is the right choice is certain scenarios. I simply wished to provide an alternate method.

First it has been a very long time since my last post. My excuse is that I moved…and I’ve been a bit lazy. I felt the need to address a topic that seems to have been brought up to me several times over past couple weeks: CSS site navigation.

When structuring and styling menus, a definition list (dl) should be used for the site title and navigation.

The first roadblock some have with this method is that they feel the h1 tag must be used for the site title. However, using the h1 for the site title (or company name in many cases) is redundant. Your h1 (although many web builders view it improperly as a semantic whore and over use it), has great value in SEO. Because the h1 is suppossedly the most important content on a specific page, why would one waste it something like Widgets, Inc. when it could be used for maybe a content rich headline: Widgets Inc Provides Solutions for All Widgets Manufacturers. I thought the following experience was very valuable

As I observed a blind web user navigate through a few pages, he reported that hearing the h1 content on top of the page was boring and redundant for him. Because his screen reader read the content of the title element first, the title element served as the actual title of the document for him, and the h1—which merely repeated the content of the title element—was useless.

ALA – October 09, 2006 – Working with Others: Accessibility and User Research”

If a blind human and his screen reading software sees the h1 used as the site title as redundant, I wonder how it impacts how a search engines screen reader views your site’s content. Clearly, using the dt instead of the h1 for the site title is not a bad thing whatsoever.

Secondly, if you look at the meaning of a definition list you will see that it fits perfectly with the notion that you are defining your site and there direct relationship between the items (navigation). And to reiterate the advantage of not using h1 for your site name, the site navigation is now minimized along with the site name – since both are redundant and not content rich.

Lastly, stlyling your header is now a breeze. And your html structure won’t give you indigestion due to enclosing your site name, navigation, and possibly one other element (perhaps search) in a wrapping div.

Here is the markup for your reference (or view a mockup page). Using this method, you can easily create dropdown menus by nesting another unordered list (ul) inside of the list item (after the anchor tag).


<dl class="sitedefinition">
	<dt><a href="/">Company Name, LLC</a></dt>
	<dd>
		<ul>
			<li><a href="/company/">Company</a></li>
			<li><a href="/solutions/">Solutions</a></li>
			<li><a href="/strategy/">Strategy</a></li>
			<li><a href="/clients/">Clients</a></li>
			<li><a href="/partners/">Partners</a></li>
			<li><a href="/contact/">Contact</a></li>
			<li><a href="/">Home</a></li>
		</ul>
	</dd>
</dl>

Sep 27 2007