How to improve SEO for a SPA which is using .NET back-end?

When Google and other search engines index websites, they don’t execute JavaScript. This seems to put Single Page Application (SPA) — many of which rely on JavaScript — at a tremendous disadvantage compared to a traditional website.

If you’re running an SPA with content that you’d like to appear in search results of Google and other search engines websites then you have to index your content. Historically, AJAX applications have been difficult for search engines to process because AJAX content is produced dynamically by the browser and thus not visible to crawlers because they cannot execute JavaScript. The browsers can execute JavaScript and produce content on the fly – the search crawler cannot. To make the crawler see what a user sees, the server needs to give a crawler an HTML snapshot, the result of executing the JavaScript on your page. HTML snapshot allows the web server to return to the crawler this HTML created from static content pieces as well as by executing JavaScript for the application’s pages.

Solution:

Here is an image made by Google depicting the setup of how a crawler index AJAX crawling scheme enable application using HTML snapshot and improve Search Engine Optimization (SEO).

Ajax Crawler Diagram (Graphic by Katharina Probst)

Ajax Crawler Diagram (Graphic by Katharina Probst)

In your SPA replace the hash fragments (e.g. #myForm) to hash bang (e.g. #!myForm).

For example, replace

www.example.com/index.html#myForm

to

www.example.com/index.html#!myForm (which could be available to both crawlers and users.) 

How do we create different hash bang for different contents in the same URL of a SPA?

If you are using KnockoutJS you might use SammyJS or PagerJS to support hash fragments. See http://stackoverflow.com/a/9707671/798727 for how to use it.

If you are using AngularJS the ngRoute module is available in the framework itself. See http://stackoverflow.com/a/16678065/798727 for how to use it.

When the crawler see the hash bang (#!) it knows that the site support AJAX crawling scheme on it web server. You have to provide the crawler with an HTML snapshot of this URL, so that the crawler sees the content. How will your server know when to return an HTML snapshot instead of a regular page? The answer is the URL that is requested by the crawler: the crawler will modify each AJAX URL such as

www.example.com/index.html#!myForm

to

www.example.com/index.html?_escaped_fragment_=myForm

There are two very important reasons why hash bang is necessary:

  • Hash fragments are never (by specification) sent to the server as part of an HTTP request. In other words, the crawler needs some way to let your server know that it wants the content for the URL http://www.example.com/index.html#!myForm (as opposed to simply http://www.example.com/index.html).
  • Your server, on the other hand, needs to know that it has to return an HTML snapshot, rather than the normal page sent to the browser. An HTML snapshot is all the content that appears on the page after the JavaScript has been executed. Your web server return the HTML snapshot for http://www.example.com/index.html#!myForm (that is, the original URL!) to the crawler.

When the crawler see the hash bang it replace it with the “_escaped_fragment_” before making the request to the web server to index that page. For example

www.example.com/index.html?_escaped_fragment_=myForm.

The web server sees the “_escaped_fragment_” in the URL it knows that the request is from a crawler. The web server will then redirect the request to the headless browser to serve HTML snapshot from the server.

How to create HTML snapshots on the web server?

If you are a .NET developers you could use ASP.NET MVC with PhantomJS. Create an [AjaxCrawlableAttribute] which will redirect all request with “_escaped_fragment_” in the query string to the HtmlSnapshotController. The HtmlSnapshotController will load the PhantomJS.exe to create HTML snapshot. You can get the PhantomJSexe in the Nuget gallery. Please see this article for detail implementation steps  http://stackoverflow.com/a/18530259/798727.

If you do not want these headache of creating, maintaining & scaling HTML snapshot on your web server check out the following online SaaS,

  1. Brombone is using nodejs, PhantomJS, Amazon AWS SQS, AWS EC2, and AWS S3. BromBone supports sites that use HTML5 pushState urls instead of hashbang urls. They do not offer free trial plan, but they offer a no questions ask money back guarantee.  If you have any question contact Chad DeShon (Founder of Brombone) on Chad@brombone.com. Check them on out http://www.brombone.com.
  2. AjaxSnapshots has multiple snapshotting servers on Amazon AWS which has a Java based dispatcher that sends requests on to one of the PhantomJS based headless servers. They use Amazon AWS SQS, AWS EC2, AWS ELB for load balancing and AWS S3. They got free trial plan. They also claim that PhantomJS script that they run benefits from many modifications they have made to deal with corner cases that trip up naive implementations. If you have any question contact Robert Dunne (Founder of AjaxSnapshots) on support@ajaxsnapshots.com. Robert also wrote a nice summary of which search and social bots are snapshot aware http://blog.ajaxsnapshots.com/2013/11/googles-crawlable-ajax-specification.html. Check them out on https://ajaxsnapshots.com.

How to test headless browser content?

It’s highly recommended that you try out your HTML snapshot mechanism. It’s important to make sure that the headless browser indeed renders the content of your application’s state correctly. Surely you’ll want to know what the crawler will see, right? To do this, you can write a small test application and see the output, or you can use a tool such as Fetch as Googlebot. A .NET developers could use NHtmlUnit. NHtmlUnit is a .NET wrapper of HtmlUnit; a “GUI-less browser for Java programs”. It allows you to write code to test web applications with a headless, automated browser.

Google put the following steps to make your SPA crawling,

  1. Indicate to the crawler that your site supports the AJAX crawling scheme.
  2. Set up your server to handle requests for URLs that contain
  3. Handle pages without hash fragments
  4. Consider updating your Sitemap to list the new AJAX URLs

To see the details implementation of the above steps click Guide to AJAX crawling for webmasters and developers. You might find Making AJAX Applications Crawlable useful too.

Summary

In summary, starting with a stateful URL such as http://www.example.com/index.html#myForm , it could be available to both crawlers and users as http://www.example.com/index.html#!myForm which could be crawled as Using modern headless browsers, we can easily return the fully rendered content per request by redirecting bots on web servers. 

References

In 2009 Google released the idea of escaped fragments.
http://www.singlepageapplicationseo.com
http://www.branded3.com/blogs/javascript-back-buttons-seo-dont-mix
http://diveintohtml5.info/examples/history/fer.html)
http://googlewebmastercentral.blogspot.com.au/2009/10/proposal-for-making-ajax-crawlable.html http://stackoverflow.com/questions/18530258/how-to-make-a-spa-seo-crawlable
https://developers.google.com/webmasters/ajax-crawling Services
http://www.brombone.com/
https://ajaxsnapshots.com/configGuide#Tellingsearchenginesyouprovidesnapshots
https://github.com/prerender/prerender

Reviewers

A special thanks to Chad DeShon (Founder of Brombone) and Robert Dunne (Founder of AjaxSnapshots) for reviewing this blog.

Posted in Javascript, Single Page Application

Microsoft TechEd Australia 2013

TechEd is Microsoft’s ultimate technology geek fest for IT Professionals and Enterprise Developers looking to explore a broad set of Microsoft technologies, tools, platforms and services. This year TechEd Australia celebrated 20 years. I felt lucky enough to be able to attend TechEd and watched many top Australian and international speakers. This was my third TechEd and as usual very happy with the outcome.

This years the highlights for me was following,

  1. There were some great talks where you learn a lot about the products first hand. Some of my favorite talks were,
  2. I was invited for lunch with Scott Guthrie (@scottgu), Corporate VP Microsoft. Microsoft gave us red polo shirts with ScottGu JSON print on it. We have to wear these polo shirts to enter the venue to meet Scott Guthrie. I spoke to Scott about the modern Windows Azure (Skynet in 10 years). After the lunch Scott did two impressive talks on Building Real World Cloud Apps with Windows Azure Part 1 and Building Real World Cloud Apps with Windows Azure Part 2.
    Diganta Kumar and Scott Guthrie at TechEd2013

    Diganta Kumar and Scott Guthrie at TechEd2013

    Scott Guthrie lunch polo shirt with JSON print as ticket

    Scott Guthrie lunch polo shirt with JSON print as ticket

  3. I met Mads Kristensen (@mkristensenfrom) Program Manager, Visual Studio, Microsoft. I spoke to him about VS2013 new features and how he developed Browser Link feature and added AngularJS intellisense support to the IDE. Mads is also the author of Web Essentials which is open source project in GitHub. Mads did two talks on What’s New in Visual Studio for Web Developers and Extending Visual Studio.
  4. I met Brady Gaster (@bradygaster) Program Manager, Windows Azure SDK, Microsoft. Brady was the first person from Microsoft I met who like to talk about BDD (Behaviour Driven Development). We got along very well and continued our talk later at pub. If you are interested in BDD, Brady recommended SpecFor. Check it out! Brady did two talks on SignalR – Why Should Web Pages Have All the Real-time Fun and  Windows Azure Web Sites and On-Premises.

    Brady Gaster and Diganta Kumar at TechEd 2013

    Brady Gaster and Diganta Kumar at TechEd 2013

  5. I also met Frank Arrigo (@frankarr), Principal Technical Evangelist, Microsoft. We spoke about my Windows 8 app I developed last year. We also spoke about how Windows 8 is not picking up within the government agencies. May be, the government agencies will be more keen with Windows 8.1 as it has a start button now. Windows 8.1 is releasing on 18th October 2013.

    Frank Arrigo and Diganta Kumar at TechEd 2013

    Frank Arrigo and Diganta Kumar at TechEd 2013

  6. On Thursday Microsoft took all TechEd attendees to the Movie World theme park at 6.15pm to 9pm which was a good break from continuous hammering of my brain with all new products and technologies.

Like every year this year again TechEd provided a nice backpack and a special bottle which makes dolphin sound when you drink water from it.  All attendees also received a free copy of Office 365 Home Premium.

TechEd 2013 Bag, Bottle & Office365

TechEd 2013 Bag, Bottle & Office365

Posted in Conference, Microsoft

Microsoft Windows 8 #AppFest Sydney 2013

In the Microsoft  Windows 8 #AppFest Sydney 2013  I built a App for Legal Aid NSW.  The video below show the design steps and the app demo.

Twitter reviews on the presentation,
AppFestSydneyTweet1

AppFest Sydney Tweet response

Tagged with:
Posted in Windows 8
Follow

Get every new post delivered to your Inbox.

Join 240 other followers