Google's Big Daddy


Javascript : Google's Big Daddy	Homepage \| Add to Favorites

Recommended Sites

Related Links

Informative Articles

Some Useful JavaScript Tricks
JavaScript can be one of the most useful additions to any web page. It comes bundled with Microsoft Internet Explorer and Netscape Navigator and it allows us to perform field validations, mouse-overs images, open popup windows, and a slew of other...

Taking HTML Further with Javascript
Once you've built your HTML pages, you might need them to do something a little more interactive on the client-side (that is, in the visitor's web browser). How can you do that? Javascript is the answer. How Does Javascript Work? To add...

WAP for Webmasters
What is WAP? The Wireless Application Protocol (WAP) is a standard technology that enables Cellular Phones to connect to the Internet and view stripped down Web Pages and Services. WAP uses similar concepts to the desktop Internet we are all...

Website Design and Programming - Introduction to Web Forms
There is practically no website without at least a form in one of its pages. Forms are useful to collect data from the website visitors and users. Once the user submits the form to the server, a form processing script must get the form...

What is Web Accelerator?
Web Accelerator is a feature offered on some Dial-Up plans. The service works by compressing text and images which in turn makes web pages load faster. An ISP's Web Accelerator feature can increase a web page's loading speed to five times faster...

Google's Big Daddy

Google has several data centers housing its index and anyone familiar with the Google dance will know what I am talking about. The dance is what occurs when one data center is not returning the same results as another. Someone searching for their keywords in LA will often have different results from someone searching for the same words in New York. The data is synching or dancing as SEO'ers have termed it.

A quote by Ph ilC at webworkshop sums it up nicely:

"Google has quite a few separate datacenters (DCs), each of which contain the entire index and the entire algorithms. To all intents and purposes, they are independent of each other. They don't all contain identical indexes, and they don't all contain identical algorithms (programs that do the rankings). It means that they often produce different results to each other.

When you do a search, you get the results from whatever datacenter Google chooses at that time. Unless you search a specific DC's IP address, Google chooses the DC to return the results from, and they choose it with every search you make, including when you click to get the next page of results. It's not uncommon for the next page of results to be provided by a different DC than the previous page of results."

The location of these DC's is important to any SEO'er as they can often be used to determine PR scores and ranking changes during an update. Chasing these updates is what we do. Living in Thailand these servers also allow me to see search results as I would in North America as the .co.th Google server is a bit slow at times propagating updates.

Now for the news. As Ma tt Cutt's pointed out on his blog, Google is readying a major change in the way it handles its data - dubbed appropriately, 'Big Daddy'. (For those who don't know, Cutts is a software engineer at Google and all around cool guys who shares SEO tips on his blog). The new BigDaddy data center contains new code for examining and sorting the Web, and once it has been tested fully, will become the default source for Web results, according to Matt. In a January 4 post on his blog, Cutts said that this might happen in early February or March of this year.

But what does Big Daddy mean to SEO? According to Rob Sullivan a well known organic search strategist at Enquiro: "If an algorithm update is like putting new tires on a car or installing a new stereo system, this BigDaddy is like putting in a whole new motor. They're totally revamping how Google works and resolving some long-standing issues with getting sites indexed properly." Among these long standing issues are:

* Canonicalization. This is a fancy search corp term describing how a search engine decides which of a series of related URL's is the proper one to insert into the Google index. * Duplicate Content. See my article from yesterday: Duplicate Content Penalties. * 302 redirects. This nefarious technique has long been used by black hat's to hijack search rankings by providing a redirect while still maintaining an innocent looking ranking description.

Now how Google will tackle these issues is a closely guarded secret but there's a twist. In the past Google's data center IP's changed almost daily facilitating a server hunt feverishly carried out on many SEO forums. This time around Google has opened the floodgates and Matt has publicly revealed a pair of server IP's on his blog for testing and feedback by the community. Matt posted the following IP's for testers: (66.249.93.104 and 64.233.179.104). Matt regularly discusses the future of search and has also detailed a new Google spider bot which is more flexible, quicker, and able to read javascript and flash files. The bot is built on a Mozilla browser and promises to read all non-text content.

"As Web technology develops and we get richer and more interactive Web sites, [the search engines] can't just stick with just indexing hyperlinks and text," Sullivan says. "They're going to have to do everything."

About the author:

Miles Evans provides indepth reviews on every SEO/marketing or killer app he can get his paws on. His reviews, essays, and tools on SEO, OLM, reporting, and other equally fascinating subjects are normally carried out at ProfitPapers.com - Stop by and check out the free backlinks page.