Support Bloggers' Rights!
Support Bloggers' Rights!

How to Videos & Articles:

Wednesday, August 23, 2006

Google Sitemaps: AKA Webmaster Tools Tutorial For the XML Challenged

Copyright © August 23, 2006 by Mike Banks Valentine

Google recently announced a change to their "Sitemaps" program. It went from a protocol meant for Python programmers and XML wizards to a much kinder, gentler (and friendlier to webmasters) program to help get all of your pages crawled and indexed. It's called "Google Webmaster Central. The tools can now be used and understood by most small business site owners.

Google explains everything and lists Sitemaps resources at:

To use Google sitemaps, you must first sign up for a Google Account. If you already use Google Adwords, Analytics, Gmail or other Google provided tools, you can use your existing account to submit a Google sitemap for your site. Get an account at the following URL if you don't already use Google services:


Many webmasters struggle to understand even the simplest HTML and meta tags and after visiting the Sitemaps program page when it was first announced in the summer of 2005, those small business site owners went away sadly shaking their heads and mumbling. They complained, "I can't even add PERL scripts to my own CGI bin and properly set permissions on page files - how am I going to install and debug a Python script on my server, run cron jobs and generate XML files?"

Apparently Google heard all that grumbling and came back with the newly released "Webmaster Central" to answer the concerns of excess complexity. They no longer require you to be a geek to get all your pages into their index. They've created tools to make the job of submitting all of your pages for inclusion in their index very much easier to handle.

The first listed "Site Status" tool lets you check indexing of your sites. If you enter an address into that search box and press the "Next" button, they'll return a page with a button labeled "Take me to Google Sitemaps" and encouraging use of the sitemaps tools, regardless of whether you've already submitted that sitemap or not. They'll list some minor details about the site entered such as:

Pages from your site are included in Google's index. Some of these pages are indexed without a title or deion. Googlebot last successfully accessed your home page on Aug 18, 2006

They list "Potential indexing problems" and then state:

More details about your site may be available By using Google Sitemaps, you can learn more details available only to site owners, such as:
  • errors Googlebot encountered while crawling your site
  • top search queries that return your site


Let's back up for a moment though. Webmasters have been told for ten years now to build a sitemap into their web site that lists all of their pages (if it is a small web site with under a hundred pages) or at least listing major sections of their site (if they have thousands or tens of thousands of pages.) So what is the difference here?

Google sitemaps are actually XML documents (not public html pages) that hold much more information about your web pages to help Google determine several things. They list the "priority" or importance, "last modified" dates, and "change frequency" of each page. But the creation of those documents had required webmasters to install that Python script on their server. Available at:

Or webmasters had to use third party software to generate the required XML file. Google recommends a brief list of sources for third party software to help them programmatically create the XML sitemaps:

I've personally tried several of those third party tools and found two of the web-based sitemap generators lacking, one of the downloaded software tools crashed my computer (and created havoc for me), so what is a small business owner without programming skills to do?

Those business owners who are non-programmer types and want to use Google Sitemaps complained that Google was favoring geeks over business owners. They wanted a simple way to submit all of their pages to Google without running cron jobs on their server and debugging Python scripts.


Google heard our grumbling and now allows simple lists of URL's in a plain text document. All you have to do is create that list of page files, save it as sitemap.txt and upload it to your server. Then you log in to your Google Webmaster Central (AKA Sitemaps) account and tell them the URL of your sitemap text document.

Before you submit your first sitemap URL on a domain, Google requires you to put a "site verification" meta tag on your site home page and click a "Verify" button to prove you own the site. Anyone with a Google account and access to your server can do this. You can add or remove any authorization tags placed by anyone with access to your server who is no longer authorized to see this data.


In the "Diagnostic" tab, there is a tool that will validate your robots.txt file, tells you which pages are restricted by that file and lists problem URL's and reasons for the problems. It also lets you make changes in a copy of your robots.txt file locally, which shows immediately how changes would affect the next crawl by all Google bots, including the Adsense and PPC landing page quality crawlers! They warn on that page that local changes don't affect your own robots.txt file and remind you to make the changes to the file on your server.

Another useful "Diagnostic" tool lets you set your preference for canonical URL's to include www or non-www versions of your site. (This last item shows how seriously the Google team takes this canonical issue.)


Under the "Statistics" tab in Webmaster Central is are "Query stats, Crawl stats, Page analysis" links with more data on your pages. The Query statistics show your top 20 search queries that searchers have used to find your web site and your top 20 click through queries. Those data tables provide some interesting and sometimes unexpected detail about how visitors find your site and allow you to further optimize and funnel those visitors. The "Crawl Stats" promises to show PageRank and distribution of PageRank throughout your site and in comparison to other sites.

The "Sitemaps" tab simply lists your submitted sitemaps for all your sites and shows the dates "Submitted, Last Downloaded, and Sitemap Status." The status tells you if there are errors, and what they were (not allowed, external site links, 404 error page not found, etc.) I've just submitted a new sitemap on a just reserved, created and newly posted site this week and will report back on how long it took for index inclusion on that site to record the effect of early sitemap submissions.


Finally, there is a "Tools" link in the upper right corner of the "Sitemaps" page which allows you to "Download Data for all sites", "Report Spam in Our Index" and a "Reinclusion Request" link to use if you've been banned for questionable techniques. Clearly, since you are doing all of this from within a Google account, you are openly providing Google with your information and making all spam reporting and reinclusion requests under your name from within a Google account. This suggests that you trust Google with all information they hold on your sites and any complaint made about search engine spam.

Currently there are ratings tools from within the Webmaster Central site to let you tell Google if you like the tools with a smily face, a neutral face and a frowny face. This may not last as the program comes out of beta, but lets you tell them what is useful and what isn't.

Still need some help? Try joining, reading, searching and posting to Sitemaps and Webmaster Central Google Groups. Posts from webmasters get back responses from knowledgeable members. Watch for the little green "G" logo for Sitemaps team members for particularly definitive and useful recommendations.

Want ongoing official Google blog posts about Webmaster Central?

If you've had trouble getting all your pages indexed and want to use those informative and useful webmaster tools and reports - give Webmaster Central a try.

Mike Banks Valentine operates SEOptimism, Offering SEO training of in-house content managers as well as contract SEO for advertising agencies, web development companies and marketing firms.

Content aggregation, article and press release optimization & distribution for linking campaigns.

Friday, August 11, 2006

LinkBaiting Advice From the Pros

Link Baiting: For Experienced Marketers Only (Don't Try This at Home!)

Copyright August 10, 2006 by Mike Banks Valentine

Link building sessions at Search Engine Strategies shows are always popular and draw some of the most substantial crowds from the ranks of webmasters and corporate SEO's attending conferences. But only one extreme form of link building can reel in the big fish and attract industry-wide attention or land whale sized national media for massive linking on the scale most web site owners can only fantasize about.

Rand Fishkin (appropriate name for LinkBaiting) of was first of a panel of experienced link bait fishermen to present on the topic at the Search Engine Strategies 2006 show. Fishkin recently landed a shark with excellent bait offered at his site he named the "Page Strength Tool," which helps webmasters determine the overall quality score, and can determine ranking factors for their web site beyond simple "PageRank." The tool magically reviews dozens of factors contributing to high ranking of sites and reports to webmasters where they're strong and what needs work.

Fishkin started by outlining the elements that lead to success. Step one to link baiting researching a sector link Worthiness by doing some discovery of "big" players in your field. Check delicious tags and technorati tags on your topic. Can your web site content be tweaked to appeal to pundits in this field?

He claimed that online viral public relations is another name for linkbaiting, since the term seems to carry negative connotations. He recommends selecting a content focus to meld branding and viral elements. Do keyword research to find popular phrases in your area. "most popular" as well as offer "most popular" areas Fishkin recommends as an undervalued resource to research areas of online buzz and interest. Look for elements that encourage linking.

Fishkin was followed by Cameron Olthuis of Advantage Consulting Services with a presentation titled, "Tracking your buzz - Because your reputation depends on it"

Olthuis emphasized how critical it is to monitor perceptions of buzz. Track your buzz using blog search engines and conversation tracking at popular message boards to follow public conversation about your site. Track the right terms to monitor how people perceive your buzz, Subscribe to RSS feeds using company name, company URL, competitors or industry related sites. He recommends that you figure out how best to leverage your buzz once it starts by watching what people say. If you have no knowledge of what is being said, the buzz you do gain could end up being nothing more than chumming for fish - You may attract them, but you'll never catch them with out hooking them after they show up.

Join any conversations you do find by commenting on blogs, respond to emails and by posting in forums, answering questions, because it keeps your buzz going and can lead to many more links. If buzz is negative, be sure to turn it into good any way you can. Take focus off of negative buzz if it happens by creating a different controversy. ClaimID is a personal reputation management company he recommended using to track online buzz.

Mentos Diet Coke fountain from YouTube video was discussed by Olthuis as an example of taking control of any buzz that starts spontaneously. For those unfamiliar with it, the video involves putting the candy into Diet Coke to create a fountain of soda. (Fizz in this case, rather than buzz). Mentos took advantage and funneled the traffic to a newly created contest to create your own Mentos Fountain. The most popular video, linked to by the home page of the Mentos web site, has been viewed over 6 million times. That video is below.

Another video that went viral, causing extensive buzz (both positive and negative) in the advertising world was the video pitch for Subway Sandwich national advertising. As of this writing, the account review has not been finalized, but the video below was done by to attempt to win the ad account. The negative buzz is coming from within the advertising industry for the most part (competitors to There is no word on whether the pitch worked and landed the Subway account for them. That video is below:

Embrace your buzz, regardless of good, bad or ugly. Measure it with backlinks, brand image, trends, new customers by using Yahoo Site Explorer, blog search engines Google GTrends Opinmind and Google Analytics. Learn rinse, repeat.

The next presentation was by Jennifer Laycock, editor of SearchEngineGuide. Her advice is "Give them something to Talk About" Why use link baiting and viral marketing? The cost is the in the idea, not the marketing. Any idea won't do, must be something worth talking about. Once you get that idea, there is almost no cost involved. The technique creates brand evangelists and gives people a reason to talk about your product. Because it is driven by passion, it creates a better conversion response at a rapid response rate.

She gave the example of the "Subservient Chicken" game on the web site from Burger King - did it sell any chicken? She emphatically claims that it is "Not about selling chicken, it's about branding and awareness." She observed that it had resulted in hundreds of millions of visits, with an average time on the site of over 7 minutes - unheard of for most sites. Laycock says it was about "making a brand cool." She asserts that this video resulted in a new demographic of web savvy, mostly young visotrs becoming suddenly interested in Burger King.

Lack of Brand Control is an issue she suggests is on the down side of viral marketing. Laycock warns that there is no control over who gets your message or how it is sent. Unbridled growth of a viral message and complete lack of contol over how quickly or even where you grow. The popularity is often hard to measure.

Laycock gave tips for creating the idea by suggesting that you ask yourself, "What sparks passion in my customers? What hasn't been done before? How will your idea benefit your users? Will your audience risk their own reputation on it? Ideas spread because they are important to the spreader not the originator. A good viral marketing idea is one that builds and works through relationships.

She emphasized that the point of linkbaiting is to "attract eyeballs." Successful link bait makes it easy to spread the word by providing tools or simple methods of sharing. This is one key to the success of videos that can be linked through Google video or YouTube.

Laycock recommends scalability be considered before launching and that you must be poised to act if things take off. She urges that it is critical to get beyond the idea itself to exploit motivators. She insists that "people want to be cool, so give them the chance" to do that with your link bait.

She gave the example of Gmail invites as one great method of allowing existing Gmail users to be cool, by having the ability to invite their friends to the service via a simple link in the mail interface. She proposes that linkbaiters "use existing networks and take advantage of other peoples resources.

Laycock had her own success in lainkbaiting through a fundraising site she began called "Lactivist," which was intended to raise money for a (mohter's) "Milk Bank" by selling t-shirts and seeking donations to support breastfeeding above giving formula to babies.

She used the site to next emphasize that "People are Talking and Linking" so that we need to pay attention to the impact of blogs. She recommends that everyone doing linkbaiting should "Understand the Impact of a good Post." and just as importantly, "Understand the impact of a Bad Post." She suggests that the way to "get people to do what you want is by arousing their desires."

People wanted to participate in the series and spread the word about the site. She emphasized the improtance of being intersted in others, learning their names, visiting their sites and doing active link building. Did the linkbaiting for Lactivisit work? Laycock claimed the site produced $2500 in profit and $1000 in donations for the Milk bank. More than 1,000 incoming links make up 75% o f the trafic to the site. Total unique visitors 36.500. She says the project produced a SearchEngineGuide promotional ebook for promotional purposes.

The final presentation was titled, "Gaining Visibility in the Goldnen Age of Links". Given by Chris Boggs of G3Group. He recommended what he called "Linkbaiting in a search engine friendly fashion" by contacting bloggers specific to your industry. He said, "You can't just put the worm on the hook, you have to through the hook out and do link building first, because link baiting" has become the holy grail of search marketing. Many hooks are baited, but are cast in shallow waters without big fish to notice or take the bait. You can't catch a marlin with worms cast on wimpy hooks bobbing into a back country pond.

He noted a bad example of reputation issues created by links gained from negative publicity, pointing to the Comcast customer service fuss by a blogger that had bad experience waiting for Comcast service rep to show up and then caught him sleeping on video during the visit. That bad PR had the video complainer ranked at number 5 in a google search for "Comcast customer service". Comcast has been doing some work to outrank the complainer video as it has dropped to position #8 as of this writing. Video shown below:

Boggs spent much of his presentation giving a long list of link baiting site examples from past few months. Notable among them was the "Church of the flying spaghetti monster" and the Air Force One graffitti story (a faked creation that imitated the presidential plane with graffiti spray painted across the wing mounted engine, supposedly done at Andrews Air Force Base.) Air Force One Graffiti video can be viewed at the creators web site,

Each of the examples given in this story produced extreme amounts of controversy or publicity and gained mainstream press coverage. Those sites that successfully create link bait are those that produces positive buzz, massive linking and great PR for sites that do it well. Each of the speakers gave great advice regarding link bait development that if followed, may lead to landing the big fish with just a big idea and not necessarily big budgets.

Wednesday, August 09, 2006

Google Dance Photos SES 2006 San Jose

Click any of the photos below for a larger view of that photograph.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

Several thousand Search Engine Strategies show and conference attendees lined up in a looping long line that wrapped around the huge entryway to the San Jose convention center waiting to board chartered buses to Google campus for Google Dance 2006.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

Google staffers (in the blue t-shirts) cheer and applaud arriving SES invited guests at the entrance to the Googleplex at the start of Google Dance 2006.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

One of the first areas Google Dance attendees see upon entry to the Google Dance is a row of booths where Google "white coats" demonstrated each of the many Google tools and services. This booth was the demo for Google Desktop Search. The Dt on the sign stands for "Desktop" in the style of the Periodic Table of the Elements used for the Google Dance or "Gd" and "Co" for Google Co-op.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

The sign at the entrance of the Google Dance for "Club G" where it was too dark to take any worthwhile photographs. But everyone knows what a dance looks like, so I'll leave that to your imagination.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

Revelers filled the exterior courtyard of this area of the Googleplex where beer flowed, barbecued food was plentiful, karaoke video (ya hadda be there - so no photos) and great desserts were served.

Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

An overview of the Googleplex courtyard were several thousand party goers enjoyed the hospitality of the Big G during Google Dance 2006.

Google CEO Eric Schmidt on Privacy, Outhouses & Proprietary Algorithms

San Jose Search Engine Strategies 2006

Copyright August 9, 2006 by Mike Banks Valentine

The sensitivity of search string data is suddenly on everyone's mind due to news of the AOL data leak on a research site this week. Search Engine Watch editor Danny Sullivan is set to interview Google CEO Eric Schmidt in the premier event of the Search Engine Strategies show in San Jose. Guess what his first question is? Give us your thoughts on search data privacy - (in so many words) while holding up a copy of the New York Times. Schmidt seemed not to flinch at the question he must have known was coming in this informal conversation format arranged by Sullivan.
Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

The leaked AOL data is actually from Google results, but AOL is clearly the bad guy here, since they leaked the info on search strings of 658,000 of it's members over a three month period and 20 million compromised searches. Sullivan pointed out that the New York Times had been able to trace the somewhat anonymized AOL member names through some very light detective work and found an AOL user whose data was part of the leak. That user was featured in a photograph on the front page and profiled. Sullivan made no connection to Google of that AOL blunder, but asked Schmidt to comment on search privacy and the sensitivity of the data.

Schmidt reminded Sullivan of the recent Department of Justice demand for all search records over a month's time and Google's refusal to give it up to DOJ. He suggested that Google works very hard to protect user data and they clearly do deserve kudos, both for refusing to give up the search data to the DOJ and for their stance on protection of that type of sensitive information. They went to court with the DOJ to take a stand on privacy of search data, rather than meekly turning it over as the other search engines did. That move gained Google some credibility which supports the famed "Do no evil" mantra of the company.

Schmidt recalled several cases of what he called "crazy people" who have been using Google search to plan and carry out all manner of nefarious deeds, from murders to suicide to stalking. He emphasized repeatedly that there is no way to stop bad people from doing bad stuff using search, but that Google is doing all they can to make it a safe environment for users. Schmidt also pointed out how Google makes it easy for users to remove their phone numbers from the Google search results. (Story for another time, but not so easy, and why is it accessible at all?).

Click fraud was next on Sullivans' agenda and he asked how Google was addressing it within the Adwords pay-per-click system. Schmidt steered clear of discussing the court case, calling it a "business negotiation done in the courtroom." He said they we're reluctant to discuss both user and advertiser data, and again, we're doing all we can to address click fraud.

Sullivan graciously backed off of the sensitive topic and moved on to the next news item of the week, the just announced Google partnerships with both MTV and MySpace. The MTV deal will allow Google to sell video ads surrounding clips of hit shows on the network. Early take-up may be only from major advertisers and it is not clear if the video ads will be sold with the automated system used for adwords, since the video ads will clearly need to be vetted for ratings and quality. The MySpace deal means that MySpace will show Adwords ads throughout communities on the social networking site and becomes the sole provider of search to all Fox Interactive Media sites in the deal.Photo copyright 2006 by Mike Banks Valentine
Photo copyright 2006 by Mike Banks Valentine

Schmidt took the opportunity to emphasize the value of targeted ads being shown to users in all venues when he pointed to the radio deals, video deals and the standard text ads used by publishers of each of those sources. Sullivan had asked about the value of tracking and metrics for advertisers using Google Analytics, allowing them to do conversion studies and understand the value of their ad spending.

Sullivan used that Adwords conversion stats issue to segue' into a more delicate topic when he asked Schmidt to comment on the value of Adsense to publishers, adding a caveat to that saying that since Adsense was so valuable to publishers, that it had lead to huge amounts of junk on the web from junky publishers filling web site pages with garbage in order to place the contextual ads across loads of informational, but mostly junky sites. He suggested those sites clog up search results and produce little value for users other than to click on ads. Schmidt replied simply, "These are the problems of success." He went on to explain that Adsense has far exceeded expectations seen for it.

Schmidt discussed the proprietary Google search algorithm and how carefully it was guarded within the company, claiming that it was so closely held that "I have chosen not to know it." That statement had many SEO's in the audience groaning with some sort of envy of Schmidt being in a position to know that tightly secured secret algorithm and chosing NOT to know about it. A kind of stunned silence was palpable among many in the room, although that statement must have gone right over the heads of the mainstream press in the room.

Sullivan asked about copyright issues within the "Book Search" and other uses of print and online text within Google searches. Schmidt commented that he had learned how fluid and open to interpretation copyright law was, but that their own legal staff had determined that they were completely within legal limits with book search and "Fair Use" rules within copyright laws. He stated flatly that Google "Must respect copyright" and emphasized their commitment to do so, while claiming that book search was within legal limits.

"Google's mission is to organize the world's information and make it universally accessible and useful." That statement comes from the Google corporate info page and is well known and often quoted. Sullivan moved on to a lighter note with a reference to the Visa slogan adapted to Google, saying, "Google, We're everywhere you want to be!" (Which coincidentally turns up a ZDnet forum post on Google instead of Visa's catchphrase, as number one search result at Google.) Sullivan noted that with all the Google products coming online, such as Gmail, search, spreadsheets, calendar, personalized home page, radio ads, video ads, text ads, shopping search, Google Maps, etc., that "You might as well give me the Google implant and be done with it!

Photo copyright 2006 by Mike Banks ValentinePhoto copyright 2006 by Mike Banks Valentine
While that got a good laugh from the audience and even from Schmidt, it did cause some reflection from audience members in my quick glance around to see heads tilting to the side in contemplation of the quip.

Sullivan wrapped it up by asking some lighter questions in rapid fire fashion for quick responses from Schmidt, but one question got a great response from the Google CEO when Sullivan asked, "How often do you personally search?" Schmidt quickly replied that he routinely searches between 50 and 100 times a day and that he uses searches to inform his business decisions. Schmidt then told the audience that he had been asked by someone once "Did you know that there are more outhouses than Tivo's?" To which Schmidt had some illuminating expanded comments. He later did some of his own searching to find that the question was in fact true previously, but that the trend had reversed and that now there are more Tivo's. He even had percentages to back it up.

That showed that he very much believes in search and uses it himself to, as he said, inform business decisions. Whether that means he is investing in Tivo or getting rid of outhouses is up for debate, but was an interesting tidbit that may be thrown around the press from the SES show coverage (I'm sure Schmidt will hope it goes no further than that).

Another Sullivan question, "What is your favorite Google tool?" to which Schmidt replied, "My favorite is the search tool, but beyond that, it is the 'Certified by Google' tool," referring to what has been called Gbuy by commentators. That allowed a bit of expansion by Schmidt on the new tool, which was clearly something he wanted to get a plug in for as a new service. Schmidt is a likable guy and everyone allowed him the plug as Google needs a bit of positive press due to the bad PR they are currently getting on "Certified by Google" with the eBay policy of not supporting the service.

Tuesday, August 08, 2006

Google, Yahoo & MSN Research Laboratories: What Makes it from Idea to Product?

Copyright August 8, 2006 by Mike Banks Valentine

The Research Laboratories session at SES San Jose 2006 brought representatives from the top 3 engines to talk about how projects emerge from their labs to become actual search tools. Each offered a different perspective and each seemed to have a differing emphasis on moving from ideas to products.

First up was Peter Norvig, Research Director at Google, who began by asking, "What comes out of research?" He suggested that most of the tools emerging from Google labs are developed in a "Bottom up fashion ... We have a bunch of engineers trying things out and some of them bubble up to the top." He gave several examples and revealed that one of the most popular publisher tools, Adsense, came out of looking for a way to monetize Gmail, the free webmail product.

He showed an example of factual search, "What is the population of Japan?" The answer of a Google search on that query produces a direct answer as the first result on the page. 127,417,244. Followed by the source link and more possible sources displayed below. Clear fact based questions can be drawn from authoritative sources, continually updated and displayed as "One Box" searches.

He discussed Machine "Statistical machine translation" based on a model of English documents online compared to model of other languages such as news stories done in differing languages as a source for reliable quality for statistical comparisons. Norvig proudly displayed results of National Institute of Standards and Technology (NIST) competition for this type of translation shows Google coming out on top. They do it by looking at same text in different languages using online info without anyone actually speaking the languages.

Moving to more challenging computational and algorithmic research projects, Norvig discussed work being done on image processing in an attempt at "face localization" to determine from group photos, where a photo was taken. Identification of people on the web can't be done so easily. The best they've reliably achieved is to determine if a face is that of a male or a female.

In what appeared to be an unintentional segue' Norvig had mentioned the image processing in his presentation and was followed by Bradley Horowitz, VP product strategy for Yahoo. Horowitz had studied Computer vision and imaging before his involvement in search and claims that the science had progressed incremetally over years. He found an improvement when he first viewed Yahoo's Flickr image tagging for determining photo content, "to avoid the heavy lifting of image processing algorithms. "People plus algorithms are greater than algorithms. This lead to emphasizing "Authority of Trust" of social search relying more on users than algo's. He sees engines finding ways to re-Introduce "content and metadata" as reliable sources of classification.

Horowitz emphasized his "areas of focus" on Community at Yahoo and stressing "Better search through people" and social media such as their social tagging site, and social photo site, Flickr. He also mentioned the importance of microeconomics of Information navigation and search, with emphasis on the user experience. He pointed out that there are "2.6 words on average in search box" Yahoo Answers. Ordinary people ask a question in natural language and ordinary people answer in natural language. Turnaround time of question to answer suggested within a day, sometimes within hours.

One function he wished aloud for is probably one many people would love to see from search engines, the ability to ask where the most convenient Starbucks is on his route to the conference. Norvig (of Google) had to be biting his tongue, since Google currently shows exactly that on Google Maps pages linked from an address query.

Horowitz wrapped up by discussing the utility and value he sees in Yahoo Answers pages and suggested those would be factored into the algorithm soon. He also reminded the audience of the recent promotional stuff about celebrities asking questions for people to answer. Stephen Hawking asked "Will the universe survive the next 100 years?" Which is, of course, NOT the "normal person asking questions" as described above by Horowitz, but a PR move by Yahoo bringing in celebrity questions.

MSN James Colborn Ad Center labs. Paid search environment. Take that data to the next level. MSN has higher conversion ratio, tools in Ad Center help do that better. Showing probability of Commercial Query (Microsoft 33% "Buy digital camera" 91% probability of commercial intent. tool for advertisers. is available for anyone to use any time. Keyword mutation tool shows mis-spellings. Acronym resolution expansion. "Looking for feedback". If there are things that you don't like, please feel free to tell us as well.

Questions from audience. "20% time bubbling up." How to determine where the lab should focus its time. Ideas from "20% time" allowed of Google engineers are first submitted to product management teams for review where they are voted on to move to the next level and into product development in the labs.

While little of any substance was revealed about possible new products from any of the engines research labs, it was at least illuminating to hear the differing emphasis and mindset of each company approach to research in the space. Now if they could only tell me where I left my keys in a "one box" result and get it out of beta before I'm late for the Google Dance tonight.

Can you Please Them All? Universal Search Engine Ranking Algorithms

Can you Please Them All? Universal Search Engine Ranking Algorithms. SES San Jose, 2006

Copyright August 8, 2006 by Mike Banks Valentine

Search engine specialists use to spend inordinate amounts of time creating pages that ranked well at just one search engine due to algorithmic weighting of known and very specific ranking factors. But with duplicate content penalties and increasing complexity and number of strongly emphasized factors converging, most SEO's are moving toward using tweaks to important pages, rather than what were once known as "Doorway pages" or alternately, "Hallway pages" meant for just one engine for dozens of search phrases per engine.

Most SEO firms now realize that the vast majority of referred search traffic comes from Google and that it is followed only (often at less than one-third the referral traffic) by Yahoo and then half as much again from MSN (with Ask trailing far behind at just fractional percentages of the referrals brought by the others). Therefore, most optimization efforts are spent toward making Google happy, and the others will mostly fall in behind by bestowing rankings at similar positions to those achieved atGoogle. Still, there are many interested in improving positions at Yahoo, MSN and Ask once they have achieved their best rank at the big G.

First speaker Aaron Wall of emphasized that algorithms are always in a state of evolution and offered a brief overview comparing observed ranking factors of each of the top search engines. Wall elicited a chuckle from the audience with the webmaster's quip, "A good search engine is one that ranks my sites well, a bad engine is one in which my site does badly." He suggested that there is "No such thing as a perfect algorithm." Wall asserts that, of necessity, SEO techniques evolve with the algo's. Because you rank well in one engine, it does not mean you'll do well in all.

Infrastructure or algorithmic changes may have unintended side effects. Wall mentioned Google Sandbox effect and suggested that it was really a side effect of an aging factor added to the algorithm, but that its' effect was positive overall to the index, so it was kept. He moved to discuss "Big Daddy" infrastructure effects, which for many webmasters meant large numbers of temporarily disappearing pages dropping from search results. That was widely discussed in webmaster forums when it resulted in wide swings of results for many until the index was able to readjust and settle over a few weeks. Many sites didn't regain positions they had before Big Daddy because they emphasized factors in their optimization that were downgraded by that major overhaul to the Google infrastructure.

Wall mentioned that new publishing formats can create algorithmic "holes" and gave two examples - Wikipedia and blogs. This was an "advanced" session, according to the conference schedule, so terms were not defined and it was assumed that most in attendance understood how different those publishing formats are. He also suggested that many will always attempt to game the system as new formats emerge.

Yahoo Focus was quite literal for years, but recently changed to be more like Google. "Nepotistic links" still working there. Bias toward commercial sites with their algo's. MSN newest to search and they entered when spam was already heavily gaming system. Google has biased toward information resources like .gov and .edu best at determining true link quality and bad links can hurt crawl depth. Places a lot of weight on domain level trust. Aggressive duplicate content filters. Google looks much more at linguistic patterns than the others and filters out some hyper focused pages. Some have called that "over-optimized".

He mentioned that ASK is not studied as much as others due to small size, so less is actually known about their algo's.

Dave Davies of Beanstalk SEO then took the stand, emphasizing items such as site architecture and URL importance on his first slide, showing standard SEO factors such as key content appearing higher on the page, above the fold. Heads around the room nodded as attendees agreed with the basics as he reviewed each item on the standard SEO checklist.

As he moved to the "code to content ratio" he claimed it was a sizable weighting factor for his clients. While not revealing names of those clients, he did show several slides of example sites with keyword phrases highlighted and circled on the pages. He claims that nested tables and complex table structure hurts many sites and that when Beanstalk switched to "table-less design" from table structures, that his own site saw immediate increases in ranking with absolutely no changes to content. He went on to discuss SE friendly URL's and "flat filing" of dynamic content.

Davies said that when optimiizing for separate engines - MSN is by far the easiest, then Yahoo then Google. But then he said, "Ranking on MSN is essentially useless though, so I'd rather be on page three of Google than on page one of MSN." He claimed that relative ranking results in far more traffic, even though few searchers go past the first two pages of results. Google has very much higher referral numbers, so rankings are worth more on page three of Google results than page one of MSN. Determining which engine to target first for top ranking is therefore, quite easy. Ranking on MSN search is still not as valuable as ranking well on Yahoo.

Other factors include age of a page, content adjustments, freshness, keyword density, How it fares in search results (clicks from result pages out to site), backlinks, visitor stats and user analysis. Traffic is better that way. Referrer analysis - where are they coming from. Which keyword phrases are actually converting? Path analysis important to determine what users do on the site and then comparing each of the engines for path analysis and which engines result in more referred users taking the most desired action from each engine. Do users referred from MSN result in sales or quote requests? He recommends doing that same analysis at each of the engines referring search traffic.

Next up on the panel of experts for "Can You Please Them All" session was SEO Michael Murray, VP Fathom SEO, who suggested that audience members make slow subtle changes to their important pages and overall to the site, and recommended against major reworking. Don't do a complete overhaul. Make some baby changes first to get rankings and go from there. He warned attendees, "You can't get all the rankings with one page. Think multiple pages to achieve results for different engines.

Murray provided a fun example of what it takes to woo a search engine when he said, "MSN is easy, it just takes a little kiss." The powerpoint showed the MSN name and a photo of a Hershey's Kiss. The next slide showed the Yahoo name along with a heart shaped box of chocolates and a couple talking. He said, Yahoo is fickle. It's sometimes slow but comes around. Then showed a slide for Google showing a box of long stemmed roses, a box of the best Godiva Chocolates and a wedding ring and said you've gotta completely romance them.

He emphasized important page structure issues and classifed them in terms of "Tiers: First Second and Third on page architecture." Showing page titles, headings, first paragraph use of keyword phrases. Make good use of the pages you have - Don't abandon longstanding pages for lower ranked terms, add more to
pages, rather than swapping new keyword phrases targeting a different term. He gave examples of sites switching content management systems, resulting in complete URL changes site-wide, then asking him, "What happened to our traffic?" A question many SEO's have heard from both new clients coming for initial SEO and long time clients who neglected to tell their SEO they planned the site remake. Few understand the ramifications of site redesign on ranking.

Murray then listed "Keyword development and Assessment tools" like Google 300, Wordtracker KEI, web analytics Sales data, charting performance, influence of root words. Derivitaves of words. No corporate names or bylines in title tags. Use of keywords far more important than the corporate name or catch phrase. Discussed getting ranking 3rd or 4th position on Google until the board asks for company name in the title tag and it dropped to #22. "Which actually made me happy!" He said he told them rank would drop if the title tag was used for branding and reported the new position to them.

Get ranking for important keyword phrases first, then adjust tags to include any additional info. He recocmmended that sites don't show "breadcrumbs" for sections of site in title tags, as is often done. Home > Brand > Model > Product is far less important than the actual item description. He recommended what he called "Page Freeze" for backup to previous tags when ranking is lost at any time. Backups of previous tags on well ranked pages should be saved so they can be reverted to those versions if rankings fall following a change to those tags. Home page is best bet for best words. Test tactics, go slowly.

The session turned into a list of best practices and SEO basics when all were done, but proved that it never hurts to pay close attention to what matters most. When top SEO's present that laundry list of issues, it emphasizes how important the basics are to top ranking. The title "Can You Please Them All" was essentially answered, "Yes, if you use best practices and target Google."

Monday, August 07, 2006

Leveraging Social Media, SES San Jose, 2006

Attending morning SES (Search Engine Strategies) sessions after a 4am wake-up, airport crowds, public transit shuffles and schlepping luggage all morning - I was a bit cranky by the time I finally arrived at the San Jose Convention Center for morning sessions. So please forgive my mood here, but the "Leveraging Social Search" session seemed a bit lightweight and without any solid suggestions for attendees.

First speaker Gary Stein, Director of Strategy, Ammo Marketing, mentioned the famous "Jib Jab" videos as viral social content and skimmed over stats from a PEW Internet study on why people say they blog. Top reasons many people do it for "Creative Expression" most do it for documenting experience. The third most popular reason was to share knowledge and experience.

That third ranked reason seems the best reason for marketers, the easiest to exploit for commercial reasons and the most likely to lead to valuable, lasting and worthwhile (marketable) content. Stein's role is that of a "Word of Mouth" or "WOM" marketer, so one can see why Jib Jab and positive blogger endorsements are near and dear to his heart.

Reputation management and true Word of Mouth (WOM) can span a very wide chasm between start-ups and potential success for the lucky product marketer - but there are also bad boys out there badmouthing vulnerable companies and products to extort "protection money" from nervous early stage companies. Many SEO's have been approached by anxious company reps seeking a way to overcome bad (high ranking) blog and forum posts for their company name or product trademark.

But the difference between suggesting that businesses go out and gain all those positive blog comments and actually getting those endorsements from bloggers in your marketing space are two very different things. All marketers would dearly love some clear and direct methods of gaining those social kudos online short of "Astro Turfing". (Supposed grass root marketing planted by WOM marketers.)

Next up on the panel was Scott Meyer, President and CEO,,who made a quick intro to his portion of this panel by offering 4 key (power) points.

  1. Success in social media equals Engagement plus authenticity times Target Audience Reach
  2. Look for the Riches in the niches Social media takes many forms
  3. Learn but don't be intimidated.
  4. Cede only as much control as you are comfortable with. (Protect your brand)
While he suggested that those points were the critical take-aways of his presentation, he did expand on them. He classified content as "mundane and not sexy" and emphasized it was editorially controlled, and thus not true social content. (I'll agree to the mundane label and add somewhat shallow as my own editorial comment on Meyer emphasized several advertiser tie-ins to content and pointed out the recent NBC Torino Winter Olympics 2006 event coverage by guide James Martin.

The social media label has been applied widely in this new space and more forms of that amorphous category are emerging every day. One of those emerging is the new "Plum" where entrepreneur Hans Peter Brøndmo is doing something that might be called a variation of or maybe where people "collect" stuff and tag it. The site is not officially launched as yet, but descriptions on the "learn more" page of the site suggest it will share aspects of both of those, plus a few more.

Brøndmo outlined social content creation with a reference to a variant of the famed 80/20 rule where 80% of content is created by 20% of users.

An aside here: I love that the top search result for the 80/20 rule or "Pareto Principle" is, since we just heard from the top man at, I classified it myself as "shallow" and it turned up while researching "Social Media" in a story on search engine strategies. Rich.
He suggests a variation on that at 90:10:1 meaning that 1% of people contribute content 10% participate in the dialogue (comment or discuss), while 90% are consumers only - suggesting what he called "Info-Voyeurism" when he said, "We like to watch." Brondmo suggests that "open Source Marketing" asks the question "Can you control a mob?" and proposed an answer of sorts by suggesting that you do that through "Trust" in a community or system.

Wrapping up presentations was Brian Monahan, VP, ITG Emerging Media Lab - Director of "user gnerated content practice" with what he referred to as the self expression of "Me" media. Monahan showed some free form video clips solicited from several video bloggers which were done in response to a questionaire provided to them.

Those video clips elicited several smart (and funny) remarks from the video bloggers in response to the questions presented to them. Monahan suggested that those respondents or content "Generators." He said study suggests that they were highly opinionated, crave recognition, were "class clowns," sarcastic & reactive rather than original. Not much input beyond that of "I like it or not".

The conclusions drawn by each of the speakers appear to be that user generated content and social media are powerful beyond belief and that it is changing marketing in ways we have yet to fully grasp. Attendees looking for ways to successfully fulfill the session title (Leveraging Social Media) probably went away hoping they can find a way to exploit social media and leverage it to advantage, but were not provided any true suggestions short of using the companies represented on the panel in one way or another to advertise or market. I'd say that they have failed to leverage this SES reporter's blog, for example. ;-)