SEO Column: December 2005

Thursday, December 15, 2005

AOL Reveals Top Searches for 2005

AOL Search announced the year's top searches based on the topics that received the highest volume of online queries on AOLSearch.com, the AOL.com portal and the AOL service during 2005.

"Millions of people search online through AOL Search for a wide spectrum of things, but there are those terms that are looked-up more frequently than others," said Jim Riesenbach, senior vice president of AOL Search and Directional Media. "From news and people that grab attention to popular products and common queries, the most searched for topics online during 2005 are a reflection of what was top of mind or what people wanted to find more information about."

"Lottery" emerged as the most searched word in 2005, followed by "horoscopes" (no. 2) and "tattoos" (no. 3). New search terms that emerged in 2005 include the addictive puzzle "Sudoku", the irregular minting of the new "Wisconsin Quarter" and the global music event "Live 8."

Here's the complete list:

Top Words: 1) Lottery; 2) Horoscopes; 3) Tattoos; 4) Lyrics; 5) Ringtones; 6) IRS; 7) Jokes; 8) American Idol; 9) Hairstyles; 10) NASCAR.

Top Celebrities: 1) Paris Hilton; 2) Oprah Winfrey; 3) Jessica Simpson; 4) Britney Spears; 5) Lindsay Lohan; 6) Pamela Anderson; 7) Angelina Jolie; 8) Jesse McCartney; 9) Hilary Duff; 10) 50 Cent.

Top Video Search Results: 1) 50 Cent; 2) Paris Hilton; 3) Tsunami; 4) Madonna; 5) Angelina Jolie; 6) Jessica Simpson; 7) Ciara; 8) Pamela Anderson; 9) American Idol; 10) Star Wars.

Top News Stories: 1) Natalee Holloway; 2) The tsunami; 3) Hurricanes; 4) Pope John Paul II; 5) Gas prices; 6) Live 8; 7) Terri Schiavo; 8) Earthquakes; 9) Rosa Parks; 10) The bird flu.

Top Celebrity Couples: 1) Brad Pitt and Jennifer Aniston; 2) Donald Trump and Melania Knauss; 3) Kenny Chesney and Renee Zellweger; 4) Brad Pitt and Angelina Jolie; 5) Jessica Simpson and Nick Lachey; 6) Bow Wow and Ciara; 7) Tom Cruise and Katie Holmes; 8) Pamela Anderson and Tommy Lee; 9) Prince Charles and Camilla; 10) Rob and Amber.

Top Athletes: 1) Danica Patrick; 2) Lance Armstrong; 3) Serena Williams; 4) Eddie Guerrero; 5) John Cena; 6) Maria Sharapova; 7) Derek Jeter; 8) Anna Kournikova; 9) Jackie Robinson; 10) Dale Earnhardt, Jr.

Top Bands: 1) Green Day; 2) My Chemical Romance; 3) Fall Out Boy; 4) The Killers; 5) Black-Eyed Peas; 6) Slipknot; 7) G-Unit; 8) The Wiggles; 9) Pussycat Dolls; 10) Good Charlotte.

Top Songs: 1) "We Belong Together" by Mariah Carey; 2) "Hollaback Girl" by Gwen Stefani; 3) "My Humps" by Black Eyed Peas; 4) "Candy Shop" by 50 Cent; 5) "Listen to Your Heart" by DHT; 6) "Laffy Taffy" by D4L; 7) "Gold Digger" by Kanye West; 8) "Disco Inferno" by 50 Cent; 9) "La Tortura" by Shakira; 10) "Obsession" by Frankie J.

Top TV Shows: 1) American Idol; 2) Big Brother; 3) Wheel of Fortune; 4) Survivor; 5) Oprah Winfrey Show; 6) Days of Our Lives; 7) Lost; 8) The Apprentice; 9) Today Show; 10) Good Morning America.

Top Movies: 1) Harry Potter; 2) Star Wars; 3) Napoleon Dynamite; 4) Dukes of Hazzard; 5) Fantastic Four; 6) War of the Worlds; 7) Lord of the Rings; 8) Batman Begins; 9) White Noise; 10) Constantine.

Top Cars: 1) Ford Mustang; 2) MINI Cooper; 3) Scion; 4) Chevrolet Corvette; 5) Ford GT; 6) Hummer; 7) Dodge Charger; 8) Porsche; 9) Chevrolet Camaro; 10) Honda Civic.

Top Toys: 1) Barbie; 2) Elmo; 3) Bratz; 4) Polly Pockets; 5) Dora the Explorer; 6) Thomas the Tank Engine; 7) Care Bears; 8) American Girl; 9) Legos; 10) Scooters.

Top Gadgets / Gadget Brands: 1) iPod; 2) Cell Phones; 3) Playstation 3; 4) Xbox 360; 5) mp3 Players; 6) XM Radio; 7) Laptops; 8) Palm Pilot; 9) Sirius Radio; 10) GPS.

Top Fashion Brands: 1) Louis Vuitton; 2) Coach; 3) Dooney & Bourke; 4) Baby Phat; 5) Nike; 6) Lacoste; 7) Uggs; 8) Steve Madden; 9) Bebe; 10) Juicy Couture.

Top Denim Brands: 1) Levi's; 2) Apple Bottoms Jeans; 3) Seven Jeans; 4) Lucky Jeans; 5) True Religion Jeans; 6) Wrangler Jeans; 7) Lee Jeans; 8) Pepe Jeans; 9) Evisu Jeans; 10) Diesel Jeans.

Tuesday, December 06, 2005

Page Hijack: The 302 Exploit, Redirects and Google

A page hijack is a technique exploiting the way search engines interpret certain commands that a web server can send to a visitor. In essence, it allows a hijacking website to replace pages belonging to target websites in the Search Engine Results Pages ("SERPs").

When a visitor searches for a term a hijacking webmaster can replace the pages that appear for this search with pages that (s)he controls. The new pages that the hijacking webmaster inserts into the search engine are "virtual pages", meaning that they don't exist as real pages. Technically speaking they are "server side scripts" and not pages, so the searcher is taken directly from the search engine listings to a script that the hijacker controls. The hijacked pages appear to the searcher as copies of the target pages, but with another web address ("URL") than the target pages.

Once a hijack has taken place, a malicious hijacker can redirect any visitor that clicks on the target page listing to any other page the hijacker chooses to redirect to. If this redirect is hidden from the search engine spiders, the hijack can be sustained for an indefinite period of time.

Possible abuses include: Make "adult" pages appear as e.g. CNN pages in the search engines, set up false bank frontends, false storefronts, etc. All the "usual suspects" that is.

# Googlebot (the "web spider" that Google uses to harvest pages) visits a page with a redirect script. In this example it is a link that redirects to another page using a click tracker script, but it need not be so. That page is the "hijacking" page, or "offending" page.
# This click tracker script issues a server response code "302 Found" when the link is clicked. This response code is the important part; it does not need to be caused by a click tracker script. Most webmaster tools use this response code per default, as it is standard in both ASP and PHP.
# Googlebot indexes the content and makes a list of the links on the hijacker page (including one or more links that are really a redirect script)
# All the links on the hijacker page are sent to a database for storage until another Googlebot is ready to spider them. At this point the connection breaks between your site and the hijacker page, so you (as webmaster) can do nothing about the following:
# Some other Googlebot tries one of these links - this one happens to be the redirect script (Google has thousands of spiders, all are called "Googlebot")
# It receives a "302 Found" status code and goes "yummy, here's a nice new page for me"
# It then receives a "Location: www.your-domain.tld" header and hurries to your page to get the content.
# It heads straight to your page without telling your server on what page it found the link it used to get there (as, obviously, it doesn't know - another Googlebot fetched it)
# It has the URL of the redirect script (which is the link it was given, not the page that link was on), so now it indexes your content as belonging to that URL.
# It deliberately chooses to keep the redirect URL, as the redirect script has just told it that the new location (That is: The target URL, or your web page) is just a temporary location for the content. That's what 302 means: Temporary location for content.
# Bingo, a brand new page is created (never mind that it does not exist IRL, to Googlebot it does)
# Some other Googlebot finds your page at your right URL and indexes it.
# When both pages arrive at the reception of the "index" they are spotted by the "duplicate filter" as it is discovered that they are identical.
# The "duplicate filter" doesn't know that one of these pages is not a page but just a link (to a script). It has two URLs and identical content, so this is a piece of cake: Let the best page win. The other disappears.
# Optional: For mischievous webmasters only: For any other visitor than "Googlebot", make the redirect script point to any other page free of choice.

What Is The Google Sandbox?

Before we get too far into an explanation as to what Google's sandbox is, it must be noted that not everyone even agrees that the sandbox exists. The sandbox is actually nothing more than a theory developed to explain what many different SEO experts have witnessed with their listings. Whether or not the sandbox really exists is actually irrelevant when we know that the effects of the sandbox exist.

In the age of fair competition you may find it hard to believe that Google hinders the appearance of a new website. This is what is currently believed to be happening on more web servers today. Many have viewed Google as uncomfortable to rank newer websites until they have proven their viability to exist for more than a period of "x" months. Thus the term "Sandbox Effect" applies to the idea that all new websites have their ratings placed in a holding tank until such time is deemed appropriate before a ranking can commence.

However the website is not hindered as much as the links that are reciprocated from other users. Newer links that are created are put on a "probationary" status until again they pickup in rank from other matured sites or placed directly by an ad campaign. The idea behind the hindrance is to prevent a fast ranking to occur on a new website. The usual holding period seems to be between 90 and 120 days before a site would start obtaining rank from reciprocal or back linking.

Is There A Way to Get Out of the Sandbox?

The quick answer to this is yes, there is a way out of the sandbox, but you will not like the answer. The answer is to simply wait. The sandbox filter is not a permanent filter and is only intended to reduce search engine spam. It is not intended to hold people back from succeeding. So eventually, if you continue to build your site as it should be built, you will leave the sandbox and join the other established websites.

Again, if your website has been placed in the sandbox you should use this time to your advantage. It is a great opportunity to build your traffic sources outside of the search engines. If you have a website that does well in the search engines, you may be tempted to ignore other proven methods of traffic building such as building a community, or building strong inbound links through partnerships. However, if you establish traffic sources outside of search engines, when you finally leave the sandbox, you will see a welcome increase in your traffic levels.

How To Play In Google's Sandbox?

Pay Per Click
There of course is AdWords, Google's pay per click advertising program. If you have a new site and are finding yourself caught in that aging filter to where your site will not show well in the Google SERPs, why not put aside a budget for an AdWords program? With AdWords, you can instantly gain exposure on Google as well as many search and contextual partner sites. This can bring traffic to your site as a direct result of people searching at Google or one of their search partners such as Ask Jeeves, Netscape, AOL as well as others that display AdWords on their sites.

Sure these will not be the free listings you may get from the organic results of Google but if you watch your bottom line and conversions, you might find that AdWords will bring about a very good ROI. Later on when you start to see your site showing well in the organic results, you can begin to back off of your AdWords campaign. Of course if AdWords is effective for you, you may just well continue both.

Other Search Engines
Don't discount traffic from other search engines such as Yahoo, Ask Jeeves and MSN. If you only focus on Google in your SEO strategy, you might miss valuable traffic that you can receive from these other sites, all of which do not seem to have any type of aging filters. Besides that, sites that have good "on the page" search engine optimization seem to do very well in these engines. Now Ask Jeeves is typically very slow to update its index but Yahoo and MSN are lighting fast about finding new or updated content and including it in their index.

Therefore do not neglect optimizing the various elements of you site's pages that these engines factor in to their algorithms - title tags, meta description tags and the actual html text on your pages. If you optimize these elements properly, you will most likely experience very good placement in these engines and as such will gain a good quantity of visitors.

Take Advantage of Established Sites
One thing we have recently began to test with new sites that we are providing marketing services for is to develop a profile page or pages that will give a brief summary of the client and their product and/or service. These are also optimized to target some of their most important keywords. We will then place this page or pages on an established site such as a directory we own or a case study section on our site - somewhere where it has the possibility of ranking well and sending the client some traffic. When they do finally begin to rank well in Google with their own site, the page or pages are no longer needed and can be removed.

A word of caution here - in doing this we are careful not to simply place duplicate content on another domain. I say that because I don't want people to think I am endorsing duplicate content or mirror sites. The pages or pages that are created need to be unique and not just copies of their own content.

Patience Is A Virtue
All in all, be patient. Don't continue to tweak and adjust your site hoping that you changes will thrust you on to the first page. Don't pull all the hair out of your head, cursing Google because they won't allow your site to rank well. Simply accept the fact that if you have a new site, it will take quite awhile before it will rank well in Google. This will allow you to be more at peace with your marketing efforts as well as have the foresight to look at other alternatives.

Submit your site before it is ready
As soon as you register your domain name, submit it to Google! Even if you have not built your site, or written an copy, or even thought about your content, submit your domain name to Google. In fact, even if you have not fully articulated your business plan and marketing plan, submit your domain name to Google.

Sunday, December 04, 2005

Search Engine Robots and Robot.txt

Many search engines use programs called robots to gather web pages for indexing. These programs are not limited to a pre-defined list of web pages, they can follow links on pages they find, which makes them a form of intelligent agent. The process of following links is called spidering, wandering, or gathering.

Controlling Robot Indexing
Robot spiders cannot index unlinked files, so they will ignore all the miscellaneous files you may have in your web server directory. Webmasters can control which directories the robots should index by editing the robots.txt file, and web page creators can control robot indexing behavior using the Robots META tag.

Following Links
Local search robot spider indexers locate files to index by following links, just like webwide search engine spiders. You specify the starting page, and these indexers will request it from the server and received it just like a browser. The indexer will store every word on the page and then follow each link on that page, indexing the linked pages and following each link from those pages.

Link Problems
They will miss pages which have been accidentally unlinked from any of your starting points. And spiders will have problems with JavaScript links, just like webwide search engine robots.

Dynamic Elements
Robot spider indexers will receive each page exactly as a browser will receive it, with all dynamic data from CGIs, SSI (server-side includes), ASP (active server pages) and so on. This is vital to some sites, but other sites may find that the presence of these dynamic elements triggers the re-indexing process, although none of the actual text of the page has been changed.

Most site search engines can handle dynamic URLs (including question marks ? and other punctuation). However, most webwide search engines will not index these pages: for help building plain URLs, see our page on Generating Simple URLs .

Server Load
Because they use HTTP, robot spider indexers can be slower than local file indexers, and can put more pressure on your web server, as they ask for each page.

Updating Indexes
To update the index, some robot spider will query the web server about the status of each linked page by asking for the HTTP header using a "HEAD" request (the usual request for an HTML page is a "GET"). For HEAD requests, the server may be able to send the page header information from an internal cache, without opening and reading the entire file, and so the interaction may be much more efficient. Then the indexer compares the modified date from the header with its own date for the last time the index was updated. If the page has not been changed, it doesn't have to update the index. If it has been changed, or if it is new and has not yet been indexed, the robot spider will then send a GET request for the entire page, and store every word. An alternate solution is for robot spiders to send an "If-Modified-Since" request: this HTTP/1.1 header option allows the web server to send back a code if the page has not changed, and the entire page if it has changed.

Duplicate Files
Robots must contain special code to check for duplicate pages, due to server mirroring, alternate default page names, mistakes in relative file naming (./ instead of ../, for example), and so on. Some search indexers have powerful algorithms to identify these duplicates and only store and search one copy.

Search engine robots will check a special file in the root of each server called robots.txt, which is, as you may guess, a plain text file (not HTML). Robots.txt implements the Robots Exclusion Protocol, which allows the web site administrator to define what parts of the site are off-limits to specific robot user agent names. Web administrators can disallow access to cgi, private and temporary directories, for example, because they do not want pages in those areas indexed.

The syntax of this file is obscure to most of us: it tells robots not to look at pages which have certain paths in their URLs. Each section includes the name of the user agent (robot) and the paths it may not follow. There is no way to allow a specific directory, or to specify a kind of file. You should remember that robots may access any directory path in a URL which is not explicitly disallowed in this file: everything not forbidden is OK.

This is all documented in the Standard for Robot Exclusion, and all robots should recognize and honor the rules in the robots.txt file.

Entry	Meaning
`User-agent: * Disallow:`	The asterisk (*) in the User-agent field is shorthand for "all robots". Because nothing is disallowed, everything is allowed.
User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /private/	In this example, all robots can visit every directory except the three mentioned.
`User-agent: BadBot Disallow: /`	In this case, the BadBot robot is not allowed to see anything. The slash is shorthand for "all directories" The User Agent can be any unique substring, and robots are not supposed to care about capitalization.
`User-agent: BadBot Disallow: / User-agent: * Disallow: /private/`	The blank line indicates a new "record" - a new user agent command. BadBot should uts go away. All other robots can see everything except the "private" folder.
`User-agent: WeirdBot Disallow: /tmp/ Disallow: /private/ Disallow: /links/listing.html User-agent: * Disallow: /tmp/ Disallow: /private/`	This keeps the WeirdBot from visiting the listing page in the links directory, the tmp directory and the private directory. All other robots can see everything except the tmp and private directories. If you think this is inefficient, you're right!
Bad Examples - Common Wrong Entries
use one of the robots.txt checkers to see if your file is malformed
`User-agent: * Disallow /`	NO! This entry is missing the colon after the disallow.
`User-agent: * Disallow: *`	NO! If you want to disallow everything, use a slash (indicating the root directory).
`User-agent: sidewiner Disallow: /tmp/`	NO! Robots will ignore misspelled User Agent names. Check your server logs and the listings of User Agent names.
`User-agent: * Disallow: /tmp/` `User-agent: Weirdbot Disallow: /links/listing.html` `Disallow: /tmp/`	NO! Robots read from top to bottom and stop when they reach something that applies to them. So Weirdbot would stop at the first record, *, instead of seeing its special entry. Thanks to Enrico Altavilla for pointing out this problem in my own robots.txt file!

How to choose your Search Engine expert?

There are lot of websites touting themselves as search engine experts who assure you of top rankings in search engines. The answers to these questions will help you a lot to find the SEO Expert.

1. What is the firm's area of expertise?
Do they optimize porn sites, affiliate sites, link farms or actual businesses like you and me? Request a list of clients they have done work for to verify this.

2. Are their optimization techniques a secret?
Search Engine Optimization is not a secret industry though some would like you to believe it is. SEO consists of a lot of research, optimization of pages, increasing of link popularity, knowing when/where/how to get your site indexed, which indexes are important for high rankings. Ask them for an overview of how they achieve their results. It is ok if you do not understand the technical jargon the point is to see if they will tell you. If they do not give you an explanation chances are they are trying to hide the fact that they are using illegal techniques.

3. Is Pay Per Click (PPC) part of their guarantee?
A good Search Engine Optimization strategy does not include PPC where you end up having to pay for the clicks generated to your website.

4. Can the firm guarantee that your site will not be penalized with removal from the search engines or directories index, or a possible Google PageRank penalty?
If not chances are they are playing Russian roulette with their client's websites. By using illegal techniques that may produce immediate surface level results but in the long run will end up getting the site banned or penalized by the major search engines.

5. Do they optimize more than your homepage?
A good search engine optimization strategy will include all of the pages in your website. It is more work for them but will produce the greatest results for you.

6. There is a difference between a visitor, and a targeted visitor. A targeted visitor will result in more sales and leads. Does the firm focus on more sales... do they say: "10,000,000 visitors in 1 month"?
You do not want "junk traffic" coming from porn sites or other non English speaking countries.

7. Do they create special new pages optimized for your key phrases that redirect to another page?
If the answer is yes you are probably dealing with a firm that creates "doorway" or "bridge" pages (although most companies will call them by different names). This technique violates the terms of service of most major search engines. Legitimate Search Engine Optimization providers do not need to create any doorway pages to produce results.

8. Is their guarantee unrealistic?
SEO providers who say they can guarantee a specific position like number 1or 2 in a specific search engine like Google or Yahoo are not being honest. It is
physically impossible to guarantee specific positions in a specific engine.

9. Are automated submission tools used or are all submissions done by hand? Automated tools can often penalize you and results in exclusion from the search engines.

10. Does the plan include the cost of all submissions and directory fees such as Yahoo, Looksmart and other directories and submissions or does it just cover optimization?

11. If the plan includes link popularity building are they using a link farm or contacting sites by hand and concentrating on link reputation more than link popularity?
Soon websites linked to from these link farms will be penalized and banned from the major search engines.

12. Does their technique involve showing a different page to the visitor and a different page to the search engine?
If the answer is yes then you are probably dealing with a search engine
optimization provider who uses "cloaking" techniques. Many search engines specifically warn against this technique in their terms of service. Google is particularly harsh on sites that use cloaking, and is known to remove them entirely when they find them.

13. Does the firm provide an ROI analysis and a goal of more revenues from your website, or are they simply focused only on rank?

14. Does their plan incorporate the following strategies?

Page optimization
Title optimization
Keyword optimization
Description optimization
Body optimization
Link optimization
Link reputation
Existing link analysis
Initial ranking report for your keywords
Monthly ranking report after optimization
Guaranteed minimum number of first page listings
Keyword research
Competitor analysis

15. Are the rankings guaranteed for more than 3 months?
A good guarantee should cover at least 6 months.

16. Which search engines do they work with?
Make certain that the positions the search engine optimization company has achieved are for the most popular search engines, not smaller engines for which they may have a knack.

17. What kind of search engine saturation can they achieve?
While it is easy to focus on one particularly impressive position on one popular engine, it is more important to focus on a broad range of positions achieved for one site. It is entirely possible for a site to have one great ranking and be sorely lacking in positions for all other keyphrases. Ask your potential search engine marketing company to show you a report for an individual client that demonstrates good positions on many popular engines for many popular keyphrases. An effective search engine optimization campaign will achieve maximum exposure across a broad range of keyphrases and engines, not one notable position on one engine.

18. Do they have direct relationships with any of the major search engines and search engine industry leaders?
Reputable search engine optimization firms will most likely have industry
affiliations. Fly by night operations do not because they have not been around long enough or they employ unscrupulous industry practices therefore other reputable firms will not affiliate with them.

19. Can you reach a real person to ask them these questions before you must commit your hard earned dollar?
Never do business with a firm you ca not talk to.

20. Do they eat their own dog food?
Ask them what key phrases they are ranked for and how effective their strategies have been for their own website.

DEEP WEB SEARCH

WHAT IT IS: Technology that boldly goes where no search engine has gone before.

WHY IT'S HOT: Google may have already indexed 8 billion webpages, but that's just the tip of the iceberg. Many more pages are hidden behind corporate firewalls or in databases waiting to be indexed. By some estimates, this so-called dark Web is 500 times bigger than the World Wide Web as we know it. Unlike the public Internet, however, it can't be retrieved by the usual Web crawlers. Instead, the information must be fed into search engines' mammoth databases using special retrieval techniques.

Before the advent of desktop search, our PCs were part of that invisible Web -- connected to the Internet but not indexed. File-sharing networks already search your PC for MP3s, but there are tricky privacy and security issues to resolve before your hard drive can join the visible Web. There are also millions of digitally transcribed books waiting to be connected. Ultimately, deep Web search could answer a direct question better than hundreds of links, because many of the most authoritative sources have yet to make it online.

KEY PLAYERS: Endeca, Glenbrook Networks, Google, IBM (IBM), Kozoru, and Yahoo.

Google will define the future of software

By George F. Colony, CEO, FORRESTER

Last year, I wrote that Google would fade. I said that search's lack of stickiness, combined with crushing competition from Microsoft and Yahoo! and a changing Internet, would mute the company's prospects.

I was wrong!

But not for the reasons that you think I'm wrong. You're reading in every business magazine on the planet right now about Google's lead in Internet advertising, the $100 billion valuation, all of the brilliant people joining the company, the company's strong financial performance, and how smart Sergey and Larry are.

All of that is true. But none of the conventional and obvious wisdom captures the real importance of Google. At the risk of sounding overly dramatic, I believe that Google will revolutionize the software business. It's a complex story, but I'll try to keep it simple . . .

Most of us use two types of computer software: 1) programs like Microsoft Word or Oracle Financials, and 2) Web files - like a corporate intranet or Amazon. You can perform many tasks with programs because they are executable; they contain millions of lines of computer code that enable you to underline words, calculate profit and loss, check spelling, etc. You pay for programs - ostensibly to defray the high cost of writing all of those lines of computer code.

In comparison, the Web files we use are like documents. When you click on a link for a site, the server at that site sends you pages of information. Web files are mostly static - they can't do much compared with programs. Their limited functions (buy buttons, search) don't execute on your computer - the server that you are connected to does most of the work.

So it's pretty simple: Programs can do things; Web pages are static documents. Here's where the plot thickens.

Forrester has predicted that Web pages will get replaced by programs - we call this executable Internet (X Internet). In the future, when you click on your bank's site, servers will download a program to your computer, not static pages. Once that program is installed, you will be able to "converse" with your bank, run financial models, analyze your net worth - do much more than you could have with old Web pages.

Google will be the company that leads this revolution. It is already writing programs like Google Toolbar and Google Desktop Search that run on your computer but blur the divide between your desktop and the Internet. And they are very powerful programs. Do a test. Search in Microsoft Outlook for an email from a friend - for me the search took 21 seconds. Then try the same search in Google Desktop Search: 5 seconds.

Google is also leading a pricing revolution. Google's programs are free, funded through advertising and syndication. This is a prescient move. I foresee a world in which even enterprise applications like financials, ERP, and supply chain software will be advertising-funded.

So here's Google's playbook: 1) have the best search; 2) have more of the world to search than anyone else through the digitization of university libraries, earth images, maps, etc.; 3) attract the most advertising and syndication; enabling the company to 4) give all of its software away for free; which enables it to 5) change the rules and economics of the software business and define the future through its pioneering work in X Internet.

What It Means No. 1: Large corporations should get Google executable Internet programs onto their corporate desktops. Google Desktop Search, Google Toolbar, and Google Maps will drive productivity. In addition, this move gets corporate IT ready for a world in which free executables will begin to proliferate. IT staffs will learn to incorporate Google's programs and application programming interfaces into corporate Web experiences.

What It Means No. 2: Google' stock price may not be insane. When you are restructuring an entire industry, your best years as a company typically lie ahead.

What It Means No. 3: Vista (formerly Longhorn) had better be fantastic, and Microsoft had better be able to re-spark its culture of derivative innovation. Bringing back stock options may be the first stop on that journey - as a way to re-attract the best and the brightest. I predict that Microsoft, under attack from advertising-funded software (and other factors like open source), will lose its monopoly-driven 25% net profits over the next several years, having to settle for 13%-15% nets (still astronomical compared with the average for most large corporations).

What It Means No. 4: The coming of executable Internet fundamentally changes the software and Internet landscape. Microsoft is an obvious loser. The closed, centralized architectures of Oracle and SAP will get a bunch of new salesforce.com-type challengers over the next five years. Amazon, AOL, eBay, and Yahoo! will be stuck with old Web-style experiences - not as easy, fast, and customizable as the executable Internet experience. That is why Google may be so dangerous for its Internet brethren - it knows programming and they don't.

In the past year, Google has proven to me that it is way more than just a great search company. It can jump into the program game - and play under a completely new set of rules: executable Internet and free. Unless Larry and Sergey lose focus and the company's charter devolves into esoteric pet projects, Google is going to change the world.

Google Advance Search Operators

The following is an alphabetical list of the search operators. This list includes operators that are not in Google's online help. Each entry typically includes the syntax, the capabilities, and an example. Some of the search operators won't work as intended if you put a space between the ":" and the subsequent query word. If you don't care to check which search operators require no space after the colon, always place the keyword immediately next to the colon. Many search operators can appear anywhere in your query. In our examples, I placed the search operator as far to the right as possible. We did this since the Advanced Search form writes queries in this way. Also, such a convention makes it clearer as to which operators are associated with which terms.

allinanchor:
If you start your query with allinanchor:, Google restricts results to pages containing all query terms you specify in the anchor text on links to the page. For example, [ allinanchor: best museums sydney ] will return only pages in which the anchor text on links to the pages contain the words "best," "museums," and "sydney."

Anchor text is the text on a page that is linked to another web page or a different place on the current page. When you click on anchor text, you will be taken to the page or place on the page to which it is linked. When using allinanchor: in your query, do not include any other search operators. The functionality of allinanchor: is also available through the Advanced Web Search page, under Occurrences.

allintext:
If you start your query with allintext:, Google restricts results to those containing all the query terms you specify in the text of the page. For example, [ allintext: travel packing list ] will return only pages in which the words "travel," "packing," and "list" appear in the text of the page. This functionality can also be obtained through the Advanced Web Search page, under Occurrences.

allintitle:
If you start your query with allintitle:, Google restricts results to those containing all the query terms you specify in the title. For example, [ allintitle: detect plagiarism ] will return only documents that contain the words "detect" and "plagiarism" in the title. This functionality can also be obtained through the Advanced Web Search page, under Occurrences.

In Image Search, the operator allintitle: will return images in files whose names contain the terms that you specify.

In Google News, the operator allintitle: will return articles whose titles include the terms you specify.

allinurl:
If you start your query with allinurl:, Google restricts results to those containing all the query terms you specify in the URL. For example, [ allinurl: google faq ] will return only documents that contain the words "google" and "faq" in the URL. This functionality can also be obtained through the Advanced Web Search page, under Occurrences.

In URLs, words are often run together. They need not be run together when you're using allinurl:.

In Google News, the operator allinurl: will return articles whose titles include the terms you specify.

author:
If you include author: in your query, Google will restrict your Google Groups results to include newsgroup articles by the author you specify. The author can be a full or partial name or email address. For example, [ children author:john author:doe ] or [ children author:doe@someaddress.com ] return articles that contain the word "children" written by John Doe or doe@someaddress.com.

Google will search for exactly what you specify. If your query contains [ author:"John Doe" ], Google won't find articles where the author is specified as "Doe, John."

bphonebook:
If you start your query with bphonebook:, Google shows business white page listings for the query terms you specify. For example, [ bphonebook: google mountain view ] will show the phonebook listing for Google in Mountain View.

cache:
The query cache:url will display Google's cached version of a web page, instead of the current version of the page. For example, [ cache:www.eff.org ] will show Google's cached version of the Electronic Frontier Foundation home page.

Note: Do not put a space between cache: and the URL (web address).

On the cached version of a page, Google will highlight terms in your query that appear after the cache: search operator. For example, [ cache:www.pandemonia.com/flying/ fly diary ] will show Google's cached version of Flight Diary in which Hamish Reid's documents what's involved in learning how to fly with the terms "fly" and "diary" highlighted.

define:
If you start your query with define:, Google shows definitions from pages on the web for the term that follows. This advanced search operator is useful for finding definitions of words, phrases, and acronyms. For example, [ define: blog ] will show definitions for "Blog" (weB LOG).

ext:
This is an undocumented alias for filetype:.

filetype:
If you include filetype:suffix in your query, Google will restrict the results to pages whose names end in suffix. For example, [ web page evaluation checklist filetype:pdf ] will return Adobe Acrobat pdf files that match the terms "web," "page," "evaluation," and "checklist." You can restrict the results to pages whose names end with pdf and doc by using the OR operator, e.g. [ email security filetype:pdf OR filetype:doc ].

When you don't specify a File Format in the Advanced Search Form or the filetype: operator, Google searches a variety of file formats, see the table in the File Type Conversion section.

group:
If you include group: in your query, Google will restrict your Google Groups results to newsgroup articles from certain groups or subareas. For example, [ sleep groups:misc.kids.moderated ] will return articles in the group misc.kids.moderated that contain the word "sleep" and [ sleep groups:misc.kids ] will return articles in the subarea misc.kids that contain the word "sleep."

id:
This is an undocumented alias for info:.

inanchor:
If you include inanchor: in your query, Google will restrict the results to pages containing the query terms you specify in the anchor or links to the page. For example, [ restaurants inanchor:gourmet ] will return pages in which the anchor text on links to the pages contain the word "gourmet" and the page contains the word "restaurants."

info:
The query info:url will present some information about the corresponding web page. For instance, [ info:gothotel.com ] will show information about the national hotel directory GotHotel.com home page. Note: There must be no space between the info: and the web page url.

This functionality can also be obtained by typing the web page url directly into a Google search box.

insubject:
If you include insubject: in your query, Google will restrict articles in Google Groups to those that contain the terms you specify in the subject. For example, [ insubject:"falling asleep" ] will return Google Group articles that contain the phrase "falling asleep" in the subject.

Equivalent to intitle:.

intext:
The query intext:term restricts results to documents containing term in the text. For instance, [ Hamish Reid intext:pandemonia ] will return documents that mention the word "pandemonia" in the text, and mention the names "Hamish" and "Reid" anywhere in the document (text or not). Note: There must be no space between the intext: and the following word.

Putting intext: in front of every word in your query is equivalent to putting allintext: at the front of your query, e.g., [ intext:handsome intext:poets ] is the same as [ allintext: handsome poets ].

intitle:
The query intitle:term restricts results to documents containing term in the title. For instance, [ flu shot intitle:help ] will return documents that mention the word "help" in their titles, and mention the words "flu" and "shot" anywhere in the document (title or not). Note: There must be no space between the intitle: and the following word.

Putting intitle: in front of every word in your query is equivalent to putting allintitle: at the front of your query, e.g., [ allintitle: google search ].

inurl:
If you include inurl: in your query, Google will restrict the results to documents containing that word in the url. For instance, [ inurl:print site:www.googleguide.com] searches for pages on Google Guide in which the URL contains the word "print." It finds pdf files that are in the directory or folder that I named "print" on the Google Guide website. The query [ inurl:healthy eating ] will return documents that mention the words "healthy" in their url, and mention the word "eating" anywhere in the document (url or no). Note: There must be no space between the inurl: and the following word.

Putting inurl: in front of every word in your query is equivalent to putting allinurl: at the front of your query, e.g., [ inurl:healthy inurl:eating ] is the same as [ allinurl: healthy eating ].

In URLs, words are often run together. They need not be run together when you're using inurl:.

link:
The query link:URL shows pages that point to that URL. For example, to find pages that point to Google Guide's home page, enter:

[ link:www.googleguide.com ]

Find links to the Google home page not on Google's own site.

[ link:www.google.com -site:google.com ]

location:
If you include location: in your query on Google News, only articles from the location you specify will be returned. For example, [ queen location:uk ] will show articles that match the term "queen" from sites in the United Kingdom.

movie:
If you include movie: in your query, Google will find movie-related information. For examples, see Google's Blog.

msgid:
If you include msgid: in your query, Google will restrict your Google Groups results to a newsgroup article with the specified message ID. For example, [ msgid: ] will return the article whose message id is .

phonebook:
If you start your query with phonebook:, Google shows all white page listings for the query terms you specify. For example, [ phonebook: Krispy Kreme Mountain View ] will show the phonebook listing of Krispy Kreme donut shops in Mountain View.

related:
The query related:URL will list web pages that are similar to the web page you specify. For instance, [ related:www.consumerreports.org ] will list web pages that are similar to the Consumer Reports home page. Note: Don't include a space between the related: and the web page url. You can also find similar pages from the Similar pages link on Google's main results page, and from the similar selector in the Page-Specific Search area of the Advanced Search page. If you expect to search frequently for similar pages, consider installing a GoogleScout browser button, which scouts for similar pages.

rphonebook:
If you start your query with rphonebook:, Google shows residential white page listings for the query terms you specify. For example, [ rphonebook: monty python Oakland ] will show the phonebook listing for Monty Python in Oakland.

safesearch:
If you include safesearch: in your query, Google will exclude adult-content. For example, [ safesearch:breasts ] will search for information on breasts without returning adult or pornographic sites.

site:
If you include site: in your query, Google will restrict your search results to the site or domain you specify. For example, [ admissions site:www.lse.ac.uk ] will show admissions information from London School of Economics' site and [ peace site:gov ] will find pages about peace within the .gov domain. You can specify a domain with or without a period, e.g., either as .gov or gov.

Note: Do not include a space between the "site:" and the domain.

You can use many of the search operators in conjunction with the basic search operators +, -, OR, " ." For example, to find information on Windows security from all sites except Microsoft.com, enter:

[ windows security -site:microsoft.com ]

You can also restrict your results to as ite or domain through the domains selector on the Advanced Search page.

source:
If you include source: in your query, Google News will restrict your search to articles from the news source with the ID you specify. For example, [ election source:new_york_times ] will return with the word "election" that appear in the New York Times.

To find a news source ID, enter a query that includes a term and the name of the publication you're seeking. You can also specify the publication name in the "news source" field in the Advanced News Search form. You'll find the news source ID in the query box, following the source: search operator. For example, if the search box contains [ peace source:ha_aretz ], then the news source ID is ha_aretz. This query will only return articles that include the word "peace" from the Israeli newspaper Ha'aretz.

stocks:
If you start your query with stocks:, Google will interpret the rest of the query terms as stock ticker symbols, and will link to a page showing stock information for the symbols you specify. For instance, [ stocks:brcm brcd ] will show information about Broadcom Corporation and Brocade Communications System. Note: Specify ticker symbols not company names. If you enter an invalid ticker symbol, you'll be told so and taken to a page where you can look up a valid ticker symbol. You can also obtain stock information by entering one or more NYSE, NASDAQ, AMEX, or mutual fund ticker symbols in Google's query box, e.g., [ brcm brcd ] and then clicking on the "Show stock quotes" link that appears near the top of the results page.

store:
If you include store: in your query, Froogle will restrict your search to the store ID you specify. For example, [ polo shirt store:llbean ] will return listings that match the terms "polo" and "shirt" from the store L. L. Bean.

To find a store ID, enter the name of the store and click on the link "See all results from store." You'll find the store ID in the query box, after the store: search operator.

weather
If you include weather in your query, Google will include weather for the location you specify. Since weather is not an advanced operator, there is no need to include a colon after the word. For example, [ weather Sunnyvale CA ] will return the weather for Sunnyvale, California and [ weather 94041 ] will return the weather for the city containing the zip code 94041, which is Mountain View, California.

Google Advanced Operators (Cheat Sheet)

Reciprocal link exchange softwares

Often you come across some new internet marketing software that claims to take all of the drudge work out of your daily routine. Well this one is no different, but after using it solidly for the past couple of months it is a tool that comes right up in my indispensable tool set. The tools that you just would not do without, given a desert island a laptop and wireless connection to make enough money for your boat trip home.

First up let me say that Arelis DOES NOT provide an automated reciprocal link exchange. If that is what you are looking for then it is not here. I suggest you try Duncan Carvers Reciprocal Link Exchange Link management Assistant which is one of the most full featured Reciprocal link exchange directories out there. It also has a paid for version so that you can remove the annoying footer.

Before you do that though I think you should understand the difference between the two beasts so that you know where you are headed.

Nope, Arelis is not a link exchange directory "per se". More a "link management tool" with some very nifty features that I am sure even the programmers did not realize would become so valuable when they started coding and selling it.

Here is your problem to be solved:

You need top search engine rankings but you are struggling because no matter how much you "tweak" your "on page" HTML it seems to make no difference to your rankings and whilst you can use webposition gold or webceo to compare your page to your competitors pages, nothing seems to happen to your rankings for all of your effort.

You have heard that linking is the answer to your prayers, but does it really work? And is it worth the effort? Which link software do I use and how do I go about it.

The answer to the first question is easy - yes it works and so are most of the SEO guys are working on it.

The difference between an automated link exchange directory and Arelis.

An automated link exchange is usually a php or asp script that uses a database to store the link results. The link exchange that powers this site is one of those. In addition they can also boast additional features like reciprocal link checking (checking that your partner has placed a link before adding the link to your directory), import/export features, mailout features and more.

The claim to fame for Automated link exchange software is that they remove the drudgery of administering links - for drudgery it is.

Arelis however allows you to build a link exchange directory from the ground up using many advanced search features and then creates your link exchange directory from the resultant data. In this way you can build a fully populated link exchange in a couple of days. (OK - you can capture the data in a couple of hours but sorting the data is a longer job).

Now then, what is so great about that?

Well, in the first instance you have a link exchange that you need to populate. The problem is that because your site is not popular yet (it ca not be, otherwise you would not be reading this ;-)) , you are only going to get a small amount of link requests and so whilst you may have an "automated exchange" it is not much use if no one is requesting to join the exchange.

In the second instance using Arelis you have built a database of say 200 potential link partners, published the database in HTML format and you can now start contacting your potential link partners with a request to exchange links.

The first method is a passive way of gaining links whereas using Arelis you are put in the drivers seat and you can aggressively build your link directory.

The key here is that by publishing your partners links first they have an incentive to link back to you. Many email requests that I get for link exchanges along the lines of "I will link to you if you link to me first" DO get binned immediately. With Arelis this would not happen unless you forget to publish your links pages quickly enough!

Using Arelis you can pop them into your database and publish the new links in an instant. Get active with your link exchange campaign.

1. Import links from any HTML page. So you have a competitor with hundreds of links? No problem, simply point Arelis at the link pages and tell it to import all of the links. If a site is linking to your competitor then they should be linking to you right? Quickly play the link game catchup.

2. Import a list of URL's. Gather up all of your "pending links" into a single list and import them. Then let Arelis go and hunt for the link text, description and do a whois lookup for the email address.

3. Import/export links. Define a project for each site that you want to gain links for. Want to build a new site? Export the links to a new project complete with all of the relevant detail. I use this technique to successfully gain links for new sites as I make them by emailing webmasters who have already linked to a previous site.

4. Template driven look and feel. No need for your links directory to look "plain Jane" Choose from four different layouts including "Yahoo style", "single page with subcategories" etc. Blend the directory into your site using existing CSS styles etc.

5. Find new link partners. Short of link partners? No need to be. Simply do a search by relevant keyword to find a bunch of potential suitors or find sites that link to your competitors and place them in the database.

6. Website preview feature. Split the screen in two and check out each potential link partner manually. Fantastic for seeing if your target has an "add url" page or automated script. Quickly adding the link and publishing your link pages is now a snap.

7. Email Templates. Your link request email can make the difference between a return of 15% (15/100 link exchange requests are successful) or 5%. These figures are based on my actual returns. A smart personalized link exchange email makes a big difference to your success rate.

8. Check links. If you are fanatical about checking your reciprocal links then this feature will check all of the sites who link to you for your physical link. No cheating!

Link building is hard work. Do not let anyone tell you that it is easy. You have a myriad of problems to overcome. Duff email addresses, a 1/10 take up of your link exchange request, sites that would not link to you, sites that have no link pages.

I mean you can build a database of 500 sites in a half hour if you wished using this tool, but if theses sites have no links pages then what is the point? You still have to manually inspect each site to make sure it is worth while contacting.

Whilst Arelis make some pretty advanced claims this software in no way does ALL of the work for you. It does save you tons of time and I mean tons. But you still have plenty to do and that is why link building is skipped over by many search engine optimization specialists even though - it is the most important jigsaw piece.

What Good is Duncan Carver's Link Management Assistant?

Your entire niche website directory is dynamically generated although will appear to be completely static to website visitors & search engines.
Create or import an unlimited number of categories and sub-categories. Add new categories or sub-categories at any time in any location. Shift an entire set of website listings from one category or sub-category to another with one click.
Manually add a new website listing with ease. Sort, manage & edit website listings by listing status (all, active, awaiting approval, suspended, premium, DMOZ imported) for all categories or a particular category or sub-category.
Set listing links to open in the same or new window. Define how many listings to display per page. Display listings alphabetically or by date added. Manually give existing (or automatically give all new) listings "premium" status and those specific listings will always be placed in the top group of their category. This is a excellent way to give strategic link partners an additional benefit & incentive to link to you over other non-reciprocated listings in your directory.
Requiring a reciprocal link on directory submissions is entirely optional. Turn it off if you just want to manage a standard niche website directory, turn it on if you want to build your link popularity & boost your search engine rankings by making reciprocal linking a requirement.
Limit the title & description fields of directory listings to however many characters you like. Option to require email confirmation from all new submissions to ensure no-one is spamming your submission form. Directory listing submissions can be set to automatically approve themselves (totally automated reciprocal link / directory management) or require admin approval.
If using the reciprocal link feature the Link Management Assistant will automatically check your reciprocal linking partners websites (at the interval you define) to ensure the link back to your own site remains in place.
If no reciprocal link is found, you can set the listing to suspend automatically (removes the listing from the directory but retains their information in case they put the link back up), and/or automatically email the link partner about the situation inviting them to place the link back up.
If after a pre-defined interval the link is still not found, then you can set the Link Management Assistant to automatically delete the listing from the directory and/or email the link partner to let them know they have been removed.
You can also give all link checking schedules a "grace period" and define how many unsuccessful attempts are required before these actions occur. That way, if their site is down for standard technical reasons, you won't be bothering them with such automated link management emails to add to their frustrations, particularly as your link probably still exists on their website.
All automatic system emails are completely customizable and can be totally personalized with the listing submitters details. You can bulk email all listing submitters, or specific listing status groups at anytime from the admin area.
Import an entire dataset into your directory in minutes using any flat file (text) database. This allows you to easily convert your existing website directory over to one powered by the Link Management Assistant.
Extremely Powerful - Open Directory Project (DMOZ.org) import feature allows you to import entire directory structures (including all categories & sub-categories and/or the listings contained within those categories) into your own website directory in just a matter of hours & often only minutes.
Using the above feature gives you the ability to create a tightly focused, "lived in" niche website directory containing hundreds (or thousands) of keyword rich, high quality content pages driving hundreds of additional targeted visitors to your site... at no-cost.

Best of all, this fully functional $197 application is yours to download FREE...

The hard and the most tough part of this software is it does not support WIDOWS....so sad????

W3C Compliance and SEO

From reading the title many of you are probably wondering what W3C compliance has to do with SEO and some may be wondering what W3C compliance is at all.

What Is W3C Compliance?

The W3C is the World Wide Web Consortium and since 1994 the W3C has provided the guidelines by which websites and web pages should be structured and created. The rules they outline are based on the best practices and while websites don't have to comply to be viewed correctly in Internet Explorer and other popular browsers that cater to incorrect design practices, there are a number of compelling reasons to ensure that you or your designer ensure that the W3C guidelines are followed and that your site is brought into compliance.

Some non-SEO reasons to take on this important step in the lifecycle of your site are:

Compliance help ensure accessibility for the disabled.
Compliance helps ensure that your website is accessible from a number of devices; from different browsers to the growing number of surfers using PDAs and cellular phones.
Compliance will also help ensure that regardless of the browser, resolution, device, etc. that your website will look and function in the same or at least a very similar fashion.

At this point you may be saying, Well that's all well-and-good but what does this have to do with SEO?

Proper use of standards and bleeding edge best practices makes sure that not only is the copy marked up in a semantic fashion which search engines can interpret and weigh without confusion, it also skews the content-to-code ratio in the direction where it needs to be while forcing all of the information in the page to be made accessible, thus favoring the content. Reduce the amount of code on your page and the content (you know, the place where your keywords are) takes a higher priority. Additionally compliance will, by necessity, make your site easily spidered and additionally allow you greater control over which portions of your content are given more weight by the search engines.

To be sure, this is easier said than done. Obviously the ideal solution is to have your site designed in compliance to begin with. If you already have a website you have one of two options:

1. Hire a designer familiar with W3C standards and have your site redone, or
2. Prepare yourself for a big learning curve and a bit of frustration (though well worth both).

Resources

Assuming that you've decided to do the work yourself there are a number of great resources out there. By far the best that I've found in my travels is the Web Developer extension for FireFox. You'll have to install the FireFox browser first and then install the extension. Among other great tools for SEO this extension provides a one-click check for compliance and provides a list of where your errors are, what's causing them and links to solutions right from the W3C. The extension provides testing for HTML, XHTML, CSS and Accessibility compliance.

Where Do I Get Started?

The first place to start would be to download FireFox and install the Web Developer extension. This will give you easy access to testing tools.

Once you've done these you'd do well to run the tests on your own site while at the same time keeping up an example site that already complies so you can look at their code if need be.

To give you a less frustrating start I would recommend beginning with your CSS validation. Generally CSS validation is easier and faster than the other forms. In my humble opinion it's always best to start with something you'll be able to accomplish quickly to reinforce that you can in fact do it.

After CSS you'll need to move on to HTML or XHTML validation. Be prepared to set aside a couple hours if you're a novice with a standard site. More if you have a large site of course.

Once you have your CSS and HTML/XHTML validated its time to comply with Accessibility standards. What you will be doing is cleaning up a ton of your code and moving a lot into CSS, which means you'll be further adding to your style sheet. If you're not comfortable with CSS you'll want to revisit the resources above. CSS is not a big mystery though it can be challenging in the beginning. As a pleasant by-product you are sure to find a number of interesting effects and formats that are possible with CSS that you didn't even know were so easily added to your site.

But What Do I Get From All This?

Once you're done you'll be left with a compliant site that not only will be available on a much larger number of browsers (increasingly important as browsers such as FireFox gain more and users) but you'll have a site with far less code that will rank higher on the search engines because of it.

To be sure, W3C validation is not the "magic bullet" to top rankings. In the current SEO world there is no one thing that is. However as more and more website are born and the competition for top positioning gets more fierce it's important to take every advantage you can to not only get to the first page but to hold your position against those who want to take it from you as you took it from someone else.

Saturday, December 03, 2005

Give Spiders a Tasty Treat!

Mechanical spiders have to eat. In fact, they usually have bigger
appetites than the real-life spiders you squish under your shoe. What
spiders am I talking about? The automated programs sent out by search
engines to review and index websites. These "spiders" (sometimes
called "bots") are looking for a reason to list your site within the
database of their particular search engine. It's hard work roaming
around the 'Net nonstop, and these little guys need some nourishment
from time to time. In fact, when spiders find some hearty "spider
food" (a.k.a. a site map with some meat to it) they sit down to stay a
while. That's a good thing!

You've probably seen many site maps. The standard ones look like the
example below with each phrase being linked to the page of the same
(or similar) name.

=====================
Home
About Us
Shipping Rates
Products
>> Small Appliances
------- Microwave Ovens
------- Can Openers
>> Dinnerware
------- Platters
------- Serving Bowls
Contact Us
Privacy Policy
=====================

Site maps are deemed "spider food" because they provide the perfect
place in which to crawl your site. Because a site map has links to
every page of your site (and those link names or page descriptions
often include keywords), it is extremely easy for the search engine
spider to access each publicly accessible area with no obstacles, and
relate it to a given subject matter. (For example, a page labeled
"microwave ovens" is most likely about microwave ovens.)

Some site owners think that's enough. They think a page with
keyword-rich titles and links is plenty for a hungry little spider to
munch on. Not hardly! That's not a meal... it's just a light snack.

Give Spiders A Tasty Treat

If you really want to fill the spiders' bellies, you'll want to
provide them with a "descriptive site map" (as I like to call them).
Descriptive site maps go beyond a simple list of links to your pages.
These special versions of the traditional maps also include a short,
keyword-rich description of each page. The text only needs to be a
sentence or two in length. An example is below. (The links would
remain the same as in the previous example.)

=====================
HOME - Home page for XYZ Depot, a home accessories outlet.
ABOUT US - Account of how XYZ Depot became the world's largest home
accessories outlet.
SHIPPING RATES - Shipping rates and delivery times.
PRODUCTS - Complete listing of home accessories offered.
>> SMALL APPLIANCES - Exciting selection of small appliances to save
you time in the kitchen.
------- MICROWAVE OVENS - Top-of-the-line microwave ovens from brands
you trust.
=====================

Descriptive site maps work well in attracting and satisfying spiders
because they include naturally occurring keywords. They also place
keywords in the vicinity of a link that points to the associated page.
Add these advantages to those that already exist including:

* having links in the body copy of the page
* overcoming complex navigation such as DHTML or Java
* lending quick access to pages located several layers deep within the
site
* assisting with usability for visitors (especially disabled visitors)
* and others

and you have prepared a huge feast for the search engine spiders that
is almost guaranteed to entice those hungry little critters to crawl
through every available page of your site.

Does every site need a site map? It certainly wouldn't hurt. Sites
with less than 20 pages or sites where most or all the pages have
links directly from the home page generally don't "need" a site map,
per se. However, practically every site can reap the benefits of a
site map.

If you're creating a site map for your site, don't stop with just the
basics. With a little added effort, you'll have a four-course meal to
serve the spiders that will keep them happy and satisfied, and
ultimately help provide you with exceptional rankings.

All About Sitemap

What is a sitemap ?

A sitemap is a collection of hyperlinks that outlines and defines the website's overall structure. It provides both the end user and the search engine spider / crawler with the opportunity to navigate or index your web site more efficiently.

Why is a sitemap important for web site ?

A sitemap plays an important role in terms of search engine optimization (SEO) as it enables the spider to crawl your site more efficiently plus it provides them with a clear overview to the content's of your web site. Google has indorsed the need for including a sitemap on your web site and has introduced an option called Google Sitemaps "which is an easy way for you to help improve your coverage in the Google index".

Please note before creating a sitemap make sure your page titles and meta descriptions clearly reflect the contents of your individual page contents as this will help both your visitor and the Bots understand what each document contained in your web site is about and index, rank it accordingly.

Where can I download a sitemap component :

A number of very good sitemap components have being developed which you can install. The best one I found which free is available at http://www.sitemapdoc.com/

Google Sitemap Generator.

The Google site map generator will automatically generate an XML sitemap file of your web site which you can submit to the Google sitemap web site. For more information click Google sitemap generator.

SEO Column