The Unfair Advantage Book on Winning The Search Engine War
- Chapter One - Titles & Keywords
- Chapter Two - Meta Tags
- Chapter Three - Tricks & Illusions
- Trick #1 - Keyword spamming and stuffing a.k.a. spamdexing
- Trick #2 - Invisible or Semi-visible text
- Trick #3 - Pointer Pages
- Trick #4 - Redirect Page
- Trick #5 - Make your keywords work double duty
- Trick #6 - Program your site to be found even when people are looking for something else.
- Trick #7 - Using the <!...comment line>
- Trick #8 - ASCII, Numerical, and Alphabetical text order
- Trick #9 - Turn images from liabilities to ASSETS with the SE's
- Trick #10- The secret of the Phantom Pixel
- Trick #11 - The ole Bait & Switch Technique
- Trick #12 - The Food Technique
- Trick #14 - The Hidden Form Tag
- Trick #15 - Hidden Links
- Chapter Four - How to analyze your competition's pages before you design your own
- Chapter Five - How to set up several entrances to your Internet Storefront
- Chapter Six - The In's & Out's of the TOP SEVEN Major Search Engines and Directories.
- Lycos - www.lycos.com
- Alta Vista - www.altavista.com
- Inktomi - NBCi.com - MSN.com and Yahoo
- HotBot - www.hotbot.com
- Direct Hit www.directhit.com
- Google www.google.com
- Northern Light - www.northernlight.com
- Excite - www.excite.com
- WebCrawler - www.webcrawler.com
- Yahoo - www.yahoo.com
- Open Directory Project - www.dmoz.org
- Chapter Seven - High Tech pages and Search Engines
The examples we use in this book are real... they are not theory. Most, in fact, have been taken from our own experience with very successful web sites and web pages that we have developed for ourselves and our clients -- as well as other examples of actual circumstances that we have found on the world wide web.
Several researchers and web designers within our company have spent thousands of hours accumulating this information over the past three years and update it the first of every month. It is written from a we point of view in an effort to reflect the collective effort that has gone into the original text as well as the continuous updates that keep this book on the cutting edge of today's competitive search engine positioning war. We mention this so that you do not underestimate the degree of time and ongoing research that continues in order to produce for you this valuable, effective, and up to date source of information.
Please realize there is no such thing as a magic silver bullet that will vault you to the top of the search engines... if there was, everyone would know it, they would do it, and it would cease to work because of saturation. Fortunately for you, this is not the case. Instead, success on the Internet comes from doing many tiny little things exactly right... and the process is an ever changing science that this book and our Newsletter updates continuously reveal. If you wish to be successful on the Internet, expect to spend some time learning the secrets contained in this book. The rewards will be worth it... as you will become armed with an arsenal of tools that will leave your competition in the dust.
Be sure to read the book completely before you start constructing or making changes to your web site and be aware that you may already know some of these techniques... especially in the beginning because we start with the most basic and then move on to the more complex techniques. If you already know the basics, your knowledge will give you an added advantage because you'll already have that experience to build on. This will enable you to comprehend the value of the more subtle, yet most valuable refinements contained in this book - as well as give you a foundation on which to build this additional knowledge and expertise. Rest assured, you will learn many tricks and techniques that you did not know... By keeping an open and intuitive mind you will be more likely to find those subtle changes that will make a HUGE difference in your web site's positioning on the Search Engines.
On the other hand, if everything in this book is new to you... that's ok too. We are starting from the beginning with the basics. Assuming you are comfortable with your computer and have some idea of what the Internet is about... and aren't afraid to experiment with HTML documents (web pages) you will succeed... And... even if you are an Internet Beginner this stuff will make a tremendous degree of sense to you once you begin to familiarize yourself with the Internet. In fact, you'll even have a leg up on most of the so-called pros.
This book, in actuality, is divided into three sections.
Section One - Chapters One through Five - general section. In Chapters One through Five you will learn all of the known tricks and strategies that have been used with search engines over the past three years. This is important because all of them, in one form or another, still appear on pages listed with virtually every SE. It is also important because some of them can cause your pages to be penalized or even banned from certain SE's.
Section Two - Chapter 6 - we get into the specifics of what is currently working on each of the 8 major search engines.
Section Three - Chapters 7 and 8 - talks about the best search engine software tools as well as gives you tips on web design aspects such as frame style web pages, java script, and cgi generated pages -- as well as some parting information that we feel is important enough to include in this book.
Although this is the most basic information.... It is impossible to exaggerate the importance of your web page's <TITLE> tag content in designing your web site. The first rule to remember is:
Title is the most important aspect of web page design in respects to scoring well on most search engines.
Reason #1: Most SE's look first for keywords that are contained in the Title tags of your web page.
Reason #2: Remember that you are also attempting to appeal to a person who is seeking your information, product or service in addition to attempting to appeal to the search engine (SE) indexes.
When a search is requested through any search engine, the SE looks for the search word(s) contained within the <TITLE> tags first -- and usually gives preference to web pages that have that specific word(s) within the <TITLE></TITLE> tags. Therefore, it usually imperative that you insert the search words (AKA, keywords) that your potential customers are likely to use when looking for your service between the <TITLE> your title here </TITLE> tags in your HTML code.
For example, if you own a Bed & Breakfast in Hanalei Bay, Hawaii - Island of Kauai that is called Kiluhana Inn do not use that as your title. If you do, your business will be handicapped in a search and buried by knowledgeable competition.
A better title would be:
<TITLE> Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii</TITLE>
Reason #1. The words Hawaii, Beach, Bed, Breakfast, Hanalei Bay, Kauai are all keywords in your <TITLE> that people are likely to look for when searching for this type of service. In addition, the words Hawaii, Beach, Hanalei, Kauai are all words that may cause people who need your service to find you even when they are not looking for you. For instance, if someone did a keyword search for hanalei kauai your service has a very good chance of showing up toward the top of the search results.
Reason #2. Some search engines and many directories give slight priority to alphabetically correct web pages. Although this is a secondary consideration, whenever possible, try to start your <TITLE> with a letter that starts early in the alphabet.... and of course whenever possible an A.
To summarize: Make sure every word in your title is one that is likely to be used by a person when doing a keyword search for your business or service. AND... unless the name of your business is prominently recognized -- something like KODAK Film, it does NOT belong in the Title tags.
Secret: On some engines, it may be worthwhile to use multiple <title>'s in your page to try to increase the relevancy. There are also many examples of people using very long titles to achieve the same thing. Here is the format for that trick, but be aware, you should only use this technique if you find it working for other pages that you are competing against for top positioning.
<title>Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii</title><title>Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii</title><title>Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii Bed & Breakfast Kauai - Hanalei Bay & Beach - Hawaii</title>
If you use this, put everything all on ONE line if your text editor will allow it.
TIP: Some HTML programs such as Microsoft's FrontPage will put your <title> tags AFTER your <meta> tags. You don't want this to happen because your results with the engines will be much better if your <title> tag appears directly after the <head> tag on your page. For example: <html><head><title>Put your title here</title>. FrontPage 97 and earlier versions will attempt to put the tags back in the old order the next time you edit the document, so you may want keep an eye on this. Frontpage 98, however has corrected this problem.
Warning - Extremely long titles may cause some browsers to crash and will be difficult for your visitors to use in their bookmarks. Also, most of the engines will quit reading after a set amount of characters are reached.
Next step: BEFORE you design your web page (site), make a list of every possible search word (keyword) and phrase (keyphrase) that your potential customers might use when looking for your information, product or service. Don't waste your time starting a web site until you do this. It is critical. Don't forget to include the common misspellings of your keywords and synonyms and pay particular attention to noun phrases.
Next, use these search words and phrases to find your competition on the top eight search engines. Once you find who you will be competing with for the Top Ten positions in your keyword search, scour THEIR web pages for more search words that you may have overlooked ...by the way, these are called keywords and from here on, we will refer to them as such.
Tip: If you want to be in the Top Ten of a search engine, find out who is on the Top Ten of each of these SE's and then build a better page. Later we will show you how to analyze your competition and then make your page go one better .
Simply put, you must work these keywords into your body text according to the specifications of each search engine.
For example, if your service is a location sensitive offering, then be sure to mention the location in the text at every opportunity... For instance, if your motel is in the city of Port Angeles... a normal sentence might read:
The Hill Haus Motel boasts an unlimited panoramic view.
A better sentence would be:
The Hill Haus Motel in Port Angeles boasts an unlimited panoramic view.
Even if the reader already knows it is in Port Angeles
Here's an example of some keywords that we put into a client's page that vaulted him to the top of many Hawaii specific B&B searches:
Hawaii Bed & Breakfast on Oahu's Waimanalo Beach. This Hawaiian Bed & Breakfast is tucked away on the quiet side of Oahu Island away from the tourist side of Honolulu across from the famous hawaiian waimanalo beach where James Michener wrote his novel, Hawaii.
hawaii bed breakfast oahu waimanalo beach Hawaii Bed Breakfast Oahu Waimanalo Beach hawaii bed breakfast oahu waimanalo beach Hawaii Bed Breakfast Oahu Waimanalo Beach hawaii bed breakfast oahu waimanalo beach Hawaii Bed Breakfast Oahu Waimanalo Beach hawaii bed breakfast oahu waimanalo beach
Secret #1 -- Some SE's will favorably place repeating keyword text and some SE's will ignore it or, worse, penalize you for it. In Example A we used a relatively normal statement type sentence and jammed it with keywords that a human SE quality control inspector would not be too likely to object to... (as in the case of YAHOO!) nor would a programmed SE computer ignore (as in the case of AltaVista) ....
In Example B we simply repeated the main keyword phrases. Our research has shown that some SE's are programmed to ignore, or penalize, keyword text that is repeated within a tagged section on your page. (Notice that in this case, these keywords are enclosed in <H6></H6> tags). In Internet terms, this technique is called spamdexing or keyword stuffing . More about the ethics of this later... suffice it to say for now that there are acceptable and unacceptable ways to do this... and, remember that your business survival on the Internet may depend on you using this technique in an acceptable fashion with certain SE's.
Secret #2 -- Here's a trick that you need to know about because, sooner or later, you will see it -- and unless you know about it you may become confused about what is going on. The trick is referred to as invisible text and is called such because the actual text is invisible to the visitor to your site... even though the SE sees it and indexes the invisible words.
A typical example of invisible text is often found at the top or bottom of a web page. The source code can look something like this...
...where the font color code, FFFFFF (white) matches the background color of the page, which is also white -- and therefore the text is invisible . In this instance the <h6> tag makes the text an extremely tiny HEADLINE. Many search engines now penalize pages that contain invisible text and some SE's will reject them. However, a few will allow for it and index pages that contain this tactic favorably. Please remember that it's often better to put the keywords within a regular sentence structure when possible. If for design reasons you are having problems getting enough text into your pages, this or other similar methods may be an option. In Chapter 6 we will tell you how each SE's treats invisible text.
A word about Keyword Placement
It is extremely important that you have keyword text on your web page before your images... because search engines don't give a hoot about actual images. They are looking for TEXT. Although images may look nice, even expected, at the top of your page, they do nothing to help your site's findability on the SE's. Therefore, by placing meaningful small font keyword text at the top of your page - before your image - you can still place your logo near the top of your web site while favorably appealing to the SE's that are looking for relevant keywords at the top of your page.
Therefore, be sure to Headline your page with text that the search engines can recognize as relevant to what your are offering... AND do it early in your page before the images whenever possible.
Remember, it is these keywords (search words) and key phrases through which people will find you on the Internet.
We will refer to these <TITLE> & text keywords frequently throughout this book. However, for now, suffice it to say that you should apply the correct keyword density within the your web page as well as in your <TITLE> if you expect to be found in the top ten list of any given SE search.
One potential area of confusion is Keyword Relevancy . Simply put, your keywords must be relevant to your TITLE as well as to the contents of your web page. Search Engines DO reject some pages and it can take up to 8 weeks (or more) in some cases before they update their index. That's a long time to wait to find out if your submission was accepted... so you had better make your submission count! SE's are getting smarter by the day and some of them are overseen by humans who will actually check your site for keyword relevancy ... therefore you must make sure that your Keywords match the overall substance of your site.
Since this is SO critical, be sure to spend the crucial time needed to select your keywords carefully. Pick words that you feel others would use if they were looking for your service or product. Get online and search (using your keywords) the SE's for the sites that come up in the Top Ten; ...and be sure to keep careful notes as to the results. Later we will show you how to use your competition's web pages as tools to build yourself a better page.
Remember that when you submit your web page(s) to the SE's, you will need to put the best keywords first, the second best next and then down the line with the rest. Also remember, your title must be relevant to these keywords and your web site has to match both the content between your <TITLE> tags as well as the keywords that you're using in your text.
Another concept for you to grasp is Keyword Density . This is the number of times your keyword(s) appear in relation to the other words on your web page. For instance, if your page only had one word of text, say... Chicago , the keyword density would be 100%. If on the other hand the only text on your page was Eat at Chicago's finest seafood restaurant Then the Keyword Density of Chicago would be 20% because each word on the page represents 1/5th of the entire text, or 20% (SE's ignore common words such as the, at, of, etc.)
Theoretically, your page that said only Chicago verses the Eat at Chicago's... page would be given a higher rating due to a higher keyword density. Although you can see this is a simplistic example, you get the idea I'm sure. Here's the point. When designing your web site, be aware that keyword density can play a MAJOR factor in regards to how well (or bad) your page is represented by the SE's. It can be to your advantage to develop entry pages to your web site that are ethically and appropriately loaded with the correct density of carefully selected and intelligently assembled keywords... then use these entry pages as a portal to your other pages containing the rest of the information you want your site visitors to access.
In Chapter 6 we specify to what degree each specific SE is weighting keyword density and the range of keyword density mix that is optimum for each SE that is factoring in this ingredient. In Chapter 7 we will point you toward the best software program we have found for calculating keyword density -- a must for any serious webmaster.
Caution: Do not use stop words or dead weight words in Titles: a keyword or key phrase that has become so common on the Internet that search engines either ignore it or return hardly relevant results when they are used.
Some examples include Homepage, Home Page on the WWW, web, webpage, and in some cases, even the word sex. For instance, if you are in the business of webpage development the TITLE -- Webpage Design and Development -- will more likely land your pages in a design or development search than in one for webpage services.
Your efforts would be far better served if you find a niche group of businesses that you wish to market to and then design your web services page to land into their keyword search! ...because, simply put, it is almost pointless to attempt to score high in a stop word keyword category.
Here are some obvious stop words: the, of, that, is, to, etc..
In addition to titles, you should also be aware that SE's tend to ignore these certain frequent and/or recurring words that commonly appear on web page text in order to save storage space or to speed up searches. If your business happens to be in a stop word category, we suggest you wrap your keywords in quotation marks and/or use capitalization or uppercase letters as such words are more readily findable through the SE's.
For the novice we must explain that Meta Tags are undisplayed text written into your HTML document intended to describe your page to the SE for the purpose of cataloging the content or description of your page.
There is some debate as to whether or not they should be included in your HTML document since some SE's openly state that they ignore <META> tags. However, since several Search Engines claim to, and do , support them, our view is that they should be included in your HTML document... however you must be careful how you use them. The trick is to make them useful for the SE's that support them while avoid being penalized by the SE's that do not support them.
For instance, AltaVista, and HotBot use <META> tags when they retrieve their search results. In some cases, the SE will use your <META> description as the summary for your web page. Regardless, we know of no SE that will penalize you for using <META> tags as long as you use them properly. However, be careful that you do not repeat your keywords too many times. In some cases, one to three times is the limit. (For better specifics, see Part Two, Chapter Six pertaining to each individual SE because there are exceptions). There is no clear standard to this rule, the best we can tell you is to study the individual SE recommendations that we outline in Part Two, Chapter Six
Here is an example of where your <META> tags should be placed in your HTML document and an example of how they might look:
<TITLE>Absolutely Awesome Hawaiian Beachfront Vacation Rental Properties & Villas - Oahu Kauai Maui Hawaii</TITLE>
<META name="description" content="Absolutely Awesome Hawaiian beachfront oceanfront & golf course vacation rental properties & villas accommodations. Hawaii lodging for the luxury traveler. Located on Hawaii Oahu Maui Kauai Big Island Kona Kohala Kailua Lanikai Kehei Hana & Kaanapali">
Note: 248 characters in Meta description.
<META name="keywords" content="Hawaii, Villas, Vacation Rental Properties, Beachfront, Oceanfront, Golf Course, Oahu, Maui, Kauai, The Big Island, Kona, Kohala, Kailua, Lanikai, Kaanapali, Kehei, Hana, hawaii, villas, vacation rental properties, beachfront, oceanfront, golf course, oahu, maui, kauai, the big island, kona, kohala, kailua, lanikai, kaanapali, kehei, hana, Hawaii, Villas, Vacation Rental Properties, Beachfront, Oceanfront, Golf Course, Oahu, Maui, Kauai, The Big Island, Kona, Kohala, Kailua, Lanikai, Kaanapali, Kehei, Hana, hawaii, villas, vacation rental properties, beachfront, oceanfront, golf course, oahu, maui, kauai, the big island, kona, kohala, kailua, lanikai, kaanapali, kehei, hana, Hawaii, Villas, Vacation Rental Properties, Beachfront, Oceanfront, Golf Course, Oahu, Maui, Kauai, The Big Island, Kona, Kohala, Kailua, Lanikai, Kaanapali, Kehei, Hana, hawaii, villas, vacation rental properties, beachfront, oceanfront, golf course, oahu, maui, kauai, the big island, kona, kohala, kailua, lanikai, kaanapali, kehei, hana, Hawaii, Villas, Vacation Rental Properties, Beachfront, Oceanfront, Golf Course, Oahu, Maui, Kauai, The Big Island, Kona, Kohala, Kailua, Lanikai, Kaanapali, Kehei, Hana ">
<BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#7F7F7F" ALINK="#0000FF">
Note: 1196 characters in Meta keywords
In this example you see the <META> tags we used for a client to describe his web site dealing with precisely what the contents suggest. In the <META> description we wrote as many keywords in Plain English sentences as possible and limited the content to just less than 250 characters (counting all spaces and periods). We did this because this description is used by some SE's as the summary for this web site. That means that people will see it... and use it to decide whether or not to click on this link. We limited it to less than 250 characters because SE's seldom, if ever, display more than 250 characters as a summary... and sometimes they display fewer characters.
The reason we offer the <META> tag description as a summary is some SE's will otherwise pull a random paragraph from your text as a summary if there is no <META> description. In some cases that means that the person who is searching could get a nonsense description of a web page... for instance, here is an actual summary example taken from Alta Vista from a page that did not use a <META> description...
URL Link: scuba diving maui hawaii photos
Summary: click to go home.
Now, we are sure that this company did not want click to go home as its summary but that is what they got because they had no <META> description.
I'm sure you can see the point. Would you click on a URL with that description?
On the other hand, here is an example of a page using a <META> description that was pulled from Alta Vista using the exact same search
URL Link: Scuba Dive Hawaii - Sun Seeker Charters!
Summary: Sun Seeker Scuba Diving Charters! Welcome aboard for Scuba Diving or Whale Watching on the Sun Seeker Yacht in Hawaii on the Big Island. Come to Kona for
In this case, Alta Vista used the first 150 characters of the <META> description and truncated the description... however, this is by far superior to having a summary selected from the page at random. For your information, here is the complete <META> description below.
<META name="description" content="Sun Seeker Scuba Diving Charters! Welcome aboard for Scuba Diving or Whale Watching on the Sun Seeker Yacht in Hawaii on the Big Island. Come to Kona for overnight excursions and scuba diving with Paul Warren on his Dive Boat">
As you might, accurately, conclude... you should put the most important part of your description first because the SE may not use all of it, as was the case in the example above.
On the other hand, the <META> keywords are seen only by the SE's. Getting back to our previous example above... In this case we were careful to use each word no more than the currently acceptable number of times and we alternated between capitalized and non-capitalized words in order to hedge against going unnoticed if the SE happens to show a preference based upon how the person enters the keyword into the search engine. In addition, the meta keyword tag area is a good place to put misspellings of your keywords that are not found in the body of the page.
Contrary to what some people think, simply installing <META> tags is not usually enough to insure top positioning for your web page(s). However, it is useful when the SE uses it as a summary for your page. Using a <META> description gives you some degree of control over how the search engine presents your link to the viewing public.
Remember that <META> tags and the keywords in them are used like mini indexes of your web site. Plan to put effort into finding and using words and phrases for your <META> tags that are simple. Be sure to use words and combinations of words that the AVERAGE person would use.
Warning: Do not try to fool the search engines with words & phrases that are unrelated to the content of your web site with <META> tags. It is not only unethical, it can work against you because some search engines may refuse to even list your site if you use unrelated keywords or repeat the same keywords within the meta tags too many times.
You must be aware that some companies have sued because they found their trademarked words hidden within other sites <META> keyword tags or by using other means to deceive competitor's customers. At this time the lawsuits are not settled, but many people believe this is highly unethical, and at some point in the future perhaps even illegal.
Also, keep in mind that too many times is set by them, not this book. In other words, do your homework... check on your competition and see what the successful page designers are doing -- then follow their examples so that you do not get squeezed if (when) the SE's change the rules.
Later, we will give you specific <META> tag guidelines for the Top Eight SE's. For now, you have simple, yet precise, info on how to use them effectively for the SE's that support <META> tags -- while reducing the chance of being penalized by the SE's that do not support their use. Keep in mind that the rules can change. Therefore, it will be in your best interest to constantly monitor the top ten pages of several keyword specific searches in the various SE's to see if the pages that are coming to the top are using <META> tags. There may come a time when word frequency and repeating keywords more than once could be penalized. Watch for it!
...and if you would like a second opinion as to whether or not your <META> Tag keywords and descriptions are properly written, here is a free utility that will check them for you... go to: http://www.northernwebs.com/set/setsimjr.html and enter your URL's before you register your web pages with the SE's
In this Chapter, we will present what we call the Tricks & Illusions of web page design -- techniques that are being used -- or have been used in the past -- to improve positioning on the SE's.
Before we continue, allow me to make something perfectly clear. We may not agree with many of them of these so-called tricks. However, if the keyword category that your company is competing in is very competitive, you may need to use some of these techniques if they are working and being used by your competition. Therefore, we feel it is only fair to alert you to every known trick that has ever been used to improve positioning on the search engines.
Frankly, in today's search engine climate, many of these techniques can hurt you more than help you... it depends entirely upon which SE you are submitting your web pages to. At the very least, you must know what has been used, and what is being used, on the Internet to improve positioning on the SE's -- and unless someone points these techniques out to you, you will find yourself wondering how certain pages have climbed into the top ten lists... and in other cases, you may wonder why your page was penalized. Later on, in Chapter 6, we explain on a case by case basis when to, and when not to, use the various techniques that we outline in this section.
This is a VERY important section, so read carefully. Enough said... here we go behind the doors of Internet Web Page Design...
You probably have already seen this, or at least figured it out, but this technique consists of repeating keyword(s) over and over in text -- usually at the top of the page and/or at the bottom of the page in very small letters, i.e. <font size=1> or headline <H6>. In addition, spamdexing can also be found in some <META> tags and even <TITLES> -- in order to find it however, you will need to check the Source code. In Netscape you can do this by clicking View then Document Source. In Microsoft Explorer you can access the Source code by clicking View and then Source. By the way, these are the only two web browsers that you should be using. You should definitely use the latest versions or else you will be missing what the web really looks like -- and you will not be able to determine what most Internet users are seeing when they view your web pages.
An example of spamdexing could look like this:
<H6>hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii hawaii</H6>
In the past, some people have gone overboard with it and included totally irrelevant keywords in an effort to appear in every search regardless of the topic. This is stupid! ...and it will no longer work. In fact, chances are that using irrelevant words will sooner or later get your domain banned from most search engines.
In addition, repeating keywords over and over on most search engines will cause your pages to be penalized or rejected by the search engines.
A more acceptable way to stuff keywords today involves the technique of working them into regular sentences as much as possible up to the point that your keywords and phrases blend to create the correct keyword density mix.
Note: On some search engines you may find pages that are ranked high and DO include repeating keywords. As is the case with Alta Vista, these are older pages -- maybe even two years old. The pages have not yet been removed by the search engine but if they were submitted today they would likely be scored low, or rejected altogether or they may get your site banned. Be careful not to emulate very old pages. When given a choice, select the most recently indexed pages to emulate when analyzing your competition for the correct winning formulas.
Note: On some pages, if you check carefully, you will see literally hundreds of keywords & phrases printed in visible or even invisible text across the bottom of the web page -- This is called Tail Tagging. It is another way of Keyword stuffing (spamdexing). By doing so, you increase the keyword density of your web page. This is sometimes effective but generally not at this current time. Again, it depends on the search engine.
We have already touched upon this one, however, we will now expand on our explanation and show you how this is sometimes done effectively.
If your goal is to add keywords, phrases, or sentences that you want the SE's to see yet be invisible to the actual page viewer -- invisible or semi-visible text will accomplish this for you. Simply put, by setting the paragraph font <font color=#??????> the same or almost the same color, as the background color <BGCOLOR= #??????>, the text will blend into the background and appear invisible.
For instance if you want white, replace the #?????? with #FFFFFF . If you want yellow, use #FFFF66, and so on.
To get semi-visible text, you can offset one of your colors a little bit making it invisible to the viewer without letting on to the SE that the text is not visible. For example, set the white background color to <BGCOLOR="#FFFFFF"> then, to make your text slightly off white, use <font color="#FEFEFE" >. For some page designs, you may want to use <background="image.gif"> instead. This trick will work there too. Just keep in mind that if the background image is too big in file size, your text may be visible while the image loads. The two techniques can be combined to prevent this, just remember to keep the value in the <BGCOLOR> tag slightly different than the one in the <font color> tags.
Warning: You must be CAREFUL using hidden text some engines, for example, may automatically reject any pages that it finds having the same background color as the font color . Google (and in some cases AltaVista) will also reject pages and possibly ban sites that have ANY hidden text IF one of their human Spam Task Force members view them manually regardless of how close the background color is!
For a complete color chart check the help contents of your HTML editing program or go to: http://tanega.com/java/color3.html and there you will find a very handy and easy to use HTML color chart complete with HEX codes.
If you want to see if one of your competitors is using this technique, hold down the left mouse button and drag it along the top or bottom of the web page. If invisible text is present, dragging the mouse across it with the left button held down will render it visible. Another way to make sure you have viewed everything on the page is to click your browser's edit/select all function and all of the text will be selected.
A pointer page is a page that uses keyword density or other techniques to score high in a given search and has one or more Links that all point to your main page. Here is an example...
<TITLE>Sailing Kauai, Sail Hawaii</TITLE>
<body text=#000088 link=#3366ff vlink=#663399 BGCOLOR=#FFFFFF <font size=2>
<CENTER><a href="index.html"><IMG SRC="logo.gif" border=0></a></CENTER><BR><BR>
<CENTER><a href="index.html"><IMG SRC="boat4.jpg"><BR></CENTER>
<CENTER><H1><a href="index.html">Sailing Kauai, Sail Hawaii</H1></a></CENTER><BR>
In this page there are only four viewable words... they are; Sailing Kauai, Sail Hawaii and this page links to the client's main page -- giving each word a density rating of 25%, which normally is extremely high... in fact, the word sail actually has a density rating of 50% and you will also notice that the <TITLE> matches the text exactly, thereby increasing all the more that this page will score high in a search for sail hawaii or sail kauai.
The idea behind a pointer page is to optimize different techniques for a specific search engine or a specific keyword that you are working on. You will likely need to make multiple pointer pages to optimize your results. For example, http://www.mysite.com/page1.html, http://www.mysite.com/page2.html and so on. Please refer to Chapter 6 for details on each specific engine.
NOTE: if you put images into a pointer page, you must make them tiny (file byte size wise, that is). In this case the images that we included in this page totaled less than 6K. If you put images that are any larger, the page will take too long to load and you will likely alienate your customer (by making them wait) before they even see your main page.
Here's a trick that most search engines hate and we do not like to use, however, it is has been successfully used in the past and you might encounter it. We will therefore explain what it is.
If you ever click on a link and notice a page loading that automatically then loads another page (without any action from you) you have encountered a redirect page.
The redirect page may score high on a search because, again, it takes advantage of the correct mixture of keyword density.
In any case, the HTML code contains the line...
<META HTTP-EQUIV="refresh" content="1;URL=index.html">
and an actual HTML document that we pulled from the web looks like this;
<HEAD><META HTTP-EQUIV="refresh" content="1;URL=hawaii-index.html">
<TITLE>HAWAII | Hawaii Lodging Guide | Hawaii</TITLE>
<META Name="description" Content="Hawaii Lodging Guide">
<META Name="keywords" Content="Hawaii">
<BODY BGCOLOR="#ffffff" TEXT="#ffffff" LINK="#FF0000" VLINK="#FF0000" ALINK="#00ff00">
<font size=+3> <a href="hawaii-index.html"><B>
Since this page has only one viewable text word Hawaii, the keyword density is 100% Hawaii.
Hopefully you will not need to resort to tactics such as this. However, if any particular SE allows it -- and your competitor uses it -- you may need to design a redirect page out of self defense. You can do so by copying the HTML code above and simply replace the info on the page with your own info -- then register your redirect page(s) with the appropriate Search Engine that is allowing it.
Most search engines will not allow you to submit a page with a meta refresh redirect, so make sure and read the details in Chapter 6 on each engine.
Here is a standard technique that you can use to increase the effectiveness of some of your keywords. Whenever appropriate, add an s to the end of the word. For example, if you are in the book business, and one of your key phrases is book sales, then add an s to the end of book. By doing so, your page will be found for both book and books - since, obviously, some people will use the plural when searching while others will use the singular. In any case, your page will qualify to be found by the SE. The same goes for words such as beach / beaches etc.
OK, here's an area where you need to use some good judgment.
Let's suppose that you own a Motel in Port Angeles, Wash... Nearby is the Olympic National Park, the Sol Duc Hot Springs, the ferryboat to Victoria, hiking, fishing, etc. You should program your page to come up high on a search for the following keyword searches.
Port Angeles, Port Angeles Motel, Olympic National Park, Victoria ferry, Port Angeles salmon fishing, Sol Duc Hot Springs, etc.
Here's why. People who are looking for information on these activities are likely to need lodging close by. If your Motel happens to be the gateway to other services, then most people will not object if they find you while looking for info on the other guys. Again, this is a real life example -- this is precisely what we did for a client recently and guess what. The Hill Haus Motel comes up in the top 10 on many of the above searches on many engines.
Now that you know how to do this... Please, please, please, do not attempt to put unrelated, irrelevant keywords into your pages in an attempt to better your position and be careful about using other companies' trademarks. Here's a STORY as to why -- Not too long ago, a very aggressive company in Florida sold snow crabs. They quickly became the Kings of spamdexing by repeating every city, state, county, national park & monument, football teams and colleges -- as well as many commonly used words like, sex, nude, pictures, adult, women, software, erotica, gay, naked, etc. -- into their pages and repeated each word literally hundreds of times. The result -- they showed up on almost every search on several SE's. To make matters worse, they had at least 50 different URL's... so all of their pages monopolized the first 30 or so positions on any given search. Did it work? Sort of -- it got them to the top of the search engines but since people were not looking for that kind of info, it created a backlash toward the company.
Eventually, they disappeared from the SE's... they were either kicked out or their technique did not produce significant sales.
The point is... be appropriate when you program your HTML documents (web pages). After all, we are giving you The Keys to the Vault and you will find it unnecessary to flagrantly cheat in order to get noticed when you appropriately apply the information you are learning here. OK?
You may or may not know this, but anything that you put into a tag that starts with an " ! " (exclamation point) is invisible to the viewer. This is called a comment tag and it looks like this <!-- your comment here -->
This is an ideal place to put additional hidden keywords into your document to increase the page's keyword density. Here's an example:
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
Here's the Search Engine Tool you'll need in order to design your pages to be found in the Top Ten keyword search of every major search engine.
This trick has worked in the past, however it does not currently work on Excite, AltaVista, HotBot, WebCrawler and Lycos at this time -- but some of the other, less popular, engines may make use of it and we mention it so that you will recognize the technique should you happen across it.
The top Search Engines no longer consider ASCII, numerical and alphabetical listings with the same importance as in the past -- especially since Yahoo quit listing alphabetically as their default method of choosing which pages received top billing -- however some of the minor SE's may still give preference to this type of order.
If ever you should find this to be the case, choose a web page <TITLE> that starts with an A or better.
Better than an A you say? Yes... as a matter of fact there are many ASCII characters and numbers that are much better than an A. When alphanumerical order is used, ASCII characters and numbers appear before alphabetical letters.
Here's the normally accepted order...
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ` a b c... etc.
Therefore something like...... !!#1-A -- should be listed before anything that starts with an A . However, remember that you will want a person to click on your link... therefore you will have to decide what works best from a computer vs. person compromise perspective.
Something like...... "A-1" Hot Tubs could be very effective, considering that " (quotation mark) is second only to ! (exclamation point) on the ASCII chart.
Keep in mind that search engines are getting smarter and site reviewers are aware of the alphabetical tricks. If you use a <TITLE> that is alphabetically privileged, then you had better make your web page reflect claim to such a <TITLE>.
The problem with images in your web page is that the SE's do not index them at all. Therefore, your Logo may say what you are, who you are, and even state a benefit but the SE's don't show that.
In fact, if your image loads higher on your page than your text... your page is automatically handicapped in a BIG way!!!!! This is a mistake. Do not load images higher on your page than keyword text for reasons that we have stated previously in this book.
However, when you do put images in your page, make them work for you, not against you. Here's how. Always include the <ALT="here is a list of my keywords"> in your <IMG SRC="image"> tags. Here's an example of one that we have used.
<IMG SRC="logo.jpg" Alt="Absolutely Awesome Beachfront Hawaii Vacation Rentals - lodging - villas - homes - accommodations - Oahu - Maui - Kona - Hawaii - Kauai - Big Island" height=116 width=537>
The reason for doing this is because some SE's (at this time only Lycos, AltaVista and Google) index the <ALT="keywords"> -- by using the <IMG ALT> tag to contain your keywords, you will give your web page additional keyword / phrase help. In many cases it will improve the position of your web page(s) on the SE's -- and it is a simple thing to do.
Now that you know how to use the <IMG ALT=> to your advantage, We'll tell you about one of the sneakiest tricks we have found on a web page.... the phantom pixel.
Suppose you had an image that was so small it could not be seen. Imagine an image that is only one pixel square... tiny -- and that single pixel is WHITE against a WHITE background.
What's the point you ask?
It would be invisible, you say? That's exactly correct. It would be invisible to the web page viewer... and since the SE's don't see images, it wouldn't matter, ...right?
Wrong. Now, let me ask you another question. What if you loaded the <IMG ALT=> with TONS & tons of keywords and phrases?
That's right. You would have an image that loads quickly, takes up no physical space and is invisible to the web viewer -- but gives a mountain of keyword information to the SE because the SE sees the <Img ALT="keywords"> - and therefore indexes your page accordingly.
This technique has been very popular. If you have ever seen a page that is in the Top 10 but you can find no reason whatsoever in the source code* for the page to have earned a Top 10 position, then you may have found a page that is using a bait & switch approach to scoring high in that particular search engine.
Here is how it is done. A page is submitted to the search engine that is designed specifically to appeal to that particular search engine. In some cases, it is even a nonsense page that is laden with all of the components that a search engine robot looks for in a specific keyword/keyphrase category. Then, once the page is indexed and linked by the search engine, the page is switched to the REAL page that the company wants you to see.
We have heard of instances where competitors have helped out their competition by resubmitting the competition's switched page -- and in such cases the page will fall like a stone in the ratings. Remember, we are not advocating such tactics... we are only reporting what is taking place in the competition for SE positioning.
Here is what we feel is one of the sneakiest and most complicated trick of all. Basically, the Food Technique is a sophisticated variation of the bait & switch... with one very important difference.
Here is how this works. When a SE robot comes visiting to index a website they first identify themselves with a calling card and an IP number. Some companies have gone to the trouble of identifying each of the IP numbers of certain SE robots and when a certain robot, like AltaVista, comes calling they feed the robot the page that they want that robot to see (by using a server side CGI script). However, when a normal visitor clicks on a link to visit the page, the CGI script dishes up the consumer version of the page... the real page.
When you encounter this technique you will see nothing in the source code that indicates why the page is ranked high on the search. In addition, if you were to help the company by resubmitting their page it would be unlikely that your efforts would have any significant effect on the page's positioning... in other words, the page would most likely remain somewhere high in the rating.
The bad news is that unless you have the server side resources and available expertise to write such a CGI script in order to accomplish this magic you are unlikely to be able to match this particular slight of hand.
The good news is that this technique is mostly being used in only the extremely competitive categories. In addition, you can still be successful provided that you apply the successful formulas in this book to your real pages up front.
Also, please keep in mind that one reason we explain this technique is so that you will not think yourself crazy in the event that you encounter a website that is highly rated with no visible means of having earned such a rating in the viewable source code.
location.replace( http://www.domain.com );
Note: Search engines really do not like this technique because of abuse in the past and many will view it as spamming. We do not know of any of them that automatically check for it, but because of the location.replace text inside, they could easily ignore pages that have it. If a human editor sees this technique, they will likely remove the page from the index.
The <input> Hidden type tag is part of a form that is generally used for pages that use Forms to collect information. On occasion you will see some pages using this tag to try and increase the number of hidden keywords in a page. This doesn't work with all SE's. If you want to try this one, attempt to format your keywords in an English sentence to avoid any penalties for spamdexing.
The HTML code is:
<INPUT TYPE="HIDDEN" NAME="hidden" VALUE="put your keywords here">
Note : Make sure you place it outside a real form or it may interfere with your real hidden tags.
There is one form tag that will get indexed that we have been testing with. It isn't actually hidden text, but it does get indexed at on some engines like Excite and Google. The part of the form tag that will get indexed is the text following the form <option> tag. This is the text that you see inside of some of those drop-down forms and menus. Here is the html code for an example:
<FORM ACTION="http://www.server.com/cgi-bin/redirect.pl" METHOD="GET">
<select name="newlocation" size="1">
<option value="http://www.server.com/page1.html" SELECTED> keyword1
<option value="http://www.server.com/page2.html"> keyword2
<option value="http://www.server.com"> keyword3
</SELECT><INPUT TYPE="SUBMIT" NAME="SUBMIT" VALUE="Go!"></FORM>
This trick closely resembles Trick #10, the Phantom Pixel but the sole purpose is to get more of your pages indexed into some of the difficult search engines, such as Excite and AltaVista. Several of the deep search engines have begun limiting how many pages you can directly submit to their add url feature or they have limited how many pages you can submit per day. To help explain how this will work to your advantage, first we must discuss how a search engine spider crawls or travels through the web.
When a search engine is out crawling the web, it will often start with pages within its own index and check to see if there have been any updates made to those pages. When the engine's spider visits a page and it discovers new links, the spider will attempt to index those new link's urls too, and add them to the engine. To make this work to your advantage, it is a good idea to put links to ALL of the pages you want indexed on EVERY page of your site. That way, no matter how the spider finds your site, it is only one link away from any page.
The first thing you might say if you have a large site is I don't want to fill all my pages with links!, that's where the hidden pixel trick #10 comes into play. By using the small hidden pixels on your pages, you can add many links without affecting the look of your web pages, and still reap the benefits of getting more pages indexed.
Here is an example on how to put in several hidden links:
<a href="http://www.server.com/pagetoindex.html"><img src="pixel.gif" border="0"></a>
<a href="http://www.server.com/secondpagetoindex.html"><img src="pixel.gif" border="0"></a>
<a href="http://www.server.com/thirdpagetoindex.html"><img src="pixel.gif" border="0"></a>
<a href="http://www.server.com/fourthpagetoindex.html"><img src="pixel.gif" border="0"></a>
The above example would only take up 4 pixels of viewable space on your web page. Don't forget, you want to use an image that is 1 pixel wide by 1 pixel tall, and it either needs to be the same color as the background, or transparent and make sure and set the border to 0 so that the blue outline doesn't show up. It's also important to not have any spaces between the >< parts of these links, they could show up on the page as a blue _. We recommend NOT using the height and width attributes for the image tag (width=1) because that could be used by a search engine to ignore links like this. If you do wish to use the height and width attributes for the img tag, make them something other than 1.
An added benefit is some engines that measure link popularity will score your pages higher! Link popularity is a term search engines use to measure how popular a web site is based on how many other web pages or sites have links to them.
In case your site has a large number of pages, then the above method may not be totally practical. In that case, build a single table of contents or site map file that has links to all of the pages you want indexed on your web site. Then on each of your web pages, put a hidden link, like the example above, to that table of content file. This helps tremendously in getting more pages into engines like Excite and AltaVista!
Earlier we mentioned that you must use either the latest versions of Netscape or Microsoft Explorer to browse the World Wide Web -- Forget the reasons, just believe us. If you are serious about making money on the Internet, then you must use professional tools -- like it or not, everything else in the realm of Web Browsers is second rate.
That said, we will tell you how to analyze your competition so that you can build yourself a web page (and web site) that goes one better and will significantly improve your chances of making the Top Ten Hit List in your selected search fields.
First, -- be systematic -- start with a single search engine...
We suggest that you start with Google, Lycos, AltaVista, HotBot, and Northern Light as these are the simplest engines to submit to. Then move on to Excite and WebCrawler, and save Yahoo and then ODP for the last. Yahoo and ODP have totally different criteria that we will tell you about in their respective sections found in Chapter 6. In fact, we will save you a HUGE amount of frustration dealing with these engines/directories that will be worth far more than the price you paid for this book - and should get you the best possible results in the process.
Then enter, one at a time, each of your keywords and keyword phrases into that SE's search function in order to see what is currently the Top Ten hit list in each particular category.
Next... click on the first URL at the top of the list... go to that website. Once their page has loaded try to determine what caused it to land on top.
Look for keywords, invisible text, etc... Use the information that you have learned so far to analyze their page.
Viewing the Source Code
Next... you need to look at the code behind their page. Click View, then click Source. Here you will see if the page contains Meta descriptions, Meta keywords, <IMG ALT=> tags, spamdexing, and so on.
If the page is particularly well done (format wise), save the source code to a file. You may want to use it later for a template to build your own page(s). In fact, every time you find a page that works well with a Search Engine, save it in a separate file
Be sure to keep track of your research as you go. Perhaps the best way to do this is to open the source code file that you save and write your notes at the top of the HTML document. Keep track of which SE it appeared on top with and what you think caused it to make the top of the list.
After you have finished the top ten in one keyword search, go to another keyword search with the same SE and repeat the process. After a while you will notice that some pages keep popping up in a variety of searches. Those are the ones that are well done. They are the ones you should use as a template. You've found the pages to beat and you've captured the code for the ones that are working! ...and you even know why they are working..... Congratulations!
In highly competitive areas, what you see isn't always what the search engine indexed. Some businesses are using the Bait and Switch Trick #11 or the Food Technique Trick#12 and you won't be looking at the correct source code. In those cases, you might find a page that doesn't seem to be using the same number of keywords or format that the other pages seem to have. In that case disregard that page and analyze the pages that appear to be similar to one another and score closely. Here are some clues that often are associated with pages of this type:
- The title from the search engine listing doesn't match the text within the <title> area of the page.
- The summary or description from the search engine listing doesn't match the text within the <meta> description on the page.
- The file size may be different.
- The file extension isn't .htm or .html.
If one of these four clues applies to the page you're studying, it's best that you skip that one and analyze a different page. It's important to note that these indications don't mean that those sites are using special tricks, it just means they MIGHT be.
Now remember, we are being systematic. Completely analyze one SE before you move on to the other... and then, repeat the process until you have finished all of them. Very soon you will become your own expert on what works and what does not work... because you now know what you are looking at -- as well as what you are looking for!!!
NOTE: here's a good tip. While you are doing your research, browse the web with your images turned off. That way the pages will load much faster! ...and if at any time you want to see the images on the page you simply click the images button.
With Netscape, you can turn the images off by clicking OPTIONS then make sure the Auto Load Images is un checked (click it to check/uncheck it). Doing this will considerably speed up doing your research. When you want to see the images, simply click the Images button on the top bar of the main screen. This is how we frequently view the Internet and we recommend it.
With Microsoft Explorer, click VIEW then OPTIONS then un check show pictures. If you decide that you want to see the pictures, follow the same path and re check show pictures then go back to the main screen and click refresh ....and, yes, Netscape does work easier.
Once you start analyzing your competition, you will find that the true Internet professionals have several ways for you to find them. In other words, they have many web addresses for the same page or similar pages... In essence, they have multiple side doors into their main site.
This is easier to do than you might think. Even if you only have one actual page, you can create the illusion that you have several and get them all listed separately. Here's how.
Suppose your URL address is http://www.theirserver.com/yourdomain/
Ok, that is one address. But here are some more that should also get you to the same exact page....
http://theirserver.com/yourdomain/ [without the www]
http://theirserver.com/yourdomain/index.html [again, without the www]
Ok, so now you have four addresses that all go to the same exact page. This will work with some search engines. However, be forewarned that this technique is not something that the searchengines like. It does happen quite often by accident when a search engine spider finds a link and follows it similar to the above example.
By the way, this will not work with YAHOO!, so don't even try it. OK, here's a way to get as many URL listings as you can handle.
By having your own domain, you can have sub-pages. Design every sub-page as a potential entry page with links to all of your other pages. Just make sure you give each page a different unique <TITLE> AND you need to vary the content. Then register all of the sub-pages along with your main page.
Note: Avoid naming your pages in what appears to be a consecutive order (like the ones above with the 2,3,4,5), this makes it appear obvious that the pages are copies. Instead name your pointer or entry pages with unique files names like products.html, sample-products.html, items.html etc. OR you can have an UPPER level domain such as:
Here the sky is the limit (well, almost). You can have as many listings as you have URL's. It could literally be hundreds if you want and... this way you can design unique pages to fit each individual SE criteria. This can make you extremely successful on the Internet. This technique worked wonders in the past, but now in 2000 the engines are starting to watch for duplicate pages. Itís important that you donít use exact duplicates of your pages. The titles, meta descriptions and the page content needs to be a little bit different for each one. ( Note: AltaVista does seem to be limiting the number of pages you can have listed to around 400 or so per directory, with total site page numbers somewhere around 5000 (subject to pending changes).
Yes, it is a fair amount of work but if you do only a dozen pages you are likely to bury your competition in most cases.
Note: The trend we have been seeing over the last couple months is that the search engines are giving higher scores for the main home page (http://www.domain.com). As of April 1st 2000, AltaVista, HotBot, and Excite appear to be doing this. For these search engines you may need to get numerous domain names. Preferably with your keywords in the domain name (www.keyword-keyword.com) and if possible each should have unique IP addresses to score highly in the more competitive categories.
Again, when you research your competition, you will find that some of the professionals are already doing this... and your ability to compete on the Internet will depend on your willingness to level the playing field and go one better than your competition.
Remember, each page must have its own name, title, URL address and keywords to work... and, believe me, it will work like MAGIC!!!!!
Note: This section uses many examples of pages that are currently listed at the top of the various Search Engines. Please note that these examples may change from week to week even when the techniques that are being used remain the same. This is due to the fact that the SE's are continually updating their indexes and because new pages will frequently outdo the previous position holders -- and, to a lesser degree, because the Search Engines can change their relevancy sorting without notice.
Ok... now that you know the secrets of the very top professionals in this business, let's discuss which search engines, what they are, how they work, and the best way to effectively register your web page(s) with them.
First of all, you need to know that over 95% of all people on the World Wide Web (WWW) use only eight search engines. These are the only SE's that you should focus on. However, if you decide you want to register with others, you may... but, we are telling you that your results will probably not be worth the effort it takes to analyze these engines ...because hardly anybody uses them. However, if you decide that we are wrong -- you now have the info you'll need to analyze them and make intelligent choices regarding how to design your page(s) to fit with just about any search index or catalog out there.
Suffice it to say, if you do a good job with just these eight , your site will be successful... provided that people are interested in the topic, product, or service that you are offering. For example, our company's very first web site still draws over 5000 visitors per day and we are only doing SOME of the things right -- and we have only registered with these eight SE's.
Tip : As you begin the registration process of your web pages / site(s), it will be helpful if you open a text editor and list your URL's. This way you can accurately copy & paste these URL's into the registration forms -- decreasing the chance of typos and mistakes.
In addition, as you develop descriptions (some catalogs require them), write them in your text editor first and when they are finished you can copy & paste them into the registration forms.
Likewise, as you register your sites... keep a record of dates with each SE, as well as vital info in your text editor. That way you will be able to recall what you registered, with whom, and when. This log will become valuable later when you draw conclusions about how long it took to be indexed by each SE as well as what worked and what did not work with each web page on any particular SE.
As you may have gathered already... here is what we consider to be the TOP EIGHT search engines and Internet catalogs:
Google, Excite, AltaVista, Lycos, WebCrawler, Inktomi Powered engines (HotBot - Yahoo - Microsoft - NBCi), Northern Light, and Yahoo,
Now, we will talk about the specifics of each one, where to find them and how to register with them... as well as many shortcuts and tips.
Directory Size: 1,120,808 + sites (source- Open Directory http://www.dmoz.org)
Indexed Pages: Approx. 575 Million (Fast Search Engine)
Frame Support: Lycos DOES index <no-frames> content
Meta Tag Support: No
Accepts multiple <TITLE>s: YES -- but no evidence that it improves position
Database Refresh: Extremely slow but improving
Submission Time: 1 - 5 weeks (Much better in recent months)
Last Search Engine Update: 2/13/01 (alltheweb.com)
To register with the Lycos Open Source Directory, go to the directory you would like to be found in and click on the Add Web Site link at the bottom of the page, or submit at directly at the Open Source Directory -- http://www.dmoz.org. To register with the OLD Lycos Search Engine (T-Rex) which is used for most overseas versions of Lycos, go to -- http://www.lycos.com/addasite.html. To register with the current search engine at Lycos, submit to alltheweb.com at http://www.ussc.alltheweb.com/add_url.php3
The old Lycos (T-Rex) has one of the most friendly submit pages on the Internet. Be sure to read the information on this page.
You will also find that Lycos has a very user friendly way to find out if your URL is in their catalog... it is on the same page as the submit page and easy to use.
New Subscribers: Play close attention to which search engine your optimizing your page for at Lycos. Most of the information below is specific to the Lycos T-Rex Search engine which at this time (5/31/00) only powers the overseas versions of Lycos. They are now using Fast (alltheweb.com) to power the search results for the US version of Lycos.
No major changes at Lycos during February. Fast, the engine that powers Lycos is doing a reasonable job at updating often, usually within 15-30 days for a new site to appear in the index.
Lycos joins Goto affiliate program
Lycos joined the Goto program in November. It is now displaying Goto paid listings at the top of their search results and sometimes shows them mixed into their search results found lower on the page. The top 3 listings from Goto are listed in a section called Featured Listings however no apparent effect is being made to identify these listings as advertisements or Paid listings.
The Lycos search results page is now so busy that a search newbie would have a very hard time figuring out what is going on. The mixture of search results from so many different sources gives Lycos the appearance of a confused metacrawler. The only non-paid search results, in most cases, now appear below the fold (i.e., you'll have to scroll down to see them with most screen settings). Lycos deserves some credit however -- at least they give the Goto search results a slightly different appearance than the non-paid search results.
Fast claims worlds largest index
Fast announced Oct. 12 that they are now the worlds largest search engine claiming 575 million pages in 32 languages. They claim they reached that number by spidering through 1.5 billion urls.
Google, however, could likely dispute their claim since they're very close, and possibly larger, than Fast. And since Google updates more often than Fast they may also have the freshest index. http://www.fast.no/
Fast - more pages give better results
Quite a large number of our test pages were indexed this past month at Fast providing us with some valuable insight into how Fast is working. We noticed that, once the spider fully indexed the sub-pages of a site, the rankings for the home page went up significantly -- in one case from #11 to #1. Apparently, those internal links back to the home page help reinforce the Theme concept of the site resulting in a gain in relevancy score.
In the case of the example above, besides getting the #1 listing, we also got #3, #4, #7 and #30 (it was an honest accident). In our analysis we found that the most significant factor was that each of these pages happened to be heavily crosslinked. This indicates to us that the number of links to your main pages, even from your own site, do in fact make quite a substantial difference.
By the way, we noticed a similar effect at AltaVista this month as well. The more pages you have indexed, the better the sites appears to rank.
Fast powers Lycos results
Lycos search results for the US engine are being powered by Fast - http://www.alltheweb.com. It's important to understand that Lycos uses what they call search buckets. That means that when you do a search on Lycos the results come from a variety of sources as such...
- In the Popular section, results come from sites listed in Direct Hit or listings that Lycos is promoting, like their own internal pages.
- In the Web Sites section, results come from...
Matching Categories in ODP
Any Lycos network sites that have the keyword.
Web Sites in ODP that have the keywords
Fast or Inktomi search engine results (we've observed primarily alltheweb.com results recently)
The more generic a keyword that you're working with, the more important it is to be listed within the Direct Hit Popular Results or within the ODP Directory. If your keywords or keyphrases are more unique, then it's also important to be listed in the Fast engine.
What does Fast index?
1. Text in the title (as much as 1129 characters)
2. Text in the no frames tag section
3. Text in the <option> section of a form
4. Body text
Fast doesn't index -
1. img alt text
2. Meta descriptions
3. Meta keywords
4. Meta http-equiv keywords
5. Text within a <style> tag
Fast doesn't support the meta description tag, instead it will display approximately the first 255 characters that it finds at the top of your document in the body text.
Fast also appears to be using link popularity as a portion of it's algorithm. We'll be conducting more tests at Fast during June to learn more of the exact details on this engine.
To submit to the Fast search engine, go to http://www.ussc.alltheweb.com/add_url.php3
One method of building a page for T-Rex engine powering overseas versions of Lycos.
Lycos completely surprised us on the 18th of March. We had changed a few of our doorway pages on several sites in February while doing some maintenance work. To our pleasant surprise that let the Lycos spider to discover some of our VERY old pages that we built back in July of 1998 and, guess what -- they indexed them!
Taken alone, that doesn't sound like much of a big deal. However, what was interesting was that those pages were designed with a very high keyword density which worked pretty good back in 1998. Guess what?... the old pages are back in the money again. In fact we are scoring number 1 through 6 for some of our competitive single keywords.
Even though this page worked great, your best chance for getting into the Lycos engine is through submitting to http://www.dmoz.org (Open Directory Project). Be sure to read the submission information found on the page at http://www.dmoz.org/add.html.
Keywords in URL'S
The actual Lycos search engine (T-Rex) does show a preference for keywords in the URL. When doing some testing we couldn't help but notice that for long search phrases, URLS with the keyword somewhere in the domain name, directory, or file name scored higher than those without. Just make sure and separate the words with dashes, directories / or underscores. For example - www.domain.com/new/car/reports-reviews.html would help with the phrase new car reports" or new car reviews.
The 1% difference
Here's a new tip - Your page will score up to 1% higher if the keyword is not the first word of the title on Lycos.
If you go to Lycos UK -- http://www.lycos.co.uk -- you can see the scoring percentages. We did some testing with a single unique keyword only appearing once on the page in the title. The page that had the unique keyword at the first of the title consistently scored 1% lower than the one that had it as the third word. Assuming all other factors to be equal, if the keyword is not in the first position your page will likely score 1% higher.
Note: This only works on the Web Sites part of Lycos search. The directory and popular result listings don't appear to favor this strategy.
Note: New readers - The Lycos search engine that we are talking about is their original search engine called T-Rex. Lycos currently uses the ODP directory database and also has listings from Direct Hit in itís search results. Search results from T-rex show up after the ODP and Direct Hit listings in the overseas versions of Lycos and are not as important unless your working on a unique keyphrase. This means if you want to get into Lycos with your listings, right now the best way is to submit to ODP at http://www.dmoz.org
Pictures Added to Search Results
Lycos is now including pictures in their search results, right after the ODP directory results in the News and Media section for the more popular search terms. The only time these pictures don't appear is if the search phrase is very unique.
If you're interested in promoting your site on Lycos, you had better start by submitting to the appropriate category at the Open Source Directory ( http://www.dmoz.org ) and only worry about working on search terms for the Lycos engine if they are not found in the ODP Categories.
Lycos' Top 10 Web Page Search is powered by Direct Hit
Lycos has uses Direct Hit in it's search results at the top of the page. In most cases, the results very closely match those at HotBot which is also powered by Direct Hit.
After three months of testing we have learned that it is indeed beneficial to submit your site to directly to Direct Hit by using the add URL page at http://www.directhit.com/util/addurl.html. This may actually be a faster way of getting your site listed at Lycos than submitting to the Lycos search engine because Direct Hit updates much faster.
The pages we've submitted have only gained position in the top 10 while none of them have slipped lower. Submitting directly to Direct Hit may become even more important as they launch their own private label search engine service similar to Inktomi's.
At present, Direct Hit is staffing up in what appears to be preparation for an IPO in the not to distant future. Look for them to become an increasingly important player in the search engine game.
Lycos converted their search engine into a hybrid search engine/directory on April 16th, 1999. They took advantage of the free Open Source Directory ( http://www.dmoz.org) and merged the data with their search engine. Because of the size of the Open Source Directory, your primary objective to score highly on Lycos searches will be best served if you concentrate on submissions to the Open Source directory vs the Lycos Search Engine.
Since the search engine results only show up after all of the Open Source Directory links, we donít recommend spending much time optimizing your pages for the Lycos Search Engine. That isÖunless the keyphrase or keyword that you are working on does not show up in the Open Source Directory - only then should you spend the time on optimizing for the search engine itself. This is a similar strategy that we recommend for Yahoo - which is the model that Lycos has based their last change.
To identify whether youíre looking at the Directory or the Search Engine results, they are separately identified using the Yahoo terminology. Directory results are labeled in the top of page headings as Web Sites and search engine results are labeled as Web Pages.
Note that at this point in time (3/30/00), the overseas versions of Lycos, like Lycos UK are still using the Lycos search engine without the Open Source Directory results. If youíre working in one of these markets, then by all means work on optimizing your pages for the search engine.
ODP listings at Lycos
Lycos had a major site change on the 16th of April. Along with a page redesign they replaced the top search results with the directory listings from the Open Source Directory, previously known as Newhoo. Lycos now defaults to the Open Source Directory and categorizes the listings as Web Sites similar to Yahoo's terminology.
Pages from the Lycos search engine are now listed below the directory listings, or on subsequent pages under the heading of Web Pages. The actual search engine listings still appear to be old, even though Lycos was spidering heavily during the month of April. Wired news Article on the change - http://www.wired.com/news/news/business/story/19164.html
So far, this change to the Open Source Directory has not been implemented on the other versions of Lycos - such as http://www.lycos.co.uk
If the search phrase matches one of the top level categories in the Open Source Directory, Lycos will display first the top level categories, then the Web Site listings, and at the very last the search engine listings. Our recommendations are to concentrate on the Open Source Directory for your primary keywords and keyphrases. Unless the search term is unique and not found in the Open Source Directory, you'll not likely get much traffic from even a top 10 listings in the search engine itself.
To add your site to the Open Source Directory, do a search for your best keywords. Lycos has a link called Add WebSite at the bottom of the directory results. This is a worthwhile effort as the Open Directory is now used by Netscape Netcenter, HotBot and the Dogpile Meta Search Engine.
Keep in mind that the Open Source Directory Editors are volunteers, and will vary in their opinions on what is to be added to their categories. In some cases they may even be your competitors! But, that's ok -- You can also be an editor, signup at http://www.dmoz.org/about.html
Note: Do not submit your pages to more than one site that uses ODP, we suggest submitting directly to ODP in the appropriate category.
Lycos T-Rex (UK) Search Engine Only Recomendations
Tip (only works now on Lycos UK version): Your page will score higher if your domain name begins with an a than a z. So, if you have your choice of domains, choose a higher ASCII character when possible. For example, http://www.appledomain.com will likely be listed higher than http://www.bakedapple.com if all other factors remain equal.
Also you may wish to experiment with different machine names than www for optimum ranking. In case you are wondering, here is an example of a machine name
...where the word apple (in this example) replaces the www.
Many top10 pages on Lycos are using the following techniques. They utilize very simple leader pages as gateways to the site's main page. They have the keyword in the <title>, once in the <h1> tag at the very top of the page, and very little other text on the page, with perhaps one more occurrence of the keyword within the body text. These pages are only about 1-2k in size. Keeping the html file size small with this method is important. After we believe that it became more important to have one of the keywords or keyphrases towards the bottom of the page within a hyperlink .
Lycos doesn't use the meta description tag. Instead, Lycos uses the first 250 characters of your page to create one. We suggest that you create pages that have several occurrences of the keyword spread throughout the text as well as a combination of the hidden text tricks we have discussed... and/or you could also create short pages with only a few powerful keywords like our example above.
Here's another technique that is being utilized successfully. Some pages are listed TWICE under slightly different URL's. Lycos accepts http://www.yourserver.com and http://yourserver.com as TWO different sites. Please understand, these are not copies of the same page... they are exactly the same page; just different versions of the URL address.
This is proof that submitting multiple URL address versions of a Lycos friendly web page can earn you multiple positions with virtually no extra effort.
The specific keyword used in a search did not necessarily need to be in the Title tag. Lycos is obviously recognizing synonyms. In other words, the word hawaii might pull up something with kauai in the title and the word accommodations would pull pages with condo in the title.
Technique Summary for Lycos T-Rex
Meta Description Tags:Not used, Lycos uses the first 141 characters of body text for your page summary.
Keyword Meta Tags: Lycos doesn't use meta keyword tags
Submission Speed: Very slow lately, but varies depending on size of site it appears.
Multiple <Titles>: YES -- but no evidence that it improves position
<title>: Keywords are not required to be in the Title tags, but they are important . Synonyms are taken into account. Some pages have very large titles, so there doesn't appear to be penalties for this.
<!comment> tags: are not indexed.
.alt text: Is indexed and is very useful to generate a good description for your page if placed early in the document.
Body Content: Put the most important keywords in the top of document and once close to the bottom or last paragraph. Start paragraphs with the keyphrase occasionally or have them occur close to the front of the paragraphs. Remember that the description will be made from the first part of the document.
Tail Tagging: Not useful.
Keywords: use one in the title, and one or two in the body text, make sure that you have one towards the very top of the page, preferably in the <h1> tag.
Invisible Text: very little hidden text spamdexing was observed in most cases.
Page Summary: Summary is taken from the first 141 characters of body text found at the top of the webpage. It will use the text from the img .alt tag to provide description for your site if placed first on the page.
Domain Name: Alphabetically sensitive
And, as with most of the other SE's, one of the best pieces of advise that we can offer is to carefully analyze the search topic [keyword(s) / phrase(s)] that you want to appear in and closely study the HTML codes of the top pages in that particular search. Next, use the page code for a template and, by inserting your own information you are likely to be competitive in a Lycos search in that particular category.
Indexed Pages: Approx. 150 Million +
Frame Support: Alta Vista does support the <noframes> tag
Meta Tag Support: YES -- and it is important that you use a Meta Description tag for Alta Vista to accurately summarize your page
Database Refresh: Full refreshes of the index happen approximately every three months
Average Submission Time: 8 to 12 days
To submit a page to Alta Vista, go to: http://www.altavista.com/av/content/addurl.htm Note: The submit form is at the bottom of the page, careful not to be fooled by the advertising banners that sometimes appear to be a submit form.
AltaVista appears to be in the middle of an update as we go to press (February 28th). We're noticing that pages are moving up and down in the listings at various times during the day. Early indications are that we may see a slight change in the algorithm although we don't expect that it will be major change.
AltaVista adds GoTo sponsored links
AltaVista finally incorporated GoTo sponsored links in January. Three GoTo Sponsored Listings complete with descriptions are now appearing towards the bottom of search results for AltaVista but not at RagingSearch.com.
LookSmart Listings now affecting AltaVista Search
AltaVista is now using the title and description from LookSmart's directory for the same URL in the AltaVista index. Search results appear to be heavily biased towards the LookSmart title and very little on description or the actual page content. We do believe AltaVista is still indexing the page content of the listings from LookSmart based on some tests we conducted looking for unique words. They just aren't ranking the page content as high as a non-LookSmart page.
Since LookSmart is now allowing up to 5 URL's from the same domain (at $199 each.) it may be worth getting multiple pages listed at LookSmart for each of your primary keywords. Based on how those listings are now affecting rankings at AltaVista and other LookSmart partners like MSN, if LookSmart will actually put your keywords in the title it might be very worthwhile. For this amount of money, we would suggest calling the submission in at (877) 512-5665 to make sure the editor understands what keywords you are looking to get into those titles. For more than 5 URL's, LookSmart has a special Subsite Listings program which might offer some discounts for more URL's.
Basic Submission of one URL is now $99 (8 week lead time) and Express Submit runs $199 (48 hour lead time). Note that LookSmart frequently runs discounts on their Express submit package, so get on their email list and then watch for email notification of their specials.
Before you submit your site to LookSmart, spend some time thinking about how the LookSmart title is going to affect your search results at different engines such as AltaVista, MSN, Iwon etc. Because you don't have complete control over what LookSmart selects as your title, it's possible that a LookSmart listings could hurt you more than help you. If LookSmart fails to include your keywords in the title, your home page may actually rank worse in these other engines than it did without the LookSmart listing. You would be well advised to consider using an alternate domain name especially for LookSmart to avoid this possibility.
AltaVista Pushing LookSmart Directory
AltaVista has added a new See reviewed sites in at the top of search results, just below the Related Searches table. The new section attempts to match up the search query to relevant LookSmart categories. Another reason to consider getting multiple LookSmart listings whenever possible.
AltaVista focusing on least popular of two word phrases
We've been exploring the term vector theories on AltaVista's engine lately and have noticed something interesting in regards to multiple word searches. When presented with a search phrase that contains one popular and one less popular phrase, AltaVista focuses on the less popular word/s and will even ignore the popular words entirely. This is a change in the way the service used to work where it practically required both words to be in a document for it to show up.
For example, do a search for space elephants. You'll see the top page is www.fourelephants.com , which is also the top page for the single search phrase elephants. Our assumption is that when you see AltaVista ignoring a word in a phrase like this, there probably isn't a term vector for it and the engine is going into some kind of default mode.
On the most basic level, term vector means groups of links about specific topics. Otherwise, to create a term vector for space elephants you would need to create some web sites about that topic and crosslink them. Next, you'd have to get some of the sites listed in Yahoo, ODP and the other directories. At some point AltaVista will discover them and learn about the topics they focus on and create a term vector for that phrase. Then when trying a space elephant search, those pages should show up -- theoretically replacing the page that's currently ranking tops on that particular search.
Subpages getting dropped
Lately we've been hearing reports that many larger sites are losing a large number of subpages. In most of the examples we've examined these subpages were similar to database created, template style pages that are similar in layout and file size to each other. An example would be something like what you would find at a site listing apartments for rent in each state and city. To clarify, we're not talking about doorway pages -- instead, these pages are parts of the root domains and are all unique.
As we've mentioned in earlier issues, AltaVista has a dupe detector which we believe may be catching these pages in its claws -- even though they're not actually duplicate pages. The indications are that subpages must now display a degree of file size randomness and content in order to remain solidly in AV's main index.
Another possibility is that AV is dropping the pages because they appear to not be focused on a specific topic or theme. Keep in mind this is only a theory -- but also remember that focused topic specific pages tend to do better on search engines in general. In any case, designing your pages with this in mind will tend to help you and certainly won't hurt you regardless of why AV is dropping pages.
AltaVista Launches Raging Search http://www.ragingsearch.com The big difference is that this version is free of ads and none of the typical portal clutter found on most engines these days.. While this may look like AltaVista's response to Google's recent popularity, it's more likely intended to promote sales of their search engine software technology, Search Engine 3.0, and related services. We think Raging Search will serve as a test bed for new technologies yet to be implemented. It's logical to assume their search engine technology and services have major revenue stream possibilities that they will be taking advantage of in the future.
AltaVistas tight spam policy
After hearing about the problem from a number of readers. It turns out, this submission problem at AltaVista is actually a HUGE issue afflicting many thousands of web sites. After reviewing our discussions with many subscribers and other SE placement pros, as well as taking into account our own first hand experience, here is what we believe is going on. Recently, AltaVista has removed the error message too many urls and just appears to take the page even if youíre banned from the engine. You will not know if you are going to get indexed until the spider visits or does not visit your site, so watch your web logs.
AltaVista has implemented a very tight spam filter in their add url system. This system frequently (improperly) labels many pages or sites as spam. We've been attempting to reverse engineer their filter in an effort to figure out exactly what they are classifying as spam. We believe we are in the ballpark now and hopefully these tips will help you out and even squelch some of the urban myths that are spreading around.
This submission filter DOES NOT reject sites for spamming based on any one of these individual items alone:
- Does not - Reject sites just because they have a long domain name or a dash in the name. The only time AltaVista rejects a long domain name is when it exceeds 63 characters. If the domain will work with Netscape's browser, it will work at AltaVista. If you're close to this number of characters, try dropping the www. from the domain name and see if that will work.
- Does not - Reject cloaked pages based on IP Delivery or user agent name systems as long as the content is relevant and on-topic in respects to what both the user and search engine sees.
- Does not - Reject pages or flag you for review because you use WebPosition for reporting functions on the same IP address that you're submitting from.
- Does not - Reject pages that have been resubmitted IF their content has changed substantially. What is substantial? We don't know for sure, but if you're only changing a title or a meta keyword -- something small -- it's probably a good idea not to resubmit. Instead, let the spider find the changes on it's own.
This submission filter will very likely trigger a human review of the page or automatically give you a too many urls message if:
- You've been flagged as a spammer and identified by either a cookie residing on your machine or by the IP address you have submitted from (and spammed from) in the past.
- You submit more than 5 urls per day for a single domain from any one IP address / and or identifiable cookie. This is with or without an automated submission program. We've been a long time advocate of only submitting one to two urls per day at AV, dumping your cookies and switching the IP address you submit from.
- Your page that is being submitted contains hidden text, repetitive keywords (keyword keyword keyword) or the meta refresh tag.
- The page you're submitting appears to closely match other pages on the site. This is one factor they look at to identify Doorway pages -- something which AltaVista classifies as spam. A doorway page to most search engines is defined as... A page that has very little if any content, primarily consisting of links that the surfer is supposed to click on They also classify as spam pages that are created by template systems when such pages are almost the same as other pages found on the same site but differ only in a few select keywords. Note that you can usually have one page that fits this description. This allows the use of splash pages, but if they manually or automatically detect that you're putting up copies of pages that have only switched-out keywords, etc. they will ban you. We've heard many people complain that WebPosition Gold's doorway page generator got them in trouble this way because the pages it created were so similar. If you are using that program or similar programs, be sure the pages you are submitting have the look of any other normal page on your site. Take care that each has unique content within it. Do not submit pages to AV that are just Click Here links. Likewise you should refrain from submitting a list of links without content such as a page of link descriptions. True, this technique is, and has been, useful on other engines (like Lycos) but it may backfire on you for AltaVista with the new spam filter in place.
- You are resubmitting a page that is already indexed and the page content has not changed substantially. Note there is a fine line on what substantial is and we don't know exactly how to define it. Our best advise? ...don't push your luck. AltaVista claims to have mechanisms in place to prevent a competitor from resubmitting your pages and getting you banned -- but don't bet on them working all the time. If this happens to you, you'll have to contact AltaVista via their feedback page and ask them why your site was banned -- and ask what you need to do to lift it. Suggestion: Be nice! ...it's a safe bet they are tired of irate complaints.
AltaVista's new filtering system is quite aggressive. And, as with all SE circumstances on the net, it's subject to change without notice. If you exercise patience and focus on building highly optimized pages with relevant content -- while avoiding the don'ts we mention above -- you are unlikely to run into trouble with AV's spam police.
AltaVista overcomes rejection of question mark in URL's
Early in March we ran a test to determine if AltaVista would index a page with session attributes -- those special characters like the ? and the = sign. In the past, AltaVista like many other search engines, refused to index URLs that contained these special characters. URLs like... http://www.domain.com/page.cgi?id=top ...are now getting indexed, IF submitted manually (i.e. non-automated or through submission service). We don't recommend submitting more than one page a day like this.
So far, we don't know if AltaVista will spider a link containing such characters on it's own. We believe they are accepting them only when they are submitted manually.
AltaVista's Help Page Reveals Interesting Information
We've got to give AltaVista for the great job they've done on adding helpful information for webmasters on their site. While not highly detailed for the expert search engine professional, the info will help most people understand the engine better. Certainly it's a 1000% better information that the other major SE's offer. Imagine that! ...a webmaster friendly search engine. Whatta concept ;-)
The new help pages that you should review are at... http://doc.altavista.com/adv_search/ast_haw_index.shtml
Here's some highlights...
In a typical day Scooter and its cousins visit over 10 million pages. But this is a random game with hundreds of millions of Web pages. Pages with many links to them may be found frequently by the crawler. Pages with few links might be found in a week, a month, six months, or even longer. Pages with no links at all to them will never be found.
This means the more links to your site the better your chances that Scooter, AltaVista's crawler, will find it on it's own -- without you submitting it. It doesn't mean that you have to have a popular site to get indexed, just that a large number of links to your site increases the statistical probability of Scooter finding your pages on it's own.
On page http://doc.altavista.com/adv_search/ast_haw_titles.shtml we found the following statement... In the ranking rules that determine which pages will appear near the top of a list of matches, the HTML title is the most important element of the page.
Even though we've already told you how important titles are we have to admit it's darn nice to see AltaVista confirm it in writing -- thereby eliminating any doubt.
And finally, we actually have a search engine that defines exactly what they see as Spam
Some barriers to being indexed are due to the misbehavior of a handful of webmasters who have tried to fool search engines into ranking their pages high on lists of matches and including them as matches to queries they aren't appropriate for . This is one kind of behavior is known as spamming. Spamming degrades the value of the index and is a nuisance for all.
Irrelevant content (spam) is the number one problem that all search engines are faced with. If you push that button you will have problems. The logic that leads people to try such tricks is rather bizarre. I figure everybody searches for the word sex. I don't have any sex at my site, but I want people to stumble across my site. So I'm going to put the word sex, three thousand times as comments. And any time that anybody searches for sex, my pages will show up first.
People have actually tried that. They have tried doing the same kind of thing in the backgrounds of their Web pages. They have also created page after page of text that is in the same color as the background color so visitors won't see the words, but search engine crawlers will. They have tried everything imaginable to fool search engines.
If being found via search engines is important to your business, be very careful about where you have your pages hosted . If the hosting service also hosts spammers and pornographers, you could wind up being penalized or excluded simply because the underlying IP address for that service is the same for all the virtual domains it includes. ( The information above was found at http://doc.altavista.com/adv_search/ast_haw_spam.shtml)
That last very informative paragraph tells you why you must have your own unique IP address for each of your domains. You should never settle for a virtual domain that shares IP addresses. Doing so will most certainly squelch your ability to succeed at getting top listings on the search engines.
Additionally, you should also make sure the IP numbers you are using wasn't previously banned by a SE before you started using it. Although unlikely, it is possible that when (if) someone else had a domain on it they may have played games with the engines, got banned and then released the tainted IP number(s) back into the pool. Admittedly it may be difficult to find out if you have a bad set of IP numbers. Regardless, you should be aware that the previous use of the IP could be a problem. If you find yourself being ignored by the engines you can send a very polite message stating your suspicion and explaining your situation. Chances are such a request will be unique enough that it will stimulate a response -- maybe even open up a dialogue and help get you listed.
The following page may have the most important clue of all -
There are hundreds of millions of Web pages, so almost any query is likely to have a huge number of matches. For search results to be useful, search engines must rank more highly pages that are most likely to have relevant information. AltaVista's formula for doing that is a closely kept secret (like the formula for Coca-Cola), and is subject to continuous fine-tuning. But an understanding of the main ingredients can help you build pages that will be valued by search engines and hence found by people who use them.
Content counts; content near the top of a page counts more than content at the end. (Note: we've found that this statement isn't always true at AltaVista, depending on what algorithm they are using at the time, the text at the bottom of the page or the middle of the page may be the most important.)
In particular, the HTML title and the first few lines of text are the most important part of your pages. If the words and phrases that match a query happen to appear in the HTML title or first lines of text of one of your pages, chances are very good that that page will appear high in the list of search results.
Say you want to put your resume on the Web. Keep this rule in mind: Don't put your name first. You aren't trying to be found by people who already know you. You want to be found by people who have never heard of you. So don't waste any letter in the HTML title on your own name. The first word should be resume. After that, list your main qualifications and the kinds of jobs that you are looking for. Put the same kinds of things in the first lines of text . That's what will come up as the default as the description in match list, and it's also an important position for ranking.
And the most important thing they wrote... Above all, remember that AltaVista does not reward web pages that practice useless repetition; AltaVista only counts each unique word twice. Source: http://doc.altavista.com/adv_search/ast_haw_query.shtml
This is a very, very important statement that, for us, brings up even more questions. We've stated for some time that short pages seem to work best at AltaVista and to only repeat the words a few times, once in the title, once in the <h1> tag and if possible a link with the keyword in it. This statement suggests that AltaVista may be ignoring one of those words in the body.
The final page in the How AltaVista Works is very insightful and basically tells you exactly how to score on this engine... AltaVista bases its ranking on both static factors (a computation of the value of a page independent of any particular query) and query-dependent factors. It values:
- Long pages rich in meaningful text (not randomly generated letters and words). Note: The above statement we don't really agree with. There are millions of pages in AltaVista that are very small, short on text like we've described, in the top 10. However, this is the type of page that they would like to see in their index - it's just not necessarily what will score highest.
- Pages that serve as good hubs, with lots of links to pages that have related content (topic similarity, rather than random meaningless links such as those generated by link exchange programs intended to generate a false impression of popularity). In other terms, a link to your page from an about.com topic specific area is a very good example of what they are looking for. This is also why you may find that your home page scores higher than your pages that you built for a phrase. Those pages are boosting the home page score because they are pointing to it.
- The connectivity of pages, including not just how many links there are to a page but where the links come from: the number of distinct domains and the quality ranking of those particular sites. This is calculated for the site and also for individual pages. A site or a page is good if many pages at many different sites point to it and especially if many good sites point to it. Link Popularity, just like at Google. The only difference is that it doesn't appear to affect the scoring of the page quite as much as at those engines.
- The level of the directory in which the page is found. Higher is considered more important. If a page is buried too deep, the crawler simply won't go that far and will never find it. This is absolutely true. The home page (www.domain.com) of a site is the URL that has the best chance of scoring a Top 10 ranking.
These static factors are recomputed about once a week, and new good pages slowly percolate upward in the rankings. Note that there are advantages to having a simple address and sticking to it so others can build links to it, and so you know that it's in the index
Query-dependent factors include:
1. The HTML title.
2. The first lines of text.
3. Query words and phrases that appear early in a page rather than late.
4. METAtags, which are treated as ordinary words in the text that appear early in the page (unless the METAtags are patently unrelated to the content on the page itself, in which case the page will be penalized). Words mentioned in the anchor text associated with hyperlinks to your pages. (e.g., if lots of good sites link to your site with anchor text breast cancer and the query is breast cancer, chances are good that you will appear high in the list of matches.
Keep in mind that in any query, rare words count more than common words. If someone searches for fruit and pomegranates, pages with the word pomegranates will appear at the top of the list (a technique known as inverse document frequency). Hence you should use specific terms on your pages, in your anchors, and in your METAtags, not general ones that won't give you any advantage. Be specific whenever you can.
What we got out of the above information is that you do want to put the keyword in the title because you get a boost, but you don't want to put the keywords in your meta tags. Why? Because if you do, and if the engine only counts up to two keywords, then it's only going to count the first word in the title, the second one in the meta tags and ignore the one in the <h1> tag at the top of your page. If it misses the keyword in the <h1> tag at the top of your page then you don't get that little boost from the <h1> tag. That's why you see most of the top 10 pages without any meta tags. The third keyword that we suggest in the anchor text is good for boosting the score on the page that being pointed to. (So don't point it at your competitor ;-)
AltaVista Going Deeper On Title Data
We've discovered that AltaVista is now indexing the title tag far deeper than before. We've actually proved that it will index text in the title up to 417 characters now but it will only display 78 characters in the search engine listing.
Testing results with the <noframes> tag
We completed some testing this month regarding the <noframes> tag so we could answer a few questions we've had. We've let you know in the past that AltaVista, like most of the major search engines supports the <noframes> tag and frames - but we haven't been very exact in what it will and will not do, so here we go.
What will AltaVista index if I submit an html page that uses frameset tags?
AltaVista will index the title for the page and any regular text content that is within the page. It will use the meta description from the page for the search results but it will not index it. It will not index comment tags.
The spider will index the contents within the <noframes> tags just like it would on a none frames page. In fact, you can leave the title and meta description off of the top of the page, and move it down to the <noframes> section and AltaVista will still use them in the same way.
AltaVista will unlikely follow the links that are within the framesource.
Can I use the <noframes> tag on a page that doesn't use frames to hide text?
Absolutely! AltaVistas search engine doesn't care if the framesource tag isn't there, but under human review they may not like it if you put repeative text or other spam related content. It will index the page, same as normal... plus any content within the <noframes> tag set. This is a great place to put your links to other pages you want indexed, additional text, etc. -- just keep in mind that text within this area counts in your overall keyword density.
Does AltaVista care where on the page I put the <noframes> tag?
No - it can be used anywhere on the page that you want. Just avoid nesting the tags within non-indexable tags, like the meta keyword tag. We've got the best results by placing it towards the bottom of the page. Just be sure to test the page with your creative html code in several different browser versions to make sure it's compatible.
Title with One Keyword
In the extremely competitive arena of adult sites, it appears that a new page design is working very well. VERY short pages with only one keyword in the title, no keywords in the body and with VERY little or no body text are scoring high. Some of the pages we found were as small as 198 bytes. If you are working in a competitive category where there are many pages indexed for your keywords, this is worth a try.
The Real Names link on AltaVista
Real Names is a proposed alternative naming system to the current Internic URL domain names, only accessible at this point through links like the new one on AltaVista. Companies can purchase their Real Name based on trademarks, slogans or company names for $40 per year (This price has gone up to $100 per year), per name. For example, if you go to AltaVista and search for Explorer you will find a Real Name link at the top of the search results page, clicking on that link will allow take you directly to Ford Motor Companies page. Ford registered Explorer from Real Names as a company trademark. (It's hard to say what Bill Gates thinks of this).
Real Names will not allow you to register words that aren't directly related to your company or that are too generic. If Ford wanted to register the word Trucks then they would likely be turned down, but they could likely register Ford Trucks as a phrase and get away with it. If an exact match isn't found in the Real Names registry when someone clicks on one of these links, then the user is sent to a Real Name search page at http://www.realnames.com and they can choose from the directory. Not all of the names in the directory are actually registered, the index was originally seeded using Yahoo's directory. So if you find your site in Real Name directory and you haven't registered, that is where they found your page.
Real Names is trying to get Netscape and Microsoft to include Real Names technology into their browsers, so in the future you may be able to type in Ford Trucks and be automatically sent to a Ford Truck page without knowing their domain name or url. Recent business partnerships with companies like Network Solutions may prove interesting.
On AltaVista... The <TITLE> tags are everything !!!! -- If the keyword that people use to search for you is not in your <TITLE> tag then you will simply not be found anywhere near the top of the list! This is the most important strategy for AltaVista. If you do nothing else, you must do this or you have NO chance whatsoever to succeed with this search engine.
Title Tags: Your most important keywords MUST be included in your Title tags. Do not repeat your keywords.
Meta Description: Put important keywords towards the front of the description
Meta Keyword Tags: will index approximately 1024 characters. Meta Keyword Tags are seldom found in the high scoring pages.
<noframes> Tags: are supported, AltaVista will index content inside of the <noframes> tags just like it would on a normal page. Location optional.
Image .alt text: is indexed, however not likely to be an effective technique and seldom were pages using this technique found in the top positions.
<! -- Comments -- >: are not indexed.
Spam Penalty: Heavy spam penalties now in effect, repetition of keywords (example: beer, beer, beer, beer) may cause your page to drop significantly.
Tail Tagging: Not used.
Keywords: Use one or two occurrences of each keyword in the document for best results. Use the keyword as a link description for example: <a href="http://www.yourserver.com>keyword</a>
Vary Keyword Density: A smart strategy is to have multiple pages listed with varying keyword density.
Page Submission: We've had the best success getting pages indexed if we only submit one page from each domain per day. When we submitted 5-10 pages per day most of them were ignored except the first one or two submissions.
Inktomi raises prices, adds partners
The price for the Premier Page Inclusion Program has recently increased by $10 for the first URL. Inktomi has also added Network Solutions as a new Channel partner.
There is a slight difference in the pricing between Network Solutions and PositionTech, both charge $30 for the first URL but prices differ for each additional URL, PositionTech changes $10 for each additional URL where Network Solutions changes $15.
However PositionTech charges $30 for the first URL for each domain, and an additional $30 for each additional domain. Network Solutions doesn't appear to charge for domains after you've paid the first $30.
Paid inclusion program hiccups
In mid-February Inktomi's paid inclusion program incorrectly listed many sites with missing descriptions and titles. Inktomi has fixed the problem and as far as we know all the missing titles and descriptions have been corrected.
Inktomi is now ranking LookSmart reviewed sites higher than they were previously. However there is a new wrinkle in the fabric -- Inktomi is also replacing the web page's HTML title and Meta description with the LookSmart directory editor's rewritten title and description. If the editor did a poor job (like they often do) you may not like the title and/or description as it will now appear in the Inktomi index.
Keep this fact in mind when you submit pages into Inktomi's paid inclusion program if those are the same pages that are already listed in LookSmart. In such a case, the LookSmart description will override your page's actual description that is hardwired into the HTML source code.
In other words, you may want to consider this fact carefully when deciding which pages to submit to each index. You may find it unnecessary to submit the same exact pages into both indexes. Although LookSmart reviewed sites do tend to stick well in the Inktomi index they are not reindexed as often.
http://www.positiontech.com (POC recommended for excellent service & support)
Inktomi Search/Submit Service Launches
HotBot, AOL, and MSN all use the Inktomi Search Index and now the surest way in is to pay the fee ! And, beware of free submissions -- they could actually hurt your rankings.
The Inktomi Search/Submit service launched November 15th with their first Channel Partner, Position Technologies, Inc. of St. Charles, Illinois. Network Solutions is expected to be launching their version of the service in January.
The new service offers a solution for webmasters who are experiencing problems getting their pages to stick in the Inktomi database. It also insures that pages will be routinely spidered -- something that is likely to help webmasters who are constantly adjusting pages in order to achieve optimum search engine ranking. The primary advertised features are:
- URL's that are submitted through the service get indexed into the IFD database every 48 hours.
- The (agreement) service period is for 12 months
- Dynamic URL's with special characters are acceptable (example: http://www.domain.com/index.asp?id=pagename&variable=true)
- URL's are added to the main database that is shared with Inktomi's 125 search partners
- Position Technologies provides online reporting to show you the last date that the pages were actually spidered by Inktomi.
We tried out the new service early this month on some test pages to let you know what to expect. The best thing is that Position Technologies does exactly what they say they will do, there are no hidden surprises here. We also learned...
- The service does allow, and will spider, complex dynamic URL's which contain special characters -- like the question mark. This is a great help to many database driven sites.
- We also learned that the spider reindexes the submitted pages once per day. That means that changes you make early on day one will likely show up in the IFD index on day two. This will allow you to make adjustments to factors like keyword density on a daily basis until you've achieved optimum results.
- Position Technologies pre-spiders your newly submitted URL's and will give you a rough estimate on how the page will do on Inktomi. For instance, if the URL can't be spidered because you used a bad link, you won't be charged for it, the system will show an error. This pre-spider only takes place once, when you submit the new URL to them. After that the only spiders visiting your site will be the ones from Inktomi.
- A URL may be substituted once during the subscription period. Substitute URL's must be from within the same domain as the original URL. (Exception: URL's incorrectly entered while subscribing may be corrected without being counted as a substitution). So you can make some limited changes to the URL's.
- There is no human review of pages submitted during this process at Position Technologies, however Inktomi may review the pages submitted manually, just like the free addurl service.
- We received an 8 position gain for a URL that was already indexed in Inktomi prior to using the Search/Submit service. This page was previously submitted to Inktomi around July 2000. Under no circumstances did we see a drop on pages submitted through the service.
- Cloaked pages work fine with the new service
After Inktomi spiders the submitted URL's, they will start to show up in their search partner's results within 48 hours. In some cases, you may see two entries for the same page. That's because Inktomi has two databases, one known as the Long-Term database and the new one that is called the IFD database - we're guessing that IFD stands for Inktomi Fresh Database?
The URL's that are submitted via the Position Technologies service go into the IFD database within 48 hours. They are not added to the Long Term Database (LTD) until Inktomi has a full index -- which is approximately every 30 days.
Some search partners, like HotBot, only show the LTD results if two duplicate results for the same URL are found. So the changes that are made to pages enrolled in the service, AND already in the LTD won't show up on HotBot until the LTD is updated. If the pages that are enrolled through the service are not already in the LTD, then you should see them at HotBot in 48 hours.
You can however force HotBot to show you both database results if you search for a keyword on your site and click on the see results from this site only link. That will also allow you to see the date the pages were spidered. The old date for a given URL is the entry from the LTD and the newer date is from the IFD database. Some search partner's like MSN will show both the Long-Term and the IFD database information, resulting in two listings in the search results.
Position Technologies has several other tools in regards to submission services that tie into the Inktomi Search/Submit service. Position Pro is their general submission service that covers all the other major engines and also includes position reporting services.
To play the game we'd advise you to consider utilizing the Search/Submit services at Position Technologies. It does give you definite advantages that are otherwise unavailable to non-subscribers. We'll be using it for most of our own important pages.
Note these Important Changes on the Horizon
Based on information revealed to us at the Dallas Internet.com seminar by sources that (understandably) wish to remain anonymous, we can expect some major changes coming from Inktomi. You can treat the following information as rumors, as hints, or whatever... but we're already acting on them and treating them as known facts.
Pages submitted to Free addURL forms at search partner sites will be penalized in ranking.
Yes, that means that if you submit a brand new URL to Inktomi, that page will be ranked lower -- that is until Inktomi's spiders find links from other sites to it. Once the spider finds it, the penalty will be lifted. The logic is that any page with existing links to it is probably more popular (and therefore more relevant) than a page that is simply submitted. Pages that are discovered by their spider after following a link on another site will not receive this penalty.
Pages that are already indexed in their Long Term Database (LTD) and get resubmitted are likely to be slapped with the penalty factor until the engine rediscovers links to that URL from other sites. And, (naturally) pages added via the new (paid) Search/Submit service will not be subject to this penalty.
Based on this information, we will NOT be resubmitting ANY URL's from our sites that are already indexed at Inktomi. It should also be noted that Inktomi doesn't actively deep crawl sites. So, in order to avoid the penalty factor you'll need to obtain a link from a page on another domain that is already in the index. Then you'll need to wait and let the spider find that new link on its own. Do NOT resubmit the page containing the new link.
Inktomi will likely be removing the Free addurl feature at their search partner sites in the future. Incoming Link-Popularity is a determining factor on how important your pages are. Pages with low or no importance will be penalized.
Inktomi is dropping new pages that are submitted to it via the Free addurl services if they fail to receive some click throughs. We don't have enough data on this yet to tell you how many clicks are enough to get the pages to stick. However, it is important that your URL's receive traffic in order to keep them in the (free) index.
Inktomi may substitute the description and title from LookSmart listings for a URL if its spider finds the page in the LookSmart listings. Our experience has been that pages we have entered in the Search/Submit service appear to be overriding the LookSmart listing at this time. Pages not in the Search/Submit program will likely have their title and description altered to match LookSmart's listing for that page.
There are no guarantees how long MSN, HotBot or others will continue to use the Inktomi database. But at this point in time there is no doubt that a high ranking listing in Inktomi is a valuable marketing asset not to be overlooked. The Inktomi Search/Submit service is fast becoming a necessity for both high rankings and for stability for your pages in the Inktomi database. We suspect you might want to plan this into your 2001 marketing budget.
Inktomi Keyword Density Experiments
We've been experimenting recently with pages submitted through the Positiontech.com service. We've been trying this new 48 indexing service out at Positiontech.com
Anyway, we're now able to make changes to pages and see the results of our tests within 48 hours. That's fantastic. In the world of search engine optimization there's nothing better than to get the results of your tests QUICKLY. It's simply the best way to find out if you're making the correct optimization changes. Not having to wait months to get reindexed is a tremendous asset.
A few people have noticed that on Inktomi based engines like HotBot, Canada.com and others that the title and description in the search results aren't matching what's on their pages. After closer inspection, it is clear that Inktomi is replacing the page's title and description with the one from LookSmart. Since Inktomi powers LookSmart's search in the Web Sites section, and also spiders their directory your LookSmart listing it is going to affect your Inktomi listings.
On the plus side, it means that you're going to be less susceptible to the shuffle that occurs every few weeks at Inktomi because you'll be in the main cluster with the other LookSmart listed sites. It might also mean that you won't have to pay the submission to Inktomi when that program is announced since LookSmart is already paying to get the sites listed in their directory spidered.
On the negative site, it's very likely that your title and meta description won't be what you want since they're using the ones created by the LookSmart editors instead of those actually written into your pages.
Our opinion is that LookSmart might actually increase their express submission sales if they left webmasters' titles and descriptions alone. At the very minimum, we think they should leave the titles as is provided that they are relevant to the page content. We know some people who are flaming mad -- customers with a bad taste in their mouth from having had their title changed after paying for a review. To us it seems like an unnecessary antagonization of a paying customer. Sure, if the title is inappropriate or not relevant (people still do that? ...why?), then LookSmart can have them re-title and resubmit. Doesn't that sound reasonable?
Inktomi is an unbranded search engine that is private labeled by search engine sites that wish to outsource their technology. Each site that licenses the Inktomi search engines utilizes different options that affect how their own engine responds to search queries.
For instance, they can access different portions of the Inktomi database as well as respond differently to the scoring algorithms than other sites using the same Inktomi engine. In fact, one site, goto.com even sells the top positions listed on its site.
Inktomi now powers the following private labeled sites:
HotBot - http://www.hotbot.com
MSN - http://search.msn.com
GoTo - http://www.goto.com
Anzwers - http://www.anzwers.com.au
Goo - http://www.goo.ne.jp
Canada.com -- http://www.canada.com
RadarUOL - http://www.radaruol.com.br
Note: Submissions to free add Add urls links will not rank as well as those that use paid URL feature at http://www.positiontech.com until the Inktomi spider finds links to your pages. Then they should rank just as highly, but wonít be spidered every 48 hours like they would with the paid submission features.
If you are listed in any one Inktomi powered site, then chances are your site can also be found in the other engines that are also powered by Inktomi. In this discussion we are only covering the major players Ė MSN, HotBot, Yahoo, and NBCi (Snap). We don't believe the other engines generate enough traffic to be relevant to your marketing efforts.
One possible exception -- Goto.com receives decent traffic due to their recent advertising campaigns. However, keep in mind that to be listed well on Goto.com you must pay for a listing in competitive categories. Otherwise, there is no other way to outrank a paid listing at Goto.com.
Search Engine: Hot Bot - http://www.hotbot.com
Indexed Pages: Approx. 110 million
Frame Support: Hot Bot does NOT support Frame style pages
Meta Tag Support: YES
Average Submission Time: Quick when submitted through canada.com ( we recommend using the positiontech.com paid submission service to avoid submission penalities.
To submit a page to HotBot, go to : http://www.hotbot.com/addurl.asp
To submit a page to Direct Hit, go to: http://www.directhit.com/util/addurl.html
An Important Difference
Please note that HotBot is basically three engines since Direct Hit, Inktomi and the Open Source Directory serve its Top 10 search results depending on the search phrase that is used. If Direct Hit has the keywords in itís database, then it will be the engine that provides the Top 10 results, if not then Inktomi handles the request. If there is a category match in the Open Directory Project then you may see the category results towards the top of the search results page. In most cases Inktomi handles all searchs past the top 10 results. To determine which engine you are dealing in regards to your search keywords and keyphrases its important to monitor which results you are looking at.
A clear indication that the top 10 results are from Direct Hit are as follows:
- This sorted by Direct Hit image appears at the top of the search results page
- The Powered by Inktomi image is missing from the page
- At the top of the page right above the search results it will say Web Matches: Top 10
Your search results are going to be dramatically different depending on which service is providing the top 10 results for HotBot.
In case you're unfamiliar with Direct Hit, it's a relatively new search engine service that scores search results based on the traffic that a site receives. It does this by basically counting the clicks that a site's listing gets in the search engine, similar to banner ads.
Theoretically, to influence Direct Hit to increase your score, a large number of different IP addresses (visitors) need to click on your link at HotBot and spend some time before returning to the search engine. These visits should probably be random in nature, as opposed to scheduled daily visits. There is probably a way to automate this, but to date we haven't heard of anyone with that capability. If someone were to build such a system it would also need to have the ability to present different cookies to the search engine to properly represent a human visitor.
Direct Hit does have a submit page on their site, which may be useful for getting your site into their engine quicker. http://www.directhit.com/util/addurl.html
Direct Hit's client list contains HotBot, AOL, ICQ, ZDNET, Apple, Lycos, LookSmart, and Microsoft / LinkExchange. Expect some of these companies to use similar technology in the future.
HotBot and Direct Hit have been tracking traffic to individual sites since September 1998 and most recently they have also been monitoring traffic from the search results at Lycos. This information is compiled and used for some of the Direct Hit algorithm. Look for this new user information to be tied to their banner server at some point in the future as well. It's a very good source of information for them to compile user preferences. In fact new features are going to include gender and age profiling. This means that sometime soon the top 10 search results at HotBot and other Direct Hit powered sites will be served up using a formula that takes into account the site visitor's age and gender. This information will be stored in a cookie on the search engine user's browser. More than likely this will also be tied into other advertising methods that will generate different ads and search results for each users search profile, age, gender and location.
Inktomi is indexing the meta description tag now and perhaps the meta keyword tag. It doesn't appear to be giving extra weight to this tag, but you should include this tag when calculating keyword density.
Home pages, http://www.domain.com, appear to get a small boost as compared to individual pages like http://www.domain.com/page.html , but the grouping features that HotBot uses may be influencing this observation.
A key component of the Direct Hit technology appears to be the new search results are not direct links to the web site that appears, but a logging system that HotBot is using which tracks click throughs to the web sites. This feature appears to be to unique to HotBot, we haven't observed this at other Inktomi powered engines
This means that HotBot is likely now tracking how many hits it is delivering to each website, how many unique and return visitors the site gets and the length of time a visitor spends at each site. This information is likely compiled and used for some of the Direct Hit algorithm. Look for this new user information to be tied to their banner server at some point in the future as well. It's a very good source of information for them to compile user preferences.
We gave Direct Hit a call to discuss their new technology and spoke to Gary Cullis, Chairman and co-founder of Direct Hit. According to his executive bio, Gary's been writing code since age 12, and he is likely the driving force behind this new technology.
We also asked if repetitively clicking on a site's listing would increase its popularity ...or if each click had to be from a unique visitor to make a difference. Gary replied they had factored that into their system. He said that repeatedly clicking on a link wouldn't make a difference in the popularity. In our discussion we also learned that it does matter how long a visitor stays at the site before returning to HotBot. If people just visit your site then immediately hit the back button, it could have a negative effect on your Direct Hit popularity.
Theoretically, to influence Direct Hit to increase your score, a high number of different IP addresses (visitors) need to click on your link at HotBot and spend some time before returning to the search engine. These visits should probably be random in nature, as opposed to scheduled daily visits. There is probably a way to automate this, but to date we haven't heard of anyone with that capability.
We have been studying HotBot quite closely over the past few weeks trying to see what makes a difference in HotBot once you have reached the 99% level. As you've probably noticed, the number of pages all ranked with the same score can be quite large. Often we see hundreds of pages all scoring exactly the same. Therefore we've been closely looking at what makes the difference between a page that scores 99% on page 1 vs. a page that scores 99% on page 10.
Our studies have shown that regardless of the same score, these pages vary wildly in the keyword densities and designs. We have seen in some cases the keyword density vary from .67% to 17%, all on the first page! And yes, we were very careful to make sure we were looking at real pages...
We did notice some trends, however, that you might use to your benefit...
- Top pages often have the keyword in the URL, as a domain name, or the file name.
- High ranking pages most often are low keyword density from around .67% to 1.17%. This is was calculated on body text, href tags descriptions and words inside href tags.
- The greatest majority of high scoring pages have the keyword in the href descriptions (<a href="filename">keyword</a>) two to three times.
- Often high scoring pages have the keyword in one of the filenames of these three hrefs... example -- <a href="keyword">keyword</a>
Inktomi, the universal search engine...
The technology that actually drives HotBot's search engine is maintained by the Inktomi Corporation -- and the Inktomi Index. HotBot simply private labels Inktomi's master index by putting their own look on the engine. When your site is submitted and indexed by HotBot, it is actually added to the Inktomi master index.
Currently, the primary submission avenue to the Inktomi search engine index is through HotBot. Some of the sites outside the US, like Canada.com also have a submission form that we believe is also tied to the Inktomi index. By submitting to HotBot you avail your site to the extended search results for the new Yahoo, and NBCi. This is clearly an extra advantage. We suggest submitting your site to any of the Inktomi engines that have their own submit URL form.
Therefore, even if your pages do not score well on HotBot, you should submit them anyway in order to increase your chances of being found by the engines and indexes that are using the Inktomi Index.
HotBot has been scoring pages highly that contain the keyword in the url, like http://www.keyword.com or http://www.domain.com/keyword/keyword.html It appears that the having the keyword in the domain name carries slightly more weight than a subdirectory or filename. This technique is also effective on Excite.
Since Hot Bot forces their users to be very specific in order to turn up meaningful search results -- and for you to have a fighting chance to come up on top of their search list -- you will need to pay close attention to adding very specific keyword phrases when you design your pages for HotBot. In addition, HotBot is case sensitive so you may want to include alternative capitalization in your keywords or body text.
<META> keywords can affect your positioning on Hot Bot. You should use them, however, there is no evidence that repeating them will improve your page's position -- except for lower case / upper case repeats; (i.e. Bed Breakfas" & bed breakfast). Most importantly, make sure that you include all of your pertinent keywords in your <META> keywords description just once.
<META> descriptions are used by Hot Bot for the summary description on the search results. More importantly, at this point in time it is the hottest place to put your keywords. Make sure you put your keyword/s in this tag for the best score on the Inktomi engine. Be sure to put your summary in the meta description to force Hot Bot to meaningfully describe your site. Otherwise, it will grab the first words on your page to use as the summary. Unlike other SE's that only use Meta descriptions found after the <TITLE> tag, Hot Bot uses them regardless of whether they are placed before or after the <TITLE> tags. However, in order to maintain consistency with your page designs, we recommend that you place your meta's AFTER your <title> tags... this way they will work for all SE's that use <meta>'s.
Remember: It is important to use both the meta keyword and the meta description tag together. HotBot appears to ignore meta keyword tags if used without the meta description.
Here's an interesting trick somewhat unique to Hot Bot. If you have registered the same page twice with a different URL, Hot Bot may list the second registered URL as an alternate and it will take up a position on the search page that is returned. This means that if your page takes the, say, #9 position -- your alternate will take the #10 position. This is important if you are in the top ten on a search because it means that you can bump your competition down on the list by registering the same page two different ways. Example: http//yourdomain.com and http://yourdomain.com/index.html are actually the same URL address returning the same page... but by registering it two different ways, you can take an extra position on the search engine -- and, theoretically, you have two more opportunities to register this same page by using the www in each of the above URL's. That would expand this page's presence & positions to four spaces in the search.
In summary... META tags are extremely important to Hot Bot... be sure to use the description as well as the keyword meta's in order to give your page(s) a chance to be returned high on a HotBot search.
Keyword density is also important to Hot Bot. Many top pages are short on words and laden with image links. By making your images the links to the other pages on your site, you maintain (increase) keyword density while refraining from diluting your page with words that are not supportive to the central theme of your web page.
Indexed Pages: Approx. 575 million, links to 1 billion plus
Frame Support: Indexes <noframes> section
Meta Tag Support: NO
Accepts multiple <TITLE>s: Not Effective
Database Refresh: 1 month
Case Sensitive: Yes
Word Stemming: No (a search for tool and tools will give different results)
Average Submission Time: 30-120 days
Traffic - 3 to 4 million visits daily
Interesting Fact: Google has the largest installation of Linux powered servers in the world, currently around 4,000. Plans are to increase the number to around 6,000 servers in 2000.
To submit your site, go to : http://www.google.com/addurl.html
Google now indexing PDF files
Google has confirmed that they can now index Adobe .pdf files. We talked to David Krane at Google about the new feature, which to our knowledge is a first for a public search engine. This opens up a whole new level of documents to their search engine such as technical documents, instruction manuals, etc.
Users at Google will soon have the ability to click on a Text Version link that will display text conversion of the files, very similar to what their Cached link does with HTML files. David explained that the formatting of the pages still is being worked on and that it would be enhanced as they release new versions.
Another interesting note is that the engine will also crawl links within .pdf files to other .pdf files as well as to other normal HTML URLs. At this time, Google has only released the feature on two of their servers and they are in the process of installing it on the rest of them. We asked how the text in the title would be determined and David replied that it's something they are still working on. He also said that the Title of the page in the search result will likely vary as they enhance the software to get the best results.
This feature also opens up new possibilities for Google for providing services to corporate intranets and other outsourced search services.
SEGuru/Google Webcast Notes
On January 30th, Darin Babin of Searchenginematrix.com and Matt Cutts, a software engineer at Google conducted a live webcast for members of the adult webmaster community. A number of interested things were discussed and of course, we made notes.
- Matt noted the problems that WebPosition Gold and other automated bots cause at Google. They ask that if you do use these tools, limit the number of queries and run them during the off-peak hours -- which means late night and very early morning hours. Google does have the ability to see the WebPosition queries quite easily. IF they see what they believe to be excessive queries they MAY blacklist BOTH the IP of the machine that is making the queries and, depending upon their observations, the domains themselves from the engine. This applies to most automated tools. He also announced that they really don't like automated submission tools and requested that if you do use them that you restrict the number to 5-10 submissions per day -- again during off-peak hours whenever possible.
- Google employees do not like getting spam email and Matt commented that if you send email spam to Google you can be assured that your domain will get blacklisted.
- Google has no plans to require a paid submission system.
- firstname.lastname@example.org is the email address to use for correspondence regarding your websites, and questions that you might have.
- Broken Images do not affect ranking.
- There is no relationship between Yahoo's ranking of their directory results and the Google search engine, at least on Google's part.
- The meta no archive tag does not affect your ranking, but of course Google may, at some time in the future, review pages that have this feature to make sure they are not hiding irrelevant content.
- If one of Google's search engineers bans a site, when you request getting unbanned, the original person at Google that banned the site will normally be contacted before the site is placed back into the index.
- Thanks to Darin Babin at http://www.searchenginematrix.com for getting Google on their webcast. If you would like to hear the full webcast go to http://www.searchenginematrix.com/livechat/.
*Note: Search Engine Matrix is an Adult Webmaster Community. If you are easily offended by Adult content, be aware that this site deals with sexually related websites and adult material.
Flashback in time to 1999. Google was the new kid on the search engine block while the powerhouse engines were Excite, AltaVista, and Infoseek. Lycos and HotBot were major players. Even WebCrawler and Northern Light were considered engines worth promoting your site on. All of these companies were substantial, had been around for a while and most were flush with cash as the Internet boom was still going strong.
Google's cofounder, Sergey Brin, when asked at the Search Engine Strategies conference, How does Google feel about webmasters that spam their engine?, he charmingly replied, We kill them. Of course, everybody laughed. Then Sergey smiled and casually stated that Google does not worry about spam. He further explained that Google never bans a site and never removes a page from their index.
Sergey's demeanor, charm, and sense of wit in the face of this emotionally charge issue won the hearts of many professional marketers and webmasters that day. Over the next two years, Google's wise and far-sighted tolerance attracted the world-wide support of Internet professionals who, in turn, become a driving force behind Google's phenomenal growth in popularity. This grassroots ground swell of professional opinion that Google represented the good-guys facilitated their meteoric rise from ground-zero to become the world's largest index of web pages -- over one billion strong -- in a period of less than two years.
Ok, flash forward to now, present day, March 2001. Google is now considered one of the big winners -- perhaps even the most likely survivor -- in the search engine wars. They're now one of the most important engines for webmasters to focus on in their quest for high rankings. Our research is showing that, generally speaking, Google delivers more traffic from a number one ranking than any other engine.
The sad chapter in this story is, however, that Google, being so impressed with its own success, has forgotten those who've contributed so much toward making them successful. Alas, webmasters (with a commercial motive) no longer seem important to Google. And, the rules have changed!
Consider this paragraph from Google's Terms of Service page:
The Google Search Services are made available for your personal, non-commercial use only. You may not use the Google Search Services to sell a product or service, or to increase traffic to your Web site for commercial reasons, such as advertising sales. You may not take the results from a Google search and reformat and display them, or mirror the Google home page or results pages on your Web site, or send automated queries to Google's system without express permission from Google. If you want to make commercial use of the Google Search Services you must enter into an agreement with Google to do so. Please contact email@example.com for more information.
According to the above paragraph and the broad statements within it, all commercial web sites seem to be in violation of their terms of service. Another interesting but obviously out of date statement on their sites says:
The sites displayed as search results or linked to by Google Search Services are developed by people over whom Google exercises no control. The search results that appear from Google's indices are indexed by Google's automated machinery and computers, and Google cannot and does not screen the sites before including them in the indices from which such automated search results are gathered. A search using Google Search Services may produce search results and links to sites that some people find objectionable, inappropriate, or offensive. We cannot guarantee that a Google search will not locate unintended or objectionable content and assume no responsibility for the content of any site included in any search results or otherwise linked to by the Google Search Services.
The highlighted statement above is actually not true. They are, in fact, now banning sites from their index and preventing them from being found. Google now exercises editorial control over what they believe to be non-appropriate pages for their index -- which, of course, they have the right to do. However we also believe they should publicly state in their written materials that the results are edited and some web sites are not included in their index.
The point we feel you should realize is that the people that operate Google, do not care about your web site, or your sales, or how you rank, or how the changes they make may affect you. Stated another way, they no longer need you, the webmaster, to supply them with their content. Ok, fine. The question is, are they unaware that without you, the webmaster, their support will decline? ...do they not see what happened to InfoSeek when they finally told all of the webmasters that made it so immensely popular to piss off? ...do they not understand that their popularity is very much the result of being the safe harbor for webmasters in a search engine storm? Of course, only time will tell. However, recent executive and technical decisions at Google do not paint a rosy picture in respects to the quality of the relationship between engine and webmasters.
Now that Google has lost its innocence and discovered that their engine can be manipulated and their algorithms are not totally spam free, they feel that Internet marketers are taking advantage of them. Behind the scenes they are fighting back while at the same time trying to maintain a friendly public face to webmasters.
In any case, over the past few months several distinct patterns have emerged. And, as a web site promoter you may not like them but you should remember this is Google's ball game and they get to make the rules. Your choice to play, or not to play, by these rules is, well, your choice. Our job is to bring you the information. Part of that information is this...
Google is now showing a penchant for selectively passing the death sentence over sites without trial and without explanation. In other words, if you commit any one of many actions they no longer like they may permanently ban your site from their index.
According to Google, anything that you do to increase your web site's ranking is not allowed. If, for some reason they review your web site or you get their attention for some other reason, the following penalties could apply (and, your mileage will probably vary -- greatly!)
Things that might get YOU banned at Google, but not necessarily your competitor...
The following information is based on emails and discussions we have had with numerous webmasters, Google's search engineers, and drawn from our experience. As you read over these guidelines, keep in mind they are only selectively enforced.
By that we mean that some people get by with using these strategies without any problem whatsoever while others get permanently banned. We very strongly suspect the difference in treatment is the result of how these strategies are applied and, to a lesser degree, the personal mood of the search engineer that happens to be reviewing the site that's making use of them.
- Link-Farms -- When link popularity became one of the factors that search engines used to score page relevancy, a number of services sprang up to help webmasters increase the number of links to their sites. As you might suspect, some were similar to directories and actually added deserved relevancy. Others were nothing more than massive collections of distributed links. The number of sites that link to you and the text within those links does affect how Google ranks your pages. Now, however, Google views these systems as attempts to manipulate their index. Reality? ...Google has passed the death penalty over many of these so called link farm sites. In some cases, they've banned the domains these link farms are pointing to even when they don't have a link pointing back to the farm. Warning : If your site has ever been involved in one of these services, make sure they are no longer linking to you. In some cases, Google has unbanned sites after they've removed their link-farm connections. (Oh, oh... I just flashed on people adding their competitors' pages to link farms and reporting them to Google. What should we do now, Google? ...oh, I forgot -- not your problem, right?)
- Site networks -- Some companies have networks of sites. For example 4anything.com which runs topic specific domains like 4Careers.com . Networks like this have hundreds, if not thousands of topic specific domains. Since each topic specific site also links back to the main site, the network has the effect of building (i.e.,increasing) link popularity to the main site. In some cases Google has banned entire networks of domains such as this. We're not saying that our example 4careers.com is banned or that it should be banned (in fact we believe it shouldn't), but we can't seem to find it in Google. Perhaps such sites are viewed as competitors to Google, or maybe they believe that the design of the site network is giving them an unfair boost in the Google search results. Reality? ...some sites that have a large network of domains under their control are being excluded at Google, some are not. There seems to be no set rule that determines when a network is considered inappropriate and therefore should be banned -- at least that we're aware of. Our best advise is to be certain you have unique content on each domain and that each site is useful and relevant in regards to your keyword topics. Refrain from hiding any links back to your mother site -- a practice they now view as spam.
- Hidden links -- Hidden .gif links, like the 1x1 pixel method, the empty href method -- i.e., <a href="yourdomain.com"></a>, style sheets containing hidden layers with links, and generally speaking, any link that you show the spider but is invisible to a human are sometimes stated as the reason a site has been banned. Unfortunately, we've been unable to determine by what specific rule this death penalty is meted out. Reality? ...we can't help but notice that some sites are using these strategies very successfully while other sites experience only penalties for using them. Our advice? ... it may be a good idea to remove any hidden links on your pages -- or, at least, make them visible in some fashion.
- The <noframes> tag -- The no frames tag is a standard and perfectly legitimate strategy for providing search engine spiders with relevant content to index on an otherwise un-indexible frames based site. However in some instances, Google has inexplicably dropped some pages that contain noframes content from their index while retaining other sites that continue to use this standard strategy. We are certain that the noframes tag content is still being indexed and it is not being ignored by their spider. However, we do not know by what criteria Google is deciding that some pages stay and some pages go. So, for now, be wary of using anything within the noframes tag that could be remotely considered spam.
- Keyword Repetition -- Sites that use text content that is obliviously geared toward increasing the keyword density of the page without adding value to the text itself are potentially in the path of Google's spam police. If the page content doesn't make sense to a human, Google may ban the site for spamming. Therefore, we suggest that your site read well -- especially if you are attempting to adjust the keyword density to optimum levels. Reality? ...stay away from the kind of senseless babble that some software programs tend to generate when they create artificial doorway pages with pre-determined keyword density formulas. If a search engineer sees such a page, it's highly likely they will ban your site.
- Cloaking, dynamic sites, special scripts -- Pages that are generated by using tools of this category are now publicly referred to as spam by Google. In practice, however, it clearly depends on how you are using them -- but, of course, Google cannot tell you that any more than they will tell you why such inconsistencies exist in the categories we've outline elsewhere in this article. It's the content itself, not so much the tools, that they are focused on. If you choose to employ strategies or activities like the ones we've listed above, Google will likely react even more strongly if you use sophisticated tools to achieve these objectives. For instance, if you are participating in a Link Farm, or using hidden links, or repetitive text AND using a cloaking system to hide those facts, they will ban you -- and even maybe tell you they did so because you cloaked. If you ask them directly, is cloaking allowed? ...they will officially answer, No, we consider it spam. Reality? ...in practice, Google doesn't ban sites that are cloaking unless those sites are als using it to conceal other strategies or techniques that are deemed unsavory by Google. Most importantly, you should not expect to use cloaking as a method to successfully spam Google. You will not get away with it. Remember that if you show Google's indexing robot different content than what a human sees, your pages will be scrutinized with a higher professional standard in mind. Google will expect you to know exactly what you are doing. If they see anything else at all on your site they take exception to, you will likely be banned and they may very well tell you it's because you cloaked. Therefore, if you employ this strategy, you'd be well-advised to use it with care, discretion, and primarily to protect your optimized content from pagejackers and source code thieves. Other uses that may be seen as legitimate by Google (no guarantees) involve addressing very specific spidering & indexing issues pertaining to flash pages, graphics intensive sites, etc.. Under NO circumstances should you use any script to conceal what Google considers to be spam. Doing so will likely earn your site the death penalty without reprieve. Bottom line: Such scripts represent very powerful tools that, if used, should be used cautiously with any and every search engine. And, there exists no special programming scripts that will enable you to conceal unacceptable tricks at Google.
Thus far, these are the strategies and optimization techniques we've learned may get you into hot water with Google. We'll be attempting, over the next few weeks, to obtain further clarification from Google on exactly what their policy is toward commercial sites -- particularly those that use the various promotion techniques. Google does worry about spam. The good news is, they haven't yet started killing the webmasters.
Google Search Tips
Google's default search uses the Boolean and logic. If you search for hawaii travel news all three words must be in the page in order for it to be found in the search results. Or Google also supports the Boolean OR logic. Using are same example above, we could further extend our search for Hawaii travel news or information like this Hawaii travel news OR information. Hawaii and travel are required but the pages could contain either news or information, or both terms.
Most engines have what they call stop words. These are words and phrases that are used so often in web pages that the quality of search results would suffer if they were included in the search results. For example, a search for Where can I find Hawaii results in a search only for the word Hawaii. If you use a stop word or phrase at Google, you'll see a section of text at the top of the page which says... The following words are very common and were not included in your search:.Where can I find.
If you must have stop words in your search results, add them with a plus sign like this: +Where +can +I +find Hawaii Pay close attention to this issue; our experience has been that many people needlessly waste time and effort promoting keyphrases that contain stop words. Our advice? ...don't.
To exclude certain words from your search, simply put a minus sign in front of the word that you wish to not find in the pages of your search results. For example if you're doing a search for Anderson but find that you're getting a lot of pages for Pamela Anderson, try searching for Anderson -pamela. You can use multiple exclusion words to narrow down your search even further.
To search for text in a specific order, enclose it within quotes, like Hawaii Travel News.
To restrict searches to a specific web use the site: option. This allows you to search for information only on a specific web site. By entering... keyword site:www.thedomain.com ...you can go directly to the pages on that site containing those keywords.
To find what sites are linking to a specific domain, use the link: option and Google will return all of the pages that are linking to the URL. You can use it to find links to the entire domain, like link:www.mydomain.com or even a specific page link:www.mydomain.com/product.html. Google will not allow you to use Boolean modifiers like -news with this option.
Search Page Titles Only
One of Google's cool features is the ability to search only the titles of web pages by using the allintitle: option. To find text only in titles, first enter the option then the keywords as in allintitle:Hawaii, for example. This type of search will return only pages with the word Hawaii in the title. To extend this function you can combine other search options, for example allintitle:Hawaii Travel News will further reduce the number of pages returned in the search. You can also use the minus, plus and quotes with this option.
Search URL's Only
We really like this one. The option allinurl: allows you to restrict the search to just the URL's of the pages. An interesting example is searching for allinurl:robots.txt will show you how many robots.txt pages Google has indexed <grin>.
By the way, just in case you're still wondering whether or not it pays to have your keyword in the URL? ...it sure does in this case! For fun, try a search for something like allinurl:cellphones -- quite possibly THE most accurate search you can do for something like cell phones.
We believe Google really likes the outgoing links with the keywords in them. If you're just starting to build pages for Google, the flowers page above is a good example of what it likes for popular single keywords. Keep in mind that you should also get listed in ODP and Yahoo in order to score well on Google. Additionally, it will help if you have incoming links from other relevant sites whenever possible.
Along with the announcement of the new larger index, some of the new pages they indexed happened to be our test pages. This information could be invaluable to you when building your pages specifically for the Google engine. And, it will soon be even more important once Yahoo completes their already announced switch to Google.
What does Google index?
- Text in the <title> tag (first position preferred, it will index up to 1129 characters in the title tag)
- Text within keyword links - <a href="file.html"> Keyword </a>
- Text within the <noframes> tag, even if the page doesn't use the framesrc tag sets.
- Text that occurs before the <html> tag
- Text that occurs before the <body> tag, if not inside other tag sets
- Text inside the <option> tag set
- Text inside the <img alt="keyword"> tag
- alt text in the <img src> tag
- Text in the URL. GoogleScout reads the file name from a linked page. This isn't the same as indexing the page. It's based on links, names within the url, or that occur near the url, get referenced by GoogleScout.
- Text previous to the <html> tag - although this is not as good as text in the body or after the </head> tag.
- The first 83 characters of text in the title -- put your most important keywords at the front.
- Body text
What doesn't Google index?
- Text within the <style> sheet tag
- Text within the <meta> description tag
- Text within the <meta > keyword tag
- Text within the <meta http-equiv> tag
- Text within the <! -- Comments --> tag
- Text within the url of a link <a href="keyword.html">text</a>
- Dublin core type meta tags
- http equiv meta tags
Other important facts about Google...
- Google is case sensitive
- Google doesn't use the meta description tag for the page summary in the search results. It displays the text around the keyword from the document source and highlights the keyword.
- RealNames Internet Keywords links are incorporated at Google, but only for exact search matches
- Google's searches are exact (boolean and)... otherwise, all of your search words have to be found in the document to show up.
- Google caches your page content on their server and makes it available through the Show Matches link in the search results. (more in this below)
- Google will list a URL even if it hasn't spidered the page, they are the ones that you see in search results that don't have a regular title, summary or the Show Matches link.
- Google places a lot of importance on your domain being listed in Yahoo and ODP. If your site is missing from these directories it could hurt your rankings on this engine.
- How does Google handle IFrames? Google appears to ignore the content within an Iframe tag, indexing content within the tag but not the framed page itself. For example, do a search for You will not see this text if your browser supports IFRAME If you're not familiar with how IFrames work, see Webmonkey's article at http://hotwired.lycos.com/webmonkey/96/37/index2a.html
- Google doesn't index the meta keyword tag, or the meta description tag. Don't use that information when calculating keyword density on this engine. After extensive testing, we've arrived at the conclusion that Google attempts to index almost all content on the page (except the meta content). For example, we've observed situations where it has indexed the onmouseover text that is within a properly formatted <a href> tag, for example:<a href="http://www.domain.com" onmouseover. ..">. What does this mean? Well, for one thing when calculating keyword density on a page, HTML code should probably be calculated along with visible text to get a more accurate picture of the density of a page.
- Another interesting note; when viewing a page from the Google Cache link, you may see a note that says These terms only appear in links pointing to this page: (terms). In fact, we've observed that sometimes when it says this, the terms ARE in the page, but they are hidden, for example in the image alt text area.
Breaking out of the Google Cache
Google's search engine has a feature in their search results called cached which gives the user a copy of the page that Google has stored on their web server. In the event that a page has been removed from a site or is not available for whatever reason, Google can still provide it through this means of delivery.
Some web site owners have concerns regarding the use of the cached page feature. First on the list is the matter of up-to-date content. The cached results at Google are often old versions of your pages dating back 30 to 90 days old at times. If you are displaying content that is time or price sensitive, a visitor may be receiving and acting on out-of-date information. Conceivably, someone might purchase a product that is no longer available for sale on your web site.
Google's cache feature is not perfect. For instance, when caching a frames based page the cached result is a blank page. Your framed site will not show within the cached result. We don't know how many people make use of this feature at Google, but for a frame based site it could be costing you visitors.
There is also the issue of copyright. The fact that they do display a version of your web page on their site without asking permission may make it hard to enforce other copyright violations. This aspect has not yet been addressed in copyright cases that we're aware of but it may be an issue if you allow Google to display a copy of all of your web pages but selectively enforce another copyright violations somewhere else on the web.
On the other hand, if you have an unstable web server that frequently is unavailable, this feature might come in handy for people looking for information from your site.
The Google method is called the Google no-archive tag. Simply add the following line to your HTML documents... <META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
Place the tag in the <head> section of all web pages you don't want cached. This prevents the engine from listing the cached link on the search results page.
By the way, this tag has no effect on Google's ability to crawl or index your page, it merely tells their engine not to display the cached link in the search results. There is no penalty for use of this tag according to Google's Search Engineers and our own research verifies this.
if (location.href.indexOf("cache") != -1)
// end hide --></script>
If you're not concerned about users seeing older content or about protecting the copyrights of your pages, then don't bother with these tags. Adding them will neither help nor hurt your search rankings at Google.
For more information on Google's cache feature, go to... http://www.google.com/remove.html#uncache
Google.com was founded in 1998 by Larry Page and Sergey Brin, both Stanford University Computer Science Ph.D students. Originally located in a house with a hot tub in Menlo Park during it's early days, Google moved to the new Google-Plex -- an office in downtown Palo Alto -- in early 1999.
Since the early days in 1998, Google has received excellent reviews by those in the media and generated the interest necessary to get funding to make them a player. In June of 1999 Google was chosen as the fall-through search engine for Netscape search and is a premier search engine on the Netcenter Portal. This resulted in a gigantic increase in traffic for the new engine, and put them on our radar screen as a search engine with enough traffic to make it worth while to work with. Google now receives around 3-4 million searches per day. With that kind of traffic, it's a site you should try to rank highly on.
How does it work?
Google makes extensive use of a feature called PageRank. In simple terms the PageRank algorithm measures the sites that have links to the web page, how popular the sites are that link to the page, the text content and the outgoing links from the page to determine the ranking in the search results. This works very well for popular single and multiple keyword searches, but degrades somewhat for less popular search terms. Such degradation is generally due to the lack of popular referring sites to an obscure page that features the keywords. For example a search for linux accurately lists the major Linux sites, but a search for a small town name will score a page from a HotBot directory higher than the actual official town site because HotBot is more popular. In spite of this idiosyncrasy we believe you'll find Google to be, in general, a very accurate search engine.
Google has a full text index of around 100 million web pages, but actually covers somewhere around 300 million pages due to the way it analyzes links pointing to other pages. You can occasionally see some of the pages that it knows about but hasn't indexed. Those pages are the ones that don't have a title or the Cached link available in the search results.
Page Scoring Details
Search results are based on a calculation of the following factors in order of importance:
1. PageRank: The number of pages from other sites that have links to the page and the popularity of those referring pages. For example, a single link from Yahoo is more important than a large number of links from lesser sites.
2. The keywords in the anchor text of the links pointing to the page. (<a href="http://www.domain.com/products.html">Keywords</a>)
3. The keyword density of the words in the document.
4. The proximity of words to each other in the document.
5. Words are weighted higher if they are found in bold text, use larger font sizes, and/or the make use of Header tags (<h1>).
To increase page ranking on Google it is essential that you have either a few links from important sites with your keywords in those links, or lots of links from less important pages, also with the keywords in those links. It is important for Google to know about every link there is to your site, so take the time and submit pages from important sites that are linking to your pages. Remember that Google is very slow at indexing pages so the links pointing at your site are must still be in place when Google finally decides to index your page.
By the way, Google doesn't index the meta description or the meta keywords on your pages. Instead it selects text from the document surrounding the search phrase if possible and highlights the keyword in the search results.
When Google does index a page it keeps a copy of what it indexed stored on their server. It's important to make certain that pages with sensitive information have a disclaimer in regards to such things as Prices good till..x date.. Keep in mind that Google is very slow to update it's cached pages. Therefore your old prices could still be showing up for a long time on their server.
Google can be a tough engine to increase your ranking on and will take time and patience to see any results. The more popular your search phrase the more difficult it can be to influence this engine. The primary factor to be aware of is that the links from other sites play a very important role in your scoring. Therefore it is in your best interest to do whatever it takes to get as many quality links to your site as possible.
Indexed Pages: Approx. 110 Million
Frame Support: Northern Light does not appear to support Frame style pages
Meta Tag Support: NO
Accepts multiple <TITLE>s: None found in top searches
Database Refresh: Occasional
Case Sensitive: Yes
Word Stemming: No (a search for tool and tools will give different results)
Average Submission Time: 3 weeks. To submit your site, go to http://www.northernlight.com/docs/register.htm
We are now only providing minor updates to this information on Northern Light. This is due to the low traffic this engine is receiving and the fact that Northern Light seldom indexes new pages. The information below is useful and still up to date if you want to optimize your pages for this engine.
Northern Light shuns robots. Recently Northern Light banned Webposition software from accessing itís site.
Northern Light will index:
- The first 12 (approx. 83 characters) within the <title> tag
- Words outside of the <title> tags but within the <head> and </head> section of a page
- Words within the <body> text
- Words that describe a link: for example <a href="www.yourserver.com"> keyword </a>
Northern Light doesn't index:
- Words in the image .alt tag
- Dublin core meta data (The Dublin core meta data is a new set of meta tags that have been gaining popularity in libraries and universities. This tag set is still under development and is seldom used at this time with the large Internet Search Engines.)
- Hidden input form data
- Text within a url link: for example - <a href="http://www.yourserver.com/keyword.html">
- Words in the <meta> http-equiv tag
- Words within <!-- comment --> tags
These tests were conducted using a seldom occurring keyword and placing just one occurrence of the word in different sections of a similar pages.
The one page that consistently ranked highest (99%) was one that had the one keyword in the title and the lowest (82%) had the one keyword in the html link. The highest scoring page had the keyword as the 4th word in the title.
The next highest scoring page (92%) had the keyword in the very first position in the title. 3rd highest scoring position was the page with the keyword at the top of the document (73%) and the next one, which tied (73%) had the keyword in a link towards the top of the document.
Please note that these scores are for a search keyword that no other documents in the index were using except for these test pages. If we were testing with a more commonly used word, then the scores would have been MUCH lower.
Of the 20 documents that were indexed, each one also had another unique word in the same body text position. When we did a search for this keyword, all 20 documents ranked 99%, but they were listed in the same order that they were spidered (oldest time the page was spidered ranked highest). This may not sound very important, but if you are fighting it out for that the very top position that little tidbit may move you from #2 to #1.
Note: It appears that the more documents that a site has indexed in Northern Light the higher up it will rank in the Custom Search Folders on the left hand side of the search results.
Background on Northern Light
Northern Light opened its doors August 12, 1997 and has steadily been building the size of its index, now estimated at 110 million documents. Northern Light has a different business model than the other engines of this size. In addition to providing an index to the web, they also have available over 1,800 magazines, journals, books, newspapers, pamphlets, and newswires from content providers. The Special Collection feature has a small charge, but low enough cost to be of interest to many researchers.
We didn't find any data on how many visits Northern Light is receiving, but with the amount of press that it is has been getting, it will only be a matter of time before it starts forging strategic alliances. It's also surprising that they haven't been selling banner ads on the site, which will likely change sometime in the future.
One unique feature that is new to the search engine field is their Custom Search folders, which are dynamically generated during a search. For instance, a search performed on hawaii accommodations generated Custom Folders on several topics such as Bed & Breakfasts, Personal Pages, and our personal favorite www.bestofhawaii.com. Clicking on our entry in the custom search folders, we were presented with a listing of all the pages that we have indexed there, and then it broke down even further, showing that it had another folder for Bed & Breakfast. In other searches we found that the custom folders would generate an index for just the commercial (.com) sites or just the .edu sites. It's an interesting feature that no one has matched yet. It appears to us that a site has to have a number of documents relating to the search term indexed to be able to appear in a Custom Search folder.
The Good NewsÖ
Northern Light likes a number of tricks that we have used in the past on other search engines, such as tail tagging, and high keyword density. Here's a good example site, http://www.hawaiian.net/~marjorie/index.html scores #2 in a search for vacations. This page is using about every trick to increase its keyword density. For instance <meta> keyword spamdexing:
<Title> ( many lines deleted) rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental vacation rental</title>
Note: We don't necessarily recommend loading your titles like this for Northern Light, but we wanted to show you this example to let you know it will not reject your pages for doing it. Northern Light says they don't assign any additional relevancy to meta keywords, but they may be indexing them. We didn't find a lot of pages using loaded titles like this.
Title Tags: Your most important keywords should be included in your Title tags
Meta Tags: Not used. Northern Light says that they make note of meta tags, but don't use them for descriptions or for increased relevancy of the page.
Spam Penalty: None observed
Tail Tagging: was found in several top rated pages.
Keywords: Place relevant keywords frequently throughout the body text of your document.
Hidden Text : Northern Light doesn't penalize for using the same color text as the background.
Indexed Pages: Approx. 250 Million - Scheduled to extend to 800 million
Frame Support: Excite DOES support and index text in the <no frames> tag
Meta Tag Support: Meta Description Only
Accepts multiple <TITLE>s: YES
Database Refresh: Weekly
Average Submission Time: Very slow
Keyword Concentrations in <body text>: 3% to 7%
Keyword in URL Significant: Yes
Keyword Order : Not significant ("hawaii travel" and "travel hawaii" are the same)
To register, go to: http://www.excite.com Then, at the very bottom center, click add site - or you can go directly there by entering -- http://www.excite.com/Info/add_url.html Next, add each URL that you wish to register one at a time. You can also choose to enter the optional in the form directly below the web site submission fields.
Tip: after you register, use the back button on your browser to get back to the registration page. Then you can simply enter copy and paste (from your text editor) the next URL, and so on until you have completed registering all the URL's in your web site.
The latest update does show a marked preference for keywords in the domain name. It doesn't seem to matter whether it's a machine name (i.e., keyword.domain.com) or dash delimited (www.keyword-keyword.com).
In addition Excite continues to show a boost if the site is listed in the (LookSmart) directory. No major changes that we've noticed in the algorithm as most all of our own sites continue to be ranked about as they were before the update.
Link Popularity at Excite
The multitude of sites we involved with have given us advantageous insight into Excite's ranking system. Because we've been working with sites that focus on very specific, non-popular keyphrases, it's been helping us piece together the factors that are important to Excite's ranking system. Our advantage is that we not only know how our sites are constructed we also know what other sites are linked to them. This makes it easier to see what is going on.
Link popularity is very important. It's apparent to us that Excite is awarding bonus points in relevancy for sites that are listed in the major directories. Having your site listed in ODP and Yahoo is very important and it's clear that being listed in LookSmart is also very helpful. In fact, domains with a LookSmart listing (Excite Category Match) are often in the first page of the Excite search results. In our opinion, this alone justifies paying the fee for the LookSmart express submission.
In regards to outside pages pointed at your target page, it helps to have text within the links of these pointer pages (but don't call them that, call them information pages). Whenever possible, put your keyword phrase within the links pointing to your target site. Here's an example of an ideal link on outside sites pointed at your target site... <a href="http://www.domain.com"> Keyword phrase </a>.
Excite does still spider pages -- but they only appear to spider the home page. On some of our test sites we've achieved a top 10 result with a single home page that was also listed in the major directories.
Keyword density is much less a factor these days at Excite although it does still matter when all other things are equal, especially when the outside link popularity is similar for the competing sites.
Excite removes per day submission limit. Remember that they can change their policy at any time and without notice but you should at least be aware that the old 25 pages limit is probably no longer be valid. Obviously, if you abuse this non-limit policy you risk repercussions. So, proceed cautiously at your own risk until further notice.
Excite adds LookSmart paid submission to add url page
LookSmart must have finally decided to stick with the Express Listing option. We say this because the option is now available on Excite's 'add url' page. Excite submitters are now given the option to submit their url via the Express Listing at LookSmart for $199. Doing so guarantees a review within 48 hours and can lead to a quick listing in an appropriate category.
Excite's categories, historically, didn't seem to be updated often with new site content from LookSmart. Now, with what appears to be some sort of cross promotion between LookSmart and Excite, the time between updates will likely be reduced. We called Excite and talked to Kelly Distefano about the new arrangement. She assured us that sites submitted via the Express Listing service would show up in the Excite category search in approximately one to two weeks after the listing shows up in the LookSmart directory.
- The Web site must NOT already appear in the LookSmart Directory.
- The Web site must be in the English language.
- The site must contain original content as determined by the LookSmart editors and shall not, in LookSmart's determination, consist of: pornography or adults-only sites, gratuitous or graphic violence, material that infringes on or violates someone's rights, material that promotes/disseminates illegal activities, or sites with pornographic advertising.
- The site must contain sufficient pages and recognizable content.
- The Web site must be up and running 7 days a week and 24 hours a day.
- The Web site must not be under construction, all links on the site must be operable and the pages must load quickly.
- For Web sites requiring a password or subscription ID, Applicant shall provide LookSmart a password or subscription ID (valid for at least 60 days) to enable site review.
Excite's categories have been completely re-configured. Searching the category section now returns many more listings from LookSmart. Scoring appears to favor sites that are in a category with the keyword, or a portion of a keyphrase and the keyword in the title of the reviewed web site. Searching the directory for generic or popular single keywords returns category results first, then an alphabetical sorted directory listing next.
What will Excite index if I submit an html page that uses frameset tags?
Excite will index the title for the page and any regular text content that is within the page. It will use the meta description from the page for the search results but it will not index it. It will not index img alt text, comment tags, or meta keywords. The spider will index the contents within the <noframes> tags just like it would on a none frames page. In fact, you can leave the title and meta description off of the top of the page, and move it down to the <noframes> section and Excite will still use them in the same way.
Excite will NOT index the content from the frames you link to on this page at the same time. However, it MAY come back later and attempt to read the content on the frameset pages -- but it is unlikely to actually list those pages. There is some evidence that Excite does pay attention to these pages in determining the rank of the home page. So if you're trying to score on hawaii then the subpages of your site need to have Hawaii in them, preferably within a <a href> keyword </a> link.
Can I use the <noframes> tag on a page that doesn't use frames to hide text?
Absolutely! Excite doesn't care if the framesource tag isn't there. It will index the page, same as normal... plus any content within the <noframes> tag set. This is a great place to put your links to other pages you want indexed, additional text, etc. just keep in mind that text within this area counts in your overall keyword density.
Does Excite care where on the page I put the <noframes> tag?
No - the tag can be placed anywhere on the page. Just avoid nesting the tags within non-indexable tags, like the meta keyword tag. We've gotten best results by placing it towards the bottom of the page. Just make sure to test the page with your creative html code in several different browser version to make sure it's compatible.
Keyword Density Recommendations
Tests show that with a keyword density of 4.5% (10 keywords/222 total words), single keyword pages are scoring at about 65% relevancy on Excite. In our test page we put only one keyword in the title, but we didn't start the title with the keyword - we put it in as the third word in the sentence. We recommend putting the keywords anywhere in the title except at the front of the sentence but within the first 71 characters. We put another keyword in the link description (<a href="file.html">keyword</a>) . We also had one keyword in a <h2> tag, and one in a <h3> tag. The remaining 4 keywords were distributed in normal English sentences in the page <body>.
Our best two keyword phrases are scoring around 79%, which is about the best we can do without a directory listing for a domain in Excite at this time. A page ranking at this score needs to have about a 4.5% keyword density on the first word which is the same as the above page description (10 keywords/222 total words) and about a 2.25% (5 Keywords/222 Total Words) density for the second keyword. Both keywords need to be in the title, but not at the front of the sentence as stated before. If your title starts with the keyword it will cost you about 1% we have found.
Document word count = 222
Document character count = 1364
Title word count = 10
Title character count = 67
Meta Keywords word count = 0
Meta Keywords characters count = 0
Meta Description word count = 35
Meta Description character count = 287
Excite likes domain names with keywords
During January we were quite successful with Excite, even though it changed it's algorithms a couple times. We've began utilizing domains that have keywords inside of them as much as possible, and the results have been very encouraging. Example: http://www.keyword-keyword.com
We tried a new test to see if Excite would index text in the <style> sheet tags. We have our results now and we've proven it does. The <style> tag is used normally within the <head></head> tags.
Here's an example of the code for our test;
<head><title>Style sheet example</title>
<meta name=description content="Place your description here or Excite will likely make it from your text in the style tags">
<style> Place your keyword text here </style>
Note that text in this area won't show up in the browser window as viewable text, so this is even better than hiding text with the same color font as the background. Text placed within the style tag appears to be equivalent to having the same text in the body of the page. However, using these tags allows you to place your keywords up very close to the top of the document. Will this crash a browser? Not that we can tell, we've tried it in Netscape 3 and 4, and Internet Explorer 4 and 5 without problems -- but remember, this isn't proper html code. We would suggest testing your pages before you submit to insure they look right to a site visitor. .
By the way, this tag has also been confirmed to work on AltaVista and it may also work on other engines as well. Just keep in mind that only Excite and AltaVista have been proven to index text within this tag set so far.
In order to score highly and stay well positioned in Excite throughout the month, attempt to get your site listed in one of the reviewed directories (Excite, WebCrawler or Magellan). Getting your site listed in one of these review directories can save you a lot of time and trouble. If that doesn't work out then let's take a look at what else can be done.
Excite favors root directories (home pages with upper level domain URL's). Frequently it drops subdirectory pages from its index. You can work at getting those subpages listed, but you will have more luck by focusing on your root directory pages. So what do you do if you are working on multiple keywords? Get more upper level domains. That's the key lately to having your pages stick in the Excite index. Since Excite prefers to index only the root URL we've gone to a strategy that incorporates having several upper level domains for the same company, topic, service or product. Yes, it is more expensive - but the traffic at Excite, WebCrawler, AOL and Netscape is well worth it in most cases. However, before you go to this expense try getting into their reviewed directories.
Is Excite ignoring subdirectory page submissions?
Excite is only accepting upper level domain (root domain) submissions for indexing. We now have an official confirmation that this is indeed true. Excite not only mentions this on their How to Get Listed page -- http://www.excite.com/Info/listing.html, they recently sent out the following email to one of our subscribers...
Thank you for your interest in Excite. If you are trying to submit your site for inclusion in the Excite index of websites please only submit the top-level domain name, otherwise know as the root domain. That would be for example: www.yoursite.com
Unfortunately we will not accept subdirectory pages through the submission process. At this point we are trying to keep our index to a minimum so we can test new systems and searching techniques. If you submit your site and it meets our guidelines (please see: http://www.excite.com/Info/listing.html) then it should get indexed in approximately two to three weeks. At that point there is a possibility that another of our spiders may come across your site and index subsequent pages, but there is no guarantee of that.
If you would like to submit your site please go to our Add URL link on the Excite home page. On the following link please be sure to type in the full URL for your site (i.e. http://www.yoursite.com/ ). If you have any other questions please let me know. Thank you again for your interest in Excite!
Technical Support Engineer
Frankly, we are not surprised. In fact, we expect that other search engines may even follow this pattern in one form or another. So what can you do?
First , make sure that your website starts from your own upper level domain. The days when sub-domains (www.domain.com/~yourdomain ) receive equal treatment with upper level domains ( www.domain.com) are coming to a quick end.
Second , put links on your main page to all of the other pages within your site. When the Excite spider visits your main page it will find, and follow, the links to your other pages. These are your subdirectory pages that Excite will not accept when you submit them -- but they will most likely index them once the spider finds them.
At this time, we advise you to follow the suggestions listed in these articles and don't waste your time submitting subdirectory pages to Excite. And, if you don't yet have an upper level domain for your commercial web site, then now is the time to get one.
Excite apparently uses two types of spiders.
- One is devoted to indexing pages that are manually submitted. In most cases this is the upper lever (root) domain page as these are the only pages Excite is currently accepting for submission.
- The second is a spider that looks for and finds pages from links on other pages. It doesn't appear to have a quantity limitation although it may have an age limit. This second spider is the one that we recommend that you systematically place pages in its path in your attempts to get your subdirectory sites listed.
Our tests show that pages found by the second spider (the link finding spider) tend to stay in the index for only a set amount of time -- somewhere around two months. After that Excite appears to dump them out. This explains the disappearing pages that you may have noticed the past several months. It is also interesting to note that these disappearing pages have a tendency to show back up when the link spider finds them again. The cycle causes about a 15 to 30 day period when these found pages are gone from the index.
Therefore, your best strategy is to keep an eye on the timing of the spider's activities so you can schedule new pages to appear at the appropriate intervals. It can also greatly help to use something like WebPosition to identify dates when your pages are indexed. That way you can design an alternative set of pointer link pages that can be placed in the path of the crawling spider at about the same time the older ones are due to fall out. By using this kind of alternating approach, you should experience good success in keeping your pages indexed on Excite.
The Invisible Link Technique
Since Excite is no longer accepting subdirectory submissions, you might want to try putting invisible links on your root page -- linking to all of your other pages on your web site -- in other words, all of your subdirectory pages.
By invisible we mean, links that a visitor will not see but the Excite spider will see -- and follow ...so the additional pages can be added to the Excite index.
The trick is simple and consists of adding a transparent .gif that is 1 x 1 pixels in size on your page. Each image becomes a link to another page that you want the spider to visit and index -- for example: <a href="pagetoindex.html><img src="images/cleardot.gif" width="1" height="1" border="0"></a>
You can put as many of these on your page as you want. They are invisible to the site visitor but the spider will see them. By doing so you significantly increase the chances that the rest of your web site will make it into the Excite index.
How to get your web pages to Stick in Excite
We have discovered is that a two step approach helps to keep many more pages in the index, including our pages in subdirectories. The first step involves making sure that the pages you want to keep indexed are have links pointing to it from a home page on another site.
We originally set up a series of links in an effort to get more pages indexed by the crawler. After a few months from when we implemented this, we discovered to our surprise that these pages stayed in the index, when others tended to drop out after one or two months.
What we did is put a hidden link to get Excite's crawler to find pages on its own, without us having to submit them and run the risk of violating the 25 page submission limit that Excite claims. This hidden link on the home page of a different web site led to a list of links pointing to our pages that we were attempting to get indexed by Excite. The pages were nothing more than a simple html page that had links with descriptions.
Note: There is a reported 25 page submission limit, but Excite may discover many more pages on its own and index them. This is not violating any of their rules.
The second thing we did is make sure that these pages would always be fresh when Excite crawled them. There are two ways you can do this, by updating the pages on a monthly basis (we would suggest around the first of the month) or by putting some dynamic content within them.
The way we did it was to get a CGI script that returns the server date and time every time the page is loaded. It's important that this text is indexable by the robot. It won't work with a java script clock, or other client side methods generating this changing information. Other methods that you could use would be a banner or image rotation script, a random quote script. Regardless of what method you use the pages should appear fresh each time the spider checks them.
We placed our clock text within the title tag of the web page so we could tell also on what date and time the engine indexed the page. If your interested in adding the Perl time and date script to your pages, a free copy is available from Matt's script archive, http://worldwidemart.com/scripts/textclock.shtml Your web server has to support server side includes (SSI) to use this script.
Excite doesnít keep all pages
Excite has been limiting the size of its database. They appear to be listing few pages from each individual domain in its index. In fact, very large sites like web.mit.edu have only a handful of pages indexed, where AltaVista has thousands of pages indexed from the same domain. Excite itself only shows about 18 links to its own site, www.excite.com . These facts tell us some very important things about Excite.
- Excite treats each different machine name as a separate domain. For example, web .mit.edu and www .mit.edu can each have as many as 25 pages listedÖBUT each machine needs a different IP address to do this.
- It's very important that you choose the pages to submit wisely. Don't make up a hundred pages, submit them all and expect some of them to rank well. Do your planning in advance and work with the same handful of pages.
- The maximum number of pages that we have personally been able to get indexed is 26 from the same domain name. All of these pages are focused on a very similar topic and utilize the hidden pixel link technique
Excite appears to keep pages in its index better if they are updated often. Here's a tip, put something on the page, like a quote for the day, or a time and date feature so that the page appears new any time Excite spiders the page. This may help keep your pages in the index.
Submitting your site to a channel
Excite claims they don't accept submissions for their web directories. However, some companies have experienced success by sending a request through the form at http://www.excite.com/comments. It will undoubtedly help if your product or service fills a vacancy that Excite hasn't yet covered in the channels. It may also help if what you are offering can enhance Excite image. Let's face it, a sloppy site won't cut it here; they are looking for best of the web types of sites.
We conducted a few simple tests to check the differences in scores for a single keyword placed in different places of the document. While this isn't rocket science, it may assist you in getting that extra 1% that you need to better your competition. Here's what we learned:
- The highest scoring position (62%) was with the single keyword in the title of the page, but NOT at the front of the title text, this word was the third word in the title! This fact also holds true for two keywords. If one is in the body and one in the title, not having a keyword at the start of the title text, but occurring as the third or fourth position does increase the score slightly.
- The next highest scoring position (61%) was with the single keyword in the title, at the front of the title text.
- The third highest scoring page (57%) had the keyword impeded at the very top of the document.
- The fourth document, which also scored at 57% had the keyword in a link. For example: <a href="http://www.server.com/page.html> KEYWORD </a>
During some testing we did we also learned that Excite only recognizes the first 71 characters in the title tag. However, we have also noticed that the multiple title tag technique does seem to be working very well in some cases. We have also confirmed that Excite doesn't index the .alt text, meta tags, or the <!-comments -- > tag. It will index information that is outside of the regular <html> tags for instance
Keywords do get indexed here, before the html tags begin.
<title> Your page title</title>
The number of outside web sites that point to your domain determines Link Popularity. The more outside sites that link to your domain, the better . (Outside sites are sites that do not have the same root URL). For this reason NASA's sites score incredibly high on Excite -- and using web pages that ordinarily would not score that high based on the design of the pages themselves. The fact is, the NASA web pages have thousands of sites that have links pointing in their direction. If you are having trouble figuring why a certain page is scoring high on Excite, go to AltaVista and see how many other sites have links to that particular domain by entering: link:www.the-server-name-here.com Excite isnít influenced much by this, but its going to help you on many engines also, some to a much greater extent such as Google.
Keyword in the URL and/or domain name
Pages that have the keyword within their url score much higher than the same page without the keyword in the URL. Whenever possible, use your keywords as names for your subpages and/or directory. You also might keep in mind that words should be broken up to be recognized by Excite. The url http://www.yourserver.com/discounttickets.html would score higher if it were http://www.yourserver.com/discount/tickets.html because Excite will consider the slash (or a dash) as a delimiting character, see the separate keywords, and index them accordingly. This effect is even more pronounced when used in the domain name. The domain name http://www.discount-tickets.com likely will score even higher.
We are currently studying two-word keyphrase search results on Excite and we will be publishing the results in the near future book revisions and newsletter -- time depending upon how long it takes Excite to index them. At this time it appears that Excite reacts differently for multiple keyword searches. We've been studying 40 pages that are in the index for custom software and have learned some things that can help get your pages ranked higher. Here are some tips...
- The two keywords do not have to be close together, like on other engines.
- Shorter pages, in general, do better -- you will very seldom see large 10Kb file size pages scoring high for two keyword searches on Excite. Most are around 1-3Kb of text.
- Don't use the exact same number of each keyword -- one keyword should appear in the text more than the other should.
- Combined keyword density should be between 7% and 10% (total keyword 1+ total keyword 2)/Total Words in visible text x 100.
- Individual keyword density -- you should aim for around 6-7% keyword density. If you are working on two keywords, make one keyword slightly a higher density than the other has, they shouldn't be the same densities.
Excite, unlike most of the other search engines, does not require your keyword(s) in your <title>. This makes it easier to score under multiple keywords.
Excite definitely gives preference to pages that have the keyword in the url. For example, if your keyword is hawaii, then you will get a bonus if you use it in your url like this http://www.hawaii.com/hawaii/hawaii-tours.html
We have also seen usage of multiple <title> tags that were used with pages that had virtually no text in the page at all, that did very well. Donít start your title text line with the keyword as the first word, we found it works better if it's the third or fourth word in the text.
Excite does not index or use your web page's <META> keyword tag. Excite will simply ignore the Meta's keyword and index your page(s) according to the Excite criteria.
Most importantly, Excite is looking for keywords in viewable portion of your web page! -- Put your page creation efforts into the following areas.
Keywords in Links
Keywords in Headlines
Keywords in Body text
Keywords in Title
Another Tip: Excite understands synonyms of your keywords, so make use of them to avoid spamdexing penalties.
Additional Insight into Excite
According to Excite, their SE attaches decreasing weight to each repetition of a word, so it doesn't do much good to repeat the keyword over and over anywhere in the body text... and, according to Alex Cunningham (Technical Support Lead -- Alex@excite.com). "We take the full text of the document (including hidden text, but not including META information) and we do an analysis of the content. For every case of word stuffing that we can find we decrease the relevancy for that word inversely. In other words, the more you stuff a word or concept, the least likely your page will come up high in the search results when searching for that. Although occasionally we miss some abnormal cases of word stuffing (which may always be true, but we are working to prevent), it does help to level the playing field."
When asked if stuffing meant repeating words (like real estate, real estate, real estate...) or simply reusing a word throughout a page, Alex responded, "...repetition like in the first example (repeating words). Of course we try to look for all kinds of repetitious abuse, but like we said before some will inevitably slip by. This is primarily due to the fact that we do not want to suppress good pages that we thought were spammed, when they truly were not. We sway to the side of the spammer since we would rather have a few stuffed pages in the top results than to suppress good results that were not. Being redundant (although possibly annoying to read) should be fine, where being repetitive is not."
According to Excite, you should design your page to be theme oriented -- and the search results that we researched indicate that Excite has found a way to enforce their wishes. Here is the general rule to follow with Excite. If the web site visitor cannot see the text, then Excite ignores it . This includes META's, comments, and .alt text.
As we mentioned before, Excite is also scoring pages high that are mostly listings and links.
Indexed Pages: Approx. 2 Million
Frame Support: WebCrawler does index the text in the <no-frames> tag
Meta Tag Support: None at this time
Database Refresh: Monthly
Average Submission Time: 6-12 weeks
To submit your site to WebCrawler directly, go to: http://www.webcrawler.com/info/add_url/
Note: This will have the same effect as registering your site with Excite on their addurl page ( http://www.excite.com/Info/add_url.html). If you submit your site to Excite, you donít need to submit it to Webcrawler , they use the same database.
WebCrawler is owned by Excite and is obviously using a very similar ranking criteria when compared to Excite. On many searches we've found that the result listings for both single and multiple keyword searches are almost exactly the same as the results for Excite. The index, however, seems to be a smaller copy of the larger one at Excite, many pages that show up in the Excite index don't show up in the WebCrawler index. So, in general, you are dealing with almost the same ranking system as Excite but with a much smaller index of pages. We do know that WebCrawler is using Excite's spider to index the pages, or it is sharing the same database information.
Submitting to WebCrawler
Submit your main URL and those of the main sections of your site. WebCrawler strives to present users with the greatest possible breadth of search results listings, rather than trying to index every document of every Web site we know about. Unlike some others, our robot spider doesn't necessarily dig down to index every page on your site, although it does traverse links recursively to save them for future exploration.
We strongly encourage you to submit your main URL (also known as a site's root or index URL) to us, as well as your site's main subsidiary pages, and not every single document of your site, especially if it is a matter of hundreds or even of thousands of URLs. Please do not submit more than 25 URLs for any given server. Cases in which WebCrawler receives disproportionate numbers of submissions for the same Web site will be subject to automated and/or human review, and will be processed as WebCrawler sees fit. http://webcrawler.com/Help/GetListed/HelpAddURL.html
While we don't know where this magic number for 25 came from, we have been hearing rumors that Excite was using some kind of 25 per day rule. Since the change to WebCrawler was made earlier this year, this appears to apply to both Excite and WebCrawler. We have seen no penalties, for submitting more than 25 per day, but to avoid human inspection you should avoid submitting more than that. You will likely find that WebCrawler is even more difficult than Excite when it comes to getting your pages indexed. WebCrawler prefers finding documents that are the root of the main site or the index.html page in a subdirectory, for example: http://www.server.com and http://www.server.com/keyword .
Indexed Pages: Millions
Meta Tag Support: N/A
Accepts multiple <TITLE>s: N/A
Database Refresh: constant
Average Submission Time: 2 weeks/7 days or less when using business express service.
If not registered in 2 weeks after using the free registration, contact them at firstname.lastname@example.org (make sure and use the original email address used when submitting and list the date when you submitted your first request)
Yahoo released their Sponsored Sites program which is a paid listing program. Sites that are already in the Yahoo business directory can pay a flat fee that is somewhere between $25 and $3000 per month to be listed as a Sponsored Site at the top of the directory listings. Some categories are significantly higher than this. Up to 5 sites show within a category at a time, but we didn't see any limits on how many sites might be allowed into the program for each category. All of the sites are supposed to rotate, with no more than 5 to appear at a time.
Is this a good deal? Maybe. But, we're not thrilled about the fact you don't have any control over what the sponsored listing says. Right now, Yahoo is using the same title and description as the site's standard Yahoo listing. As you may already know, oftentimes Yahoo's selection for your title and description leaves a LOT to be desired. Paying big bucks yet having no editorial license over your own site's description sucks, seems unusually restrictive.
Anyway, if you have a deep pocket full of money to test it, then go ahead and give it a try (then be sure to call us and let us know how it worked out). BUT, be aware that it may be difficult to track the traffic coming from your sponsored listing. By that we mean that you may not be able to tell it apart from incoming traffic generated by your standard listing because Yahoo doesn't appear to offer any special tracking URLs or other options that would allow you to measure results.
They do, however, use a different redirection system for the sponsored sites, so you might be able to find different referrer information in your logs -- at least, that's what we'd recommend watching for.
Our opinioned belief is that the Most Popular listings will get more traffic than the Sponsored Sites. It's a good bet this program will go bust unless they change it to allow advertisers control over how their listings are described and how they're presented on the pages. They also need to enable tracking. http://sponsoredsites.yahoo.com/ http://help.yahoo.com/help/us/sponsored/
Yahoo - No more freebies for commercial sites
Yahoo discontinued accepting free submissions for commercial sites. Obviously based on the success (and public acceptance) of other directories' paid submission programs -- namely LookSmart and Goto as well as their own Business Express submission program -- they obviously felt the time was right to forge ahead and entirely drop the free listings option.
Frankly, it's no big change. They were, in all practicality, ignoring free submits anyway. They will, however, continue to accept some listings in noncommercial categories of the directory.
Specifically, the two directory areas that will only accept paid submits now are:
Even if the category that you want to submit to isn't one of the above, we still suggest using the Business Express service to make sure your site gets listed while you're still alive.
In a related news story at MSNBC, Yahoo also explains that they are considering paid ranking similar to the Goto model according to Srinija Srinivasan, vice-president in charge of the directory. Read more of his comments at http://www.msnbc.com/news/491082.asp
More details and FAQ's can be found at http://docs.yahoo.com/info/suggest/faq.html
Yahoo directory search results show popularity
Yahoo started tracking clicks on web sites during August to get data to drive their Popular Sites results. We suspect that the tracking system has been in place for some time during testing and wasn't fully implemented until recently. Your site logs will now start showing referrals from srd.yahoo.com with is the click tracking domain -- instead of www.yahoo.com . It now appears the primary method Yahoo is using to track popularity for both the categories and Popular Sites is srd.yahoo.com
Remember now that it's important to notice how Yahoo is sorting the sites for the directories. It shows you which category is pulling the most traffic for your search keywords on that site. You may also notice that Yahoo is currently having some problems with this new system -- most notably, they are frequently timing out during a search. We believe they're still having problems accommodating the major load of day-to-day traffic the site routinely receives.
Business Express Opened up
The Yahoo Business Express submission service is now available in all categories and in the other country specific areas as well. Highly recommended.
Probably the best (and certainly the quickest) way to get into the Yahoo directory is to use Yahoo!'s Business Express Service, which offers guaranteed consideration of your commercial website within seven (7) business days. This program requires a $199.00 USD one-time, non-refundable processing fee per submission.
However, if you are submitting a site offering adult content and/or services, the fee for the Business Express Service is $600.00 USD. All adult sites must be submitted to the most appropriate category under Business and Economy/Shopping and Services/Sex
Recently, Yahoo also changed their Business Express paid submission feature to require customers to register for a free Yahoo account before they can submit their web sites. Looks like someone in the marketing department is salivating over the opt-in marketing potential of site submitters.
Regardless, we feel that Yahoo is pushing it by making you fill in information twice. After all, every additional form - aka roadblock a customer has to hurdle increases the odds they will lose the sale. Ordinarily, the Business Express submission process, with its hoops to jump through, is a good example of what not to do. Typically, requiring so much information will turn a hot-to-purchase customer off.
However, a Yahoo listing is a must for any company trying to establish an Internet presence. So, we guess they can attach unusual strings in the process. Our belief is that if they were trying to sell anything other than a Yahoo listing this process would be costing them. It still may, time will tell.
Business Express Service
Yahoo offers a Yahoo! Business Express in their small business section. The new service guarantees that in 7 days they will review your site and either accept or reject the submission to the Yahoo directory. They also guarantee they'll notify you by email when they've made their decision. And, if your site was declined, they will tell you why they didn't like it.
Their acceptance criteria, found on their agreement page, sheds light on the subject of what kind of site makes it into the Yahoo! directory. We've reviewed the statement to highlight certain critical issues that likely effect your standard free submissions as well: From http://www.yahoo.com/info/suggest/terms.html
Yahoo recently removed the requirement for a web site to have real time credit card processing for the $199 Business Express Service. The new minimal requirements are as follows - The site must be a commercial web site based in the United States with the official business name visible to any visitor to that site.
Editors note: Put your business name and contact data on the bottom of every page
- This site must not already appear in the Directory.
- The site must support multiple browsers and capabilities (e.g. no Java only sites).
- The site must not contain any parts under construction and all links on the site must work as well as link to relevant content.
- The site must contain substantively unique content, as determined by Yahoo!'s editorial staff.
- The site must clearly define the purpose, products, and/or services of the business.
- The site must be up and running 24 hours a day, 7 days a week.
- The site must not contain any content, products, services or other information that, in Yahoo!'s reasonable determination, may be illegal to sell under any applicable law, statute, ordinance or regulation, that may infringe or violate anyone's rights, or that, Yahoo! believes, in its sole discretion, is inflammatory, offensive, or otherwise inconsistent with the spirit of Yahoo! Business Express.
These criteria are minimum requirements only, and Yahoo!, in its sole discretion, may consider other criteria before accepting or rejecting a site.
- Use an SSL (secure sockets layer) web server for the credit card forms and or shopping cart
- Provide logos for your sites (or your web hosting sites) Verisign or Thawte server certificates
- Use logos and links to sites like Public Eye ( http://www.thepubliceye.com) , The Better Business Bureau ( http://www.bbb.com) and Truste ( http://www.truste.org)
- Place links to your companies privacy policies, return policies, warranties, guarantees etc. in plan view
- Make sure your company's contact info (phone, email, mailing address, etc.) are available on the site.
- Test your site with Netscape 4.7 -- that's apparently what they're currently using to view your site.
- Check your HTML code and correct errors. One of the favorite tests that Yahoo uses is Weblint, available at http://www.cen.uiuc.edu/cgi-bin/weblint
- Be sure to check the links on your site. They must all be working. Yahoo hates bad links or broken images.
- Yahoo will likely click on your Purchase or Order links. They are probably looking to see if you are an affiliate of a company or the actual company that sells the product. In most cases Yahoo won't list affiliate sites, especially if that content is already in their index.
- Be aware that Yahoo will check the whois information for your domain name. The company name that is listed as the owner is what they look for and tend to use as the title of your listing -- regardless of what you request. From a scoring perspective, it's important to get your keywords in your title whenever possible so be sure that you have this information listed in a way that will not cause you disappointment.
Is it worth it?
That depends on how much traffic you are willing to lose. If your site makes money from listings on the other engines, YES it's worth it. In fact this may be one of the best Net advertising opportunities available! At the very least it will save you time waiting to get listed and wondering why they didn't add your site. If you follow our Yahoo guidelines and combine them with the $199 submission you can save months of time and receive traffic that will buy your products or services. A Yahoo listing doesn't normally drop in and out like your listings in search engines and it's a good source of quality customers.
Web Page Searches
Please see the Google section of the book for information regarding the extended search results on Yahoo. These are the results you get when Yahoo doesn't find a match in its directory or when you click on the link web pages.
Directory Registration Process
The registration process with Yahoo is complicated and time consuming. However, you MUST register with Yahoo if you intend to maintain a serious presence on the World Wide Web. You need to know that Yahoo will not accept your site registration if you initiate the submission from the wrong page on their site. You must first go to the category page that you wish to be listed in before you click the Suggest a site link found at the very bottom of each registerable category. Below we will show you how so pay VERY close attention and save yourself multiple Yahoo headaches.
First Step: You must understand how Yahoo operates
Yahoo is not a keyword driven search engine. Yahoo is a CATALOG!!! Yahoo does not list web sites with any regard to the keyword content of your web page. Fact is, Yahoo does not even index your page at all -- they only index the name of your company and the brief site description that you give them at the time that you submit.
However, getting listed well on Yahoo is actually darn easy... provided you know how to do it. However, before you make any attempt to master the Yahoo positioning puzzle, you must throw out all of the search engine rules and learn exactly where Yahoo is unique and how it works. Ready? ...here goes.
The first thing you must realize is that, in a keyword or keyphrase search on Yahoo one of four things will happen...
- Your search will match one of Yahoo's categories and you will get Yahoo results.
- Your search will match a company (or companies) and/or their description(s). You will then get those companies categorical results
- Your search will match BOTH category AND company results (both #1 & #2 listed above)
- Your search will match NEITHER category NOR company results -- and your keyword search will default to the Google search engine... wherein if you are listed well for Google, then you will appear in the Yahoo keyword search.
It is important to grasp that consumers using the Yahoo catalog do not generally understand exactly how Yahoo really works. Many people do keyword searches in the Yahoo catalog without any particular notice of how Yahoo returns results. Frequently their search yields no Yahoo results -- and they find themselves in the Inktomi index. Guess what... most of them do not even notice they are in the Inktomi index -- they just go on with what the search delivered without a clue they are actually looking Inktomi instead of Yahoo.
Ok, so what strategy works best with Yahoo? Surprisingly the best working strategy with Yahoo is pretty sure fire PROVIDED THAT you understand how Yahoo works. Here are some more insights you must grasp before you submit any pages to Yahoo.
- Most optimized for the search engines web pages will NOT catalog well on Yahoo. In fact, Yahoo is likely to unceremoniously reject them -- and without notice. Our sources tell us that Yahoo rejects approximately 50% of all of the pages that are submitted.
- Yahoo does NOT index the contents of your page and they don't give a hoot about keywords. They only index the name of your company and the brief description you give them at the time you submit. If you add hype into your description they reject your submission!
- They practically demand the TITLE of your page to be your company name... again, without hype! If your page looks like you are trying to commercialize their index they will reject your submission. If your company name does not match the title of your page, they will substitute your company name for the title... that is, if they accept your submission at all.
- If your page has any tricks they will reject it. No spamdexing, keyword stuffing, redirects, phantom pixels, food scripts, multiple submissions, alphabetical tricks, nothing!!! Your page must look like it is very straight-forward and the name of your company is presented up front... and, it matches your TITLE tag. It also helps considerably if the page (site) has what Yahoo calls valuable content. The Yahooligans are getting very snobbish about what goes into their wonderful catalog
This, of course, means that no one will find you unless your company name happens to include one of the keywords/phrases that people might search for you under. This is bad news for most companies, unless of course, people are searching for you by company name. Only then are you very likely to appear on top. The problem is that very few people are going to look for PXY, Corp. if they happen to be searching for the tape management backup systems that you happen to sell.
Lastly, they prefer that you have your own upper level domain. If your company's site resides on a sub-domain of an ISP and/or web hosting service, it is difficult to get Yahoo to list you. If your company's site resides on a sub-domain of a large ISP -- like AOL, freeyellow, geocities, earthlink, etc. -- then it is nearly impossible to get a new submission accepted. They simply ignore you.
Here's what you do.
If this is your first attempt at getting a listing on Yahoo, follow the procedures below. If you already have a listing on Yahoo and want to get another one, follow the steps below AND take a look at the section further on down regarding Obtaining Multiple Listings on Yahoo.
Note : Do not attempt to get Yahoo to change an already cataloged web site listing. If Yahoo has ever changed a listing (we know of no listing they have ever changed in spite of numerous requests), it probably involved far more time and trouble than it was worth. In some cases, Yahoo just removes the listing when a change is requested! It is easier to just get a new one and let the old ones ride (it is also probably easier to stop El Nino).
Start by... creating a unique site and name your company something that utilizes your keywords. Referring back to our example above, we could -- for web page purposes only -- name our company something like Tape Management Backup Systems, LTD. with a logo that displays your TMBS, Ltd. and the words, Tape Management Backup Systems LTD. incorporated somewhere within the Logo. This will prove to the Yahooligan who is inspecting your site that we are indeed a company named TMBS, Ltd..
Now, how do we determine which category to register in? Here's how. Let's do a keyword search for tape management.
(At the time of this writing, the #1 ranked company is a bad link. This presents an opportunity to purchase this position from the ISP or Hosting service that controls that sub-domain. Note: whenever you find a top rated Yahoo page in YOUR keyword category that no longer exists you should contact the owner of that domain and negotiate your way into that listing spot. Then, put a redirect on it to your main web site before Yahoo removes it from their catalog. This is a quick way to get a good listing on Yahoo.)
Onward with our search results...
Examining the rest of the results in our tape management search we find that every listing contains the name of a company and the description they submitted... and there is no hype whatsoever in any of these descriptions. You will not find the words best finest special coolest magnificent stupendous, etc... In fact, using anything that remotely resembles hype words in your submitted description -- not on your web page, mind you -- in your submitted description, is the kiss of death on Yahoo!
You should also note that each of the keywords are contained (and highlighted) in the company name and/or the description of the site. This proves that a keyword search can work well for you, provided that Yahoo happens to list categories and companies that match your keyword search. Such is not the case for all keywords and you should check before you begin your efforts so as not to waste your time trying to score high on a search that will only default to the Alta Vista search engine anyway.
By examining this search results it becomes obvious that we want to be registered in the very first listed category... the one labeled: Business and Economy: Companies: Computers: Software: System Utilities: Backup ...simply because this is the category that comes up first for the keyword search tape management . Getting your company in the first category will almost insure a front page listing for most businesses in the Yahoo catalog... especially if your keywords are in the name of your company.
OK... here's how you properly register from within that category. Click the category link... Business and Economy: Companies: Computers: Software: System Utilities: Backup
After the next page loads you will see an icon at the top that says Add URL (the middle orange button after the Yahoo logo). Click the Add URL icon . A page titled Suggest A Site comes up with instructions on how to suggest a site. Once you start to read through the gobbly-gook they have on this page and the other pages they refer to, you'll wonder how anyone suggests a site to Yahoo without help! Regardless, go ahead and look at their recommendations. Once you've gone through their instruction nightmare you can skip it in the future.
Ok, read the stuff ? Yes? Well alright then, -- now you are ready to push the Proceed to Step One button. Go ahead, click it... it won't explode.
The next page comes up with the correct category already filled in. That's why you must start the Suggest a site process from the correct category page ... the one you want to be listed in. From here it is easy to follow the directions on the page. When you are finished, click the Proceed to step two button. Go ahead, practice. Nothing happens unless you push the submit button at the end of Step 4 .
Step two gives you a chance to suggest an additional category that your company may belong in. It is self-explanatory and, in our example, the answer will depend on the information below. Therefore let us proceed.
Backing up a little we want to make something perfectly clear. Do NOT submit anything to Yahoo until you are crystal clear exactly what two categories you wish to be listed in. And you should use the Add URL from your first choice category only . Your second choice category will be a fill-in that should be copy & pasted once you've selected it -- and then, is only completed at the time you are submitting your web site. That means that you do NOT submit until you have examined all of your options and you have chosen your top two categories. Got it?
So, in an effort to choose the next category, let's do a keyword search from the Yahoo front page for tape backup systems. This time we get a category titled: Business and Economy: Companies: Computers: Hardware: Peripherals: Storage: Tape Drives ...and, if tape backup systems is what we sell then, obviously, that is the second category that we want to be listed in. Now is the time to copy and paste that second category choice into a notepad and save it for inclusion in the step two fill-in box. Since Yahoo will only list you in two categories, you can now initiate your suggestion process and, when finished, you are now done with that newly created company's web site submissions. The review process can take up to a month to complete. If you do not appear in the Yahoo index within two weeks, write them a nice letter of inquiry sort of as a reminder and to find out if you did anything wrong . If you still do not appear in 30 days, then write them another very nice letter of inquiry. Be sure to state the specifics (time, date, company name/title, page description, etc.). Chances are you will get (nicely) indexed -- provided you did everything right and you show a little patience.
Now, here's a tip that will practically guarantee your submission will be accepted by Yahoo provided you do everything else right. Register and submit under an upper level domain name that reflects your keyword chosen company name . (example: tapemanagement.com or, tape-management.com ). That is likely to convince Yahoo that your company's name is indeed Tape Management Backup Systems, Ltd. and they will list you with your keywords automatically in the listing.
Reality check : The domain registration will cost you $100 with InterNic. The domain hosting fee will be less than $30 per month. How much do you think a good listing on Yahoo is worth? If your budget can't handle an average of $35 per month for a very sweet listing on Yahoo then maybe you should reconsider being on the Internet. That's about as good as it gets, advertising-wise!
Be aware that Step Three includes contact info, company name, email, phone, fax, address , etc, so be SURE to maintain a record of exactly what you tell them and make certain that the company name you give them matches the company name/title of your submitted web page. It is important to keep a record of your exact submission details especially if you plan on following our instructions for multiple keyword listings on Yahoo and/or if you already have listings on Yahoo .
Obtaining Multiple Listings on Yahoo
Take the case of a single company that sells hot tubs, swimming pools, all of the associated supplies and accessories, along with maintenance & cleaning and has locations throughout North America. The following categories all apply...
Business and Economy > Companies > Home and Garden > Pools, Spas and Saunas
Business and Economy > Companies > Home and Garden > Pools, Spas and Saunas > Accessories and Supplies
Business and Economy > Companies > Home and Garden > Bed and Bath > Bath > Whirlpool Bathtubs
Recreation > Home and Garden > Pools, Spas and Saunas
In addition, there are several regional categories that also apply.
Yahoo's policy? ...one listing -- and Yahoo gets to pick the category. That's if they list the company at all. Once you submit your site they reserve the right to change your title, description, category, or ignore you completely, all at the whim of some Yahoo Surfer that likely knows nothing whatsoever about your customers. Frankly, if the public knew how arbitrary and inconsistent Yahoo can be we doubt the directory would be so popular -- a phenomenon we attribute to arriving first on the scene and having a memorable name
Yahoo submission strategies...
Yahoo likes professionally designed glossy web pages with fancy logos that are visually appealing.
Yahoo site reviewers are highly suspicious of company names that place at the top of the alphabetical listing.
Since Yahoo lists companies within categories alphabetically (ASCII order -- 1-9, A-Z) we strive to select a company name that places at the top of the alphabet.. We rack our brain for a legitimate company name that will land the page alphabetically toward the top. For ideas, check the current Yahoo listings in categories outside your target category to gain a feel for what Yahoo has accepted in the past.
Yahoo uses the company name and site description when returning results from a keyword search.
Whenever possible we incorporate our best keywords/phrase into the company name and certainly into the description that we suggest Yahoo use. An example of this for a hot tub business would be... All Seasons Hot Tubs and Spas - spa and hot tub sales, service, supplies and accessories.
Yahoo is looking for unique content that offers value to their site visitors.
We make certain our site is completely unique from others in the same or similar categories -- especially if this is an attempt to gain a second listing for a company we already have listed somewhere else on Yahoo. This means that we go to great lengths to insure there is no trail whatsoever to any previous listings on Yahoo. Whenever possible we design the site to showcase products, services, offers, or benefits that are not offered by too many (any) others in the target category. Remember, Yahoo wants valuable NEW stuff to list in their directory whenever possible.
Yahoo wants all of your contact info to match your site, InterNIC, and submission info.
We make sure all of our Yahoo submission info and Network Solutions registration info matches up with the contact info we put on our site. In other words, our TITLE tag and logo matches up with our company name which matches up with our Network Solutions (InterNIC) registration. The contact us address matches. The phone numbers match and, if possible, even the credit card billing address we give will match.
In addition, all of this information is NEW -- in other words, this is the first time any of these names or numbers have been submitted to Yahoo. Therefore, you MUST keep a log of email and physical addresses, phone numbers, credit cards, company names, technical & admin contacts that you give to Yahoo and never use any of them a second time! ...not even the credit card.
Some creative solutions include using a PO Box, a home address, or a relative's address. Use an email address that ends in the domain you are submitting. Use an alias as the contact person . That way if Joe Alias gets a phone call you can suspect it's someone from Yahoo and you can tailor your response accordingly.
We also use a different nameserver and web site technical contact information that does not match any of our related sites and we host the site in a different location from company related sites.
Remember, after the site is listed you can change anything on that site that you desire. You can move it if you want or change your InterNIC registration info as well. To date, we have no evidence whatsoever that Yahoo re-checks sites after acceptance. They are just too busy. In fact, it may take an act of Congress to even get them to take a second look. Besides, it's a fairly normal fact of life that websites change. You have the right to change your website whenever you please. So, after you get listed, change away!
Yahoo looks for links to pages that are already in it's directory.
We make certain any new site does not link to pages on any other related site that is already listed on Yahoo. Any new (and secretly related) site has totally new product pages, a new look (logos, etc.), and even a different secure server whenever possible.
Submitting a Multiple Listing
When submitting a second (multiple) listing to Yahoo, follow these tips...
1. Use a different dial up account than any previously used for prior submissions. Doing so will show Yahoo a different IP address making it look as if you have not submitted before.
2. Fill out the title as the company name, and the description matching the new site.
3. Be certain the contact info matches the information contained on the site and does not match any other prior submission -- including the email address . We can't stress enough how important the email address is. Yahoo does track email addresses, and they are unlikely to accept a second site submission from an email address previously used to submit a site. We suggest using an email address that is located at the new domain being submitted like email@example.com.
4. If ever you attempt to make a change to your listing in the future be sure to use the same name and email address you used when you originally submitted the site. Again, log everything into a notes folder and put it where you can access it.
5. If using the business express option, use a different credit card account and billing address from any previous submissions.
Yes, this is a lot of work and something that we do ONLY for companies that stand to gain much from their Internet presence -- and from having their listing(s) in Yahoo. Overall, our experience has been that it's well worth the trouble because, unlike search engines, once you make it into the directory you are basically in forever. Good positioning can even become a salable asset. After all, Yahoo is the most visited site on the Internet.
More Yahoo Tips
Make sure that your submitted page(s) are offering goods, product, services that are VERY consistent with the category you are attempting to list under. This is critical. Design your web site accordingly and do not change anything until after it is accepted... that is, if you change anything at all.
Updating & Resubmitting pages to Yahoo
Yahoo does not spider or revisit your site... and their change form -- supposedly there to alert Yahoo of pages that need to be categorically re-cataloged -- is an industry joke. The only time we've heard of Yahoo changing a listing involved the purchase of banner advertising. It seems that buying advertising on Yahoo can get you a little special attention in some circumstances.
Another Submission Option
If you are having problems getting your site listed in Yahoo's directory, try submitting to their regional directory. We've found that over the last couple of months Yahoo seems to accept sites that have been submitted from a regional directory more quickly as compared to submissions from to their large commercial directory. Some of their documentation suggests that you can only submit a commercial site to the Business and Economy: Companies directory, but you can also submit business sites to the Regional: directories as well.
Yahoo makes it difficult to find a email address to contact if you are having problems getting submitted, the only Official one that we have found is firstname.lastname@example.org . Wait at least two weeks before contacting them regarding submission failures. Make sure and include the date of your submission and if possible use the email address that you used during the original submission. The alternate would be to save the email address from a previous successful submission and use that one to contact them with.
Note: It make take a MONTH or two for Yahoo to reply to this email address, but it does work, we have sent mail to it and confirmed that this is the right address.
We've also been told that Yahoo will add sites quickly if they have something to do with late breaking news stories. If your site is time sensitive, then tell them so in Step 4 and ask for a rush listing . It won't hurt and might help.
One final note regarding Yahoo
Frankly, we have never been impressed with Yahoo as a resource for indexing web pages. It only works best for finding companies by name. On a web where most consumers are keyword oriented they really don't know how to use the Yahoo catalog. However, Yahoo IS one of the most visited sites on the Internet and a significant number of people DO use the Yahoo catalog. For that reason we recommend that you spend the time necessary to get the listing that you want with Yahoo. Be sure to submit your pages with VERY careful attention to detail -- because it is likely that you will get only one opportunity to get it right with each of your submissions to Yahoo!
Indexed Pages: Approx. 1.5 Million
Frame Support: ODP DOES support Frame style pages
Meta Tag Support: No - BUT some engines like AOL/Inktomi may index meta tags from pages listed in ODP
Average Submission Time: Varies - usually within 2 days
Sites Using ODP Data: 127
The Open Directory Project has lead a short but colorful life over the past year or so. Founded in 1998 by Rich Skrenta, Bob Truel, Chris Tolles, Jeremy Wenokur and Bryn Dole, the directory originally started up as GnuHoo.com . The directory was designed to be somewhat similar to Yahoo!, but was based on volunteer editors instead of a paid staff. This open, community-built web directory got off to a rocky start...
For many, those first three characters in the GnuHoo domain name were very important, as they are the trademark of the Free Software Foundation or FSF for short. The GNU is associated with open code software like Linux, but GnuHoo wasn't giving away the software directory program code. After getting heat from the Open Source Community they changed their name to NewHoo in June 1998 and thereby avoided a trademark infringement issue.
The directory continued to grow to upwards of 100,000 listings and started drawing attention from several of the other portal sites. Finally in November '98 Netscape purchased NewHoo for an undisclosed amount, renamed it the Netscape Open Directory and placed it under the umbrella of the free software movement of Mozilla.org. At this time Netscape actually did make the directory open . They created a free use license that allowed individuals and organizations to make copies of the directory.
Things started happening pretty fast after that. Next AOL bought Netscape. Then AOL/Netscape dumped Excite as their search engine replacing it with a combination of the ODP and Inktomi. During the summer of 1999, amid a wave of acquisitions and mergers, other portals started liking the idea of not having to pay for, and administer, a directory. So, they too began incorporating the free ODP data into their sites as well.
Today, ODP boasts more than 1.5 million sites, 22,000 editors, and 200,000 categories.
The ODP database is used on more than 20 search engines and directories, including AltaVista, Netscape Open Directory, Netscape What's Related, Lycos, HotBot, Dogpile, and Thunderstone. It's also the default search for Netscape's browser.
We generally equate a good listing in ODP to be worth about 1/4 to 1/2 as much traffic as a site listed well in the same category at Yahoo. This varies quite a bit depending on the topic area, but with all the engines using this directory it's not one to be overlooked.
How is ODP Used at Different Portals?
It should be noted that each licensee of the ODP database has the ability and means to format the ODP data in different ways. This means that search results are likely to differ between the portal sites and this fact need to be taken into consideration when submitting your pages to ODP. A top listing on one search engine may not necessarily be a top listing on any other because of the different algorithms each licensee site uses.
Another reason that results may vary is because each site using ODP data can update their copy of the database whenever they want. In most cases you'll find they update on a monthly basis even though ODP itself often updates more frequently.
You have the option to submit your sites from many of the different licensees sites but we recommend that you only submit from one to avoid duplicate submissions. We have not observed any advantage gained from submitting at any particular licensee site over another.
Here's a brief rundown on each of the individual ODP sites;
Netscape uses ODP extensively throughout their sites:
- http://www.dmoz.orgHome of ODP
- The Start Page for Netscape's Browser http://home.netscape.com
- Netscape Search http://search.netscape.com (results are from Google first, then ODP).
- The Netscape Browser Internet Keywords feature.
AOL -The search engine that AOL users see is http://aolsearch.aol.com/index.adp which gives the visitor an option of browsing the ODP directory or searching. Search results are composed with the following sections, from top to bottom of the page:
Recommended Sites - Paid AOL listings
Matching Categories - Regular ODP Categories
Matching Sites - ODP sites that have been indexed by Inktomi with AOL pages inserted at the top as relevant pages.
Web Articles - Regular Inktomi Index search results
The second search site is the one used by most web based users and can be found at search.aol.com, which now appears to be identical to aolsearch.aol.com.
Although we haven't experimented with this one much, the use of Inktomi's search engine to index ODP pages appears to follow some of the same rules that we are used to at sites like HotBot. Keyword in the title, meta description, meta keyword and body text and URL are all important. On just a few sample pages, keyword density appears to be running around an even 2-3% for two word phrases.
AOL appears to implement the new releases of the ODP data quicker than most of the other portals using ODP.
AltaVista has incorporated ODP as their primary directory on their home page at http://www.altavista.com. When conducting a search at AltaVista for a single keyword, they present some matching categories from ODP at the bottom of the search results page. They also have added a Categories button at the top of the page which also takes you to their version of ODP.
Lycos in April of '99, Lycos replaced their main search engine with ODP categories and selected web pages at the front of their search results listings. The Web Pages old Lycos search results now don't show up until the end of the ODP listings or unless the search phrase isn't found within ODP. The Lycos directory on the home pages is also powered by the database at ODP.
Google in March 2000, Google.com augmented their search engine by indexed sites found within the ODP directory and using the directory structure to help make their search more relevant. Sites in the ODP directory have a definate advantage on this search engine.
HotBot incorporates ODP as their home page directory and includes results from the ODP categories at the top of the search result listings.
DogPile ( www.dogpile.com) incorporates links from their home page to their version of ODP and also includes ODP listings in their search results. The search results at DogPile are found in the following order:
DogPile ODP Directory
Lycos Top 5%
Lycos Web Pages (old search engine)
ODP Submission Guidelines
Submitting to ODP is very similar to submitting to Yahoo, first you select the category that most accurately describes your sites content. You should begin by searching for your primary keywords and key phrases, then review the different categories that are returned. At the bottom of each search result is the category that page is listed under. Click on the most appropriate category and you'll find an add URL link in the upper right hand corner.
Before submitting, remember that ODP is powered by human volunteers who operate in most cases according to a somewhat lose set of guidelines. One of which is sometimes referred to as WWYD or What would Yahoo do? This means basically that you shouldn't try any tricks that wouldn't work at Yahoo. The editors at ODP are often well versed in search engine promotion techniques and will frown upon your trying to use AAA or !AAA in your title (ODP's listings are alphabetical) unless you can really make a case that is the real name of your company or site. Even then be prepared for the editor to ignore you if you try that trick.
- Adult sites can only submit to the Adult categories
- No multiple submissions of the same page and content.
- You can submit more than one page from your site into different categories. However you must make sure that they have appropriate, significant content for each category. Each page that you submit should have links to additional information on that content deeper within your site. Don't waste the editor or a surfers time submitting to places that you don't belong.
- Non-English sites should submit to the World section
- Sites with a regional focus should submit to the Regional section.
By using a frames page in a very specific way, it's possible to send the <no frames> section of the page you want indexed to the search engine and the frames content to the browser (humans). The <no frames> section would be your optimized source code for the engine. The <no frames> section DOES need to have logical text that is relevant to what is being displayed in the browser window -- just in case a human reviews the page manually. The <no frames> area of the page is basically invisible to users unless they use the view source option of their browser.
Just remember to keep the content relevant to the page the humans will see. Refrain from making templates for the <no frames> area of the pages. AltaVista is trying to detect pages that are very similar -- the pages that only swap out a keyword or two. The content of your <no frames> sections needs to vary somewhat. Ideally each would be somewhat unique and focus on the topic that the page was built for.
By the way, if your customer base consists of a significant portion of WebTV users you should test you pages on a WebTV browser. WebTV tries to combine frames and this can sometimes make a mess out of the page. That's the largest single browser demographic that currently doesn't support frames well.
Also keep in mind that shopping cart pages on a secure server that are wrapped in frames from a non-secure server won't show the closed-lock icon on the browser window. You need to break out of the frame when leading people to the secure pages or they will understandably think the page is non-secure -- even though it is. This may cost some sales because that secure lock icon IS looked for by a lot of consumers.
Almost all search engines will reject dynamically generated pages if they have extended characters in the URL (except for Lycos and Inktomi). This is primarily due to the fact they are worried of getting into what they call robot traps where there may be no end to the number of links that a script or program generates. If the URL contains a ? , % or other similar characters, they will probably not index your site, and most likely will not crawl these types of urls with a spider. In some cases such as at Inktomi and AltaVista, you might be able to get the engine to index a page with these characters when submitted manually, but the spider wonít follow it links with the characters in them. A work around is to build Pointer Pages using regular static html with links to the target page. If you attempt to use the <meta> refresh tag within the pointer pages, be aware that some engines will try to index your targeted page, not the page that you submit. There are ways around this problem, but they are quite complex.