The Crawler Restriction option enables you to control how the FusionBot crawler
handles links to content throughout your site. Depending on how you have designed your site will determine which
option is optimal for indexing as many pages as possible on your site. Following are the three options available
and how FusionBot will navigate your site and create your searchable content as a result of your selection:
Option 1 (Default) - Directory Level Indexing: Selecting this option will cause the FusionBot crawler
to only follow links on your site that are contained directly within or below the DIRECTORY path provided in the Domain Path / URL
field. This allows you to create an index that restricts the searchable content of your site to just a specific portion of your site.
For example, entering as your Domain Path:
http://www.yoursite.com/marketing
Will cause the FusionBot crawler to ignore any links on your site that point to directories not within the "/marketing" directory, such
as:
http://www.yoursite.com/finance
However, a link to:
http://www.yoursite.com/marketing/sales
Would be followed since the "/sales/" directory is contained within the "/marketing/" directory.
If you do not specify a directory as part of your Domain Path / URL:
http://www.yoursite.com
ALL links pointing to pages within http://www.yoursite.com will be included in your index.
Option 2 - Server Level Indexing: Selecting this option will cause the FusionBot crawler
to index ALL pages it finds on your site regardless of their location, whether or not you have included multiple directories as part of your Domain Path / URL. This is useful for those who wish to provide a starting point
for FusionBot to begin crawling their site that is multiple directories deep, however, they still wish for their entire site to
be crawled. For example, entering as your Domain Path:
http://www.yoursite.com/marketing
And selecting this indexing option will cause the FusionBot spider to begin indexing your site within your "/marketing" directory.
In addition, when it encounters links to pages outside of your "/marketing" directory:
http://www.yoursite.com/finance
These links will ALSO be followed and included in your index. Again, this is useful when you wish to provide an alternate starting point for
FusionBot to begin crawling your site, but still wish for ALL directories within your site to be included in your index.
Option 3 - Domain Level Indexing: Selecting this option will cause the FusionBot crawler
to index ALL pages it finds on your site regardless of the directory in which they are located AND regardless of the server / hostname in which these
files are hosted. This is the most liberal option, enabling FusionBot to follow as many links as possible on your site, as long as they match
the DOMAIN portion of your Domain Path / URL.
For example, if you have subscribed the following URL:
http://www.yoursite.com/marketing
Selecting this option will not only include ALL pages the FusionBot crawler finds within this site regardless of the directory in which they
are located, the crawler will also follow links to pages hosted under a different hostname and/or server all together, as long as these pages reside
within the same DOMAIN. Therefore, if FusionBot were to encounter a link to:
http://finance.yoursite.com
These links will be followed and included in your index as well. Selecting this option will ensure that as many pages are included in your index as possible,
as long as they are contained within your domain, i.e. http://*.yoursite.com.