Why is my index is incomplete or completely failed, and when I view my Index Log Report, FusionBot attempted to get a number of pages on my site using incorrect page names, often incorrectly duplicating the directory structure with the URL?

This situation occurs when your site (https://www.somesite.com/directory1/directory2/) is actually just a sub-directory of another site (https://www.somesite.com), rather than its own distinct URL.

Since the site you have signed up includes a directory or two as part of the domain path (which is correct), our spider establishes your site root as https://www.somesite.com/directory1/directory2.

Since this is your "root", meaning no pages should be followed that are outside of this starting directory, our crawler anchors all links found on your site to the end of this root.

Therefore, when FusionBot finds a link in your HTML such as:

<a href="/directory1/directory2/somepage.htm">

The indexer appends this HREF to your site root, resulting in the following URL call:

https://www.somesite.com/directory1/directory2/directory1/directory2/somepage.htm

Since this page does not exist, the index fails.

The problem has to do with how your hosting provider configures your site, either as its own virtual server, or simply as a directory within their site. The former, FusionBot works correctly. The latter, however, makes it difficult to index your site without a manual over-ride entered by our technical support staff.

If you are seeing this sort of behavior within your Index Log Report, please fill out our Customer Support Form and we will work to resolve the issue by sending a flag to the indexer indicating that your site is not independently hosted.

<< Previous FAQBack to FAQ ListNext FAQ >>