Can I prevent FusionBot from indexing my alternate file types?

If you have a paid FusionBot Account and you do not wish to have your Adobe Acrobat PDF, RTF, or your Microsoft Office documents indexed, you can create a robots.txt file and place it in the root / default directory of your website, or login to your FusionBot account, click on the 'Spider' tab, and select the 'Exclude Pages & Directories' link to create an internal robots deny list via the Robots Exclusion Form feature.

The robots deny list contains special instructions for the FusionBot spider to use when visiting your site and creating an index of its searchable content. To instruct the FusionBot spider to exclude an alternate file type, construct your robots.txt file or enter into your Exclusion Form the following:

User-Agent: fusionbot
Disallow: /*.pdf
Disallow: /*.doc
Disallow: /*.xls
Disallow: /*.ppt
Disallow: /*.rtf

This special syntax will prevent the indexing of ALL of the specificed alternate file types within your site. As this is not a standard implementation for the robots.txt specification, this will only work with your FusionBot account.

This syntax works for denying ALL of the specified file types within your site and/or mini-portal.

To prevent the contents of certain directories from being indexed, regardless of file type, or to prevent a specfic page from being indexed, please reference the following FAQ.

After making your changes, you must click the 'Request Spider' link, available via the 'Spider tab, to rebuild your index.

<< Previous FAQBack to FAQ ListNext FAQ >>