We have a hosted website on SquareSpace, which uses a custom CMS product.
They do not allow editing of the default Robots.txt file, but use a 301 redirect to the one we customize.
Unfortunately, it appears that ES is not picking up/using/honoring the robots.txt file. It appears that other engines do - for example we have a Google Custom search and the disallowed locations do not appear in Google’s results.
The url is: http://www.henssler.com/robots.txt
Which is redirected to the modified file. However, none of the disallows in the list are being respected.
I did force a full crawl after making the change, but the results were the same.
FYI - the only difference in the default and the customized robots.txt is the last entry /tag/.
Any ideas on what we can check/why this CMS may be different?