Problems with Googlebot fetching my robots.txt file

Home Forums BulletProof Security Free Problems with Googlebot fetching my robots.txt file

This topic contains 2 replies, has 2 voices, and was last updated by  AITpro Admin 3 years, 4 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #21452

    Mike
    Participant

    First I want to say that I searched for Googlebot and robots.txt for issues with BPS on these forums. While I found a few people over the past couple years who had problems none of them sounded like my issue. I also want to say, that I don’t think that this is a BPS issue, but maybe a recent issue with the Googlebot.

    I’ve been running BPS for probably a year now. For the most part it’s been pretty smooth, but occasionally I’ll run into a problem like this. Earlier today I received an e-mail from Google about one of my two websites about not being able to access my robots.txt file. I had no problem getting to this file and it’s a static file that sits on my server and it hasn’t changed in quite a while. I hadn’t even taken many plugin upgrades lately, but I did the typical problem solving stuff, and hours later I narrowed it down to my .htaccess file. Specifically to some custom code I had entered a long time ago. I think I got that code from a post here on the forums or from some additional BPS security optional code to add. Again, from searching this forum, it sounds like a few people here use this code.

    Here is the code in question:

    # REQUEST METHODS FILTERED
    # This filter is for blocking junk bots and spam bots from making a HEAD request, but may also block some
    # HEAD request from bots that you want to allow in certains cases. This is not a security filter and is just
    # a nuisance filter. This filter will not block any important bots like the google bot. If you want to allow
    # all bots to make a HEAD request then remove HEAD from the Request Method filter.
    # The TRACE, DELETE, TRACK and DEBUG request methods should never be allowed against your website.
    RewriteEngine On
    RewriteCond %{REQUEST_METHOD} ^(TRACE|DELETE|TRACK|DEBUG) [NC]
    RewriteRule ^(.*)$ - [F,L]

    Like I said, I’ve probably been using this code for at least 6 months if not closer to a year, and I never had a problem with Googlebot or my robots.txt file. I did remove the “HEAD” option as suggested so that my website uptime tracker bot could hit the site without problems. Well, now today as of 3/17/2015, this code apparently doesn’t allow Googlebot to fetch my robots.txt file. I have fixed this problem (for both of my websites), by removing the “TRACE” and “DEBUG” options. After doing this Googlebot had no problem once again accessing my robots.txt file. And yes, I did need to delete both, just removing one of those options did not allow the file to be fetched. I also tested probably between 10 and 20 different configurations and it was consistent in when it would work and when it wouldn’t, so I don’t think it was random coincidence.

    At this point, it is my assumption that Google has changed something recently about their Googlebot that now uses TRACE and DEBUG though I don’t know how to prove this. Maybe it’s a defect within the Googlebot, and it will quietly disappear before many people notice, or maybe it’s permanent, I don’t know. My only real concern is that I don’t want the fact that I opened up the TRACE and DEBUG options to allow my site to get hit by unnecessary bot traffic and thus slow it down more than it should be.

    Hopefully if anyone else using BPS is having the same issue, maybe this solution will help you as well.

    Mike

    #21453

    Mike
    Participant

    I’m sorry. It looks like it might be random after all. After testing back and forth with both websites, I still get the error to occur. Even when I disable the htaccess file altogether, it seems to fetch the first time but not again subsequent times. So I’m not even sure if BPS has anything to do with it. It’s probably all with my host.

    Mike

    #21456

    AITpro Admin
    Keymaster

    There is not any standard/default BPS .htaccess code that affects the Googlebot, retrieving a robots.txt file, crawling and indexing a website.  If you added additional custom .htaccess code to BPS Custom Code then it is possible that that additional custom .htaccess code could be affecting the Googlebot or other things.

    BPS Troubleshooting Steps:  http://forum.ait-pro.com/forums/topic/read-me-first-free/#bps-free-general-troubleshooting

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.