+ Reply to Thread
Results 1 to 9 of 9

Thread: Google 'unable to access robots.txt'

  1. #1
    Premium Member
    Join Date
    Jan 2012
    Location
    France. Between Limoges and Brive la Gaillarde.
    Posts
    630
    Blog Entries
    5
    Thanks
    1,766
    Thanked 210 Times in 146 Posts
    Rep Power
    11

    Google 'unable to access robots.txt'

    I do not have robots.txt files on any of my sites. However, over the past week, Webmaster's Tools reports that gbots didn't crawl the site because they were unable to access robots.txt. Most times, they failed on all of several attempts, but once they succeeded in 1 of 7 tries. Anyone seen this kind of thing? The e-commerce site discussed on an earlier thread is one of the ones most affected - could there be a connection between the two anomalies?
    Last edited by Chabrenas; April 18th, 2012 at 10:33 AM. Reason: title typo

  2. #2
    Top Contributor
    Join Date
    Oct 2010
    Location
    Cotswolds
    Posts
    787
    Thanks
    175
    Thanked 739 Times in 373 Posts
    Rep Power
    23
    stick a robots.txt on there?!
    Google - despite all the negative comments about them is quite obedient about using sitemaps / robots.txt etc. - you can also use them to show priority / and other information...
    think of it as a guide to Google telling them how you want your site indexed

    Alasdair

  3. The Following User Says Thank You to akirk For This Useful Post:

    Chabrenas (April 18th, 2012)

  4. #3
    Top Contributor
    Join Date
    Jan 2010
    Location
    Canada
    Posts
    1,277
    Thanks
    214
    Thanked 830 Times in 429 Posts
    Rep Power
    27
    I would make things as easy as possible for Gbot if I were you.

    Just add a generic robots.txt file that allows access to everything that Google can get to if you don't have folders and the such that you're concerned about indexing (which you shouldn't seeing as you hadn't taken the time to exclude them via the robots.txt file before).

    So either

    User-agent: *
    Disallow:
    So basically all robots, nothing is disallowed.

  5. The Following User Says Thank You to tke71709 For This Useful Post:

    Chabrenas (April 18th, 2012)

  6. #4
    Premium Member
    Join Date
    Jan 2012
    Location
    France. Between Limoges and Brive la Gaillarde.
    Posts
    630
    Blog Entries
    5
    Thanks
    1,766
    Thanked 210 Times in 146 Posts
    Rep Power
    11
    Thanks. I usually give Google sitemaps, but didn't think I needed to provide robots.txt.

    However, I've just uploaded robots.txt and checked that mysite.com/robots.txt exists and contains
    User-agent: *
    Disallow:
    Then I asked Webmaster Tools several times to 'Fetch as Googlebot'. Same result: 'unreachable robots.txt'. I checked both www and non-www URLs in my browser: the file is accessible.

    Since the crawl errors file showed that on one occasion the bot succeeded once in 7 tries, I wonder if it is actually a case of it failing to access the site at all. A few days ago, I couldn't access it for 18 hours, although people in other countries could.

    I see that every time I ask Tools to do a fetchm it counts down the 500 'fetches remaining'. Is this a function of a free version of WMT?
    Last edited by Chabrenas; April 18th, 2012 at 11:43 AM.

  7. #5
    Administrator
    Join Date
    Jan 2010
    Location
    Essex, UK
    Posts
    7,292
    Blog Entries
    30
    Thanks
    3,909
    Thanked 2,653 Times in 1,503 Posts
    Rep Power
    101
    In what program did you create the robots.txt? Word etc., are useless. You need to use something like Notepad++ or even Notepad. Further, try running it through a validator (example) for other errors. Invisible characters on the page, format in which it was uploaded, misspelling as robot.txt etc., are common errors.
    Show your support - Like us on Facebook

  8. #6
    Premium Member
    Join Date
    Jan 2012
    Location
    France. Between Limoges and Brive la Gaillarde.
    Posts
    630
    Blog Entries
    5
    Thanks
    1,766
    Thanked 210 Times in 146 Posts
    Rep Power
    11
    Good points, but I use Notepad++ all the time. G didn't say the robots.txt file had invalid content, it said it had failed to access it. Take a look at the file for yourself: http://www.ropeysoles.com/robots.txt

    I've just checked GWT again, in case it was working from cache (which it shouldn't do, of course). Still the same result.

    Tomorrow I'll check Awstats to see if the gbot visit is recorded or whether there were any 500 errors. I've had 2,000 of those in the past 18 days - is that about normal for around 600 pages/4,000 hits a day?

  9. #7
    Established Member
    Join Date
    Mar 2012
    Posts
    106
    Thanks
    70
    Thanked 122 Times in 66 Posts
    Rep Power
    4
    This sounds like just a WMT anomaly which often sort themselves out given a few days, but one other thing to check is whether you have any bot/IP blocking rules in .htaccess or firewalls and so on.

  10. The Following 2 Users Say Thank You to monty For This Useful Post:

    Chabrenas (April 19th, 2012), Clinton (April 19th, 2012)

  11. #8
    Premium Member
    Join Date
    Jan 2012
    Location
    France. Between Limoges and Brive la Gaillarde.
    Posts
    630
    Blog Entries
    5
    Thanks
    1,766
    Thanked 210 Times in 146 Posts
    Rep Power
    11
    Thanks, Monty. No I don't block anything in .htaccess, etc. Nothing has changed since all was running smoothly before.

  12. #9
    Premium Member
    Join Date
    Oct 2010
    Location
    East Yorkshire
    Posts
    1,687
    Blog Entries
    6
    Thanks
    286
    Thanked 1,472 Times in 759 Posts
    Rep Power
    46
    Them gargle bots "time out" too quick, no frickin patience, worse'n like a MSN download.

    Them ther things want all the world signed up on a plate waitin an ready for them. If they get served one in seven times, it show nuthin is broke except frikin Gargle. Rest of the time the dang things got no patience. Typical gargle behaviour - sod yr customers, we was here second and we don't wait.

    If we ever do a group project on E-P, we have to name it "sod off Gargle" - whatever the real subject. They is a pack of winkers, as grynge would siy ..

+ Reply to Thread

Similar Threads

  1. New Developments: Guest Access for Google Webmaster Tools!
    By Clinton in forum Due Diligence and Gotchas!
    Replies: 3
    Last Post: March 15th, 2012, 5:31 AM
  2. What is it with sellers and not giving GA access
    By Clinton in forum Buying a Website, Blog, Internet Business
    Replies: 8
    Last Post: December 6th, 2011, 6:41 PM
  3. Do you access this site on your mobile?
    By Clinton in forum Forum Rules, News & Feedback
    Replies: 14
    Last Post: November 19th, 2011, 1:35 PM
  4. FP - VIP Access now FREE - Effective Immediately
    By Bryan in forum Site Flipping
    Replies: 12
    Last Post: March 30th, 2011, 7:49 AM
  5. Interesting robots.txt files - Game
    By Clinton in forum General & Miscellaneous
    Replies: 4
    Last Post: February 19th, 2011, 8:38 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts