homepage Welcome to WebmasterWorld Guest from 50.16.112.199
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Is my robots.txt file ok?
Bubzeebub



 
Msg#: 4613152 posted 12:02 pm on Sep 27, 2013 (gmt 0)

I know very little about robots.txt files but while doing some analytics on my site I discovered it. It contains the following info:

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Sitemap: http://example.com/sitemap.xml.gz


Is this ok? I've been having major issues with my Google Adsense suddenly not being able to show data and I'm not sure if this is why...

[edited by: phranque at 1:40 am (utc) on Sep 28, 2013]
[edit reason] exemplified domain [/edit]

 

not2easy

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



 
Msg#: 4613152 posted 8:17 pm on Sep 27, 2013 (gmt 0)

There is nothing to share with G in those folders normally, no reason not to block them and no advantage to unblocking those folders. Those lines in your robots.txt are not very likely to be involved with AdSense display issues. In your GWT account you can ask it to test your robots.txt and you can see if something in there is preventing google from accessing URLs you want them to find. They may show hundreds of "Blocked" URLs, but the URLs are usually ones you don't want to share anyway.

Have you tried to see if AdSense has any known issues that might apply to your setup?

Bubzeebub



 
Msg#: 4613152 posted 2:52 am on Sep 28, 2013 (gmt 0)

I think, but can't confirm, that they might have blocked me after using a blended link unit scheme. However, I've been unable to get any response from Google to help with this. (Guess that's what they think of small fry publishers). I have no real fancy setup, just a simple blog. Do you suggest deleting the robots.txt file?

not2easy

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



 
Msg#: 4613152 posted 4:09 am on Sep 28, 2013 (gmt 0)

Your robots.txt file is where you list the files and folders you do not want crawled, and also where you list the location for your sitemaps to tell robots the pages you do want them to crawl and index. It is your tool to tell robots what you want and don't want them to do on your site. Good robots will generally follow your rules, though they are known to forget them when they follow links. Bad robots ignore your rules and do as they please. It won't break your site if you don't have one, but there is little chance that robots.txt has much to do with getting you "blocked" (or unblocked). If you believe your site has been given a manual penalty, that information would be available to you in your Google Webmaster Tools account.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4613152 posted 9:18 am on Sep 28, 2013 (gmt 0)

Is this ok?

Sure, if your purpose is to tell robots that yes indeedy, I have a directory called /wp-admin/ that contains good stuff.

Never rely on robots.txt to protect really critical areas. Don't mention the directory's existence at all, and issue a blanket 403 to anyone who isn't you.

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4613152 posted 11:28 pm on Sep 28, 2013 (gmt 0)

Robots.txt is a mixed blessing. The good bots will follow. The bad bots will say, "Cool, here we strike next!"

Above is tongue-in-cheek to a certain extent... I do disallow many bots at the root level and (surprise!) most of them comply. Those that don't end up in .htaccess.

That said, on the two wp sites I manage, you won't find any mention of wp in the robots.txt... and any calls for those system files that do not originate from the site itself are not honored (403)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved