Welcome to WebmasterWorld Guest from 34.204.171.108

Forum Moderators: martinibuster

Message Too Old, No Replies

Is my robot.txt ok?

     
6:30 pm on Apr 7, 2018 (gmt 0)

New User

joined:Dec 18, 2017
posts: 27
votes: 0


I am using Blogger and in my Robot.txt file these lines appear is it ok?
---
User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow:
Allow: /
---
Thanks
7:06 pm on Apr 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


What is it you are trying to accomplish by using that file?

Currently it is not really doing anything.
7:12 pm on Apr 7, 2018 (gmt 0)

New User

joined:Dec 18, 2017
posts: 27
votes: 0


Sorry my English not perfect..
I am want to allow Adsense to crawl my whole site, i mean that i dont want Adsense to block any URL
Is it ok?
7:33 pm on Apr 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


Yes, this robots.txt does not block Mediapartners-Google (the Adsense bot)
8:43 pm on Apr 7, 2018 (gmt 0)

New User

joined:Dec 18, 2017
posts: 27
votes: 0


Thanks
11:57 pm on Apr 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2007
posts: 961
votes: 170


Does anyone even need a robots.txt file if they're not blocking access to anything?
12:24 am on Apr 8, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


@azlinda - robots.txt does not block anything. You can request some files not to be indexed, but that is not blocking.

Only a few of the active bots even support robots.txt. The major search engines support robots.txt directives... Google, Bing, Yandex, DuckDuckGo, and couple others, but most bots do not even request this file. A few bots request it, but disobey it. A few others use it to find out where you don't want them to go, then they go there.

The robots.txt file never did become a standard. It tried to be, but there are too many interpretation differences. Some bots support wild card (*) and others do not. Some support cross-domain, others do not, etc.
12:33 am on Apr 8, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10469
votes: 1099


robots.txt is valid for bots that respect it. Otherwise it is ignored by bad bots (who may or may not actually be "bad").

It is good form, as a webmaster, to have one, even if it is blank.
1:42 am on Apr 8, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2007
posts: 961
votes: 170


@Keyplyr - Sorry, I misspoke. I meant to say if they have nothing that they don't want indexed. Thanks for your response. It was helpful.
6:18 am on Apr 8, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15892
votes: 876


if they have nothing that they don't want indexed
It's important to remember that a robots.txt Disallow--which prevents compliant robots such as search engines from crawling a page--has nothing to do with Noindex. Pages that have never been crawled are still theoretically in the index, because the search engine has seen links to the pages. The only way for a search engine to see a "noindex" instruction (whether in a header or an on-page meta) is to let it crawl.
11:53 am on Apr 8, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:2091
votes: 370


User-agent: Mediapartners-Google 
Disallow:

User-agent: *
Disallow:
Allow: /

Strictly speaking, this is pointless. You're telling the Mediapartners-Google bot that it's allowed to crawl the whole site, and then you're declaring that every bot is allowed to do so -- which obviously also includes Mediapartners-Google. So you might as well declare only this:

User-agent: *
Disallow:

(Note that the "Allow" directive is not part of the robots.txt standard, but some bots like Googlebot and bingbot do support it.)

Or this:
 

(i.e. an empty robots.txt file)

To be fair, your robots.txt works equally well, and I see Blogger adds in the Mediapartners-Google line by default, presumably because they don't any of the restrictions that may apply to other bots in succeeding lines restrict the movements of the Mediapartners-Google bot.
6:16 am on Apr 16, 2018 (gmt 0)

Full Member from US 

10+ Year Member

joined:Apr 11, 2006
posts:244
votes: 21


My robots.txt looks like robzilla's second example but I also have my sitemap's URL on the third line, does anyone else do this? I must have read this was a good idea somewhere along the way.
7:43 am on Apr 16, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:2091
votes: 370


It's one of the ways [sitemaps.org] you can tell search engines about your sitemap(s), so that's perfectly fine.
10:05 pm on Apr 16, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15892
votes: 876


I must have read this was a good idea somewhere along the way.
Google--to name but one search engine, selected wholly at random--recognizes the Sitemap: line.
4:06 am on Apr 21, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4523
votes: 350


The question about What Makes a Good robots.txt File was split off to its own discussion, it can be found here: [webmasterworld.com...]