Welcome to WebmasterWorld Guest from 50.19.57.50

Forum Moderators: goodroi

Message Too Old, No Replies

Do you have robot.txt on your site?

     
12:24 am on Apr 11, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Dec 9, 2006
posts:784
votes: 16


What's in it? I have this:

User-agent: Mediapartners-Google*
Disallow:


Do I need to put robot.txt on my site?
10:11 am on Apr 11, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts: 1642
votes: 0


If the robots.txt file is empty you don't have to have it - but I tend to use an empty one just to keep the errors log empty.

If your only entry is 'this bot has full access' then you don't need one.

(Note that the file is called robots.txt - note the 's' - if you misname the file the bots will not find it)
1:31 pm on Apr 11, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 29, 2001
posts:1098
votes: 28


User-agent: Googlebot
Disallow: /bunch-of-stuff
Allow: /

User-agent: Googlebot-Mobile
Disallow: /bunch-of-stuff
Allow: /


User-agent: Mediapartners-Google
Disallow: /bunch-of-stuff
Allow: /

User-agent: Googlebot-Image
Disallow: /

User-agent: asterias
Disallow: /

User-agent: aibot
Disallow: /

User-agent: Alexibot
Disallow: /

User-agent: asterias
Disallow: /

User-agent: BackDoorBot
Disallow: /

User-agent: BecomeBot
Disallow: /

User-agent: Bloodhound
Disallow: /

User-agent: BotALot
Disallow: /

User-agent: BuiltBotTough
Disallow: /

User-agent: Bullseye
Disallow: /

User-agent: BunnySlippers
Disallow: /

User-agent: CheeseBot
Disallow: /

User-agent: CherryPicker
Disallow: /

User-agent: CherryPickerSE
Disallow: /

User-agent: CherryPickerElite
Disallow: /

User-agent: cosmos
Disallow: /

User-agent: Crescent
Disallow: /

User-agent: Crescent Internet ToolPak
Disallow: /

User-agent: combine
Disallow: /

User-agent: Copernic
Disallow: /

User-agent: CopyRightCheck
Disallow: /

User-agent: DittoSpyder
Disallow: /

User-agent: Down2Web
Disallow: /

User-agent: dumbot
Disallow: /

User-agent: e-collector
Disallow: /

User-agent: Email
Disallow: /

User-agent: EmailCollector
Disallow: /

User-agent: EmailWolf
Disallow: /

User-agent: EmailSiphon
Disallow: /

User-agent: Enterprise_Search
Disallow: /

User-agent: es
Disallow: /

User-agent: EroCrawler
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-agent: Exabot
Disallow: /

User-agent: FairAd Client
Disallow: /

User-agent: Flaming AttackBot
Disallow: /

User-agent: Foobot
Disallow: /

User-agent: Francis
Disallow: /

User-agent: FreeFind
Disallow: /

User-agent: Gaisbot
Disallow: /

User-agent: grub
Disallow: /

User-agent: grub-client
Disallow: /

User-agent: Googlebot
Disallow: /*.gif$

User-agent: Hatena Antenna
Disallow: /

User-agent: Harvest
Disallow: /

User-agent: Heritrix
Disallow: /


User-agent: hloader
Disallow: /

User-agent: htmlgobble
Disallow: /

User-agent: httplib
Disallow: /

User-agent: HTTrack
Disallow: /

User-agent: humanlinks
Disallow: /

User-agent: ia_archiver
Disallow: /

User-agent: InfoNaviRobot
Disallow: /

User-agent: JennyBot
Disallow: /

User-agent: JavaBee
Disallow: /

User-agent: JoBo
Disallow: /

User-agent: Java
Disallow: /

User-agent: Jetbot/
Disallow: /

User-agent: Jetbot
Disallow: /

User-agent: Kenjin Spider
Disallow: /

User-agent: Larbin
Disallow: /

User-agent: LexiBot
Disallow: /

User-agent: LinkextractorPro
Disallow: /

User-agent: LinkWalker
Disallow: /

User-agent: LNSpiderguy
Disallow: /

User-agent: lwp-trivial
Disallow: /

User-agent: Mata Hari
Disallow: /

User-agent: MIIxpc
Disallow: /

User-agent: Microsoft URL Control
Disallow: /

User-agent: moget
Disallow: /

User-agent: naver
Disallow: /

User-agent: NetAnts
Disallow: /

User-agent: NICErsPRO
Disallow: /

User-agent: Nutch
Disallow: /

User-agent: Offline
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Openbot
Disallow: /

User-agent: Openfind data gathere
Disallow: /

User-agent: Openfind
Disallow: /

User-agent: PerMan
Disallow: /

User-agent: PentonMediabot
Disallow: /

User-agent: psbot
Disallow: /

User-agent: ProPowerBot
Disallow: /

User-agent: ProWebWalker
Disallow: /

User-agent: Robofox
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: SiteVigil
Disallow: /

User-agent: Sohu
Disallow: /

User-agent: tarspider
Disallow: /

User-agent: The Intraformant
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: Teleport Pro
Disallow: /

User-agent: Telesoft
Disallow: /

User-agent: Twiceler
Disallow: /


User-agent: URL_Spider_Pro
Disallow: /

User-agent: w3mir
Disallow: /

User-agent: WebAuto
Disallow: /

User-agent: webbandit
Disallow: /

User-agent: WebCapture
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: webmirror
Disallow: /

User-agent: Website Quester
Disallow: /

User-agent: Webster
Disallow: /

User-agent: Web Downloader
Disallow: /

User-agent: WebFetcher
Disallow: /

User-agent: WebEnhancer
Disallow: /

User-agent: Webster Pro
Disallow: /

User-agent: Wget
Disallow: /

User-agent: WebSauger
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebWasher
Disallow: /

User-agent: webvac
Disallow: /

User-agent: WebZIP
Disallow: /

User-agent: WWW-Collector-E
Disallow: /

User-agent: Xenu's Link Sleuth
Disallow: /

User-agent: Xenu's
Disallow: /

User-agent: Zeus
Disallow: /

User-agent: Zeus Link Scout
Disallow: /
5:43 pm on Apr 11, 2010 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:6968
votes: 389


Edge... it's been my experience if google is allowed in they assume all the rest of their little bots are allowed, too... hence if you DON'T want Googlebot-Mobile (example) you'd have to disallow that one.

I generally whitelist (allow) a handful of useful bots and disallow all others. Makes for a much shorter robots.txt !
6:56 pm on Apr 12, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 29, 2001
posts:1098
votes: 28


Each Googlebot crawls different data/media. There are places I don't want a particualr bot and other places that are OK.

Different rules for each google bot.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members