Welcome to WebmasterWorld Guest from 54.146.194.42

Message Too Old, No Replies

Google Not Displaying Site in Results

Robots.txt to blame?

     
7:33 pm on Apr 2, 2007 (gmt 0)

New User

5+ Year Member

joined:Dec 13, 2006
posts:24
votes: 0


I've got a big-time business site with thousands of pages. A Google search for site:domain.com shows many, many pages. However, I can't get even ONE of my pages to show up in any Google search. Yahoo loves my site, as does MSN and the others.

Could my robots.txt file be to blame?

Here's a snippett of the robots.txt file. Is this ok or is it keeping Google out?

User-agent: Googlebot
Disallow: /enter.cgi?
Disallow: /enter2.cgi?
Disallow: /enter3.cgi?
Disallow: /enter5.cgi?
Disallow: /enter6.cgi?
Disallow: /enter8.cgi?
Disallow: /aafra/
Disallow: /art-sitemap/
Disallow: /blumberg/
Disallow: /bmwe/
Disallow: /clickbank/
Disallow: /data/
Disallow: /divorceinfo/
Disallow: /expertlounge/
Disallow: /fbroker/
Disallow: /findforms/

There's more, but that's essentially it for the Googlebot User-Agent line.

Any ideas?

8:59 pm on Apr 2, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


If a site: search shows many pages, then googlebot is indexing those pages. But only you can tell if those disallow lines still permit the pages you want to have in the index actually to be indexed. It's certainly valid syntax.
9:15 pm on Apr 2, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 6, 2005
posts:1678
votes: 71


P0pcornB0y

Btw, how old is your business site?

2:16 pm on Apr 3, 2007 (gmt 0)

New User

5+ Year Member

joined:Dec 13, 2006
posts:24
votes: 0


My site is very old. Back in 2005 it took a sudden, catastrophic loss of placement in Google. I still get top-notch placement in the other search engines.

I may just need to ask G what the deal is. I know that they're "banning" my site to some degree. I can search for unique text strings on the front page of my site and none of them show up in the G serps.

However (again) I can do a site:domain.com search and I get a couple of thousand pages returned.

My PR is zero. Before 1995 it was 6.

I'm baffled. We're a nationwide services company, with many affiliates and we're kicking butt on the other SEs.

2:22 pm on Apr 3, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Nov 10, 2005
posts: 240
votes: 0


What percentage of urls you have in Google's index tagged supplemental?
9:36 pm on Apr 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 6, 2005
posts:1678
votes: 71


P0pcornB0y

You may wish to file a reinclusion request [mattcutts.com].

You can do that within Google Sitemaps. Its under "Tools": Submit a reinclusion request.

Good luck!

10:54 pm on Apr 3, 2007 (gmt 0)

New User

10+ Year Member

joined:Oct 6, 2005
posts:39
votes: 0


It looks like you have been suppressed using "site-unique bias" like Kinderstart.

Google does not like directories. A site with "thousands of pages" sounds like a directory. It is therefore likely that nothing you can do will fix the problem.

In order to request reinclusion you now have to stipulate that you did something wrong and fixed it. This is hard to do if you don't know what you did wrong. It is even harder if you did not do anything wrong and were suppressed because Google doesn't like your site. Google does not solicit reinclusion requests from sites that they banned or suppressed for editorial reasons.

Good Luck.

11:10 pm on Apr 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


" It looks like you have been suppressed using "site-unique bias" like Kinderstart."

When I looked at their site way back when they also had a very bad www/non-www issue that also amounted to massive subdomain cross linking as well.

Ask EFV all about that little problem or you could ask me, been there, done that, don't care to repeat that.

11:26 pm on Apr 3, 2007 (gmt 0)

Junior Member

joined:Mar 15, 2007
posts:120
votes: 0


In google webmastertools you can test very effectively your robots.txt file to find if anything prevents the site beeing searched. I would suggest you sign up and use the tool before making reinclusion requests etc.
2:22 pm on Apr 4, 2007 (gmt 0)

New User

10+ Year Member

joined:Oct 6, 2005
posts:39
votes: 0


Google bans sites for using deceptive practices but they also ban or use site-unique bias to suppress sites that they just don't like for undisclosed "editorial" reasons.

If your "nationwide" business is a large business, you are probably in luck. Google pretty much does not ban or use site-unique bias on sites belonging to large businesses strictly for editorial or competitive reasons. (I am not aware of a single case.) There are huge multi-million page data base driven and highly duplicative sites (e.g. Amazon) out there that are heavily indexed in Google. (They will indeed ban a site that uses deceptive practices regardless of business size but will usually rapidly reinstate a large business that corrects the problem. Smaller businesses might have to wait a long time for a sandbox penalty to run out.)

I would recommend the following:

Submit the site for reinclusion. If you have inadvertently triggered a spam trap (seems unlikely), they might tell you and you can fix it and resubmit. If the site was suppressed for editorial reasons you will get no response. The people in the reinclusion dept. certainly don't have the authority to reinclude sites that violate Google's editorial policy, regardless of merit. The reinclusion procedure is clearly only for sites that have fixed a deceptive practice.

If that doesn't work try to contact Matt Cutts. Explain the nature of the business and why the material in the site is non-duplicative and helpful to customers. That has worked for some people. Matt has more discretion regarding merit of your case.

If that doesn't work have someone in your management try to call Google. Writing to Google is probably futile. They likely get 1,000 letters a day that go directly into a special dumpster.

If that doesn't work, you are probably "SOL".

[edited by: tedster at 5:07 pm (utc) on April 4, 2007]

2:20 am on Apr 5, 2007 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38047
votes: 11


> Disallow: /enter.cgi?

There are no garantees how google will respond to wild cards in robots.txt. Those are not to the robots.txt standard and usage is potluck. I would remove your robots.txt and see if that does it in 30 days.

1:47 pm on Apr 7, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


Brett,

Google accepts an extension to the robot exclusion rules as outlined here:

[google.com...]

7:18 pm on Apr 7, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Using Disallow: /enter would disallow all URLs that start with /enter and would make your robots file simpler.

Would that be useful, or is there an enter4.cgi that you do want to be spidered?

10:44 pm on Apr 7, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 9, 2005
posts:1509
votes: 0


If it is not the robots.txt, my guess is it is 'on page' factors.

To make a blanket 'Google does not like directories' statement is not entirely accurate. There are ways to get a large directory indexed and ranked, but you will need to follow the 'clear hierarchy' Google suggests to a T. Make sure you *do not* duplicate titles, descriptions, or headings (unless there is a good reason to), and find a way to make a 'clear' indication of where the most important page(s) are located.

I have been 'playing' with a directory for the last two years and after running multiple versions of the software I am using in different sections of the site to gauge search engine responses I have found the toughest time a SE has with a directory site is finding the 'key' page for a search term, and where all pages are 'weighted' evenly by whatever system you are running, *no* (or very few) pages will rank.

IOW If you have a 2000 page directory site, and try to weight 2000 pages evenly, you may end up having issues, but if you are running a 2000 page site and clearly define 200 'upper-level' pages, which allow visitors to easily locate all 2000 pages, you will probably have an easier time.

Justin