Welcome to WebmasterWorld Guest from 54.162.241.40

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Traffic/rank drop after robots.txt change

     
7:24 pm on Jun 7, 2013 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 11, 2009
posts: 141
votes: 0


In 30 May I disallowed a directory that contained only projects created by my site's users. Also, I set all user's folders to return a 404 error when accessed.

The next day GWT reported a drop in tracked pages from about 800 to about 50.

After 10 days I started noticing a moderate traffic/rank drop, and it's been getting worse since then.

What was responsible for the drop? The robots.txt policy or the 404 responses?
9:02 pm on June 7, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14246
votes: 551


Addressing only the mechanical aspects of the question:

Hunch: You've got an inappropriate use of belt-plus-suspenders. Or belt-plus-braces, depending on dialectal preference. The googlebot is going haywire because it isn't allowed into a directory, so it doesn't know that the subdirectories inside that directory don't exist any more.

Also, I set all user's folders to return a 404 error when accessed.

Normally you don't have to set a 404 explicitly when you're dealing with physical files. Did you really delete the directories? If so, a 410 will make the googlebot go away faster-- but only if you let it ask for the pages. You can remove the whole directory in gwt at the same time.

If this area was formerly visited by humans, make sure you've got a nice custom 410 page. Or at least use your existing 404 page. The Apache default 410 page is scary.

If the files are still there but you've changed them to restricted access, that should yield a 401 without any more work from your end.
5:20 am on June 8, 2013 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 11, 2009
posts: 141
votes: 0


My bad, I didn't explain it clearly. What I really did was:

I disallowed crawling using a robots.txt policy AND set .htaccess to return a 403 error when the directories were accessed, using (Options -Indexes).

PS: The directories still exist, but they were never visited by humans.
9:57 am on June 8, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14246
votes: 551


Gotcha. Not the whole content of the directories, just their indexes. And when your fingers typed 404 in the first post, your brain really meant to say 403.

Did all those subdirectories formerly have automatic index files, so any passing robot could see what's there? By switching off the auto-indexing, you've prevented google and other robots from discovering any new pages in the directories-- unless they learn about them from other means-- but you haven't stopped them from requesting the pages they already know about.

I kinda think it would be safer to slap a global no-index label on the directory. If it isn't practical to add meta tags to all the existing files, Option B is to make a supplementary little htaccess file and put it in your target directory. You may already have one there if that's how you turned off auto-indexing for the directory. Add a line that says

Header set X-Robots-Tag "noindex"

and it will cover everything in the directory.

:: looking vaguely around for someone who knows the answer to the SEO aspect of the question, on which subject I am clueless ::
1:07 pm on June 8, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 11, 2007
posts:774
votes: 3


Hey rlopes,

Blocking 750 of 800 indexed pages could drastically affect your site's traffic and ranking. That's 94% of your pages.

If the web pages in those directories which you've blocked have a significant number of inbound links, the link juice coming into the blocked pages can no longer be passed around to the various other pages on your site. Blocked pages cannot accumulate PageRank/link juice, and since they can't crawl the pages to find outbound links to other pages on your site, those internal pages linked to from the blocked pages are no longer getting passed PageRank/link juice.

Also, if a lot of those blocked pages were actually ranking for various keyword phrases, by blocking them from being crawled you will have killed your rankings for those phrases (some less relevant, lower ranking page my now be ranking but much lower) and thus the traffic will also have diminished.

Just a guess...
4:12 am on June 10, 2013 (gmt 0)

Junior Member

5+ Year Member

joined:Sept 11, 2009
posts: 141
votes: 0


Thank you lucy24 and ZydoSEO for the responses.

What if 301 redirected all of these directories to the homepage, to not lose the link juice?
6:24 am on June 10, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14246
votes: 551


What if I 301 redirected all of these directories to the homepage

Aaaack!

There may exist situations where a mass redirect of all requests to the home page is the best solution. I can't remember ever personally hearing of one.

The people at the other end of those links aren't linking to your home page, or to your site generically. They are-- or were --linking to a specific page.
1:55 pm on June 10, 2013 (gmt 0)

Junior Member

5+ Year Member Top Contributors Of The Month

joined:Mar 16, 2012
posts: 124
votes: 0


were never visited by humans


1. Dissallow: (done)

2. Remove URL's from index and cache in WMT (*linked from)

3. Remove /dir/ from index

Options: -Index is root config for my domains

No humans ever visited so nothing should change visitor wise.

* Be sure and remove any internal links. For any inbound links to pages I use:

RewriteCond %{REMOTE_REFERER} peekyou.com [NC]
RewriteRule ^.* - [G,L]

This, code, I use initially to get broken link issues cleared up. Have to to see what effect it has down the road. In my case peekyou and others have broken links to the site, could send them to site map instead or...
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members