Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Pages Crawled But Not Indexed



9:04 pm on Jan 12, 2005 (gmt 0)

10+ Year Member

We recently restructured our site. The domain name stayed the same as did the content. However, all the page URLs were changed, and consequently so were all the internal links.

We launched the restructured site over four months ago. Although Googlebot seems to visit often, Google hasn't indexed many of our "new" pages -- most of our listings in the SERPS reflect the pages with the old URLs.

Four months seems especially long. Has anyone else experienced such a big lag-time w/ G?


10:22 pm on Jan 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Have you done 301 redirects from the old pages to the new? If not, that's your problem. Google doesn't index the new pages because they are duplicates. Definitely redirect ASAP.


11:04 pm on Jan 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

If, as I was once, you had a site on an ISP where you can't do 301s, add a META ROBOTS = "NOINDEX, FOLLOW" to all your old pages so that when Google goes back, it drops them.
And make sure the old pages point to new pages, and are not simply interlinked amongst the old selves, or the new pages won't have enough ways in for Google to call by very often.
I did this quite successfully - lost no pages on the way - but it took quite a time for the changeover to complete!


4:18 am on Jan 13, 2005 (gmt 0)

10+ Year Member

Dear Friends,

I m facing the same problem :

Google bot is crawling my site but not indexing.

in the previous week i did a lot of change like I have implemented mod_rewrite in my php pages.
Where index.php has become index.html, module.php has become modules.html and so on
The name of the pages have remain same in most of the cases. I have also implemented the dynamic title modification. That's means lots of change in the whole site.

After the modification i watched the deep crawl of google in my website. And I was very happy to see that it has indexed my new modified pages.
But it was only for 1 day. After 1 day google has dropped all my new pages from it's index and put all my previous links in the search result.

I can see see google is crawling my website each day for more than 10 times and crawling learge amount of pages,consuming a great amount of my bandwidth but it is not indexing the new pages at all.What should I do?

I don't have the previous pages anymore so it is not possible to put noarchive in my meta tags.
And I don't know about 301 redirect. Can I use it with my php pages. If I can then how?
I have become frastated now on this situation.
What should I do?



2:48 pm on Jan 13, 2005 (gmt 0)

10+ Year Member

Yep, I've got a 301 redirect in place. It redirects the pages just fine but I wonder if the response.status isn't getting conveyed to Googlebot. Would the log files shed any light on that?

All the old pages are gone, so the noindex, nofollow idea wouldn't work here... altho that's a good one and I'll keep it in mind for future projects. (Thanks, DerekH.)

I'm still at a loss. We have the redirect script, we have a site map, we have lots of interlinking amongst the pages ...


4:25 am on Jan 14, 2005 (gmt 0)

10+ Year Member

this is what I tracked today!

Online? User IP Address Host Name Last Viewed Hits
saint 2005-01-13 20:18:52 9
Yes cache130.156ce.maxonline.com.sg 2005-01-13 20:14:41 2
Yes 2005-01-13 20:05:56 5
Yes host81-153-137-0.range81-153.btcentralplus.com 2005-01-13 20:00:50 1
Yes user-0c99qjc.cable.mindspring.com 2005-01-13 19:56:47 3 rtools3.yst.corp.yahoo.com 2005-01-13 19:19:06 1 crawl-66-249-64-79.googlebot.com 2005-01-13 19:18:42 1 lj1353.inktomisearch.com 2005-01-13 19:14:12 1 adsl-69-225-193-118.dsl.scrm01.pacbell.net 2005-01-13 19:10:10 2 d211-29-175-138.dsl.nsw.optusnet.com.au 2005-01-13 19:06:13 2 lj1020.inktomisearch.com 2005-01-13 18:46:20 1 crawl-66-249-71-72.googlebot.com 2005-01-13 18:40:07 1 crawl-66-249-66-75.googlebot.com 2005-01-13 18:36:14 17 HSE-Toronto-ppp295725.sympatico.ca 2005-01-13 18:11:07 2 crawl-66-249-71-40.googlebot.com 2005-01-13 18:10:00 1 crawl-66-249-71-32.googlebot.com 2005-01-13 18:09:47 1 d198-53-226-252.abhsia.telus.net 2005-01-13 17:58:31 2 crawl-66-249-64-37.googlebot.com 2005-01-13 17:50:41 2 2005-01-13 17:44:39 17 c-24-17-93-170.client.comcast.net 2005-01-13 16:52:42 1 crawl-66-249-71-73.googlebot.com 2005-01-13 16:48:52 2 c-144fe353.545-1-64736c10.cust.bredbandsbolaget.se 2005-01-13 16:48:38 1 ppp- 2005-01-13 16:36:49 1 lutn-cache-5.server.ntli.net 2005-01-13 16:35:57 1 crawl-66-249-71-28.googlebot.com 2005-01-13 16:34:30 1 d57-198-124.home.cgocable.net 2005-01-13 16:16:16 1 dorm83194.dorm-net.louisville.edu 2005-01-13 16:10:42 1 cs214310.pws.uscs.susx.ac.uk 2005-01-13 15:54:46 1 crawl-66-249-64-66.googlebot.com 2005-01-13 15:10:35 1 c211-30-34-71.rivrw6.nsw.optusnet.com.au 2005-01-13 15:08:07 2 mfb.xs4all.nl 2005-01-13 14:55:33 4 dD576522C.access.telenet.be 2005-01-13 14:47:13 2 nperspectief01.nieuw-perspectief.nl 2005-01-13 14:41:00 1 2005-01-13 14:39:12 10 host- 2005-01-13 14:38:51 1 lj2066.inktomisearch.com 2005-01-13 14:34:01 1

Google has given me more than one visit and it has crawled my pages. But it is not indexing. I removed all my .php files from previous website by google remove all tool. it did it good. but now Why is it not indexing my new pages?

Am I penalized for duplicate content. If I m penalized for duplicate content why I m visited by the bot?

Any answer or suggestion . It would be great. I can't see any solution till now.


5:11 am on Jan 14, 2005 (gmt 0)

10+ Year Member

I don't understand at all.
My site is showing 2 months old results with "Supplemental Result". Googlebot visited my site before 6 days but still no update. The cache date shows 1969. Its 2 months since this stuff of not updating. I think its serious now.

Can anyone plz help me out?



11:22 pm on Jan 14, 2005 (gmt 0)

10+ Year Member

I can see lots of visit record of google today also.

but When I checked what it crawled I can understand it has only visited my web root that means / directory for lots of times.

here is one line from my rawlog file today. - - [14/Jan/2005:00:25:21 -0800] "GET / HTTP/1.1" 200 10884 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:23 -0800] "GET / HTTP/1.1" 200 10886 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:23 -0800] "GET / HTTP/1.1" 200 53938 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:24 -0800] "GET / HTTP/1.1" 200 10886 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:25 -0800] "GET / HTTP/1.1" 200 51078 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:26 -0800] "GET / HTTP/1.1" 200 10883 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:27 -0800] "GET / HTTP/1.1" 200 53938 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:29 -0800] "GET / HTTP/1.1" 200 10885 "-" "Mediapartners-Google/2.1" - - [14/Jan/2005:00:25:30 -0800] "GET / HTTP/1.1" 200 51078 "-" "Mediapartners-Google/2.1"

I think GET / means the webroot directory. But it has requested this for 331 times today. What's the problem? has it enterd in a loop?
Wish a reply from the Search engine gurus. Please help.
Thanks in advance.

[edited by: coolsaint at 11:32 pm (utc) on Jan. 14, 2005]


11:29 pm on Jan 14, 2005 (gmt 0)

10+ Year Member

Mediapartners-Google/2.1 is not googlebot. Its an adsense bot.
btw Googlebot is mad!. It doesnt update since 2 months


12:20 am on Jan 15, 2005 (gmt 0)

10+ Year Member

I don't think you r right . Because I think it is google bot .

ip :

you can see here for google crawler information :


My website in google index has been change more than 5 times in last month and till now. So as you have said google hasn't updated it's index for 2 months I wish wrong information.

Can anyone help on this case please?


12:47 am on Jan 15, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member

If the user agent is Mediapartners-Google/2.1 then that is the adsense bot. No question about it.


4:47 pm on Jan 15, 2005 (gmt 0)

10+ Year Member

We lost serps and cache even though we were being crawled daily.

Last check seems that the index is working ok again but the serps are still changing. Have you seen any changes in the last 24 hours?


2:17 am on Jan 16, 2005 (gmt 0)

10+ Year Member

For me, I had 500+ hits this month on 8th Jan. But it shows the same old stuff in its cache/index with 1969. I have PR 4


2:35 am on Jan 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Mediapartners-Google/2.1 info

To clear this up, the above bot is merely for letting adsense know what the page contains so relevant ads can be shown. Usually a page will be in the main Google index and info is already known, if there is no info then a copy of the page is requested by the seperate bot.

This bot DOES NOT pass the details to the main Google database, it is for adsense only. The usual pattern is that the adsense bot can retrieve the page almost immediately as the strain on it is nowhere near as bad as the crawls that GoogleBot does.

I can't explain the multiple requests for the same page, could it be there is something different such as many domains pointing to same content (bad) or possibly you have logs not recording querystrings or session IDs.

Interestingly, it is possible to get adsense to show what ads would be shown for any page you wish (find out what google thinks the page is really about, sometimes a surprise). Just lookup and install Adsense preview tool (IE), it will show the adverts that are associated with the words it thinks the site is about.

Nothing too exciting and definately not a way to jump any queues.


2:44 am on Jan 16, 2005 (gmt 0)

10+ Year Member

I changed 100s of dynamic urls/pages because of duplicate content worries...(anything to get some google traffic!)

Didn't do any good because although 2 months later most of them are now correctly indexed, we are still sandboxed to hell...


Featured Threads

Hot Threads This Week

Hot Threads This Month