Forum Moderators: open
We launched the restructured site over four months ago. Although Googlebot seems to visit often, Google hasn't indexed many of our "new" pages -- most of our listings in the SERPS reflect the pages with the old URLs.
Four months seems especially long. Has anyone else experienced such a big lag-time w/ G?
I m facing the same problem :
Google bot is crawling my site but not indexing.
in the previous week i did a lot of change like I have implemented mod_rewrite in my php pages.
Where index.php has become index.html, module.php has become modules.html and so on
The name of the pages have remain same in most of the cases. I have also implemented the dynamic title modification. That's means lots of change in the whole site.
After the modification i watched the deep crawl of google in my website. And I was very happy to see that it has indexed my new modified pages.
But it was only for 1 day. After 1 day google has dropped all my new pages from it's index and put all my previous links in the search result.
I can see see google is crawling my website each day for more than 10 times and crawling learge amount of pages,consuming a great amount of my bandwidth but it is not indexing the new pages at all.What should I do?
I don't have the previous pages anymore so it is not possible to put noarchive in my meta tags.
And I don't know about 301 redirect. Can I use it with my php pages. If I can then how?
I have become frastated now on this situation.
What should I do?
cooolsaint
www.aanchol.com
coolsaint@gmail.com
All the old pages are gone, so the noindex, nofollow idea wouldn't work here... altho that's a good one and I'll keep it in mind for future projects. (Thanks, DerekH.)
I'm still at a loss. We have the redirect script, we have a site map, we have lots of interlinking amongst the pages ...
Online? User IP Address Host Name Last Viewed Hits
saint 203.91.159.171 203.91.159.171 2005-01-13 20:18:52 9
Yes 202.156.2.130 cache130.156ce.maxonline.com.sg 2005-01-13 20:14:41 2
Yes 203.91.159.171 203.91.159.171 2005-01-13 20:05:56 5
Yes 81.153.137.0 host81-153-137-0.range81-153.btcentralplus.com 2005-01-13 20:00:50 1
Yes 24.148.234.108 user-0c99qjc.cable.mindspring.com 2005-01-13 19:56:47 3
66.228.164.141 rtools3.yst.corp.yahoo.com 2005-01-13 19:19:06 1
66.249.64.79 crawl-66-249-64-79.googlebot.com 2005-01-13 19:18:42 1
66.196.91.133 lj1353.inktomisearch.com 2005-01-13 19:14:12 1
69.225.193.118 adsl-69-225-193-118.dsl.scrm01.pacbell.net 2005-01-13 19:10:10 2
211.29.175.138 d211-29-175-138.dsl.nsw.optusnet.com.au 2005-01-13 19:06:13 2
66.196.90.36 lj1020.inktomisearch.com 2005-01-13 18:46:20 1
66.249.71.72 crawl-66-249-71-72.googlebot.com 2005-01-13 18:40:07 1
66.249.66.75 crawl-66-249-66-75.googlebot.com 2005-01-13 18:36:14 17
64.231.34.119 HSE-Toronto-ppp295725.sympatico.ca 2005-01-13 18:11:07 2
66.249.71.40 crawl-66-249-71-40.googlebot.com 2005-01-13 18:10:00 1
66.249.71.32 crawl-66-249-71-32.googlebot.com 2005-01-13 18:09:47 1
198.53.226.252 d198-53-226-252.abhsia.telus.net 2005-01-13 17:58:31 2
66.249.64.37 crawl-66-249-64-37.googlebot.com 2005-01-13 17:50:41 2
196.40.5.138 196.40.5.138 2005-01-13 17:44:39 17
24.17.93.170 c-24-17-93-170.client.comcast.net 2005-01-13 16:52:42 1
66.249.71.73 crawl-66-249-71-73.googlebot.com 2005-01-13 16:48:52 2
83.227.79.20 c-144fe353.545-1-64736c10.cust.bredbandsbolaget.se 2005-01-13 16:48:38 1
210.86.223.133 ppp-210.86.223.133.revip.asianet.co.th 2005-01-13 16:36:49 1
62.252.64.16 lutn-cache-5.server.ntli.net 2005-01-13 16:35:57 1
66.249.71.28 crawl-66-249-71-28.googlebot.com 2005-01-13 16:34:30 1
24.57.198.124 d57-198-124.home.cgocable.net 2005-01-13 16:16:16 1
136.165.83.194 dorm83194.dorm-net.louisville.edu 2005-01-13 16:10:42 1
139.184.36.15 cs214310.pws.uscs.susx.ac.uk 2005-01-13 15:54:46 1
66.249.64.66 crawl-66-249-64-66.googlebot.com 2005-01-13 15:10:35 1
211.30.34.71 c211-30-34-71.rivrw6.nsw.optusnet.com.au 2005-01-13 15:08:07 2
213.84.221.11 mfb.xs4all.nl 2005-01-13 14:55:33 4
213.118.82.44 dD576522C.access.telenet.be 2005-01-13 14:47:13 2
62.234.133.74 nperspectief01.nieuw-perspectief.nl 2005-01-13 14:41:00 1
213.190.147.194 213.190.147.194 2005-01-13 14:39:12 10
81.10.40.176 host-81.10.40.176.tedata.net 2005-01-13 14:38:51 1
68.142.249.76 lj2066.inktomisearch.com 2005-01-13 14:34:01 1
Google has given me more than one visit and it has crawled my pages. But it is not indexing. I removed all my .php files from previous website by google remove all tool. it did it good. but now Why is it not indexing my new pages?
Am I penalized for duplicate content. If I m penalized for duplicate content why I m visited by the bot?
Any answer or suggestion . It would be great. I can't see any solution till now.
but When I checked what it crawled I can understand it has only visited my web root that means / directory for lots of times.
here is one line from my rawlog file today.
66.249.66.200 - - [14/Jan/2005:00:25:21 -0800] "GET / HTTP/1.1" 200 10884 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:23 -0800] "GET / HTTP/1.1" 200 10886 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:23 -0800] "GET / HTTP/1.1" 200 53938 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:24 -0800] "GET / HTTP/1.1" 200 10886 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:25 -0800] "GET / HTTP/1.1" 200 51078 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:26 -0800] "GET / HTTP/1.1" 200 10883 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:27 -0800] "GET / HTTP/1.1" 200 53938 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:29 -0800] "GET / HTTP/1.1" 200 10885 "-" "Mediapartners-Google/2.1"
66.249.66.200 - - [14/Jan/2005:00:25:30 -0800] "GET / HTTP/1.1" 200 51078 "-" "Mediapartners-Google/2.1"
I think GET / means the webroot directory. But it has requested this for 331 times today. What's the problem? has it enterd in a loop?
Wish a reply from the Search engine gurus. Please help.
Thanks in advance.
[edited by: coolsaint at 11:32 pm (utc) on Jan. 14, 2005]
hostname:
crawl-66-249-66-200.googlebot.com
ip :
66.249.66.200
you can see here for google crawler information :
[google-dance-tool.com...]
Can anyone help on this case please?
To clear this up, the above bot is merely for letting adsense know what the page contains so relevant ads can be shown. Usually a page will be in the main Google index and info is already known, if there is no info then a copy of the page is requested by the seperate bot.
This bot DOES NOT pass the details to the main Google database, it is for adsense only. The usual pattern is that the adsense bot can retrieve the page almost immediately as the strain on it is nowhere near as bad as the crawls that GoogleBot does.
I can't explain the multiple requests for the same page, could it be there is something different such as many domains pointing to same content (bad) or possibly you have logs not recording querystrings or session IDs.
Interestingly, it is possible to get adsense to show what ads would be shown for any page you wish (find out what google thinks the page is really about, sometimes a surprise). Just lookup and install Adsense preview tool (IE), it will show the adverts that are associated with the words it thinks the site is about.
Nothing too exciting and definately not a way to jump any queues.