homepage Welcome to WebmasterWorld Guest from 107.21.163.227
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 182 message thread spans 7 pages: < < 182 ( 1 2 3 4 [5] 6 7 > >     
Googlebot not crawling
Seeks index page, then leaves
feeder




msg:185700
 6:12 am on Feb 12, 2004 (gmt 0)

Googlebot visits often. It requests the index page, but doesn't crawl any deeper. This happens two or three times a day.

The MediaBot crawls deeper into the site without issue. The site runs AdSense.

Could there be anything in the server config that is causing this? It isn't robots.txt. The index page is lo-fi and xenu crawls it fine, as does the searchengineworld sim spider.

Any ideas?

 

HarryM




msg:185820
 1:39 am on Mar 16, 2004 (gmt 0)

Only a few pages of my site were taken for a week or so, but Google reappeared towards the end of last week and stats show took almost all of my pages (about 300), including new ones I had put up during the week. I'm hosted in the UK - is location significant to Google crawling patterns?

trimmer80




msg:185821
 2:49 am on Mar 16, 2004 (gmt 0)

>> is location significant to Google crawling patterns?

I would not think so. Crawl pattern is related to your Inbound Links, thus geographic location should not be significant.

mr_strong




msg:185822
 8:12 am on Mar 16, 2004 (gmt 0)

Update for me:

New site launched middle of February with 2-3 good incoming links to it - Google found the site within a few days and indexed the home page. Almost every day since then Googlebot has been back but only grabbed the robots.txt page and home page.

That changed as of this morning (almost exactly a month after Google first discovered the site) and Google is currently slowly working through the pages of my site. Big relief ;)

Good luck everyone and be patient, I'm sure Googlebot will get to your sites - it's just taking a bit longer that's all ;)

BallochBD




msg:185823
 8:12 am on Mar 16, 2004 (gmt 0)

My home page has not been crawled for three weeks. Along with one other page from more than 80 it was the only one which had retained title, cache and description.

Yesterday it also lost this data but today it reappeared but in the same old version. I cannot find any bot activity in my logs for this site. What is going on?

WRT
For the people who are having this problem and the people who are not, can we post details in the hope a common link will be found.

November - 5000 Google hits/month
January - 2000 Google hits/month
February - negligible Google hits/month
March - Zero Google hits/month

The only traffic I now get from google is my site:www,mydomain.com check :o(

Moncher




msg:185824
 9:41 am on Mar 16, 2004 (gmt 0)

My home Page is still not crawled, almost from a month.Same old Cache.Some changes on inner pages got reflected but not home page changes.Since yesterday i observed that SERP's are looking similar to pre Brandy.

BallochBD




msg:185825
 3:09 pm on Mar 16, 2004 (gmt 0)

Is this essentially the same subject as [webmasterworld.com...] (Big sites suffering no title...)?

RussellC




msg:185826
 3:22 pm on Mar 16, 2004 (gmt 0)

I finally got a full crawl of my site last night and this morning as well. Whew. Good luck everyone.

stcrim




msg:185827
 3:43 pm on Mar 16, 2004 (gmt 0)

Very IMPORTANT QUESTION!

Of the sites that are being ignored, how are you linking your sites? Or, how are you getting them found by Google?

I for one created directories that allow a user to visit any of our sites from any of our sites. But now that I look at it they smack of "LINK FARMS" because we have hundreds of sites? Hmmmmmmmmmm

Anyone else?

-s-

Duke_of_Url




msg:185828
 4:16 pm on Mar 16, 2004 (gmt 0)

As I mentioned elsewhere, I launched a site early Feb and it took just over a month to get a deep crawl from gglbot. The pages appeared in googles index a couple of days later. That site is now getting some decent ggl traffic. The domain name for that site is a subdomain of a related subject site that I've had for over a year, ie

keyword1-keyword2.myoldersite.co.uk

I launched another site last Friday, 12th March, and am already seeing some gglbot crawling to deep pages (ie not just homepage) today, after just these few days. Quite why or how is anyones guess, this is also under a subdomain of the same related subject existing site

keyword1-keyword2-keyword3.myoldersite.co.uk

(myoldersite.co.uk has been in googles index for over a year now)

Both have similar inbounds setup to them, the former site has about 1300 pages with affiliate related links on it, the latter approx 120 pages and showing adsense links down the RHS.

Both sites have a selection of non commercial links out, to content sites on related topics.

The bot I am seeing on the new 4 day old site is Googlebot/2.1, 64.68.82.178, so am I right in thinking this is a proper crawl I'm beginning to see? (I also see the Adsense Mediapartners bot as expected)

Last Wednesday (10th) I launched another different site, also a subdomain, and that too is seeing some limited deepcrawls today from 64.68.82.136.

Are both these 64xx ips the pukka deepcrawling bots?

thanks
DoU

johannamck




msg:185829
 5:34 pm on Mar 16, 2004 (gmt 0)

I've seen major crawling over the weekend, and it's still going on.

A couple new sites that had been waiting for many weeks are finally in the index.

Another site that has been waiting for an update (new URL's, titles etc.) had 100+ new pages included as of this morning.

I don't see a major shake-up in the search results yet though, and the brand new pages aren't ranking well (whereas the updated ones are doing fine). The new pages only show up in searches for very specific long phrases copied from the page text.

One strange thing: When I do a site:www.mydomain.com search, only a handful of the new pages are shown, and the others are lumped together under the "repeat search with omitted results included", even though they have very different content, titles, etc., with static URL's. I've never seen this happen to this extent.

Does anybody else have the impression, that brand new pages are not ranking to their full potential yet?

stcrim




msg:185830
 8:08 pm on Mar 16, 2004 (gmt 0)

Everyone of my sites with this problem has 2 things in common:

I have redirects from old pages to new pages that replaced them and

The sites are linked to each other...

Does anyone remember how to contact Google for a penalty removal or review?

-s-

stcrim




msg:185831
 2:21 am on Mar 17, 2004 (gmt 0)

Somehow being listed in DMOZ is a factor in this.

-s-

Powdork




msg:185832
 4:14 am on Mar 17, 2004 (gmt 0)

Is being listed in DMOZ a factor or NOT being listed in DMOZ?

experienced




msg:185833
 7:10 am on Mar 17, 2004 (gmt 0)

being listed in DMOZ a factor ;)

experienced




msg:185834
 7:19 am on Mar 17, 2004 (gmt 0)

mr_strong - AGREE WITH YOU

I also started 3 new projects at the same time and finished all of them mid of feb. And this time i have all of them indexed in google (Not full site) but more than 70% pages are there in SERPS. People have to wait for some more time to get the site crawl by GB.

Best of luck to All

Thans
Exp...

Troppo




msg:185835
 8:42 am on Mar 17, 2004 (gmt 0)

Is anybody seeing a correlation between this type of problem and domains that have been registered with a private whois option (aka proxy registration)?

MrSpeed




msg:185836
 2:04 pm on Mar 17, 2004 (gmt 0)

My whois info is public.

I not sure I'm ready to start exploring any conspiracy theories with regards to whois, dmoz.

It would be nice to hear from GG with some explanation. Are they retooling the crawlers so there has been less crawl activity lately? Is there some sort of a tweak in the crawl algo?

stevenb 1959




msg:185837
 2:18 pm on Mar 17, 2004 (gmt 0)

As I see it, Google is still crawling consistently every day. I am still seeing a new cache page every day of my website entry page.
Keep up the consistant work of staying on top of things Google.
Steve

BallochBD




msg:185838
 2:30 pm on Mar 17, 2004 (gmt 0)

LOL - you can brown nose to Google all you like but just don't get complacent. As you see it Google is crawling YOUR site every day. You are there by the grace of God and Google!

stevenb 1959




msg:185839
 9:37 pm on Mar 17, 2004 (gmt 0)

On the topic of Google not crawling, From what I see it seems in the Google serps that only web pages with page rank of 5 or more have a date of the cache beside their website serps results listing.

stevenb 1959




msg:185840
 9:41 pm on Mar 17, 2004 (gmt 0)

Regarding the above comment I made, I may not have analyzed that correctly, but any comments and evaluations would be greatly appreciated.

Troppo




msg:185841
 2:51 am on Mar 18, 2004 (gmt 0)

My whois info is public.

I not sure I'm ready to start exploring any conspiracy theories with regards to whois, dmoz.

Thank you for answering my question, which was in no way intended to start a conspiracy theory. I am however intrigued by your mention of dmoz. Am I missing something?

added ...woops, checked the previous page and I see what you are getting at about dmoz. Sorry about that.

Patricio




msg:185842
 1:54 pm on Mar 18, 2004 (gmt 0)

While reading carefully all posts in this topic i was thinking there´s a new pattern on how google crawls new sites.

I'm experiencing the same things with a site we can consider new: in fact the site has more than two years but till this january was not SE friendly (it was not user friendly also!) -it had very long variables in the URL and used frames. I rewrite the entire code the first days of this year, took out the frames, and set urls with only two variables max.

The site is indexed in several directories as yahoo, dmoz, etc, but added fresh external links in a daily crawled site, and googlebot came very soon, only to visit the homepage.

During January, googlebot never returned, but the new homepage was indexed very soon. Then i realized (thanks to webmasterworld!) i couldn't get success passing variables named "ID" trough the URL, and fixed this the first days of february. Googlebot returned, again, only to visit the index, but came daily six or seven times. And thanks to other external link, also indexed an internal page. The last two weeks of february, i had no news of googlebot. (The homepage is changed everyday).

So, starting march, i decided to use the apache rewrite mod to show urls as if they were static. Since i did so, googlebot is coming everyday but only to reach the homepage. This is for the last two weeks. This daily visit was very regular up to march 16th. From that day, googlebot didn't return.

After reading this topic i'm expecting a deep crawl soon. If it happens, i'll tell you. If you think there's any way of helping that, please tell me.

[edited by: Patricio at 2:43 pm (utc) on Mar. 18, 2004]

Leosghost




msg:185843
 2:42 pm on Mar 18, 2004 (gmt 0)

Subject : Googlebot not crawling
Description: Seeks index page, then leaves

Diagnosis : insufficient spammy content to merit hanging around and indexing?

Second opinion : no friends on page ( no adwords? )..

MrSpeed




msg:185844
 3:33 pm on Mar 18, 2004 (gmt 0)

Diagnosis : insufficient spammy content to merit hanging around and indexing?

Googlebot needs to crawl pages before it can be determined if the site is spammy, duplicate content etc.

Leosghost




msg:185845
 3:49 pm on Mar 18, 2004 (gmt 0)

Normally it finds the adwords on the index page ...

If they aren't there they probably arent anywhere on site!

Powdork




msg:185846
 4:05 pm on Mar 18, 2004 (gmt 0)

Goglebot has been hitting me hard since last night. Strangely, she is stopping at the index pages of all my photo galleries rather than following the links to the different pages within. The gallery index pages have little content other than 'click to enter gallery' while the pages in the gallery all at least have captions. Its early yet but I am wondering if this may be signaling her to go away (or at least not to go further).
<META name="generator" content="Adobe Photoshop(R) 6.0 Web Photo Gallery">

[edited by: Powdork at 4:06 pm (utc) on Mar. 18, 2004]

nkakar




msg:185847
 4:06 pm on Mar 18, 2004 (gmt 0)

Does anyone know how to track all the visitors that come from thee different search engines.. I mean how do you know how much traffic you're getting from which engine and if you can have a ROI counter there?

Thanks guys!

nkakar




msg:185848
 4:18 pm on Mar 18, 2004 (gmt 0)

great tool jonathan :) only one question. What if you have a long list of keywords, and each keyword is a different url and price... How would you do that?

BallochBD




msg:185849
 4:21 pm on Mar 18, 2004 (gmt 0)

There are several good statistical tracking packages available for not a lot of money. There are also several that come free one of which can be had from >snip<. This lets you know where your traffic has come from.

Sorry. I forgot - no URLs. If you want to sticky me I'll send you the link.

Powdork




msg:185850
 5:48 pm on Mar 18, 2004 (gmt 0)

You can read in the Tracking and Logging Forum [webmasterworld.com] and get some great ideas. There is usually a 'Which stats program do you use' thread going if you look hard enough. Also, I have found much better tech support there than with the products' producers.

Oops, I forgot why I came here. She just grabbed all the gallery pages she skipped before. I guess she just had to go back for more PR. I'm pretty stoked as my site just grew by a factor of 4 since yesterday in Google's eyes.

This 182 message thread spans 7 pages: < < 182 ( 1 2 3 4 [5] 6 7 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved