Forum Moderators: open

Message Too Old, No Replies

How long does it take to get listed on Google after a crawl?

Time to get listed by Google after a full crawl

         

nativenewyorker

9:34 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



My website has now been live for 6 weeks. The keywords related to my site are part of the url to optimize SEO.

example - [keywords*a*b*c*.com...]

When the site was first conceived, I built a template to work from and uploaded roughly 200 pages with various names, but with the same content. This was so the menu would not point to non-existent pages and return 404 errors. Over the course of the last 6 weeks, I have been replacing each of these pages with content. My site was submitted to Google during the first few days and was deep crawled 2 1/2 to 3 weeks ago. There is still no sign of my site on Google and the Google toolbar still shows a PR that is grayed out. Is Google somehow perceiving that I was spamming them with roughly 150 similar pages? All the temporary pages were tagged to expire and set to no-cache so they would not be returned in the results of a search engine query.

<meta http-equiv="expires" content="0">
<meta http-equiv="pragma" content="no-cache">

The updated pages have the following header with the expire and no-cache tags removed.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>*****</title>
<meta name="description" content="*****">
<meta name="keywords" content="*****">
<meta name="abstract" content="*****">
<meta name="robots" content="index,follow">
<meta name="distribution" content="global">
<meta name="revisit-after" content="5 days">
<meta name="copyright" content="© 2002 *****">
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta http-equiv="window-target" content="_top">
<link rel="stylesheet" type="text/css" href="style.css">
</head>

Despite the revisit tag, the Googlebot has not returned since the deep crawl. These are some of the results from my logs after the deep crawl.

Host: 216.239.46.13 Url: /*****.html Http Code : 200
Date: Nov 11 07:50:28 Http Version: HTTP/1.0 Size in Bytes: 4700
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Host: 216.239.46.27 Url: /*****.html Http Code : 200
Date: Nov 11 07:44:03 Http Version: HTTP/1.0 Size in Bytes: 4700
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Host: 216.239.46.146 Url: /*****.html Http Code : 200
Date: Nov 11 07:35:43 Http Version: HTTP/1.0 Size in Bytes: 4700
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Host: 216.239.46.166 Url: /*****.html Http Code : 200
Date: Nov 11 07:31:21 Http Version: HTTP/1.0 Size in Bytes: 4700
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

Host: 216.239.46.20 Url: /*****.html Http Code : 200
Date: Nov 11 07:29:13 Http Version: HTTP/1.0 Size in Bytes: 4700
Referer: - Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html)

This appears to have been the main Googlebot and not the Freshbot as I seen mentioned on these threads. Am I wrong?

As my site is still relatively new and still incomplete, I have not yet asked for any reciprocal links, however all pages have a link back to the homepage and certain key pages. From what I have read, links within my site should help in my PR although outside links are of greater benefit.

Another thought is that Google may be perceiving that I am putting white text on a white background. On IE, the blue background graphics file displays correctly and the text is completely legible. I was checking my site on a friend's Mac and noticed that the background graphics did not display properly. In that case, it was showing white on white and appeared invisible. This invisible text is only comprised of about 10 words at the bottom of the page (my copyright info). What are the chances Google is penalizing me for this?

Should I be concerned that my site is still nowhere to be found after 3 weeks of being crawled?

BTW, what does the acronym serps stand for?

Thanks in advance for any suggestions or ideas.
nativenewyorker

troels nybo nielsen

9:50 pm on Nov 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WW

Easiest question first: SERP is search engine results page (see glossary at top of page)

troels nybo nielsen

10:09 pm on Nov 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To get listed on Google you basically have to do two things. And avoid doing a third:

1; Have incoming links from a page already listed on Google. Links are extremely important with Google.

2: Have genuine content on your pages.

3. Don't do anything that may be interpreted as cheating!

It seems that your not being indexed derives from not being fully aware of these facts.

Terry_Plank

10:22 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



[webmasterworld.com...] is a great place to learn about Google.

One thing you mentioned was having 150 similar pages. That, I would never do. If they are very similar then may be spamming with them. Make pages unique with a target keyword phrase for each page or unique material.

I also always make it easiest for Google to find things using only the following. Especially not needed to ask them to
<meta name="robots" content="index,follow">

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>*****</title>
<meta name="description" content="*****">
<meta name="keywords" content="*****">

</head>
</html>

nativenewyorker

10:54 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



Terry,

The 150 temporary template pages were all very basic. Basically showed my logo, background, javascript menu link, and some empty tables where content was to go. I did not fill up the temporary pages with nonsense such that it could in anyway be perceived as spam. Would it be preferable to delete all these temporary pages and take the chance of driving away visitors with 404 errors? At this point I am down to 88 temporary pages, and have 138 with varying amounts of content.

I do have a robots.txt file, but thought that it was on the safe side to have the meta robots tag in place as a backup.

I guess I'll spend some time tonite tweaking the keywords on each page so that the remaining 88 temporary pages are not all identical other than the filename.

Thanks,
Ted

bobmark

11:25 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



The 2 1/2 week ago crawl possibly should have got you into this update as it is not unusual for sites in the update to be crawled up to about the 14th., but there is sometimes a one month lag in the first appearance in the index so your absence from it may mean nothing.
The consensus on here by people who have a lot of experience with SE technology is, I think, that Google parses pages to establish mirror pages or spamming using an algo that compares structural and content similarity (quite easy technically and you can set a threshhold of "identicalness".
I think your method of setting up dummy pages is dangerous in this regard and could incur a penalty for you before you even get included.

Terry_Plank

11:32 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



Ted,

You are best to have the unique text on each page and not have any dummy pages, like bobmark mentioned.

Anyway, if you revisit your pages, make sure that it's the text of the page you are changing to make more unique. As you know Google isn't interested in the meta keyword and discription. I always use them just in case, but Google isn't evaluating those because they have been used to much for spamming.

You are supposed to be safe if you use the robot txt file to tell Google not to go to pages. But, I have heard some reports that they don't pay attention to them. But, they say they do so I don't really know whom to trust to be sure. For myself, I'd probably risk the 404's. But make sure you have a unique page there with a link back to your site so you don't probably will be able to keep them.

nativenewyorker

11:43 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



Bob and Terry,

Given that the pages have almost no content and are mostly tables and scripts pointing to menus, do you really think that Google would penalize me for these pages?

I had originally tagged these pages as expired and no-cache so that Google would recognize that these pages are under construction. What reason would I have to tag them as such if I was really a spammer?

I'll add those temporary pages to my robots.txt file as an extra precaution, in addition to making some text changes to highlight what is to come on each page.

Thanks for the input guys. It is really appreciated.
Ted

Terry_Plank

11:58 pm on Nov 30, 2002 (gmt 0)

10+ Year Member



Well, probably not penalize you, but not help you. That was more my direction of thought, what would help.

If there is a clear reason for doing what you are going, if a human from Google looked at them, you would be fine. I would just wonder a bit about the algorithm evaluating all of them. Final analysis, have to keep experimenting and see what happens. :-)

I'd just keep changing those pages and getting them clean and crisp, saying what you are about as a site. Then see what happens after two or three more crawls.

Good luck.

bobmark

5:06 pm on Dec 1, 2002 (gmt 0)

10+ Year Member



"I had originally tagged these pages as expired and no-cache so that Google would recognize that these pages are under construction. What reason would I have to tag them as such if I was really a spammer?"

I think you're attributing too much human (or artificial) intelligence to the spidering process and it's attendant evaluations of spam/mirror techniques. You have an algo which is essence says "if these 2 pages are 95% (90? 85?) the same according to my page comparison algo, then they are mirrors/spam."
People use nocache for a variety of reasons and it's not clear if Google pays much attention to it; certainly they cache "nocache" pages routinely.