Forum Moderators: open

Message Too Old, No Replies

Spiders to pick up my articles?

Only my main page is listed on the major search engines

         

TommyJoc

10:13 am on May 4, 2003 (gmt 0)

10+ Year Member



Hey all,

My website have been indexed by most of the major search engines, but even though I see their spiders visiting my site creating hits, my main page is the only one indexed.

Many of my pages are for instance help pages (www.mysite.com/Help/HelpTopic1.php and so on), but many of my pages are also articles (www.mysite.com/Articles/View.php?articleid=12345).

There are new articles everyday, and I link to the 10 most recent on my mainpage.

My main page have nice PR for some keywords, and the page is also linked to by many other websites. They all link to my mainpage tho, not my help files or articles.

What can I do to have the search engines picking up and indexing my other pages?

Any help is much appreciated! =)

takagi

10:21 am on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi TommyJoc, welcome to WebmasterWorld. If the homepage is indexed but the sub pages not, then you might need a site map. In the long term Google will only index pages that have at least 1 link to it. So if you update the 10 last added pages, then Google could decide to remove the older articles if no other link can be found.

TommyJoc

1:55 pm on May 4, 2003 (gmt 0)

10+ Year Member



Hey Takagi, thanks for the answer.

I do have a site map actually. On all my pages I have a left side navigation bar that link directly to some of my help pages, and also to the sitemap. The sitemap then links to every help page there is.

When it comes to the articles, the links may stay on the main page for a day or two and then be rotated out. The article still exist but can only be accessed through a search.

Still, only my main page is indexed. How long after i.e. the googlebot hits a site will it's new findings be added to the google-index?

Thanks,

-Tommy

takagi

2:07 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Tommy,

If the sitemap is not indexed, it won't help for the other sub pages.

Few questions to better understand the problem
1. When did you see the homepage of your site for the first time in the SERP (Search Engine Results Page)? In other words, is this a very new site?
2. Do you use JavaScript to handle the navigation bar?
3. Do you use frames to display the navigation bar?
4. Do you use a robots.txt or a META with similar function?
5. How do you check the number of pages indexed? Search for:

site:www.mydomain.com -blablabla

and you will see all the pages from the site that miss the string 'blablabla' (i.e. including the pages found but not yet indexed).

pageoneresults

5:14 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Have you checked to make sure that those pages are spider friendly?

Search Engine Spider Simulator [searchengineworld.com]

Is your server returning the proper status in the headers? Last modified dates?

Server Header Checker [searchengineworld.com]

And, do you have a robots.txt file in place? If so, have you validated it to make sure there are no errors?

Robots Text Validator [searchengineworld.com]

Those would be the first three things that I'd take a look at. Then come back and let us know if you passed those three tests.

When using the SIM Spider, scroll down the page and make sure that the spider is seeing all of the links you want spidered.

TommyJoc

5:40 pm on May 4, 2003 (gmt 0)

10+ Year Member



Hi Takagi and PageOneResults, thanks for helping me out.

Tagaki first:

Re sitemap: Right, makes sense.

1. Some time during january 2003. We launched late last year and submitted across the board immediately.
2. Nope. Plain anchor-tags (full browser compatibility).
3. No frames.
4. No robots.txt (My understanding is that robots.txt is only to _limit_ spider activity, and not encouraging. Am I wrong here?)
5. I try searching for specific words/phrases that I know exist on my individual pages, but there are no matches. No matter what I do I can only find my mainpage (but I can find that page in a dozen different ways).

PageOneResults:
SpiderSim: They seem spider friendly (it doesn't tell me it's not), all the keywords appear, as well as the wanted URLs. (However, I will do some more testing).

Header Check: It all seems allright. The results are the same for both my mainpage and my helppages (and the mainpage works).

Robots.txt check: I don't have one (I have nothing to exclude really).

If a spider picks up the links on my mainpage today, should they be indexed tomorrow? Or can there be weeks of delay like when submitting manually?

Thanks guys, and thanks for the links to the tools!

-Tommy

TommyJoc

5:46 pm on May 4, 2003 (gmt 0)

10+ Year Member



I did some more Server Header Check, and it does return this:

Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0

Pragma: no-cache

Will these settings affect the indexing?

-Tommy

takagi

5:46 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



4. No robots.txt (My understanding is that robots.txt is only to _limit_ spider activity, and not encouraging. Am I wrong here?)

That is right. But if there is one and you make a type-error of forget it is there, it could prevent pages to be indexed.

Do you have inbound links to your site?
What is the PR of the homepage?

TommyJoc

6:09 pm on May 4, 2003 (gmt 0)

10+ Year Member



Yes, I have many incoming links to my site. I have spent a lot of time 'Add/submit a URL' to related portals that list websites related to my business.

For some phrases (e.g. "for rent") I have high ranking, like the top page rank at google (PR1). But that may partially be bacause the phrase is a part of the name of the website, and therefor listed in the page title and elsewhere.

-Tommy