Forum Moderators: open
I cannot trace any visits from the Googlebot in the last two months.
Several sites with a PR4 or more link to my site. They do not show up when searching on link:www.domain.com.
Would appreciate any advice on improving the situation.
I can't see anything wrong with your site, except perhaps over doing the comment tag on the home page, but this would not cause the problem, just be ignored. I suspect it is just a links in issue. Link: shows only one, so a few more will help soon ;)
Have you been involved with any links farms etc? That could cause this.
Perhaps I should have been clearer, sorry.
I mean't linking to any links farms. You know how you get endless emails saying "I found your site in my favorite directory, Yahoo..." and others which are obviously automated. Well, if you start participating in these kind of schemes then it will probably back fire, especially if you link to them.
You are right, any of these schemes that link to you should have no effect, they are probably ignored.
Well they don't show up yet. Are the pages the links are actually on indexed? Are they links the spider can follow? How many links out are on the pages that link to you? Any more than 100 probably are ignored (the first 100 will be counted, the rest may be followed but not count for much)
Point a. looks unlikely because of the cached page and the link still to be found.
Point b. is more likely. After the Dominic update (early May), a lot of rather new sites got lost. In the thread Listed in Google directory, but thats it... [webmasterworld.com] you wrote: I have similar problem, with my site which has been in the Google index since April. With the Esmeralda update, a lot of problems were solved, so maybe your site still has a Dominic-update related problem.
You could try to get some deep links (links to one of the sub pages) and/or do an 'Add URL [google.com]' for the home page and some sub pages. Or wait for the next update of PR and backlinks.
Many of your internal links are very long with session id's etc. The spider may stumble on those.
Other pages have little content compared to the 'template' content which is the same on many pages, so these may be ignored until you get the ratio of original content higher. In short, you need more unique content on internal pages. You may be tripping some algo which is identifying duplicate content and thus ignoring, leaving it with a few links and not much else per page. Take a chunk of text from your home page which is repeated throughout the site and pretty unique to you, e.g. "Copyright © 2003- All rights reserved - Your site name" Nothing in Google... so it is ignoring it because it see's it as part of a template.
This is a stab in the dark, but I would beef up the pages with more unique content per page.
I am not surprised niloc is totally mythed by this.
Many of your internal links are very long with session id's etc. The spider may stumble on those.
I don't see a single session ID. Although some sub pages have 3 or 4 parameters, that shouldn't keep the home page from appearing in the SERPs. Especially since there are also a lot of internal links from the home page to sub pages without any paramaters. Duplicate content on the same site should just mean, not all pages are shown in the SERP.
To me it looks, the home page has been assigned a docID and it is stored in the repository (otherwise the 'cache' and 'info' couldn't work), but for some reason the indexer never parsed it. The snippet shown with the 'info' command matches the text on the cached page, so parsing the HTML cannot be the problem.
Maybe I'm way of track here, but there seem to be lots of links like
[........com...]
These I suspect will be ignored. Also the pages these links are on have very little other content but for the template. If google looks at the site as a whole, it will only see a small percentage of unique content. I know it used to work on a page basis, but things are changing and maybe it looks now at factors covering a whole site.
"Duplicate content on the same site should just mean, not all pages are shown in the SERP"
Maybe, but they have been dropped totally, don't forget that google needs to minimise too much data clogging up the works. I could understand it if they are getting ruthless. Duplicate content must be a big factor they are addressing, otherwise both their databanks and possibly the serps will be swamped with generated pages. If I was running a search engine, I would deem these pages as having little content and just try and find pages where there is useful unique content. In this case, the links stop me doing that.
Index page only listed (and that via manual submission)
Cache image circa first part of June
In reality : good links (you can see some of them in SERP if you enter the right search string) but-
Zero links showing in "link:oursite.org"
I do have added a robots.txt file (which is getting a LOT of traffic, BTW)
I'm maintaing an up-to date site map and revising and adding content
Working my fanny off to get good links, (although, as I've said before, many don't appear to be getting Google's attention)
Possible reason for optimism: from the 18th through the 22nd there was a small flurry of google bot visits, which went a little bit deeper into the site.
Google has set a certain measure or level of "importance" for new sites. Above it and you get crawled and indexed FAST. Below it, and you're outta luck. I feel that the size of the site (number of pages, internal links, text, images, etc) is a factor along with the old standby of backlinks. But it requires more than the old "DMOZ, Yahoo, etc" mantra.
What has changed, is that now, below a certain level and all that you can hope for is to have your index page listed and that's all. Above it, and you're in the clover.
Whatever it is that I've been doing right, between the 18th and the 22nd, the bots' visited some interior pages for the first time.
Now for the big question: when (if ever) will the material that was just crawled, show up in results?
BREAKING NEWS FLASH
before I hit "submit" I ran one more "site:lucaswidgets.org -qwerrew:"
search and hit eight of our pages! The cache image is updated along with a fresh date of the 24th.
(BTW, I also have tried looking the main pages up via the Toolbar a few times. I should have kept track of which ones I did and how often. Despite a complete sitemap, not all pages got crawled and all I can say is "it didn't hurt..." It's very interesting.)
I wonder if they will they "stick?"
I have exactly the same problem as Fearless - google had found my index page even before I had submitted the URL (?) and it came up when I did a search. I finished my site and submitted its URL to google, but a search showed that old cover page for one month, then at the beginning of June, nothing, nothing in a search and no links to my site (there were few). Oh, this morning ONE link turned up from a posting my wife had made on a bbs.
I am new to the web, I've been going it alone but in following many a helpful advice I've found here and on the web, but I think time has come for me to ask for help. I am at the same time flummoxed and - increasingly paranoid. If I am doing something wrong, believe me I've read all I can to try to find it...
Here is my site:
this was my robots.txt but I pulled it:
User-Agent: *
Disallow: /images/
Disallow: /cgi_bin/
Disallow: /403.shtml
Disallow: /404.shtml
Disallow: /navWarn.html
I set up direct links, the ALT's, everything, no listing and not even a visit from the googlebot since early June. Have I been penalized for some reason I am unaware of? I have participated in no 'scams' or 'better listing' schemes, nor done any 'trickery'.
Thank you in advance for any help,
Josefu & Coco.
[edited by: Josefu at 11:52 am (utc) on July 28, 2003]
Welcome to WebmasterWorld. First of all, it is not allowed to put your own site in a message. Well, you did, so I could check the site, but please remove the link in the previous message.
It looks though that there are a few problems, but most likely the site is not penalized.
1. There are almost no links to the site (AllTheWeb, Altavista, Google, Inktomi; none of these search engines shows a link to your site). So maybe the guestbook link is the only one.
2. Most pages are only pictures, so there are no keywords on the home page except for the meta-description.
3. Internal links cannot be found by most search engines becuase they are only done by JavaScript & Flash.
4. When I opened the site, I saw only a empty frame hanging on the 'wall' because the of the pop-up killer I use.
5. Reading robots.txt gives an 404 error page for me here in Japan.
6. At this moment the whole site looks unknown to Google, none of the pages is indexed. No cache at this moment.
To solve these problems:
a. Make sure other sites link to your site, and if possible not from a guestbook. This will get your home page in the index.
b. Put some keywords on the home page. This helps you to show up in the SERP (Search Engine Result Page), if point a. is done.
c. Make sure there are 'normal' links to sub pages. This will get those pages in the index, if point a. is done and will also make sure any visitor can get to the next pages, even if an pop-up blocker is installed.
d. Put the robots.txt in place if you want it to work
My two yen.
takagi.
So I guess it was true - there has been a mini-update.
No changes in the SERPS for any keywords I look at though, and certainly no new backlink count (we'd be #1 by a margin if that had happened).
Curiouser and curiouser....
TJ
First off, thank you for your rapid reply - and I'm glad you can read our site. I've already removed my URL, sorry for missing that.
I pulled the 'robots.txt' this morning in a fit of - paranoia? I'll put it back as I have seen, and in fact already knew, that that wasn't the problem but this morning I seem to be doubting everything I've learned over the past months : )
Many of my direct links are in 'noscript', as per the 'scriptless' capabilities of the robot... from the flash page there is a direct link to the 'Salon' and the sitemap, the former has links to 'rooms' further in and the sitemap has 'sciptless' links to the entire site. That in itself is fine, is it not? Perhaps I should work to 'scriptlessly' link the whole site together.
Even after my site had opened, the 'old' page (that wasn't even supposed to be visible to the outside world yet) was still in the google cache for over a month. Gone now, as of the beginning of July.
You are so right about the pop-up blocker, I hadn't thought of that. Damn the people whose sites abuse pop-ups. Keywords and text in the intro, too - yes, only the 'noscript' text reminding people to turn on javascript... That would be odd to see as a search result : )
As you can see by my site, as a graphic artist, I am trying to break the 'columns, text and images' the net is trying so hard to impose. I suppose when people get around to seeing the site (Hiroko is working on the links, that takes time) I won't have any problem, but for now I'll be penalized for not... going with the flow. That's a risk I took - but shouldn't the 'noscript' cover that? Does google consider 'noscript' links as cheating? I didn't abuse, one really can navigate my site without javascript - though not as well. I'll try to work on that.
Thank you very much for your help, and if you have any more suggesstions or criticizms please don't hesitate to post, you'll only be helping.
Thank you,
Jojo & Coco.
[edited by: Josefu at 12:22 pm (utc) on July 28, 2003]
googlebot comes and goes. The other day I looked at my stats across 2 domains, and it was working overtime. The last 2 days zip. I think it discovered adult sites and we wont see it again for a month or so.
Three things:
1. New sites (espcially ones with Flash or Java internal links) MUST have a site map. Googleguy has "suggested" this repeatedly of late. Believe him, he knows whereof he speaks.
2. You've gotta have a robot.txt file. If you hunt through this forum you'll also see Googleguy recommending this. Believe it.
3. Try to cultivate links to your site. It used to be that getting in Yahoo's directory and DMOZ was enough. Not any more. I found that in my realm, digging hard and being creative turned up good links that I had never thought of before.
Try looking up the pages in your site via the Toolbar.
The link to the site map is from the index page, in the <noscript> links - but I guess the only question I have left is: does google consider <noscript> links as cheating? In reading I saw that it helped the robot find the content of my site, but decided, since I was going to do it, to do it right for those who for some reason refuse to use javascript.
Take care,
Josefu.
Here is the latest inconsistency relating to my site.Last week I agreed a reciprocal link with a 'charming' web site.
Searching on mydomain finds this link.
Searching on link:www.mydomain.com does not
This is NOT an inconsistency. In fact, it is very consistent with how google periodically updates backlinks.
That is step one. Of course a well behaved, politically correct bot like googlebot checks what you have stated is permissable first in your robots text file. And your robots text content is perfect :-) Check your logs again in 24 hours ;-)