Forum Moderators: open
It was a forum with about 11000 topics.
So when I brought back the site a couple of weeks ago, I've setup a good site map with all the topics.
Google spidered all the links: when you do a search on my domain you find the 11000 links.... but only the links. No title, no content... as if all the links were in a disalowed directory which is of course not the case.
Now what? Should I expect a real indexing of all this content during the next full crawl? Or is this definitely a bad sign that all these links will never get updated?
Thanks in advance,
Brakkar
The inbound links are still there so Google knows the site is there but has no information on the page.
I once had a site down for about a month and the homepage still remained #1 on the keyword phrase for it's main topic. Hmm, that is a good example of how much more important inbound links are compared with on page factors.
Actually this is a result of Google only indexing and updating the WWW once a month. Your website is stored in Googles database. If your site falls off line that has no effect on Googles database.
Now try taking that site down for just two weeks, and see what happens since Google moved to continual indexing.
Clint
example of how much more important inbound links are compared with on page factors
Actually this is a result of Google only indexing and updating the WWW once a month. Your website is stored in Googles database. If your site falls off line that has no effect on Googles database.
Oh my god not you polluting another topic..
brakkar please go here for the discussion on a disappearing site. We are all experiencing similar problems as you've described recently. Also sitemaps and new sites seem to be the target of this new penalty(?). Also thusfar.. all indications are that this penalty/ban is permanent.
[webmasterworld.com...]
Still havng issues I see. Your beginning to be on the edge of harrassing me.
This statement is so wrong its amazing you are allowed to spread misinformation as you do.
Your comment:
Also sitemaps and new sites seem to be the target of this new penalty(?)
This is from Google. Notice the second bullet explaining to build a site map.
Now if Google suggests building a site map do you THINK they would ban for building one?
Design and Content Guidelines:
* Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.
* Offer a site map to your users with links that point to the important parts of your site. If the site map is larger than 100 or so links, you may want to break the site map into separate pages.
Brakkar don't let people give you misinformation like was done by giga. Tearing apart a site because someone has not done their research is never a good thing.
Clint
I have qanother similar question though. When I search site : www.widget.com in G I get tons of garbage results in the shape of
http: // www.widget.com/ www.widget.com/index . html
(without the spaces obviously, I just dont want the parser to think that this is an url)
These links are totaly wrong. It reminds me of the effect of forgetting the http in a A tag. Does anyone knwo what this is all about
My site is vanilla html, static, handcoded. The index is showing all kinds of crazy urls for the same page of my site, url's that don't exist, like:
www.mysite.com.asp (I have no asp pages at all)
www.mysite.com/ index.shtml
www.mysite.com/?searchengine=3j4hbr2i3uh45f9234f2jb3noifub235098ufvb09ufed
www.mysite.com/?someothersearchengine=gi3h4t9gh394tghv3904uithgv0934ht
www.mysite.com/?businessdirectory=v02489vurh023u4hrbv03u4hbrv032u4hrvmysite/
mysite.com/%1F
www.mysite.com%22
not to mention several listing that include tracking urls from different ppc engines.
My home page, a PR6 has not had a title or description since all of this started months ago, but some of the strange urls and tracking urls have full title and description. The urls with spaces seem to actually point to the correct url, just the green line displayed in the serps has spaces in it.
We asked G last week in Vegas about this and were told to 301 the incorrect urls.
The problem with that is, a 301 does not forward or get rid of tracking urls or snippets that is being added to the end of the url.
To make things worse if G has an incorrect url for my site and it leads to a 404 page, it indexes the content of the 404 page with a title "404 Not Found" and a snipet "the requested url was not found on this server"
I would like to add that no other search engine in the world is doing this, this is specific to Google.
Write your webhoste
Have your webhost turn off symbolic links
This should elimante your problem
I would also do a whois. on your site and see how many other sites share your server.
The other explanation is you bought a domain name that had been owned previously and drawing traffic as well as having been indexed by google.
Remember those results that are displayed are pulled from their database of indexed pages.
If a site drops off line this has no effect on Googles database other than the pages inside it will not be updated.
Clint
Ok now we are back to 24mill results from 5mill 5 min ago and the page count for the site is now 600 wich it has had for 2 days, so it could look like they are making some changes again.
Symlinks and Hardlinks explained.
Unix/Linux files consist of two parts: the data part and the filename part
The data part is associated with something called an 'inode'. The inode carries the map of where the data is, and the permissions, etc for the data.
The filename part carries a name and an associated inode number.
More than one filename can reference the same inode number; these files are said to be 'hard linked' together.
With hard links you can remove the original file you hard-link to and still have the data, whereas with softlinks, if you remove the original file, you are removing the data and the softlink which remains, points to nothing.
Hard links must be on the same partition whereas soft links can span across partitions and networks for that matter.
For My3cents
I am not sure this will help you, if you did indeed purchase a previously owned domain name I doubt that it will. If your domain name was never owned before this step should help.
Clint
Q. If I have these symbolic links turned off, will the tracking urls for the ppcs still work, what about dynamic links from other directories and search engines. LookSmart and Business.com dynamic links are being shown instead of my real urls. I have looksmart and business.com remove the tracking portion of the links 3 or 4 months ago, but they are still in the index with full title and description. No PR, no ranking and the real urls have lost all positioning. On top of that, every month I see a whole new slew of "dynamic" type urls pointing to my site, usually from a search engine or directory.
I would imagine I am getting a duplicate content penalty since the same page is in the index 5-10 times and this is happening to most of my main category pages. The pages one level below that are mostly suplimental links now.
Turning off symlinks will only effect your site and not any inbound links.
As far as being outranked by other directories and search engines I believe this is a matter of current relevant content as opposed to a duplicate content penalty.
Duplicate content penalties would be for page content and not for head tags.
I think you have one of two options.
1. Update content per page and wait two weeks to be reindexed and relisted.
2. Change your head tags per page to beat the search engines and directries. Give two weeks.
My two week rule and current relevant content thoughts comes from an article I wrote based on Google moving to continual indexing.
On November 10th I released a press release for a client.
Keywords used were americas diamonds and diamond brokers.
As of November 20th the press release ranked # 3 for americas diamonds and # 5 for diamond brokers on googles front page ;->
Hope this helps
Clint
Write your webhosteHave your webhost turn off symbolic links
This should elimante your problem
Symlinks and Hardlinks explained.
Unix/Linux files consist of two parts: the data part and the filename part
I don't think my webhost wants to turn symlinks and hardlinks off.
And I dont understand what they have to do with the url that apache dishes up so that only google can misinterpret them.
please explain some more
Unix/Linux files consist of two parts: the data part and the filename part
[edited by: Brett_Tabke at 11:57 pm (utc) on Nov. 25, 2004]