Welcome to WebmasterWorld Guest from 3.234.214.113
Forum Moderators: Robert Charlton & goodroi
Does anyone have a similar problem?
Googlebot has visited these pages tons of times over the past few months. However, the cache remains out of date.
Thanks.
BTW the page was not an old hijacked but from an impeccable site in Italian dealing with realtime news. . . .
Ahhhh, there you have it. Any site that posts news gathered from another site(s) is likely to receive a duplicate content penalty.
Ahhh..think again Lorel- not true...Site called topix does nothing but reprint news from other sources...and they have a pr of 7! Of course, getting 10,000 bogus listings from the odp probably boosts them pretty good.
The [216.239.59.104...] holds new and OLD stuff. Besides my existent pages with recent cache dates up to yesterday, there are many old supplementals, some of them cached back in 2004-09-28 (!), many old non-www pages, which have have been redirected to www by long standing 301s since 2004-09.
These results are very similar to those I get when calling www.google.com/.
Since I have "your visit" date/time stamps on every page it is easy to check the true age of cached data.
Google, pls roll out the 216.239.37.99 results, it will work for me ...
Regards,
R.
I added those supplemental result in a robots.txt and used the removal tool, it says complete, but nothing in the serps. I dont think its possible to remove those with the tool, because those are listed on the supplemental results DB.
Ah... I wish I would have known that. Is that why Iget "denied" whenever I try to remove these pages? The problem I am having is that these results are pages that don't exist anymore.
>>>Ahhhh, there you have it. Any site that posts news gathered from another site(s) is likely to receive a duplicate content penalty. <<<
Ahhh..think again Lorel- not true...Site called topix does nothing but reprint news from other sources...and they have a pr of 7! Of course, getting 10,000 bogus listings from the odp probably boosts them pretty good.
PR has nothing to do with it. If they are reprinting news and not getting a supplemental results penalty for it then they must be adding at least 12% more text to the page to prevent that penalty.
See Brett's post re this being the amount needed to bypass that penalty.
I have tried this on the site for the past several months. I cannot get it to work. I started a whole thread about it. I even joined Google Groups and asked there. I have also emailed Google but just got the standard response. I have had the missing pages checked by multiple different header checkers and have also had members here check to make sure that they were coming back as 404 pages. They are. I have tried everything to get rid of these pages in the supplimental results and I just can't get it done. Every time I use the removal tool, I get a "Request Denied." I am totally baffled as to why I cannot get this to work.
Any help or reason on this would be greatly appreciated. The other thread is...
[webmasterworld.com...]
Put a page in the slot. Use the tool. The listing will be removed in hours. This has nothing to do with a page being 404, and the page shopuld not show as a 404.
The form hit site of hijackers and googlebug 302, also had a 301, but that was for 4 month ago and still nothing has changed.
About omitted results which also come extremly quick these days, I had a site where I deleted a lot of content just so it REALY did not look the same, not only those 15% some say is enogh, then all pages where listed with discription like in the good old days. Also remeber to change your meta discription.
About suplemental results, as soon you see that you have to be alerted, it could be that you have been hit by googlebug 302, scrapers has to much of your content or maybe hijackers, but remember it does NOT have to be that dramatic.
I am doing exactly what the instructions tell me to do on the Google site. I am going to "Remove an outdated link". At the top of that page, it states, "Enter the URL of your page. We will accept your request only if the page no longer exists on the web". In other words, it has to return a 404. I already verified this on other threads. To use this part of the tool, the page MUST return a 404 or this part of the tool will not work. I am then clicking the radio button that says "Remove anything associated with this URL." I wait a day or two, it shows the request as "Pending." After the wait, I get a "Request Denied."
If I am doing this incorrecty, please tell me in a step by step fashion exactly what I am supposed to do. Are you saying that the pages have to exist before they can be removed from the index?
Remove an outdated ("dead")Google updates its entire index automatically on a regular basis. When we crawl the web, we find new pages, discard dead links, and update links automatically. Links that are outdated now will most likely "fade out" of our index during our next crawl.
Note: If you believe your request is urgent and cannot wait until the next time Google crawls your site, use our automatic URL removal system. We'll accept your removal request only if the page returns a true 404 error via the http headers. Please ensure that you return a true 404 error even if you choose to display a more user-friendly body of the HTML page for your visitors. It won't help to return a page that says "File Not Found" if the http headers still return a status code of 200, or normal.
As I suggested above, just use the "Remove a single page using meta tags" option. It is simple as can be. Add a blank page with the correct meta tag, enter it in the urlconsole, delte the page, and its gone in hours. Repeat it as many times as necessary to get rid of all the pages.