homepage Welcome to WebmasterWorld Guest from 54.161.246.212
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 389 message thread spans 13 pages: < < 389 ( 1 ... 3 4 5 6 7 8 9 10 11 12 [13]     
Dupe content checker - 302's - Page Jacking - Meta Refreshes
You make the call.
Marcello

10+ Year Member



 
Msg#: 25638 posted 11:35 am on Sep 7, 2004 (gmt 0)

My site, lets call it: www.widget.com, has been in Google for over 5-years, steadily growing year by year to about 85,000 pages including forums and articles achieved, with a PageRank of 6 and 8287 backlinks in Google, No spam, No funny stuff, No special SEO techniques nothing.

Normally the site grows at a tempo of 200 to 500 pages a month indexed by Google and others ... but since about 1-week I noticed that my site was loosing about
5,000 to 10,000 pages a week in the Google Index.

At first I simply presumed that this was the unpredictable Google flux, until yesterday, the main index-page from www.widget.com disappeared completely our of the Google index.

The index-page was always in the top-3 position for our main topics, aka keywords.

I tried all the techniques to find my index page, such as: allinurl:, site:, direct link etc ... etc, but the index page has simply vanished from the Google index

As a last resource I took a special chunk of text, which can only belong to my index-page: "company name own name town postcode" (which is a sentence of 9
words), from my index page and searched for this in Google.

My index page did not show up, but instead 2 other pages from other sites showed up as having the this information on their page.

Lets call them:
www.foo1.net and www.foo2.net

Wanting to know what my "company text" was doing on those pages I clicked on:
www.foo1.com/mykeyword/www-widget-com.html
(with mykeyword being my site's main topic)

The page could not load and the message:
"The page cannot be displayed"
was displayed in my browser window

Still wanting to know what was going on, I clicked " Cached" on the Google serps ... AND YES ... there was my index-page as fresh as it could be, updated only yesterday by Google himself (I have a daily date on the page).

Thinking that foo was using a 301 or 302 redirect, I used the "Check Headers Tool" from
webmasterworld only to get a code 200 for my index-page on this other site.

So, foo is using a Meta-redirect ... very fast I made a little robot in perl using LWP and adding a little code that would recognized any kind of redirect.

Fetched the page, but again got a code 200 with no redirects at all.

Thinking the site of foo was up again I tried again to load the page and foo's page with IE, netscape and Opera but always got:
"The page cannot be displayed"

Tried it a couple of times with the same result: LWP can fetch the page but browsers can not load any of the pages from foo's site.

Wanting to know more I typed in Google:
"site:www.foo1.com"
to get a huge load of pages listed, all constructed in the same way, such as:
www.foo1.com/some-important-keyword/www-some-good-site-com.html

Also I found some more of my own best ranking pages in this list and after checking the Google index all of those pages from my site has disappeared from the Google index.

None of all the pages found using "site:www.foo1.com" can be loaded with a browser but they can all be fetched with LWP and all of those pages are cached in their original form in the Google-Cache under the Cache-Link of foo

I have send an email to Google about this and am still waiting for a responds.

 

kwngian

10+ Year Member



 
Msg#: 25638 posted 10:48 am on Dec 22, 2004 (gmt 0)


I think google will fix this problem eventually.

My was fixed and my traffic has doubled.

A big portion of the pages on my site was eithier hijacked or copied and all for the purpose of adsense. Now all these pages appear as supplement results.

rocco

10+ Year Member



 
Msg#: 25638 posted 11:08 am on Dec 22, 2004 (gmt 0)

what do you think, some speculation:
[webmasterworld.com...]

tombola

10+ Year Member



 
Msg#: 25638 posted 11:15 am on Dec 22, 2004 (gmt 0)

The least one can say is that these two elements (hijacked pages and this new update) are closely linked. :-(

The_Hitcher

10+ Year Member



 
Msg#: 25638 posted 5:36 pm on Dec 22, 2004 (gmt 0)

I have tried to query this issue with Google and got the standard replies. Lately I've had to contact owners of directories to remove certain links that now seem to look like redirects due to the way Google is now listing them.

A clients site has a major index page problem and no matter what phrases you extract from his home page they do not appear to have any relevance whatsoever in the Google results (although actually in the Google machine), yet any other page comes up fine. I find myself trying to 'bodge' pages in an effort to get round a bug that Google has made no attempt to fix. Effectively you can take out another site because of this and I'm amazed that Google still hasn't addressed this problem.

Is their algo now so complex that even they don't understand it. This is all related to 'missing index pages', redirects and site hijacking - the issue comes up time and time again and no end in sight.

renee

10+ Year Member



 
Msg#: 25638 posted 5:38 pm on Dec 22, 2004 (gmt 0)

is there a way of detecting if a request for my page is a redirect? through a script(php)?

zeus

WebmasterWorld Senior Member zeus us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25638 posted 6:07 pm on Dec 22, 2004 (gmt 0)

renee - just make a site:yourdomain.com see if there a reother domains listed and check them out.

Seo1

10+ Year Member



 
Msg#: 25638 posted 7:32 pm on Dec 22, 2004 (gmt 0)

Hi There

Doing site:yourwebsite.com does not show duplicate content, it shows results that mention your website name. That can come from any forum, community, blog, irc chat you have ever joined or written in about your website or that anyone else has written about your website and included the full url as a link.

Also I did a search on google for keyword term - ranking relevancy- and the same article was brought up several times in the first three pages. All on different sites. I don't see there being a dupe content ban with this.

Perhaps people are confused as to what is a hijacked site and one that is linked improperly or simply mentioned several times on other websites.

Clint

renee

10+ Year Member



 
Msg#: 25638 posted 8:34 pm on Dec 22, 2004 (gmt 0)

zeus

thanks for the response. I'm asking though if there is a way of detecting it programmatically when the page is requested. something like the variable HTTP_REFERER yields the page that originated the request (and not the redirecting page!)

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25638 posted 9:08 pm on Dec 22, 2004 (gmt 0)

Doing site:yourwebsite.com does not show duplicate content, it shows results that mention your website name

Not true. Site: command is supposed to show pages associated with your site. If other domains are showing as they do for my site: searches then those are being incorrectly tied to your domain.

instinct

10+ Year Member



 
Msg#: 25638 posted 9:31 pm on Dec 22, 2004 (gmt 0)

First we google-bomb the phrase "faulty search engine" to point at a page outlining G's problems (hijacking, sandbox etc) and THEN we send out the press release.

Embarass them into action!

(Only joking)

;-)

dazzlindonna

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 25638 posted 9:40 pm on Dec 22, 2004 (gmt 0)

Bloggers have recently had a lot of power in gaining the attention of the masses. Perhaps a concerted effort of blogging about the problem is due. I have blogged about it in the past, and I blogged about it again today. Those who have blogs, consider doing the same.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25638 posted 10:57 pm on Dec 22, 2004 (gmt 0)

I have a friend in the media who says this would certainly be an interesting news topic. I hate to suggest a press release because Google may be working on this problem. The issue is that they are remaining silent and that they have had this problem since 2003.

And legitimate businesses are suffering at the hands of malicious hijackers.

siteseo

10+ Year Member



 
Msg#: 25638 posted 11:22 pm on Dec 22, 2004 (gmt 0)

As I posted in another thread, I believe part of the duty of the Supplemental Index is to weed out dupe pages/hijackers. The definition G gives for the SI is:
"Supplemental results are triggered on a relatively small number of queries for which Google's main index does not provide many results. Because this index is still in testing..."

So clearly it is still being tested.

I can only say that many pages that duplicate my content (especially affiliate links) have only recently been moved to the SI.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25638 posted 1:17 am on Dec 23, 2004 (gmt 0)

The fact that the site:mysite.com command is still showing spammy domains being associated with my site tells me that nothing is changed. Yes, they are in the supplemental basket, but they are still incorrectly being shown as part of my site.

walkman



 
Msg#: 25638 posted 1:31 am on Dec 23, 2004 (gmt 0)

"The fact that the site:mysite.com command is still showing spammy domains being associated with my site tells me that nothing is changed. Yes, they are in the supplemental basket, but they are still incorrectly being shown as part of my site."

agree with you. It doesn't matter where they are. They were on the supplementals before I deleted them but I was still penalized. The supp. links should've been ingored, instead I got penalized too.

I still am penalized but hopefully Google will update itself. I deleted some links a week ago and the last one 3 days ago. The people who inadvertently linked with Meta or 302 have been very cooperative and nice.

energylevel

10+ Year Member



 
Msg#: 25638 posted 1:59 am on Dec 23, 2004 (gmt 0)

I still think we need some clarity on this issue as the exact nature of these redirects and how to diagnose if they are hurting us, maybe some of us are assuming these redirects to be the problem when the penalty is for something entirely different (I'm just playing devil's advocate here you understand).

I've heard to diagnose do a search for:

inurl:www.yourdomain.com
allinurl:www.yourdomain.com
site:www.yourdomain.com

.... now I've seen a site where the allinurl search is showing a few redirects but the site: search is clean so are we saying if the redirect doesn't appear in an site:www.mydomain.com then it's NOT an issue?

I've seen a few redirects similar to this now in allinurl searches:

www.externaldomain.com/somescript/php?url=www.mysite.com

I've been copying and pasting the URLs into the 'View HTTP Request and Response Header' tool at web-sniffer.net and seen a variety of things going on with 302's, meta refreshes, javascript redirects.

The latest one I've seen shows a HTTP Status Code: HTTP/1.1 200 OK with my domain as the host and the content as my homepage content. Does this mean Google might think it's a mirror of my homepage and dish out a duplicate content penalty?

I'm getting to the stage now where I've no idea which redirects may be hurting or not.....

If anyone has a proven list of actual methods that are known to be hurting the sites they redirect to could they please post here or start a new thread. If they think the info is too sensitive please send me a stickymail.....

Has anyone managed found a method of getting these redirects removed from Google's Index quickly when they can't get cooperation from the offending site?

walkman



 
Msg#: 25638 posted 2:21 am on Dec 23, 2004 (gmt 0)

"Has anyone managed found a method of getting these redirects removed from Google's Index quickly when they can't get cooperation from the offending site? "

I would try to get email the offending site's host and if that fails, send a DCMA complaint to google. Your content is indexed under their domain. It's just as simple as that. Good luck

eddy22

10+ Year Member



 
Msg#: 25638 posted 6:38 am on Dec 23, 2004 (gmt 0)

My site has approx 500 pages but site:mysite.com shows over 1000 pages on google.
Any advice/tips will be appreciated.

thanks in advance,
eddy

rocco

10+ Year Member



 
Msg#: 25638 posted 11:34 am on Dec 23, 2004 (gmt 0)

to find dup content also use:

allintitle:
search for unique text appearing on your sites with quotes: "some aldja aldjfjlkj a whatever"

hunkydory

10+ Year Member



 
Msg#: 25638 posted 11:42 am on Dec 23, 2004 (gmt 0)

I have found a page on a site that uses this to "link" to my site: "<script>document.write('<me'+'ta·http-equi'+'v="refresh"·content="0;'+'url=http://www.'+'my-site-in-link.'+'com/">');</script>"

I tried emailing them to ask them to remove it but the email just gets bounced back and the whois info is no good either.

I have also done a google inurl search as mentioned in a post above and that returns loads of results for sites that are nothing to do with me (4 times as many results as there are pages on my site).

What can I do now- my site has gone from top 5 on several keywords in different search engines to not being listed anywhere in google, MSN or Yahoo

energylevel

10+ Year Member



 
Msg#: 25638 posted 11:55 am on Dec 23, 2004 (gmt 0)

this is one of my questions .. it's difficult to get any clarity on any of the related issues .. there are so many posts offering variations on this theme that I'm beginning to wonder what's a potentially harmful redirect and what isn't and what's the correct way to diagnose that you are are the victim of such redirects, hijacking, jacking whatever you wish to call it!

It'd be nice if a senior member posted on this giving all possible scenarios but I guess people are busy and have other things to do

Seo1

10+ Year Member



 
Msg#: 25638 posted 2:18 pm on Dec 23, 2004 (gmt 0)

Hi there

I will try to explain.

1. Why is hijacking done?

Hijacking is done to high traffic and/or high Google Position Rank sites. So it is done for your traffic, your Google Position Rank, or both.

2. How is it done?

Dishonest webmasters place a code on their websites pages that when a user clicks on the offending url it takes them to the webmasters site first and then very quickly takes the user to your site. There by stealing your Google position rank and or traffic.

3. Are you a victim?

Search in Googles search box for your website in the three manners below:

[yourwebsite.com...]

[yourwebsite.com...]

www.yourwebsite.com

When the Google results page comes up select the link that states: "Find web pages that link to yourwebsite.com".

Your URL will be proceeded with the http:// or www. you used to search with above.

Next you will see a list of pages which contain links to your website. Here you will want to look for the redirects in one of three ways.

1. Look for a URL like the one below:

www.dishonestsite.com/tracker2.php url=ht tp://www.yourwebsite.com/oneofyourpages

If you find this your site is being hijacked or redirected.

2. Look for entries "File Not Found" - Run your mouse over the File Not Found title and look in the lower left corner of your browser window. If you find a URL listed in front of your site URL then your site is being hijacked or redirected.

3. Sometimes the url will look innocent. However if it is not a site you exchanged links with take a look at the site by clicking the link. In your browser click "View" the scroll to "Source" or Page Source" what you will then see is the backside coding of the sites weaponed.

Looking in the <head> tags near the top of the page look for the following line of code.

meta h ttp-equiv="refresh" content="0"; url=ht tp://www.yourwebsite.com"

If this is present then your site is being redirected although in a more innocuous manner.

How to stop this:

1. Write the website.
2. Write the domain name registar. (Done by typing into the browser address bar: htt p://#*$!/www.thewebsiteyouwantinfoon.com
3. Write the domain name ISP hosting the website.
4. File a DMCA report with Google and other search engines where this maybe occurring.

Hoping this helps answer your questions

Lorel

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 25638 posted 5:38 pm on Dec 23, 2004 (gmt 0)

Hi Energy Level


I've heard to diagnose do a search for:

inurl:www.yourdomain.com
allinurl:www.yourdomain.com
site:www.yourdomain.com

.... now I've seen a site where the allinurl search is showing a few redirects but the site: search is clean so are we saying if the redirect doesn't appear in an site:www.mydomain.com then it's NOT an issue?

I've just checked 4 of my client's sites that I've been trying to get the redirect code removed from and those redirects come up in all 3 of the commands you listed above and they also come up when I search for the domain name and then click on the "other sites with this URL" link.

One of these redirecting people I contacted profusely denied any involvement in a redirect to my client's site and I could find no proof otherwise.

It has been rumored that it is a bug in Google's algorithim instead. I'm beginning to believe it's the later in "some" cases. (I'm not talking about tracker2php codes but "file not found" results in the searches above that are actually redirects.)

I have written Google about most of these redirects (using the url posted elsewhere on this site entitled "canonicalpage") and got a reply on two of them that they had turned it over to their engineers.

The reason I'm inclined to believe it is a bug in Google's algorithim is that all of these sites I manage have very little PR on the affected page (this could be because of the redirect) but none of them have been over a PR of 4 before this started and they are all small sites, so it's not likely it was a deliberate attempt to steal PR and the sites are totally unrelated as per subject.

I think if everyone who has been affected wrote google about these redirects the problem will eventually be solved.

Lorel

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 25638 posted 5:54 pm on Dec 23, 2004 (gmt 0)

PS. I forgot to add to my post above, in my research several of these redirects mentioned above--the two sites (innocent party and apparent redirecting party) are both on the SAME HOST with SAME IP ADDRESS, i.e., shared hosting. When I contacted the host pointing out this problem the reply was "buy a dedicated server". Yeah sure--at $150 per month.

How long have Virtual Servers been in effect and how long has this redirect problem been in effect. Is there a connection?

Seo1

10+ Year Member



 
Msg#: 25638 posted 7:45 pm on Dec 23, 2004 (gmt 0)

Hi

For energylevel

No if it is not found by searching through the urls that it is not happening. It is possible the offending site uses a mod rewrite in their htaccess folder and accomplishes the same thing.

Lorel. Toss the DMCA in the hosts face and then if they won't take steps to resolve the issuee, contact the state where the hosting company is located or your own states - Consumer Affairs Dept, Better Business Bureau, and Attorney Generals Office.

One of these agencies will be glad to get the web hosts attention.

Clint

webdude

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 25638 posted 7:51 pm on Dec 23, 2004 (gmt 0)

energylevel,

I was one of the people who was originally and heavily involved in this thread. As posted before, there are things to look for in the highjacking (which I believe in MOST cases is not intentional).

First...
Search for your site. If a link in the SERPs that shows your page title (you can check by running your mouse over the returns) goes to another site, you might have been hijacked.

If you click on that link and it goes to your site, you might have been hijacked.

Copy the link and run it through an HTML checker. I am not going to recommend which checker to use, but use one which shows the html of the page you are checking. If the page shows a meta refresh to your site, it might have been hijacked.

If you try checking with a header check, it will seem fine because you will just get a 200 page found. That tells you nothing.

If you do a link:siteA.com/metarefreshpage.html, it will show the EXACT same backlinks as your page, even though the only page that exists on the offending site is the meta refresh page.

So in a nutshell, the offending page is just a metarefresh that shows your EXACT same backlinks even though there is no code on the page except the metarefresh

Second...
If you have the toolbar installed, right-click the actual link in the SERPs. Check the cache. If it shows your page as being the cached page, the site has been intentionally or unintentionally hijacked.

Is this making sense?

In other words, the refresh page shows your cache, shows your backlinks, is returned in the SERPs exactly like your site, shows your PR, but the actual page is just a redirect.

There are a lot of example of this out there, but usually no one catches it because the REAL site ranks higher then the redirect. But in some cases, as with what happened to me, The redirect page was actually ranking higher then mine.

Now some may think, so what? When you click on the link, it still ends up going to your site. BUT, if the site that had the link up removes the redirect page and then adds another redirect to say... their home page. Then you have problems. People will click the SERPs expecting to find your site but will end up going to another site alltogether. THAT is what happened to me.

Eventually, google drops the offending link, but as in my case, it took several months for that to happen. When the offending link disappeaered, within a week or so, my site started ranking normally again.

I have been in close contact with the site that had the offending link and I have come to the conclusion that the webmaster did not intentionally try to steal my PR, backlinks or cache, it just turned out that way. The reason for this is that when G follows a redirect (302) it gives credit to the redirect page all the attributes of the site that the redirect is pointed to. This in my opinion is a flaw on googles part. If google treated 302s the same that they treat 301s, I don't think there would be a problem. Why this isn't fixed is anyone's guess. Some speculate it may upset the SERPs too much for a fix to be put in place. Others say that it affects too few sites for google to get into action.

One of the moderators here posted all the info I had pertaining to my site to googleguy, header info, ofeending info and everything else we could think of to see if something could be done. Not sure if it was coincidence or not, but the site was back to normal within a couple of weeks after that.

I hope this is making sense to you.

energylevel

10+ Year Member



 
Msg#: 25638 posted 7:59 pm on Dec 23, 2004 (gmt 0)

I'm investigating redirects that have possibly hurt a sites Google rank, I wouldn't class it as an outright hijacking but I believe were bundling variations of this issue into one. The suspect redirects appear in the inurl and allinurl searches but NOT in the site search on Google, ... I'm sure at least one of these redirects has resulted in some kind of penalty being dished out by Google, possibly a duplicate penalty or a penalty that was designed to deter people from creating multiple gateway pages on external domains and using redirects (meta refreshes for example).

I have seen behind this particluar 302 redirect a meta refresh and javascript redirect cocktail, I immediately thought it must be suspect because I see no reason to code a redirect like this then try to hide it somewhat, the URL was in similiar format to many others reported by people here on these forums:

www.externaldomain.com/scriptname.php?url=www.mydomain.com

For my bit I think:

That if a URL like the one above is appearing in a site search for your domain then the matter is more serious and your looking at a potential outright hijacking of your site.

If the URL appears in an inurl or allinurl but NOT in the site search and goes directly to your site then there is risk of a on your site resulting in dramatic drop in your Google rank in search results.

I initially thought these potentially damaging redirects involved the meta refresh and 302 redirect cocktail only but I'm not so sure now how many methods are potentially dangerous now and in use and to what degree they are hurting the targeted sites?

energylevel

10+ Year Member



 
Msg#: 25638 posted 8:05 pm on Dec 23, 2004 (gmt 0)

Lorel .. by the way I'd say yes it probably is a bug in Google's algo that is being expoited. I just hope they sort it out ASAP if it is!

I'm not sure why a shared IP would be related but you can anyway get a dedicated IP with some hosts without having your own server ..doesn't cost much...

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25638 posted 8:11 pm on Dec 23, 2004 (gmt 0)

Toss the DMCA in the hosts face and then if they won't take steps to resolve the issuee, contact the state where the hosting company is located or your own states

What if hosting is in Canada?

c

This 389 message thread spans 13 pages: < < 389 ( 1 ... 3 4 5 6 7 8 9 10 11 12 [13]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved