homepage Welcome to WebmasterWorld Guest from 54.205.99.71
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Is there a Browser. . .
wilderness




msg:4630680
 5:26 pm on Dec 14, 2013 (gmt 0)

that offers custom configurations to ignore either "footers" or "DIV's used a as footer" (in CSS)?

I'm using FF25 and couldn't see anything like that offered.
Do seem to recall one browser offering custom CSS (create your own), however that option is pretty far-fetched for a typical widget user.

I've a visitor that has been grabbing 8-10 pages on one site for a few weeks now.
Each page requests ignores the image contents "footers" or "DIV's used as a footer". (Most of the images are outside links to other widget orgs (repeated the same on each page), however some images are thumbnails pertinent to the page body content.

The former visitor (redirected to contact page) certainly has an interest in widgets, however the page views are not correlated to the content structure of the site.

 

lucy24




msg:4630689
 6:50 pm on Dec 14, 2013 (gmt 0)

Some browsers allow you to use custom CSS. You could make a custom stylesheet that says simply

footer {display: none;}
or
div.footer {display: none;}

Is your visitor using some ordinary human browser? Are the non-footer images getting loaded all in one gulp, the way you'd expect with a human? Or do they come at intervals of a second or more, as you'd get with either a robot or someone loading images manually?

Edit:
Oh, wait. Setting {display: none;} on an element may not prevent its non-html content from being loaded. It definitely doesn't prevent loading of background images; I've never experimented with upfront <img>.

keyplyr




msg:4630708
 8:14 pm on Dec 14, 2013 (gmt 0)

You can hide "block elements" in browsers supporting the web developer extension. Not sure whether this is a rendering adjustment or just a display feature.

wilderness




msg:4630739
 11:24 pm on Dec 14, 2013 (gmt 0)

lucy and keyplr,
thanks for the replies.

The footer is not defined as a footer, however serves the same function.

<div class="container">

The visitor would certainly have to be aware of html when making an intentional effort to disregard loading the images from same "container".

Is your visitor using some ordinary human browser?


The browser shows Chrome:
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36"

Are the non-footer images getting loaded all in one gulp, the way you'd expect with a human?


Yes.

lucy24




msg:4630740
 11:29 pm on Dec 14, 2013 (gmt 0)

You can hide "block elements" in browsers supporting the web developer extension. Not sure whether this is a rendering adjustment or just a display feature.

Can you do it before the page has loaded in the first place? That seems to be what wilderness's selective widget-lover is doing.

some images are thumbnails pertinent to the page body content

So they're unique to the page-- not something that might already be in the browser's cache from earlier visits?

wilderness




msg:4630747
 11:59 pm on Dec 14, 2013 (gmt 0)

So they're unique to the page-- not something that might already be in the browser's cache from earlier visits?


I guess it's possible there in a cache lucy however these requests are 200's not 304's. I just don't think it's likely that a cache is eliminating the entire container.
I see some of the handhelds doing strange things with images requests (anywhere from 5-minutes to multiple hours later) retrieving images that are linked from a previous page visit.

The IP Provider is Cogeco, and I guess it's possible that provider is cache images (wouldn't be unusual).

Seven images in the container are pretty standard on all pages, however when creating the pages I purposely even varied those images.

Other images in the container are unique and relative to the page content (body). However some pages may not be accompanied by related images in the container.

I'm still leaning towards somebody using a custom CSS, however if it didn't closely match the CSS from the site, than the page wouldn't display properly.

Don't believe this is some type of bot and/or automated software, however it's not a normal widget user either (despite their interest in widgets).

In any event, they aren't seeing any pages until they contact me or at least change IP's.

dstiles




msg:4630896
 6:33 pm on Dec 15, 2013 (gmt 0)

wilderness - You do not give any access details - UA, IP etc. Could it be a prefetcher or previewer? Something that grabs what it can in the probably unlikely case that its owner wants to view it later? It's possible it only downloads part of a page before getting "switched off" for some reason?

wilderness




msg:4630913
 8:52 pm on Dec 15, 2013 (gmt 0)

The browser shows Chrome (previous reply):
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36"

The IP Provider is Cogeco (previous reply)

24.141.193.zzz

FWIW, this visitor grabbed the same page twice on a specific search for that page, and based upon the topic of that directory, in Sept 2013.

Didn't visit at all during OCT & NOV.

One page on Dec 5th.

Another page on Dec 10th.

Four pages on Dec 11th.

Pages on Dec 5 & 10 include the container requests (previously explained).
The first two page requests on Dec 11 include the container requests (previously explained).
The second two pages (and every other page since) are absent container requests (previously explained).

dstles,
The delay (anywhere from 5-minutes to multiple hours later) in grabbing some of the images (linked from thumbnails) could be a browser action.

The activities have increased dramatically (at least compared to the former 1-2 per day requests) the last three days.

lucy24




msg:4630934
 10:41 pm on Dec 15, 2013 (gmt 0)

I see some of the handhelds doing strange things with images requests (anywhere from 5-minutes to multiple hours later) retrieving images that are linked from a previous page visit.

I don't know if this applies to all handhelds, but the iPad version of Safari doesn't seem to have a "start" screen. This means that when you go into the browser it shows whatever page you were on last time-- and presumably it then re-loads any material that wasn't cached.

wilderness




msg:4630957
 12:53 am on Dec 16, 2013 (gmt 0)

Many thanks for the replies and suggestions of possibilities.

Please consider this matter closed.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved