Forum Moderators: phranque

Message Too Old, No Replies

Screen Scraping

Is it possible to read text via screen scraping

         

IanTurner

10:57 am on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you do a screen scrape of the currently displayed screen - is it then possible to extract the text from the content of the file?

Longhaired Genius

11:40 am on Dec 17, 2002 (gmt 0)

10+ Year Member



It probably is, with text recognition software. Maybe you've got some if you have a recent flatbed scanner. I've never done it, so that's all I know.

Staffa

1:56 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ian, what is a screen scrape?

Thanks

veritysystems

1:57 pm on Dec 17, 2002 (gmt 0)

10+ Year Member



Do you mean 'print screen'? If so it is saved in a graphical format.

IanTurner

2:02 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Screen scraping is extracting information off of the currently displayed screen.

veritysystems

2:05 pm on Dec 17, 2002 (gmt 0)

10+ Year Member



OK - well if you press 'print scrn' Windows will create a screen grab (nothing will happen on the screen).

Open a graphics editor and hit paste. Hey presto - the screen will be recreated as an image. As to extracting the data, I'm not sure.

korkus2000

2:06 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How are you scrapping the screen?

veritysystems

2:06 pm on Dec 17, 2002 (gmt 0)

10+ Year Member



You could create a PDF from the image....upload it to the web and wait for Google to turn it into text HTML for you!

IanTurner

2:23 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



That is a little outside of the timescales that are needed here, we are looking at performing some data capture work within seconds rather than days.

korkus2000

2:30 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ian how are you scrapping. Are you changing it into a graphic? If so what format?

john316

2:36 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you scraping web pages or something else on your screen?

IanTurner

3:04 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



At the moment we are looking at scraping web pages only and converting to a bitmap (this being the easiest to do).

veritysystems

3:12 pm on Dec 17, 2002 (gmt 0)

10+ Year Member



OK - so does anyone know how to turn a bitmap into text?

Is this possible?

korkus2000

3:25 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You will have to find ocr software that you can script to run on the fly.

Staffa

6:10 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the explanation Ian.

Screen scraping is extracting information off of the currently displayed screen.

However I'm still confused.
Let's say its a page like this one and you want to catch all the text you could just save the page or select all, copy and paste into notepad.

I guess I'm missing something here :o

IanTurner

8:53 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Yes you can, staffa, however if you save a page like this and it has a java applet or flash in it all you get is a link to the applet or the flash, you don't get to save the current information as the next time you open the page you will get the latest information pulled by the applet or the flash movie.

Staffa

9:26 pm on Dec 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thank you Ian, now I understand what you mean :)