MarkMarkov 08-11-2009, 03:46 AM Hello friends,
Is there anyone here that can explain how to write a program that goes out to the Internet, loads a particular web site, reads information off the screen and stores that information in a variable? I am a beginner, so I really need a down-to-earth explanation ;) This must be in VB6. Thank you!
the master 08-11-2009, 04:39 AM This is quite a simple one and there are many examples here already. You dont need to write any complicated code to capture the text from the screen. You can use the Inet control.
Add the Inet control from the list of components then add it to your form. In the code window you need to call the Inet control's OpenURL() method. You pass the URL of the page you want to read and the HTML source code will be returned as a string. Now you can parse out any information you need
RyanS2000 08-11-2009, 08:39 AM You can also use the winsock control. Using this you will need to construct the HTTP header yourself and then parse any of the incoming data that you need.
the master 08-11-2009, 11:08 AM There is the API method too and even though they can both be very usefull (especially the winsock method) they are not very easy for a beginner to use. The Inet control can do everything in 1 line of code where the winsock method requires you to know a bit about the HTTP protocol first to construct the correct headers then to parse the return value which a lot of servers like to send chunked. Youll also need to know how to use winsock which can be confusing at first.
MarkMarkov: If you have the time and patience then learning the winsock method might be a good idea. Its a steep learning curve but what you learn will be very usefull in the future.
kassyopeia 08-11-2009, 05:25 PM For HTML4-compliant pages, the most convenient approach is usually MSXML. Using an HTTP request, it reads the page source directly into a DOM (document object model), from which any subset of the contents can then be retrieved without the need for any parsing.
MarkMarkov 08-13-2009, 04:28 PM This is quite a simple one and there are many examples here already. You dont need to write any complicated code to capture the text from the screen. You can use the Inet control.
Add the Inet control from the list of components then add it to your form. In the code window you need to call the Inet control's OpenURL() method. You pass the URL of the page you want to read and the HTML source code will be returned as a string. Now you can parse out any information you need
the master, thank you, I actually got it to work using the HTML object library! Like this:
Dim H As HTMLDocument
Dim s As String
Web.Navigate "http://www.myURLHERE.com"
Do While Web.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set H = Web.Document
s = H.body.innerText <-----this gives me all the text on the site!
***HOWEVER*** I realized what is very complex about my idea: most of the sites I need to get information from, I have to login first! I imagine this is very complicated. Maybe a 3rd party tool? Please advise if you know the answer! Thanks!
kassyopeia 08-13-2009, 05:29 PM Oh, yes, I should have mentioned that XMLHTTP also provides for sending authentication info along with the request. :cool:
MarkMarkov 08-14-2009, 04:59 AM I think this is slightly different, I need something that will go to a web site, find the username and password fields, enter the username and password and then, when the login is successful, read the information on the screen like I did in my successful test with Microsoft HTML Object Library(dim H as HTMLDocument, etc.)
The sites I need to gather information from don't have any human verification code windows (not sure what they're called).
Help!
kassyopeia 08-14-2009, 09:48 AM Is the URL the same for the login page and the one you're interested in/what happens if you try to access the page you're interested in without logging in?
MarkMarkov 08-15-2009, 01:00 PM kassyopeia, I am not sure I follow your last post, obviously I wouldn't be able to get to the data without logging in. Say for example, you want the program to log into Paypal and get your totals, then log into some other systems and get totals from there, then present it all on once screen.
I have learned how to read HTML pages, but I have no clue how to write a program that logs in first(it needs to find a username and password fields on the screen...).
kassyopeia 08-15-2009, 06:16 PM it needs to find a username and password fields on the screen...
No, it doesn't. At least not necessarily. There are many ways to supply information besides entering text into a field and pressing a submit button. An automated solution is far less limited than a human user in some respects, it doesn't need to rely on the graphical UI any more than it needs to press keys to enter text or move the mouse to move the cursor.
I apologize if the two questions above seemed arbitrary, but I assure you that the answers could be quite pertinent.
PeetSoft 08-16-2009, 11:56 PM I used a browser and the following for logging in to a page;
Private Sub PSNBrowser_DocumentComplete(ByVal pDisp As Object, URL As Variant)
With PSNBrowser
.Document.All.Item("loginName").Value = "me@somewhere.nl"
.Document.All.Item("password").Value = "12345678980"
.Document.All.Item("loginButton").Click
End With
End Sub
the master 08-20-2009, 08:51 AM If its actually paypal that you want to login to then they offer an API which allows programs to fetch account information, transaction history etc and it saves your program having to go through all of the html code to get the values it needs. Other similar sites should offer some kind of API too
|