Problem Parsing (not getting all html)

demosthenes705
08-13-2005, 08:19 AM
Ok I have having a problem running my script because it does not get all the html source code from the site.

but what it does get is this. (its only part of the source)



<html>
<head>
<title>Pro Football - Scoreboards</title>

<script language="JavaScript1.1" src="http://ar.atwola.com/file/adsWrapper.js"></script>
<script language="JavaScript1.1">
<!--
adSetExt('aoladp');
adSetTarget('_blank');
//-->
</script>

<script language="javascript" src="http://sports.channel.aol.com/chromejs.adp?top=nfl&recursion=1&path=nfl.scoreboards"></script>
<link rel="stylesheet" type="text/css" href="http://i.cnn.net/si/.element/ssi/css/1.0/common.css">
<link rel="stylesheet" type="text/css" href="http://i.cnn.net/si/.element/ssi/css/1.0/story.css">
<link rel="stylesheet" type="text/css" href="http://i.cnn.net/si/.element/ssi/css/1.0/sd.css">
<link rel="stylesheet" href="http://i.cnn.net/si/.includes/stylesheets/nfl2.css">

<!-- START AOL SCRIPT CALL -->
<script language="javascript">
if (window.aolCSS) {
document.write(aolCSS);
}
if (window.aolJS) {
document.write(aolJS);
}
</script>
<!-- END AOL SCRIPT CALL -->

<!-- START AOL HAT INSERT -->
<script src="/includes/aol_hat.js" type="text/javascript"></script>
<!-- END AOL HAT INSERT -->

</head>
<body>

<!-- START AOL HAT INSERT -->
<div style="width=728;"><script language="javascript">printHat();</script></div>
<!-- END AOL HAT INSERT -->

<!-- START AOL AD CALL -->
<script language="javascript">
if (window.aolMN) {
adCall = '<script language="JavaScript1.1">\n';
adCall += '<!--\n';
adCall += 'adSetType(\'J\');\n';
adCall += 'htmlAdWH(aolMN, \'728\', \'90\');\n';
adCall += 'adSetType(\'\');\n';
adCall += '//-->\n';
adCall += '<\/script>\n';

document.write(adCall);
}
</script>
<!-- END AOL AD CALL -->

<!-- START OMNITURE INSERT -->
<script language="javascript">
<!-- //
// SiteCatalyst code version: G.7.
// Copyright 1997-2004 Omniture, Inc. More info available at
// http://www.omniture.com

//var s_account="devaolsports"
var s_account="aolsports,aolsvc"
var kids=""
var teens=""
var yteens=""
var mteens=""
var vlpc=""
var s_pfxID="spr"
var s_pageName=s_pfxID + " : " + document.title;
var s_server=""
var s_channel="us.sports"
var s_pageType=""
var s_prop3=""
var s_prop4=""
var s_prop5=""
var s_prop6=""
var s_prop7=""
var s_prop8=""
var s_prop9=""
var s_prop10=""
var s_prop11=""
var s_prop12=""
var s_prop13=""
var s_prop14=""
var s_prop15=""
var s_prop16=""
var s_prop17=""
var s_prop18=""
var s_prop19=""
var s_prop20=""
var s_prop21=""
var s_prop22=""
var s_prop23=""
var s_prop24=""
var s_prop25=""
/* E-commerce Variables */
var s_campaign=""
var s_state=""
var s_zip=""
var s_events=""
var s_products=""
var s_purchaseID=""
var s_prop1 = setProp1(s_pfxID, document.URL)
var s_prop2 = setProp2(s_pfxID, document.URL)
// -->
</script>

<script language="javascript" src="/includes/us_1.0_sports_1.0.js"></script>
<!-- END OMNITURE INSERT -->

<!-- START AOL HEADER -->
<script language="javascript">
if


and then the code i am using to get to the site is this


Option Explicit
Dim text
Dim textparse1 As String
Dim textparse2 As String
Dim getscores As String


Private Sub Command1_Click()
Status = "Connecting to Score Score."
If Combo1 = "Pre Week 1" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/1/0.html")
...


The thing is I think it has something to do with the way I have the text dimmed. and right now that data is being displayed in a text box. tell me if i need to chnage it.

shawnrgr
08-13-2005, 10:56 AM
Because your sending the data to a textbox. The textbox can only up to 64k (see http://www.xtremevbtalk.com/showthread.php?t=227237). Try:

Dim strHtml as String
strHtml = Inet1.OpenURL("http://www.google.com")
Or just save it to a file.

demosthenes705
08-13-2005, 12:31 PM
well that is just for temporay use. I need to see why it isn't getting all the html

edit:

the page in its entirity is not even 10 kbs large.

shawnrgr
08-13-2005, 12:39 PM
right, you can't see all the html because the textbox is cutting it off. All the HTML is there, but the textbox can't hold it all

LaVolpe
08-13-2005, 12:43 PM
^^ A temporary solution to allow you to debug more fully would possibly be to use a RTF box vs text box.

shawnrgr
08-13-2005, 12:45 PM
True, didn't even think of that. I still myself favor saving to a text file on my desktop rather than adding controls that clutter my workspace. But either will work fine.

demosthenes705
08-13-2005, 12:46 PM
that still shows the same amount. is there a way that the site can stop my program from getting all the html?

LaVolpe
08-13-2005, 12:49 PM
How are you parsing? From the text box? If so, use Shawn's suggestion & save the html to a file & parse it line by line which you can easily step thru, including the ability to terminate the parsing when you get what you need.

P.S. I believe the answer to your last question is Yes. Sites can have the capability to detect Leechers. This obvsiously has a negative effect on their total bandwith & apps can literally hit those sites thousands of times a minute, making things worse for them.

shawnrgr
08-13-2005, 12:55 PM
What web page are you trying to grab? Let me see if I get the same result

demosthenes705
08-13-2005, 01:05 PM
http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/1/0.html

thats the site.

shawnrgr
08-13-2005, 01:11 PM
Worked fine for me, and it all fit in the textbox:

Private Sub Command1_Click()
Text1.Text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/1/0.html")
End Sub


Show us your code so we can find the problem.

demosthenes705
08-13-2005, 01:34 PM
Option Explicit
Dim text
Dim iStart As String
Dim iEnd As String
Dim sText As String


Private Sub Command1_Click()
Status = "Connecting to Score Score."
If Combo1 = "Pre Week 1" Then
text = Inet1.OpenURL("http://www.msnbc.com/modules/sports/scoreboard2/score_board.asp?s=nfl&date=8/12/2005")
Status = "Connecting to Score Site."
ElseIf Combo1 = "Pre Week 2" Then
text = Inet1 = "http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/2/0.html"
ElseIf Combo1 = "Pre Week 3" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/3/0.html")
ElseIf Combo1 = "Pre Week 4" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/4/0.html")
ElseIf Combo1 = "Pre Week 5" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/5/0.html")
ElseIf Combo1 = "Week 1" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/1/0.html")
ElseIf Combo1 = "Week 2" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/2/0.html")
ElseIf Combo1 = "Week 3" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/3/0.html")
ElseIf Combo1 = "Week 4" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/4/0.html")
ElseIf Combo1 = "Week 5" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/5/0.html")
ElseIf Combo1 = "Week 6" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/6/0.html")
ElseIf Combo1 = "Week 7" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/7/0.html")
ElseIf Combo1 = "Week 8" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/8/0.html")
ElseIf Combo1 = "Week 9" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/9/0.html")
ElseIf Combo1 = "Week 10" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/10/0.html")
ElseIf Combo1 = "Week 11" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/11/0.html")
ElseIf Combo1 = "Week 12" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/12/0.html")
ElseIf Combo1 = "Week 13" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/13/0.html")
ElseIf Combo1 = "Week 14" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/14/0.html")
ElseIf Combo1 = "Week 15" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/15/0.html")
ElseIf Combo1 = "Week 16" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/16/0.html")
ElseIf Combo1 = "Week 17" Then
text = Inet1.OpenURL("http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/17/0.html")
Else
MsgBox "No week was chosen.", vbInformation, "Please hit Ok and Chose a week"
End
End If

iStart = InStr(1, Text1.text, "<td noWrap=""true"" class=""ScoreMData1"" align=""left"">", vbTextCompare) + 58
iEnd = InStr(iStart, Text1.text, "<span class=""ScoreSmData1"">&nbsp;", vbTextCompare)
'sText = Mid$(Text1.text, iStart, iEnd - iStart)

Text1.text = text

End Sub

the problem is when i ran what you gave me it still didn't get the entire page. it just showed me what I have been seeing

also if you want to try it

http://www.snpgta.afraid.org/Score_checker.exe

the only thing in that that isn't in the code above is the cmd 2 button and that is only the code u gave me.

shawnrgr
08-13-2005, 01:48 PM
well there are a number of things wrong here. first of all you don't need to Dim your textbox control. Second, your iStart and iEnd should be Long not String. Third, you can't use the textbox with the inet like we said above because its not always guaranteed that it will fit. Dim a new string var like "Dim strHtml As String" and place the html code in there. Also, the text that your looking for with instr() for iStart and iEnd are both comming up 0, which means it has found anything.

Are you trying to get scores? What exactly on the page are you trying to extract, I work with html files all the time, I could help you get what you need.

demosthenes705
08-13-2005, 01:54 PM
well I am trying to get the scores for now. and then move on to when they double click one of the team names it will bring up more stats. but that is in version 2.0

LaVolpe
08-13-2005, 02:01 PM
Kinda a quirk w/me when I see the list of IFs you have. You can replace them all with a just a few lines.... I am assuming your 1st combo item is Pre-Week 1 & the last one is Week 16

Status = "Connecting to Score Site."

Select Case Combo1.ListIndex ' where 0 is the 1st one selected
Case -1
MsgBox "First Select an Item from the Listing", vbOkOnly
Exit Sub ' or whatever
Case 0: ' first item - pre-season Week 1
strURL = "http://www.msnbc.com/modules/sports/scoreboard2/score_board.asp?s=nfl&date=8/12/2005"
Case Is < 5 ' pre-season are items 0-4
strURL = "http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/preseason/" & Combo1.ListIndex + 1 & "/0.html"
Case Else ' regular season are items 5-end of list
strURL = "http://aolsvc.cnnsi.sports.aol.com/football/nfl/scoreboards/2005/regularseason/" & Combo1.ListIndex - 4 & "/0.html"
End Select
Then, using Shawn's suggestion, you can simply call:
strHtml = inet1.OpenURL(strURL)

shawnrgr
08-13-2005, 02:06 PM
You need to use the same website for everything or else your going to have to parse the html different for each website. Here is an example to get the score section of the html and ONLY the scores section, for preseason week 1 from nfl.com:


Private Sub Command1_Click()
Dim strHtml As String
Dim strPage As String

strPage = "http://www.nfl.com/scores/2005/preseason/week1"
strHtml = Inet1.OpenURL(strPage)
strHtml = Mid(strHtml, InStr(strHtml, "Preseason Week 1"))
strHtml = Left(strHtml, InStr(strHtml, "<b>Key:</b>"))

Text1.text = strHtml
End Sub

demosthenes705
08-13-2005, 06:57 PM
o that other site was a mistake.

ok I got it to work kinda. This is my plan

I get the data and put it into a rtfbox. then I am wondering how do i parse that data and add it to a list box?

jlm
08-14-2005, 03:46 AM
multiline=false?

demosthenes705
08-14-2005, 03:10 PM
ok Here is my current problem.

I have the following code. I just don't know how to put it to use.

Sub GetDefinition()
'We start by going down the rows for the parsing.
On Error GoTo Err
Search = "Final</td></tr><tr><td class="
spot = InStr(RichTextBox1, """>")
done = Mid(RichTextBox1, spot - 2, spot)
spot2 = InStr(done, "</a> <span id=""")
done = Mid(done, 1, spot2 - 1)
'Replace Begin
done = pReplace(Trim(done), "CORE MEANING:", "CORE MEANING: ")
done = pReplace(Trim(done), ">", "")
done = Trim(done)
'Parser Finish 'Replace Finish
'Dump into Listbox for Viewing.
List1 = done
Exit Sub
Err:
Alternative
End Sub

Sub Alternative()
MsgBox "Please e-mail me and tell me what you were doing.", vbInformation, "Some Error happened."
Status = "No Idea what Error Happened!!"
End Sub
OK what that code does is it searches the text box. Then what I need help understanding is what to do after I search the textbox looking for what I need. Basically I need to search for the team id. the problem is that the team id is different for each team. Or at least what I need to do is search for something that will get me close to the team and then do something and start looking for the team names. out of this code.
<a href="/nfl/clubhouse?teamId=11">Indianapolis</a> <span id="250806011-atr"> (0-1, 0-1 away) </span></td></tr><tr><td class="teamLine"><a href="/nfl/clubhouse?teamId=1">Atlanta</a> <span id="250806011-htr"> (1-0, 1-0 home) </span></td></tr></table></div><div class="tvNetwork_pre" id="250806011-tv_pre" style="display:none;text-align:center;">

Basically I need to search for the team names for now. And that example that was shown didn't work for me

shawnrgr
08-14-2005, 03:52 PM
Here is a start. This was only based off of the last html string you gave in the previous post, nothing else. But it should be a good example of one way to strip html.

Private Sub Command1_Click()
Dim iCnt As Long
Dim bRec As Boolean
Dim StripHtml As String
Dim arrTeams() As String

For iCnt = 1 To Len(Text1.Text)
DoEvents
If Mid(Text1.Text, iCnt, 1) = "<" Then
bRec = False
ElseIf Mid(Text1.Text, iCnt, 1) = ">" Then
bRec = True
If Mid(Text1.Text, iCnt + 1, 1) <> "<" Then
StripHtml = StripHtml & "|"
End If
Else
If bRec = True Then StripHtml = StripHtml & Mid(Text1.Text, iCnt, 1)
End If
Next iCnt

StripHtml = Replace(StripHtml, " |", "|")
StripHtml = Replace(StripHtml, "| ", "|")
StripHtml = Replace(StripHtml, "||", "|")
StripHtml = Mid(StripHtml, 2, Len(StripHtml) - 2)

arrTeams = Split(StripHtml, "|")

For iCnt = 0 To UBound(arrTeams)
Text2.Text = Text2.Text & arrTeams(iCnt) & vbNewLine
Next iCnt
End Sub

EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum