ASP.NET Data Scraping
Something that has been annoying me for a while on a hobby website is the process of maintaining a points table of F1 drivers. Recently I thought there had to be a way to pull the drivers standings from the official website; there was.
Using the WebClient class, String functions and some Regex I was able to pull down the drivers standings from the official website… nice! Here is the code, very easy to do! Enjoy ;)
Default.aspx:
<%@ Page Language=”C#” AutoEventWireup=”true” CodeFile=”Default.aspx.cs” Inherits=”_Default” %>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html xmlns=”http://www.w3.org/1999/xhtml”>
<head runat=”server”>
<title></title>
</head>
<body>
<form id=”form1” runat=”server”>
<asp:Label ID=”OfficialF1DriverStandings” runat=”server” Text=”Label”></asp:Label>
</form>
</body>
</html>
Default.aspx.cs:
using System;
using System.Net;
using System.Text.RegularExpressions;
public partial class _Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
WebClient wc = new WebClient();
string html = wc.DownloadString(“http://www.formula1.com/results/driver/2011/”);
int startDriversPointsIndex = html.IndexOf(“<div class=”contentContainer”>”);
int endDriversPointsIndex = html.IndexOf(“</div>”, startDriversPointsIndex);
int lengthOfDriverPointsContent = (endDriversPointsIndex - startDriversPointsIndex);
string officialF1DriverStandingsWithLinks = html.Substring(startDriversPointsIndex, lengthOfDriverPointsContent);
string officialF1DriverStandings = Regex.Replace(officialF1DriverStandingsWithLinks, “(?i)<a\s+href[^>]+>|</a>”, “”, RegexOptions.IgnoreCase);
OfficialF1DriverStandings.Text = officialF1DriverStandings;
}
}
Result:

-
easycalm liked this
-
alanfeekery posted this
