Login To Website Using HTMLAgilityPack
Solution 1:
Is there anyway for this to be done?
Not with what the HTML Agility Pack (HAP) library provides - not directly.
The HAP is great for getting a single page and parsing it, but it is not designed for continued interactions. Things that are missing are cookie management, JavaScript interaction and more.
In order to login you probably need to send an HTTP POST to the server, including the data you want - the HAP can't help with that.
You will need to use a class like WebRequest
to make the post - I suggest looking at fiddler and using it to see what the request should look like and constructing it accordingly, though that may just be the first step.
You may want to investigate the use of web automation tools such as selenium or WatiN instead.
Solution 2:
You need to observe the POST request via fiddler and see how it's structured. for instance :
{"userName":"you","password":"pwd"}
Usually, a site would recognize that you are logged in by receiving their cookie in your requests.
HttpClient by default sends the cookies received from a specific domain with each sequential request to that domain (Until you dispose that HttpClient instance)
1) Create a cookie container and assigned it to your HttpClient instance.
2) Use HttpClient to make the login POST request.
3) Use HttpClient to make the data GET request.
4) Read the html string from the response.
5) Use HtmlAgilityPack HtmlDocument to load the document from the html string and not from the web (as most examples show).
string baseUrl = "https://www.yourwebsite.com";
string loginUrl = "/Account/LogOn";
string sessionUrl = "/Data";
var uri = new Uri(baseUrl);
CookieContainer cookies = new CookieContainer();
HttpClientHandler handler = new HttpClientHandler();
handler.CookieContainer = cookies;
using (var client = new HttpClient(handler))
{
client.BaseAddress = uri;
var request = new { userName = "you", password = "pwd" };
var resLogin = client.PostAsJsonAsync(loginUrl,request).Result;
if (resLogin.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Could not login -> StatusCode = " + resLogin.StatusCode);
// see what cookies are returned
IEnumerable<Cookie> responseCookies = cookies.GetCookies(uri).Cast<Cookie>();
foreach (Cookie cookie in responseCookies)
Console.WriteLine(cookie.Name + ": " + cookie.Value);
var resData = client.GetAsync(dataUrl).Result;
if(resSession.StatusCode != HttpStatusCode.OK)
Console.WriteLine("Could not get data html -> StatusCode = " + resSession.StatusCode);
var html = resSession.Content.ReadAsStringAsync().Result;
var doc = new HtmlDocument();
doc.LoadHtml(html);
}
Solution 3:
I don't know if you're using the WPF WebBrowser control, but if you are, you can use something along the lines of
doc.GetElementById("submit_signin").Click();
That's what works for me.
Post a Comment for "Login To Website Using HTMLAgilityPack"