VNSGU Featured
Code

Scraping Off VNSGU Exam Results

on
August 1, 2018

During my schooldays, I had a classmate. Let’s call him/her X. We always used to compete fiercely for 1st rank in school exams.

Fast forward school. Now in college. 3 years zoomed off, I didn’t hear anything about X.

As humans, we tend to be interested in stories of our old rivals. I’m no different. I wanted to know how X has been doing in college.

In India, exam results are often indication of one’s brilliance and success. Though, I thoroughly disagree with that notion, I started to wonder, is there any way I could get results of X? It could be a good weekend project.


From mutual friends, I got to know that X was pursuing B. Sc in VSNGU.

I opened their result website. http://vnsgu.ac.in/vnsguexam/

VNSGU Website Homepage [1]

 

Nothing fancy. Pretty old website built on PHP, served by Apache server. Next, their result page.

VNSGU Result List [2]

 

Wow! No authentication here, result of all candidates was publicly available. Most websites will ask for birth date.

VNSGU Candidate Result [3]

You just enter seat number and you’ll get scorecard in jiffy. Cool!

Remember my goal? I wanted to see result of X. I don’t have X’s seat number however.

Solution: Bruteforcing.

Iterate through all seat numbers’ result one by one, match each record with X’s name. When there’s match, terminate the loop.

As I had experience of scraping data with PHP cURL, writing script to iterate through each result record was simple task. Here’s how I did it:

<?php

//Simple HTML DOM parser to manipulate HTML
include_once '../class/simplehtmldom/simple_html_dom.php'; 

//To store cookies, just in case
$cookie_file_path = "cookie.txt"; 

//Start from first seat number
$i=1; 

$ch = curl_init();

//We have total 3281 candidates
for($i=1; $i<3281; $i++) { 
 curl_setopt ($ch, CURLOPT_POST, TRUE);
 //Send post request for every seat number
 curl_setopt ($ch, CURLOPT_POSTFIELDS, "seat=$i&submit=Submit"); 
 curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE); 
 curl_setopt ($ch, CURLOPT_HEADER, 0); 
 curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
 curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE); 
 curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
 curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookie_file_path);

 //URL of result page 
 curl_setopt($ch, CURLOPT_URL,'http://vnsgu.ac.in/vnsguexam/result/science/BSCSEM5/bscsem5o.php'); 
 $output1 = curl_exec($ch);

 $html = str_get_html($output1);

 $ret = $html->find('table[width=900] td b');

 if(count($ret) == 0) continue;

 $seat = explode(": &nbsp;", $ret[3]);

 $towrite = $ret[0].";".$ret[1].";".$ret[2].";".$seat[1]."\r\n";
 
 //This file will have list of all candidates, their seat number and college name
 $file = 'data.txt'; 
 file_put_contents($file, $towrite, FILE_APPEND);
}
?>

I uploaded script on server, executed and done! I had data of all candidates in text file.

CTRL + F and I found our X!

Mission accomplished.

TAGS
RELATED POSTS
1 Comment
  1. Reply

    Milan

    August 25, 2016

    Smart 🙂 that technical thing
    I am not from IT/CE branch so i dont have language knowledge but i can build blog based on that 😛

LEAVE A COMMENT

Darpan Dodiya
Raleigh, NC, USA

Hi! I'm Darpan. 23 yr old ex-Software Engineer, now pursuing Master's in Computer Science at NC State, USA. This is my personal blog to share my interests in travelling, photography, programming and life. Glad to see you here, have a look around the website, you'll enjoy. Drop a comment or reach out to me or get connected via links below. Have a good day! :)

Lookup
Subscribe to Newsletter
Topics
Subscribe YouTube Channel
Work Together

Are you an individual, business, organization, tourism agency or government department? We can work together.

We can work with content creation, photoshoot, event photography, cinematic video production, web/software development, digital marketing, documentary, short film.

Let’s discuss. Contact details can be found here.

Ads & Promotion

Please visit the advertisement/guest posts page for details regarding promotions.