top of page
Writer's pictureVlad

[Code with Me]HTML FORM Parser in Python: Threat Hunting Social Media Sites and Suspicious Pages





Let's have a scenario, you are investigating a potential phishing site and you want to check what URL a certain login form will lead you to. In this video we'll code on a python program that will parse through a html form tag.

This is an impromptu coding so Code with Me.


Algorithm:

ask user for URL input -->

is there's a login on webpage-->

if yes:

output login redir -->

ask if you want to see the entire form tag-> if yes print entire form

-> if no exit program

if no login page:

but have forms:

> output the entire form

else:

> "no form identified"





Code:

import requests

from bs4 import BeautifulSoup

import sys


input_url = raw_input("Input entire URL to analyze: ")


try:

with requests.Session() as c:

page = c.get(input_url)

page = page.content

soup = BeautifulSoup(page,'lxml')


forms_redir = soup.find('form').attrs['action'] #will look for html form tag with redirect URL

print "Login will redirect to "

print input_url, "/", forms_redir

cond = raw_input("do you want to know the entire form? yes or no?")

if cond == "yes":

print soup.find('form')

elif cond == "no":

sys.exit()


else:

print "Wrong input, will now exit"

sys.exit()



except:

forms = soup.find('form')

if forms:

print "No login found, but did found a form tag"

print soup.find('form')


else:

print "No form tag identified on the page"



For the entire tutorial you can go on to below for the Youtube video. If you like the video click on like and subscribe to my channel.





40 views0 comments

Comments


LET'S TAKE IT TO THE NEXT LEVEL!

bottom of page