It's a python script for find which IPs are hitting on our servers mostly and that location where we find from the log with the help of ipstack.
- Log sorting with the highest hit (first 10 positions)
- Log location printing with the script
- Need an IPstack Login and API for location finding
- Need to add apikey file before running the script. So, please find the IPstack URL and grab the key and change the same on "apikey.py"
- Need to install python3.
- Please go through the ipstack and click "GET FREE API KEY" on the top right corner.
Steps: (Amazon-Linux)
sudo yum install git -y
sudo yum install python3
git clone https://github.com/yousafkhamza/log-analyzer-pyscript.git
cd log-analyzer-pyscript
$ python3 log.py
Enter your log file name (absalute path): ../Downloads/Python/access.log
193.106.31.130 : 313055 [Ukraine]
197.52.128.37 : 40777 [Egypt]
45.133.1.60 : 7514 [Netherlands]
173.255.176.5 : 5220 [United States]
172.93.129.211 : 4195 [United States]
178.44.47.170 : 2824 [Russia]
51.210.183.78 : 2684 [France]
84.17.45.105 : 2360 [United States]
193.9.114.182 : 2205 [Belgium]
45.15.143.155 : 1927 [United States]
Most Hitting Ip Address : hit count [location]
- ipstack (Custome made module)
- logparser (Custome made module)
- apikey (Custome made module for API key passing)
- requests (API key passing module)
- re (Regular expression module)
# cat apikey.py (Using for api key passing to the script)
api = '<enter your apikey from ipstack site>' #<------------------- Replace with your API key where you got from ipstack
# eg:
# api = 'a37f9a05417225606d6650e16167'
# cat ipstack.py (API connection establishing and country name grabbing)
import requests
def get_country(ip=None,key=None):
if ip != None and key != None:
url_ipstack = "http://api.ipstack.com/{}?access_key={}".format(ip,key)
response = requests.get(url=url_ipstack)
geodata = response.json()
return geodata['country_name']
# cat logparser.py (it's using for log parsing and it's a outsource script and who had made this and really thank you for him.)
#!/usr/bin/env python3
import re
regex_host = r'(?P<host>.*?)'
regex_identity = r'(?P<identity>\S+)'
regex_user = r'(?P<user>\S+)'
regex_time = r'\[(?P<time>.*?)\]'
regex_request = r'\"(?P<request>.*?)\"'
regex_status = r'(?P<status>\d{3})'
regex_size = r'(?P<size>\S+)'
regex_referer = r'\"(?P<referer>.*?)\"'
regex_agent = r'\"(?P<agent>.*?)\"'
regex_space = r'\s'
pattern = regex_host + regex_space + regex_identity + regex_space + \
regex_user + regex_space + regex_time + regex_space + \
regex_request + regex_space + regex_status + regex_space + \
regex_size + regex_space + regex_referer + regex_space + \
regex_agent
def parser(s):
"""
return type : dict()
return format: {
host:str , identity:str , user:str ,
time:str ,request:str , status:str ,
size:str , referer:str, agent:str
}
returns None if failed.
"""
try:
parts = re.match(pattern,s)
return parts.groupdict()
except Exception as err:
print(err)
# cat log.py (The script for sort IP hit and finding which of the location where it's from)
import ipstack
import logparser
import apikey
def get_hit(t):
return t[1]
path = input('Enter your log file name (absalute path): ')
if path.lower().split('/')[-1].endswith('log') and os.path.isfile('{}'.format(path)):
file = open("{}".format(path),'r')
ipcount = {}
for line in file:
part = logparser.parser(line)
ip = part['host']
if ip not in ipcount:
ipcount[ip] = 1
else:
ipcount[ip] += 1
result = sorted(ipcount.items(),key=get_hit,reverse=True)[:10]
for item in result:
ip,hit = item
country = ipstack.get_country(ip=ip,key=apikey.api)
print("{:20}:{:10} [{}]".format(ip,hit,country))
else:
print('This is not a access_log')
It's just a python script for analyzing our access log which log were we entered into the script then we can find most hitting IP address and corresponding location.