Skip to content

Latest commit

 

History

History
177 lines (142 loc) · 5.75 KB

README.md

File metadata and controls

177 lines (142 loc) · 5.75 KB

Access Log Analyzer + Location Finder

Build


Description

It's a python script for find which IPs are hitting on our servers mostly and that location where we find from the log with the help of ipstack.


Feature

  • Log sorting with the highest hit (first 10 positions)
  • Log location printing with the script

Pre-Requests

  • Need an IPstack Login and API for location finding
  • Need to add apikey file before running the script. So, please find the IPstack URL and grab the key and change the same on "apikey.py"
  • Need to install python3.

How to get IPstack API

  • Please go through the ipstack and click "GET FREE API KEY" on the top right corner.

alt text


How to get the script

Steps: (Amazon-Linux)

sudo yum install git -y
sudo yum install python3
git clone https://github.com/yousafkhamza/log-analyzer-pyscript.git
cd log-analyzer-pyscript

Script running Demonstration

$ python3 log.py
Enter your log file name (absalute path): ../Downloads/Python/access.log
193.106.31.130      :    313055  [Ukraine]
197.52.128.37       :     40777  [Egypt]
45.133.1.60         :      7514  [Netherlands]
173.255.176.5       :      5220  [United States]
172.93.129.211      :      4195  [United States]
178.44.47.170       :      2824  [Russia]
51.210.183.78       :      2684  [France]
84.17.45.105        :      2360  [United States]
193.9.114.182       :      2205  [Belgium]
45.15.143.155       :      1927  [United States]

Most Hitting Ip Address : hit count [location]


Modules used

  • ipstack (Custome made module)
  • logparser (Custome made module)
  • apikey (Custome made module for API key passing)
  • requests (API key passing module)
  • re (Regular expression module)

Behind the code

# cat apikey.py (Using for api key passing to the script)

api = '<enter your apikey from ipstack site>'            #<------------------- Replace with your API key where you got from ipstack
# eg:
# api = 'a37f9a05417225606d6650e16167'

# cat ipstack.py (API connection establishing and country name grabbing)

import requests

def get_country(ip=None,key=None):
 if ip != None and key != None:
    url_ipstack = "http://api.ipstack.com/{}?access_key={}".format(ip,key)
    response = requests.get(url=url_ipstack)
    geodata = response.json()
    return geodata['country_name']

# cat logparser.py (it's using for log parsing and it's a outsource script and who had made this and really thank you for him.)

#!/usr/bin/env  python3

import re

regex_host = r'(?P<host>.*?)'
regex_identity = r'(?P<identity>\S+)'
regex_user = r'(?P<user>\S+)'
regex_time = r'\[(?P<time>.*?)\]'
regex_request = r'\"(?P<request>.*?)\"'
regex_status = r'(?P<status>\d{3})'
regex_size = r'(?P<size>\S+)'
regex_referer = r'\"(?P<referer>.*?)\"'
regex_agent = r'\"(?P<agent>.*?)\"'
regex_space = r'\s'

pattern = regex_host + regex_space + regex_identity + regex_space + \
          regex_user + regex_space + regex_time + regex_space + \
                  regex_request + regex_space + regex_status + regex_space + \
                  regex_size + regex_space + regex_referer + regex_space + \
                  regex_agent


def parser(s):
        """
        return type : dict()
        return format: {
                       host:str , identity:str , user:str ,
                                           time:str ,request:str , status:str ,
                                           size:str , referer:str, agent:str
                                        }
        returns None if failed.
        """
        try:
                parts = re.match(pattern,s)
                return parts.groupdict()
        except Exception as err:
                print(err)

# cat log.py (The script for sort IP hit and finding which of the location where it's from)

import ipstack
import logparser
import apikey

def get_hit(t):
    return t[1]

path = input('Enter your log file name (absalute path): ')

if path.lower().split('/')[-1].endswith('log') and os.path.isfile('{}'.format(path)):
    file = open("{}".format(path),'r')
    ipcount = {}
    for line in file:
        part = logparser.parser(line)
        ip = part['host']
        if ip not in ipcount:
            ipcount[ip] = 1
        else:
            ipcount[ip] += 1
            
    result = sorted(ipcount.items(),key=get_hit,reverse=True)[:10]
    for item in result:
        ip,hit = item
        country = ipstack.get_country(ip=ip,key=apikey.api)
        print("{:20}:{:10}  [{}]".format(ip,hit,country))
else:
    print('This is not a access_log')

Sticky Note

alt text


Conclusion

It's just a python script for analyzing our access log which log were we entered into the script then we can find most hitting IP address and corresponding location.

⚙️ Connect with Me