I had to do some anayalsis of a clients site and verify if it was actually as popular as the data usage and web statistics said it was. Bellow are some of the shell commands that I have used to analyze the access log on the server the site was hosted on:
Unique IP’s and amount of entries in access_log:
======================================
cat <path to log file> | awk ‘{print $1}’ | sort -n | uniq -c
Amount of unique IP’s:
===================
cat <path to log file> | awk ‘{print $1}’ | sort -n | uniq -c | wc -l
Amount of Bytes of data from files logged.
=================================
cat <path to log file> | awk ‘{sum+=$10}END{print sum}’
Unique files and the amount of times accessed
====================================
cat <path to log file> | awk ‘{print $7}’ | sort | uniq -c | sort -n -r
Unique files and there size in bytes
============================
cat <path to log file> | awk ‘{print $10, $7}’ | sort -n -r