The above command will list IP address for each request which will contain duplicate values. You can pass this output to uniq command to get a unique list of IP addresses accessing your website. If you are looking for a specific IP address e. If you need to find the top 10 most frequent IP address accessing your website, use the following awk command.
If you want to regularly run these commands then it is advisable to create a shell script for it. Create a blank shell script with the following command.
You may also create cronjob to run the above script regularly. Open crontab with the following command. Add the following line to run the above shell script everyday at The output is of course The default format for logging is called combined and usually is defined in the configuration as follows.
Since NGINX writes information about every client request, on large websites with a lot of traffic you can have access log files with hundreds of thousands of lines.
The first sort is to enable uniq to properly identify and count -c unique status codes. The final sort orders the result by number -n in descending -r order. As you can see we can already identify some issues. We have a bunch of errors which usally identify serious issues in the application, but this is a different story.
The same output of the previous command can be achieved directly with AWK, using a little bit more of its power:. Awk supports associative arrays, which are like traditional arrays except they uses strings as their indexes rather than numbers. In the previous command we create a new array with a key specific for each HTTP status code found inside the input file and pregressively increment the internal value on each occurrence.
At the end of the file we print the content of this array by iterating on the key and the value. Giacomo 3, 24 24 silver badges 36 36 bronze badges. I actually had this big beautiful regex I had written by hand to parse all of my apache custom logs down to individual fields for submission in to a database.
I am kicking myself that I don't have it anymore. It was a one liner; gave you back one variable for each log element - then I was inserting in to MySQL. If I find it I'll post it here. Add a comment. Active Oldest Votes. Improve this answer. Mark Mark 2, 18 18 silver badges 13 13 bronze badges.
Once you're in awk, that's usually enough. Elegant and simple. Beware, spaces seems allowed in the "authuser" 3rd field, that breaks everything, and I personally think it should be forbidden, to allow us doing this ;- — Mandark. Here's an excerpt from one server's apache config: We don't want to log bots, they're our friends BrowserMatch Pingdom.
This means that if I want to do some analysis in Python, maybe show non statuses for example, I can do this: for line in file "access. With this format, it's simple: cut -f 8 log uniq -c sort -n Exactly the same as the above.
Dan Udey Dan Udey 1, 12 12 silver badges 17 17 bronze badges. Your examples for the new format are actually still overcomplicated — IP counts become cut -f 3 log uniq -c sort -n , user agents cut -f 8 log uniq -c sort -n. You're right, that is simpler. I've updated the examples to reflect that.
I have no excuse, and have updated the example accordingly. Vihang D Vihang D 4 4 silver badges 7 7 bronze badges. Interesting, but you might run into problems if your logs are particularly large I would think.
Also how well does it cope with custom log formats? I am trying it at the moment, the load time is so slow at least in version 0. Loading a Mb log it's taking more than five minutes.. I must say that after the load time it took around 15 minutes the synstax of this program is great, you can sort,count and group by. Really nice. Yes, writes might take long, but a threaded proxy might do just the right thing sandwiched in the middle. Anyhow, that'll make querying logs in an SQL like syntax a lot faster.
No loading involved too - the database server is perpetually "ON". I had no idea something like this existed! For web devs that are already very experienced with SQL, this is a great option. Here is a script to find top urls, top referrers and top useragents from the recent N log entries!
Bart De Vos My common questions are: why did the hitrate change? A more complex example in perl might be to visualize a change in hitrate for a pattern.
There is a lot to chew in the script below, especially if you are unfamilar with perl. Jeff Atwood Kris Kris 31 1 1 bronze badge.
0コメント