When I run the command, the output looks like: $ sudo tcpdump -n -l arp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on en0, link-type EN10MB (Ethernet), capture size 96 bytes 19:10:48.212755 arp who-has 192.168.1.110 (85:70:48:a0:00:10) tell 192.168. 1.10 19:10:48.743185 arp who-has 192.168.1.96 tell 192.168.1.92 19:10:48.743189 arp reply 192.168.1.2 is-at 00:0e:e7:7a:b2:24 19:10:48. 743198 arp who-has 192.168.1.96 tell 192.168.1.111 ^C
To get the output to stop, I press Ctrl-C. Otherwise, it will run forever.
If you get a permission error, you may not be running the command as
After the header, we see these 'arp who-has X tell Y' lines. Y is the host that asked the question. The question was, 'Will the host at IP address X please respond so that I know your Ethernet (MAC) address?' The question is sent out as a broadcast, so we should see any ARP requests on our local LAN. However, we won't see many of the answers because they are sent as unicast packets, and we are on a switch. In this case, we see one reply because we're on the same hub as that machine (or maybe that is the machine running the command; I won't tell you which it is). That's OK because we only need to see one side of the question.
That's our data source. Now, let's transform the data into something we can use.
First, let's isolate just the lines of output that we want. In our case, we want the 'arp who-has' lines: $ sudo tcpdump -l -n arp | egrep 'arp who- has'
We can run that and see that it is doing what we expect. The only problem now is that this command runs forever, waiting for us to stop it by pressing Ctrl-C. We want enough lines to do something useful, and then we'll process it all. So, let's take the first 100 lines of data: $ sudo tcpdump -l -n arp | grep 'arp who-has' | head -100
Again, we run this and see that it comes out OK. Of course, I'm impatient and changed the 100 down to 10 when I was testing this. However, that gave me the confidence that it worked and that I could use 100 in the final command. You'll notice that there are a bunch of headers that are output, too. Those go to stderr (directly to the screen) and aren't going into the grep command.
So, now we have 100 lines of the kind of data we want. It's time to calculate the statistic we were looking for. That is, which hosts are generating the most ARP packets? Well, we're going to need to extract each host IP that generated an ARP and count it somehow. Let's start by extracting out the host IP address, which is always the sixth field of each line, so we can use this command to extract that field's data: awk '{ print $6 }'
That little bit of
I should point out that I was too lazy to count which field had the data I wanted. It looked like it was about the fifth word, so I first tried it with $5. That didn't work. So I tried $6. Oh yeah, I need to remember that
I'm lazy and I'm impatient. I didn't want to wait for all 100 ARPs to be collected. Therefore, I stored them once and kept reusing the results.
I stored them in a temporary file: $ sudo tcpdump -l -n arp | grep 'arp who-has' | head -100 >/tmp/x
Then I ran my
Dang! It isn't the fifth. I'll try the sixth: $ cat /tmp/x | awk '{ print $6 }' 192.168.1.110 192.168.1.10 192.168.1.92 ...
Ah, that's better.
Anyway, I then realized I could be lazy in a different way. $NF means 'the last field' and