How can I find the total number of TCP connections for a given port and period of time by IP?

Turn on iptables and set it to LOG for incoming connections. Example rule:

 -A INPUT --state NEW -p tcp --dport 4711 -j LOG

(where 4711 is the port you want to track).

Then run the resulting log through whatever script you like that can do the summary for you.


You can use tcpdump to log all SYN (without ACK) packets:

tcpdump "dst port 4711 and tcp[tcpflags] & (tcp-syn|tcp-ack) == tcp-syn"

or log all SYN+ACK packets (established connections):

tcpdump "src port 4711 and tcp[tcpflags] & (tcp-syn|tcp-ack) == (tcp-syn|tcp-ack)"

And then combine it with a wc -l to count all lines

You'd also need a way to measure fixed periods of time (you could have a cron just send it a SIGINT at regular intervals, tcpdump will count bytes and packets but only logs time)

Update: not necessary to say, have a look to the man page of tcpdump and consider using some options like: -i (listen to only one interface), -p (disable promiscuous mode; less invasive), or some output options. Tcpdump needs root permissions and your boss may not like it because it is kind of a hacker tool. On the other hand, you don't need to touch anything on your system to run it (in contrast to the iptables LOG solution)

Please also remark the small src/dsk difference in the filter. If you catch SYN+ACK packets and want to count connections to a server at port 4711 you need src. If you are catching SYN+!ACK packets for the same result, you need dst. If you count connections on the server itself, you always have to use the reverse.


SystemTap solution

Script inspired by the tcp_connections.stp example:

#!/usr/bin/env stap
# To monitor another TCP port run:
#     stap -G port=80 tcp_connections.stp
# or
#     ./tcp_connections.stp -G port=80
global port = 22
global connections

function report() {
  foreach (addr in connections) {
    printf("%s: %d\n", addr, @count(connections[addr]))
  }
}

probe end {
  printf("\n=== Summary ===\n")
  report()
}

probe kernel.function("tcp_accept").return?,
      kernel.function("inet_csk_accept").return? {
  sock = $return
  if (sock != 0) {
    local_port = inet_get_local_port(sock)
    if (local_port == port) {
      remote_addr = inet_get_ip_source(sock)
      connections[remote_addr] <<< 1
      printf("%s New connection from %s\n", ctime(gettimeofday_s()), remote_addr)
    }
  }
}

Output:

[root@bubu ~]# ./tcp_connections.stp -G port=80
Mon Mar 17 04:13:03 2014 New connection from 192.168.122.1
Mon Mar 17 04:13:04 2014 New connection from 192.168.122.1
Mon Mar 17 04:13:08 2014 New connection from 192.168.122.4
^C
=== Summary ===
192.168.122.1: 2
192.168.122.4: 1

strace solution

Either start the program under strace:

strace -r -f -e trace=accept -o /tmp/strace ${PROGRAM} ${ARGS}

or trace an already running program:

strace -r -f -e trace=accept -o /tmp/strace -p ${PID_OF_PROGRAM}

-r prints a relative timestamp upon entry to each system call in case it's needed later for extra performance analysis. -f traces child processes and it might not be needed.

The output looks something like this:

999        0.000000 accept(3, {sa_family=AF_INET, sin_port=htons(34702), sin_addr=inet_addr("192.168.122.4")}, [16]) = 5
999        0.008079 --- SIGCHLD (Child exited) @ 0 (0) ---
999        1.029846 accept(3, {sa_family=AF_INET, sin_port=htons(34703), sin_addr=inet_addr("192.168.122.4")}, [16]) = 5
999        0.008276 --- SIGCHLD (Child exited) @ 0 (0) ---
999        3.580122 accept(3, {sa_family=AF_INET, sin_port=htons(50114), sin_addr=inet_addr("192.168.122.1")}, [16]) = 5

and can be filtered with:

# gawk 'match($0, /^([0-9]+)[[:space:]]+([0-9.]+)[[:space:]]+accept\(.*htons\(([^)]+)\),.*inet_addr\("([^"]+)"\).*[[:space:]]+=[[:space:]]+([1-9][0-9]*)/, m) {connections[m[4]]++} END {for (addr in connections) printf("%s: %d\n", addr, connections[addr]); }' /tmp/strace
192.168.122.4: 3
192.168.122.1: 2

Short explanation of the AKW one-liner: m[1] is the PID, m[2] is the timestamp, m[3] is the remote port and m[4] is the remote address.

The advantage of this solution is that root is not required if the server runs under the same user. The disadvantage is that all connections are counted, there's no filtering, so it won't work if the application listens on multiple ports.