Python Socket Receive Large Amount of Data

The accepted answer is fine but it will be really slow with big files -string is an immutable class this means more objects are created every time you use the + sign, using list as a stack structure will be more efficient.

This should work better

while True: 
    chunk = s.recv(10000)
    if not chunk: 
        break
    fragments.append(chunk)

print "".join(fragments)

You can use it as: data = recvall(sock)

def recvall(sock):
    BUFF_SIZE = 4096 # 4 KiB
    data = b''
    while True:
        part = sock.recv(BUFF_SIZE)
        data += part
        if len(part) < BUFF_SIZE:
            # either 0 or end of data
            break
    return data

TCP/IP is a stream-based protocol, not a message-based protocol. There's no guarantee that every send() call by one peer results in a single recv() call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv() calls, due to packet fragmentation.

You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv() until you've read an entire message or an error occurs.

One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:

def send_msg(sock, msg):
    # Prefix each message with a 4-byte length (network byte order)
    msg = struct.pack('>I', len(msg)) + msg
    sock.sendall(msg)

def recv_msg(sock):
    # Read message length and unpack it into an integer
    raw_msglen = recvall(sock, 4)
    if not raw_msglen:
        return None
    msglen = struct.unpack('>I', raw_msglen)[0]
    # Read the message data
    return recvall(sock, msglen)

def recvall(sock, n):
    # Helper function to recv n bytes or return None if EOF is hit
    data = bytearray()
    while len(data) < n:
        packet = sock.recv(n - len(data))
        if not packet:
            return None
        data.extend(packet)
    return data

Then you can use the send_msg and recv_msg functions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.

Tags:

Python

Sockets