Topic : Using Internet Sockets
Author : Beej
Page : << Previous 12  Next >>
Go to page :


    while(total < *len) {
            n = send(s, buf+total, bytesleft, 0);
            if (n == -1) { break; }
            total += n;
            bytesleft -= n;
        }

        *len = total; // return number actually sent here

        return n==-1?-1:0; // return -1 on failure, 0 on success
    }



In this example, s is the socket you want to send the data to, buf is the buffer containing the data, and len is a pointer to an int containing the number of bytes in the buffer.

The function returns -1 on error (and errno is still set from the call to send().) Also, the number of bytes actually sent is returned in len. This will be the same number of bytes you asked it to send, unless there was an error. sendall() will do it's best, huffing and puffing, to send the data out, but if there's an error, it gets back to you right away.

For completeness, here's a sample call to the function:


    char buf[10] = "Beej!";
    int len;

    len = strlen(buf);
    if (sendall(s, buf, &len) == -1) {
        perror("sendall");
        printf("We only sent %d bytes because of the error!\n", len);
    }



What happens on the receiver's end when part of a packet arrives? If the packets are variable length, how does the receiver know when one packet ends and another begins? Yes, real-world scenarios are a royal pain in the donkeys. You probably have to encapsulate (remember that from the data encapsulation section way back there at the beginning?) Read on for details!

6.4. Son of Data Encapsulation
What does it really mean to encapsulate data, anyway? In the simplest case, it means you'll stick a header on there with either some identifying information or a packet length, or both.

What should your header look like? Well, it's just some binary data that represents whatever you feel is necessary to complete your project.

Wow. That's vague.

Okay. For instance, let's say you have a multi-user chat program that uses SOCK_STREAMs. When a user types ("says") something, two pieces of information need to be transmitted to the server: what was said and who said it.

So far so good? "What's the problem?" you're asking.

The problem is that the messages can be of varying lengths. One person named "tom" might say, "Hi", and another person named "Benjamin" might say, "Hey guys what is up?"

So you send() all this stuff to the clients as it comes in. Your outgoing data stream looks like this:

    t o m H i B e n j a m i n H e y g u y s w h a t i s u p ?


And so on. How does the client know when one message starts and another stops? You could, if you wanted, make all messages the same length and just call the sendall() we implemented, above. But that wastes bandwidth! We don't want to send() 1024 bytes just so "tom" can say "Hi".

So we encapsulate the data in a tiny header and packet structure. Both the client and server know how to pack and unpack (sometimes referred to as "marshal" and "unmarshal") this data. Don't look now, but we're starting to define a protocol that describes how a client and server communicate!

In this case, let's assume the user name is a fixed length of 8 characters, padded with '\0'. And then let's assume the data is variable length, up to a maximum of 128 characters. Let's have a look a sample packet structure that we might use in this situation:


len (1 byte, unsigned) -- The total length of the packet, counting the 8-byte user name and chat data.

name (8 bytes) -- The user's name, NUL-padded if necessary.

chatdata (n-bytes) -- The data itself, no more than 128 bytes. The length of the packet should be calculated as the length of this data plus 8 (the length of the name field, above).

Why did I choose the 8-byte and 128-byte limits for the fields? I pulled them out of the air, assuming they'd be long enough. Maybe, though, 8 bytes is too restrictive for your needs, and you can have a 30-byte name field, or whatever. The choice is up to you.

Using the above packet definition, the first packet would consist of the following information (in hex and ASCII):

      0A     74 6F 6D 00 00 00 00 00      48 69
   (length)  T  o  m    (padding)         H  i



And the second is similar:

      14     42 65 6E 6A 61 6D 69 6E      48 65 79 20 67 75 79 73 20 77 ...
   (length)  B  e  n  j  a  m  i  n       H  e  y     g  u  y  s     w  ...



(The length is stored in Network Byte Order, of course. In this case, it's only one byte so it doesn't matter, but generally speaking you'll want all your binary integers to be stored in Network Byte Order in your packets.)

When you're sending this data, you should be safe and use a command similar to sendall(), above, so you know all the data is sent, even if it takes multiple calls to send() to get it all out.

Likewise, when you're receiving this data, you need to do a bit of extra work. To be safe, you should assume that you might receive a partial packet (like maybe we receive "00 14 42 65 6E" from Benjamin, above, but that's all we get in this call to recv()). We need to call recv() over and over again until the packet is completely received.

But how? Well, we know the number of bytes we need to receive in total for the packet to be complete, since that number is tacked on the front of the packet. We also know the maximum packet size is 1+8+128, or 137 bytes (because that's how we defined the packet.)

What you can do is declare an array big enough for two packets. This is your work array where you will reconstruct packets as they arrive.

Every time you recv() data, you'll feed it into the work buffer and check to see if the packet is complete. That is, the number of bytes in the buffer is greater than or equal to the length specified in the header (+1, because the length in the header doesn't include the byte for the length itself.) If the number of bytes in the buffer is less than 1, the packet is not complete, obviously. You have to make a special case for this, though, since the first byte is garbage and you can't rely on it for the correct packet length.

Once the packet is complete, you can do with it what you will. Use it, and remove it from your work buffer.

Whew! Are you juggling that in your head yet? Well, here's the second of the one-two punch: you might have read past the end of one packet and onto the next in a single recv() call. That is, you have a work buffer with one complete packet, and an incomplete part of the next packet! Bloody heck. (But this is why you made your work buffer large enough to hold two packets--in case this happened!)

Since you know the length of the first packet from the header, and you've been keeping track of the number of bytes in the work buffer, you can subtract and calculate how many of the bytes in the work buffer belong to the second (incomplete) packet. When you've handled the first one, you can clear it out of the work buffer and move the partial second packed down the to front of the buffer so it's all ready to go for the next recv().

(Some of you readers will note that actually moving the partial second packet to the beginning of the work buffer takes time, and the program can be coded to not require this by using a circular buffer. Unfortunately for the rest of you, a discussion on circular buffers is beyond the scope of this article. If you're still curious, grab

Page : << Previous 12  Next >>