TCP PORT FORWARD Summary

Overall Framework of the Program#

Introduction#

The overall program is a single-threaded non-blocking program, divided into two parts: client and server. It can map a TCP-based service (app) running on a specific port (app port, such as 80) in the internal network to a dynamic port (mapping port, such as 44567) on an external server. In order to enable communication between the client and server, the server needs to listen on a port (c-s port, such as 7080) to communicate with the client. Due to the existence of NAT, the client needs to send a connect to the port (7080) that the server is listening on to establish a connection at the beginning, and use this connection as masterfd to manage the server's connection initiation requests to the client. After receiving a request from the client, the server will listen on a dynamic port (such as 44567) as the port for mapping the internal network service, and send the port information to the client through masterfd. Afterwards, for each request received, the server will request a connection from the client through masterfd, and then the client will establish a data flow forwarding tunnel from the app port (such as 80) to the c-s port (such as 7080), and the server will establish a data flow forwarding tunnel from the c-s port to the mapping port. With these connections, the port forwarding function can be completed, and multiple such tunnels can be managed using epoll.

Data Structures#

Both the client and server are actually doing the same thing internally, which is receiving data from one end and then forwarding it to the other end. Therefore, an object can be abstracted to perform these operations.

// Wrapper for fd
// buf: the buffer used to receive data for this fd
// iptr, optr: pointers to record the read and write positions of the data
// fd_fin, closed, connected: variables to record the status of the fd
class fd_buf{
    public:
        int fd;
        char buf[MAXLINE];
        int iptr, optr;
        bool fd_fin,closed,connected;
        // uint32_t mod;
        fd_buf() {
            iptr = optr = 0;
            fd_fin = false;
            connected = false;
            closed = false;
            // mod = EPOLLIN;
        }
        fd_buf(int fd){
            iptr = optr = 0;
            fd_fin = false;
            connected = false;
            closed = false;
            // mod = EPOLLIN;
            this->fd = fd;
        }

        bool is_buf_empty() {
            return iptr == optr;
        }

        bool is_buf_full() {
            return iptr == MAXLINE;
        }
        

};

// Object used for data forwarding
// From the user's perspective, one end is for upstream and the other end is for downstream
class tunnel{
    public:
        fd_buf *fd_up, *fd_down;
        double sum_up, sum_down;
        tunnel() {
            sum_up = 0;
            sum_down = 0;
        }
        tunnel(int fd_user, int fd_app) {
            sum_up = 0;
            sum_down = 0;
            fd_up = new fd_buf(fd_user);
            fd_down = new fd_buf(fd_app);
        }
        int do_read(fd_buf* fd_buf_in, fd_buf* pair_fd_buf);

        int do_write(fd_buf* fd_buf_out, fd_buf* pair_fd_buf);

        void fd_user_read(){
            int n = do_read(fd_up, fd_down);
            sum_up += n;
        }

        void fd_app_read(){
            do_read(fd_down, fd_up);
        }

        void fd_user_write(){
            int n = do_write(fd_up, fd_down);
            sum_down += n;
        }

        void fd_app_write(){
            do_write(fd_down, fd_up);
        }
        
        bool is_user_fd(int fd){
            return fd == fd_up->fd;
        }
        ~tunnel() {
            delete fd_down;
            delete fd_up;
        }


};

Non-blocking#

Why is it blocking? Because for a single connection, if the work of this stage is not completed, the tasks below cannot be completed correctly, so it must be blocked. Why set it to non-blocking? Because we want to manage multiple connections. In a single-threaded program, if one connection is blocked, other connections will also be blocked. The solution is either multi-threading, one thread per connection, so they won't affect each other. Another solution is I/O multiplexing, using epoll/select to listen to sockfd and return the ready state of fd, so that other connections will not be affected by a single connection. In socket programming for TCP communication, connect, accept, read, and write may all block, but we can improve performance and avoid one sock fd blocking other sock fds by setting them to non-blocking.

setnonblocking function#

// A multi-system compatible setnonblocking function
int setnonblocking(int fd) {
    if (fd < 0) return -1;
    int flags;
    /* If they have O_NONBLOCK, use the Posix way to do it */
    #if defined(O_NONBLOCK)
        /* Fixme: O_NONBLOCK is defined but broken on SunOS 4.1.x and AIX 3.2.5. */
        if (-1 == (flags = fcntl(fd, F_GETFL, 0)))
        flags = 0;
        return fcntl(fd, F_SETFL, flags | O_NONBLOCK);
    #else
        /* Otherwise, use the old way of doing it */
        flags = 1;
        return ioctl(fd, FIOBIO, &flags);
    #endif
}

Non-blocking connect#

Reasons for blocking#

The connect function triggers the three-way handshake of TCP. It sends a SYN to the server, and if the server does not reply within a certain period of time, the client will retry, which can last for several tens of seconds.

How to set it to non-blocking#

Call setnonblocking(fd) before calling connect.

Handling non-blocking connect#

After setting the fd to non-blocking before connect, if the TCP connection is not completed, it will return the EINPROGRESS error first. In the case of connection errors, the fd will be both readable and writable. In the case of a successful connection, the fd will become writable. Generally, the connection can be judged as successfully established by checking if there are any error returns using the getsockopt() function.

 int error;
socklen_t len = sizeof(error);

if(getsockopt(fd,SOL_SOCKET, SO_ERROR, &error, &len) == 0 && error == 0){
        printf("connect success\n");
}
else {
    printf("connect failed\n");
}

Non-blocking accept#

Reasons for blocking#

Under normal circumstances, when using epoll/select, there is no need to consider the blocking of accept, because accept is only called after the connection event is detected by epoll/select and placed in the list of pending events. However, there is a special case: when a client initiates a connection and this event is detected by epoll/select, it is put into the list of pending events, but it has not been processed immediately, and at this time the client interrupts the connection. The server has already removed this connection, and when it enters accept, the entire program will be blocked until a new connection arrives.

Handling method#

Set the corresponding listenfd to non-blocking;
Ignore EWOULDBLOCK, ECONNABORTED, EPROTO, EINTR errors after accept.

Non-blocking read and write#

Conditions for determining readability and writability#

Readable

Data is available to read
Received FIN packet from the other end
New connection arrived
Pending errors

Writable

Space is available for writing
Pending errors

Non-blocking handling#

Call setnonblocking(fd) function, and then use epoll to listen for corresponding readable and writable events. When the corresponding event is detected, EAGAIN and EWOULDBLOCK errors should be ignored in the error handling.

EPOLL Event-driven#

Taking tunnel read as an example

 // Preconditions: fd & EPOLLIN, the status of fd_peer is unknown
 n = read
 if ( n < 0) {
     // Errors caused by non-blocking do not need to be handled
     if(errno != EAGAIN && errno != EWOULDBLOCK) {
         // Handle the conditions that can be handled immediately under the current conditions
         // Then classify the processing based on the status of the peer and the application buffer

         // After closing fd, epoll will automatically remove the listening on that fd.
         close(fd)

         // If the current buffer is not empty and the peer is not closed, need to listen for the peer's writability
         if(!fd_pair.closed && fd.buf not empty ){
             enable fd_peer EPOLLOUT
             // Because fd has been closed, the peer's readability needs to be closed
             disable fd EPOLLIN
         }
     }
 }
 else if (n == 0) {
     if(fd_pair.buf not empty ) {
         enable fd EPOLLOUT
     }

     if(!fd_pair.closed && fd.buf is empty) {
         enable fd_pair EPOLLIN
     }

 }
 else {
     if(fd.buf is empty) {
         if(fd_pair not full && !fd_pair.fd_fin) {
             fd_pair EPOLLOUT|EPOLLIN
         }
         else {
             enable fd_pair EPOLLOUT
         }
     }

     if(fd.buf is full) {
         if(fd_pair.buf not empty) {
             enable fd EPOLLOUT
         }
         else{
             disenable fd EPOLLOUT
         }
     }
 }

Handling the Characteristics of TCP Stream Sockets#

TCP protocol can guarantee that data is delivered to the other end in order and without errors (if there is an error, it will be retried or reported), but it does not mean that the number of bytes sent in one send call is the same as the number of bytes received by the other end. When designing an application layer protocol based on TCP, this characteristic should be taken into account.

There are generally two solutions. One is fixed-length messages, which means that a certain length of bytes is defined as one message. When receiving, the received bytes are first buffered, and if the length is not reached, continue to receive data from the other end. This method has poor flexibility and is not suitable for transmitting complex application layer data.

The other solution is to design a data packet header, part of which is used to handle the user's relevant protocol, and add a length field when sending. When the data is received, the header data is first parsed to obtain the length of the data, and then after receiving the data of that length, another header can be parsed.