iTranslated by AI
Revisiting TCP/UDP Header Knowledge
Background and Purpose
Since I originally came from an application development background, I hadn't really been conscious of this area and hadn't studied TCP/UDP in depth...
However, I recently witnessed a friend encounter trouble related to TCP knowledge while implementing an application that executes APIs. This made me realize that baseline knowledge is necessary, so I started studying.
I've decided to organize this information for my own future reference.
What are TCP and UDP?
TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are protocols in the "Transport Layer" that control data communication in IP networks such as the Internet.
TCP and UDP are protocols used for different purposes—namely "reliability" versus "transfer speed and real-time performance." One is not inherently better than the other; they are chosen and used as needed.
Overview of Header Characteristics:
- TCP Header:
Contains control information (20–60 bytes) to ensure "reliability and order control." - UDP Header:
Contains only the bare minimum information (fixed 8 bytes) to prioritize "transfer speed and real-time performance."
1. TCP Header
TCP header prioritizes ensuring reliability and includes fields for establishing connections with communication partners (3-way handshake), confirming packet arrival for retransmission control of lost packets, reordering, and flow control.
Size varies depending on the presence of options (minimum 20 bytes to maximum 60 bytes).
Header Components
- Source Port Number (16-bit): Identifies the return address (source) of the communication.
- Destination Port Number (16-bit): Identifies the destination.
- Sequence Number (32-bit): Manages the order of sent data and is used for correct reordering on the receiving side.
- ACK Number (32-bit): An acknowledgment number that indicates the starting position of the data the receiver wants to receive next, confirming data arrival.
- Data Offset (4-bit): Indicates the header length (as options may be included).
- Control Flags (1-bit each): A group of important flags that control the communication state.
- SYN: Connection request
- ACK: Acknowledgment
- FIN: Connection termination
- RST: Forced connection reset
- PSH: Data push function
- URG: Presence of urgent data
- Window Size (16-bit): Notifies the amount of data that can be received (free buffer space) and performs flow control.
- Checksum (16-bit): Verifies the integrity of the header and data (presence of errors).
- Urgent Pointer (16-bit): Indicates the location of urgent data when the URG flag is set.
- Options (Variable): Used for notifying MSS (Maximum Segment Size), etc.
UDP Header
The UDP header prioritizes speed and reduces overhead to the absolute limit by not performing retransmission control or order control. It follows a best-effort model where data is simply sent out, making it suitable for latency-sensitive communications such as DNS, VoIP (voice calls), and streaming.
The header size is a simple, fixed 8-byte structure that does not involve packet arrival confirmation or retransmission like TCP.
Header Components
- Source Port Number (16-bit): May be omitted (0) if a reply is not required.
- Destination Port Number (16-bit): Identifies the destination application.
- UDP Data Length (16-bit): Indicates the total length of the UDP packet, including the header.
- Checksum (16-bit): Verifies data integrity (sometimes treated as optional in IPv4, but mandatory in IPv6).
Comparison of Major Differences
| Item | TCP Header | UDP Header |
|---|---|---|
| Standard Size | 20 bytes (up to 60 bytes with options) | 8 bytes (fixed) |
| Main Functions | Order control, retransmission control, flow control | Port specification, error detection |
| Overhead | High | Very low |
| Reliability | High (with acknowledgment) | Low (no acknowledgment) |
Summary
The biggest difference between TCP and UDP is the design philosophy: "Whether to guarantee communication reliability at the protocol level or delegate it to the application side."
Why Is Differentiation Necessary?
While TCP is highly versatile and reliable, it incurs high overhead due to the "interactions" required for 3-way handshakes and retransmission control.
On the other hand, UDP eliminates waste to the extreme by focusing on "just sending."
When to Choose TCP:
When data loss directly leads to critical bugs.
(API communication, database operations, file transfers)
When to Choose UDP:
When real-time performance directly impacts the user experience, and latency is a bigger issue than the loss of a few packets.
(Voice calls, live streaming, coordinate synchronization in multiplayer games)
Application to Troubleshooting
When encountering network trouble in application development, knowledge of header information becomes a powerful weapon.
Example)
- Connection fails to establish:
Check if the TCP header's "Control Flags (SYN/ACK)" are being exchanged correctly. - Poor performance:
Suspect flow control based on TCP "Window Size" or frequent retransmissions. - Data does not arrive:
In the case of UDP, the protocol does not perform retransmissions, so retry logic must be implemented at the application layer.
Conclusion
In modern application development, I felt that understanding the mechanisms of lower layers directly relates to performance optimization and the "isolation capability" during failures.
In particular, the incident involving my friend that triggered this study was NAT port exhaustion when executing a large number of APIs via jobs. Because it was necessary to execute many APIs and, from a performance standpoint, the next API was executed without waiting for a response, ports that had finished being used were not immediately released, leading to exhaustion.
By using a connection pool, the issue was resolved, unnecessary connection establishment communications were eliminated, and the API's performance improved.
I felt that aiming to be an engineer who can understand communication requirements and select or control the optimal protocol—rather than just going with "TCP for now"—will become a strength that crosses the boundary between infrastructure and applications.
Discussion