The IP protocol (Internet Protocol)

The internet protocol is one of the most widely used protocols today. You may not realize it, but you use the IP protocol even at this very instant. To understand how, please see the diagram below. This diagram is usually called the networking stack diagram:


The stacked organisation implies that whenever you use a high level protocol (such as you do now for instance with HTTP -the web protocol), a number of other protocols are actually used which allow for the HTTP protocol to function. Some examples of such protocols would be ARP, DNS, IP, TCP, etc. but of these only IP and TCP are used by HTTP. It is important to clarify that other protocols such as DNS, ARP, etc. are supporting protocols, not used. This will be clarified even further later on, so keep reading.



This is an important observation, because it means if you can capture the packets that describe lower-level protocols, you can effectively analyse/decipher higher level protocols too. Let's focus on the IP protocol, which stands for Internet Protocol.

The Internet Protocol (IP) is the basis of internet-related protocols. It allows for an unreliable way of addressing hosts within a network or even an inter-network (internet), as it is a packet-switching network communications protocol. This is what the structure of an IP network packet looks like:


A number of things of interest can be identified when observing the structure. Every IP packet has one source and one destination address. There are a number of other fields, relating to the correct operation of the IP protocol, including:
  1. Version number
  2. Flags, signifying various things, such as whether packet fragmentation is allowed
  3. Header checksum, allowing for header validation
  4. TTL, i.e. Time-To-Live which describes the maximum number of hops allowed for packets when routed from host to host

The Internet Protocol is itself 'layered' over other hardware protocols, such as Ethernet, 802.11 family protocols, etc. so it is technology independent. Some examples of how addressing works follow:

127.0.0.1 => the local machine (also known as 'localhost')
212.58.226.75 => BBC, news sub-domain (i.e. news.bbc.co.uk)
161.74.14.28 => University of Westminster primary web domain (i.e. www.wmin.ac.uk)

So in other words, the human-friendly names that you use to identify websites and other hosts are artificial and map to one or more IP addressed hosts. To find the IP address of a host, you may use various tools, provided by your operating system, such as ping, traceroute, etc. You may also perform a web WHOIS database query to do so.

IP is the most prevalent protocol for network communications on the internet. Why?
Because it is very mature (birthed in the 80's), it is standardized, it comes for free on all major computing and now some mobile operating systems, it is fast and efficient.

IP is a binary protocol. Examining it using packet sniffing tools such as Wireshark (see related blog post) whilst fetching a web-page yields the image shown on the left. The actual data is shown on the right part of the image, whereas the hex representation is on the left.






Each bit shown on the image on the left corresponds to a different protocol. The cyan bit is 16 bytes in length and contains Ethernet frames (i.e. hardware-related communications data). The yellow bit is IP & TCP protocol data and the red bit is HTTP data.