This page looks best with JavaScript enabled

Introduction to VoIP

 ·  🎃 kr0m

Many of you may already have an idea of what VoIP is, but in this article we will clarify certain terms that will help us understand more complex concepts in the future.


CODECS:

To begin with, the voice is digitized using a certain algorithm, which is called a codec. Depending on the chosen codec, we will obtain a certain audio quality and compression. We can see the AB used by the most common codecs:

Audio AB Video AB
GSM 13 kbps H261 Between 40 Kbits/s and 2 Mbits/s
G711 64 kbps H263 From less than 64 Kbits/s up to 583.9 Mbits/s without compression
G721 32 kbps H263p From less than 64 Kbits/s up to 583.9 Mbits/s without compression
G722 64 kbps H264 Between 64 Kbits/s and 960 Mbit/s)
G722.1 24/32 kbps G723.1 5.6/6.3 kbps
G723 24/40 kbps
G726 16/24/32/40 kbps
G727 Variable
G728 16 kbps
G729 8 kbps
LPC10 2.4 kbps
Speex 8/16/32 kbps
iLBC 8 kbps
DoD CELP 4.8 kbps
EVRC 9.6/4.8/1.2 kbps
DVI 32 kbps
L16 128 kbps
 

SIP:

SIP (Session Initiation Protocol) is a signaling system used in VoIP to initiate calls, hang up, etc.

SIP is a plain text protocol very similar to HTTP. Thanks to this, we can “grep” certain patterns in real time using ngrep, a tool that will be very useful to us later on.

SDP:

SDP (Session Description Protocol) is a part of the SIP message where the codecs to be used and their preferred order are indicated.

RTP:

RTP (Real Time Protocol) is the data stream that contains the digitized voice. Through SIP, we make the destination ring and manage the call, and through RTP, we deliver the audio.

It should be noted that it is not mandatory for the RTP flow to pass through our PBX since it is a data flow completely isolated from SIP, thus RTP can flow directly between the two clients.

RTCP:

RTCP (Real Time Control Protocol) is a protocol designed to control the quality of the RTP flow, checking jitter, latencies… through tshark we will be able to analyze these parameters by capturing RTCP, RTP traffic.

DID or DDI:

Public number that can be called and the call reaches our PBX.

PSTN:

PSTN (Public Switched Telephony Network) the traditional telephone network.

FXO:

FXO (Foreign Exchange Office), interface of an analog terminal that expects signaling from the PBX.

FXS:

FXS (Foreign Exchange Station), interface of the PBX that provides signaling to an analog terminal.

WARNING!!!!: If the FXS interface is connected to the PSTN, the card will be damaged!!!

DTMF:

DTMF (Dual-tone multi-frequency signaling), tone system used in analog telephone systems for tone sending.

IVR:

IVR (Interactive Voice Response), automatic call reception system which, through the reception of DTMF tones, is able to make decisions.

SIP Router:

Highly specialized devices in routing SIP traffic.

Multimedia Server:

Here we must make a distinction between SIP routers such as Kamailio, OpenSIPS, OpenSER and multimedia servers such as Asterisk, SEMS, the former are specialized in routing SIP traffic based on a routing file while multimedia servers offer services such as automatic responders, IVRs, voicemail, etc.

The basic messages sent in a SIP environment are:

  • INVITE –> Call initiation
  • ACK –> Confirmation of call initiation
  • BYE –> End of call
  • CANCEL –> Rejection of call initiation
  • REGISTER –> Registration of a client on the server
  • OPTIONS –> Request for the options supported by the server

Most important responses:

  • 1XX information messages (100 Trying, 180 Ringing, 183 In progress).
  • 2XX operation successfully completed (200 OK).
  • 3XX call redirection (302 Moved temporarily, 305 Use proxy).
  • 4XX error (403 Forbidden).
  • 5XX server error (500 Internal server error, 501 Not implemented).
  • 6XX global failure (606 Not acceptable).

Latency:

Time that the packet is traveling through the network.

Jitter:

Variance of latency.


An important concept regarding REGISTER is that it is only used if you are going to receive calls, that is, the REGISTER process is completely independent of the INVITE process. This allows us to make calls WITHOUT being registered or receive calls WITHOUT being able to initiate them. This is a very common mistake among people who use Asterisk but do not know how the SIP protocol works.

Another very important factor is QoS, since if there is jitter, the quality of the conversation can be severely affected. In addition, we will have to make sure that the NATs are not blocking our traffic. It is very common to have audio in only one direction, this is because the person who initiated the conversation can send traffic but cannot receive it because the RTP flow is not RELATED in the firewall rules of the router.

I think this article concludes the chapter on VoIP initiation.

If you liked the article, you can treat me to a RedBull here