Glossary · SRTP
What is SRTP?
SRTP (Secure Real-time Transport Protocol) is the encrypted profile of RTP, defined in RFC 3711, that wraps real-time audio and video streams in AES encryption for confidentiality and HMAC-SHA1 for integrity. SRTP secures the media path of a VoIP call — the actual voice packets — while SIP-over-TLS secures the signaling path. Together they prevent anyone with packet capture on the wire from listening in or tampering with the audio.
What SRTP protects
SRTP adds three guarantees to plain RTP:
- Confidentiality: payloads are encrypted with AES (typically AES-128-CM in counter mode). An attacker who captures packets cannot decode the audio.
- Integrity: each packet carries an HMAC-SHA1 authentication tag. Tampering invalidates the tag and the packet is rejected.
- Replay protection: a sliding window of accepted sequence numbers prevents an attacker from re-injecting captured packets.
The RTP header itself stays in the clear so intermediate elements (jitter buffers, SBCs, monitoring tools) can still operate. Only the payload is encrypted.
How SRTP key exchange works
SRTP itself does not negotiate keys — it relies on a separate handshake. Two approaches dominate:
- SDES (SDP Security Descriptions): the AES key is exchanged in plaintext inside the SDP body of the SIP message. Security depends entirely on the signaling channel being protected by SIP-over-TLS. Common in enterprise SIP trunking.
- DTLS-SRTP: endpoints run a DTLS handshake over the media path itself and derive SRTP keys from the negotiated session. Required by WebRTC; provides true end-to-end media security independent of the signaling path.
DTLS-SRTP is the modern preference because it does not depend on SIP-over-TLS being correctly deployed on every hop. SDES is still common in PBX and trunking deployments where DTLS-SRTP is not universally implemented.
SRTP vs. RTP vs. ZRTP
Three related protocols, often confused:
- RTP: the basic real-time media transport. No encryption.
- SRTP: RTP with payload encryption and integrity. Requires an external key exchange (SDES or DTLS-SRTP).
- ZRTP: an end-to-end media-encryption protocol that negotiates SRTP keys directly between endpoints via Diffie-Hellman, without trusting the signaling path. Used in some privacy-focused softphones but not widely deployed in enterprise.
For enterprise VoIP, the modern default is SIP-over-TLS for signaling plus SRTP (with DTLS-SRTP or SDES) for media.
Why SRTP is increasingly required
A handful of trends have pushed SRTP from “nice to have” to baseline:
- Compliance frameworks — HIPAA, PCI-DSS, SOC 2, and similar mandate media-in-transit encryption for any call carrying regulated data.
- WebRTC ubiquity — every browser-based call already uses DTLS-SRTP; the rest of the stack needs to match.
- Network exposure — calls that traverse the public internet, mobile data, or shared cloud paths are easy targets for passive capture if media is in plaintext.
- Carrier expectations — premium SIP trunking providers increasingly require SRTP on interconnects.
Plain RTP still works, but is fast becoming an audit finding rather than a default.
SRTP frequently asked questions
What does SRTP stand for?
SRTP stands for Secure Real-time Transport Protocol. It is defined in RFC 3711 as a profile of RTP that adds AES encryption for confidentiality and HMAC-SHA1 for integrity protection of real-time audio and video media on IP networks.
What is the difference between RTP and SRTP?
RTP carries real-time media payloads in plaintext. SRTP carries the same payloads encrypted under AES and authenticated under HMAC-SHA1. SRTP also adds replay protection. The wire format is nearly identical; the difference is whether anyone on the path can listen or tamper.
What is the difference between SDES and DTLS-SRTP?
SDES exchanges the SRTP encryption key inside the SDP body of the SIP signaling message — secure only if the signaling channel is encrypted with TLS. DTLS-SRTP runs a DTLS handshake on the media path itself and derives SRTP keys independently of signaling. DTLS-SRTP is the WebRTC standard.
Is SRTP required for HIPAA or PCI compliance?
Both HIPAA and PCI-DSS require encryption of regulated data in transit. For voice calls that carry protected health information or payment-card data, the practical implementation is SIP-over-TLS for signaling plus SRTP for media. Plain RTP on those calls is almost certainly an audit finding.
See how DialPhone fits
DialPhone enables SRTP on every voice path by default — SIP-over-TLS signaling, DTLS-SRTP or SDES media encryption, and end-to-end MOS reporting — so call audio is protected without per-call configuration and without exposing the operations team to the audit finding that plain RTP increasingly is.