You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WebRTC is now quite mature, with stable playback and the protocol already being an RFC. There are also quite a few corresponding open-source projects. However, I believe that WebRTC still lacks a high-performance, simple and easy-to-use server. I have analyzed the existing servers before and found various issues. SRS has a great opportunity to solve these problems.
There is an improved way to merge multiple NALUs into one since the protocol layer generally does not need to understand NALUs.
For example:
In RTMP, a NALU is generally in IBMF format, where xxxx represents the four-byte size.
xxxx
If there are two NALUs in a frame:
NALU-A(1B header, xB payload)
NALU-B(1B header, xB payload)
SRS will parse it into SrsVideoFrame.samples:
NALU-A(1B header, xB payload)
NALU-B(1B header, xB payload)
It can be packed into one FUA:
STAP(SPS+PPS) // if IDR
FUA(FU Indicator, FU header, xB payload + 001 + NALUB)
Taking the plaintext RTP in the attached file as an example:
No.57 to 61, these 5 RTP packets have the same timestamp and they are 5 NALUs of a B-frame.
This NALU of No.57 is 54 bytes long and has the following HEX values: 41 9a 20 16 3c ... 02 98 f7 f0
This NALU of No.58 is 104 bytes long and has the following HEX values: 41 01 82 68 80 ... 3a 94 e6 11 7b c0
In general, each NALU is usually sent as a separate RTP packet (unless it exceeds the maximum size for FUA). As seen above, there are 5 RTP packets. According to our algorithm, we can concatenate the last 4 packets using Annex B to the end of the first NALU, for example:
No.57 0 0 1 No.58 0 0 1 No.59 ......
As shown in the following illustration:
No.63, this timestamp is the same as the one above, and the content is the concatenation of the previous 5 packets using 0 0 1.
Note: Linux kernel 4.18.0 and above support GSO. SRS will automatically detect the kernel version. The packets below are captured with GSO enabled and appear the same as when GSO is disabled. As shown in the packet capture below.
You can use the tool to analyze it: ./scripts/perf_gso.py http://localhost:1985/api/v1/perf.
No GSO, Fragment at Source
Without turning on GSO, when receiving packets from the source, the messages are divided into RTP packets, and the statistics are as follows:
Note: It can be seen that the number of RTP packets is 1.6 times higher than RTMP. If each packet has to be processed by the kernel, it will have a significant impact on performance.
No GSO, Fragment at Connection
Without enabling GSO, when sending packets (Connection), the messages are divided into RTP packets. The statistics are as follows:
Note: As you can see, it is similar to the previous one. The packet distribution has little impact on the Source and Connection.
GSO, Fragment at Connection
When GSO is enabled, the messages are divided into RTP packets during packet transmission (Connection), and the statistical data is as follows:
Note: It can be seen that after enabling GSO, the number of packets passing through the kernel is fewer than RTMP, and performance is improved. In reality, GSO does not reduce the number of RTP packets, but it can reduce the packets passing through the kernel, so we consider the number of packets to be less.
GSO, Larger FU-Payload
Previously, the length of FU Payload was 1200, and it was changed to 1300, referring to bfc70d64 and b91e07f4.
After the modification, the maximum size of IP packets is 1356 bytes, which is smaller than the 1500 bytes MTU. From the results, it can be seen that the RTP packet size has decreased from 1.56 times to 1.49 times, and the GSO fragmentation is not affected.
GSO, Padding Packets
There are usually more audio packets, and sometimes the difference is not significant. For example, there are three packets: 257 256 255. If some padding can be added, then they can be sent as a single GSO packet, referring to c95a8517.
From a data perspective, by enabling padding (127), you can lower the GSO packet multiplier from 0.74 to 0.67, and improve the efficiency from 0.67 to 0.74.
Note: Enabling padding does not significantly increase the payload. It is at the level of one in ten million (N), as padding is only added to packets when GSO is enabled.
Note: Padding is part of the RTP standard protocol, as referred in RTP Fixed Header Fields. Padding may be necessary for encryption algorithms with fixed block sizes or for transmitting multiple RTP packets in a lower-layer protocol data.
Activity
Jianru-Lin commentedon May 29, 2015
PengZheng commentedon Sep 11, 2016
823639792 commentedon Sep 22, 2017
[-]支持WebRTC[/-][+]Support WebRTC[/+]winlinvip commentedon Jan 19, 2020
For #307, #1070, define FLV CodecID for AV1 and Opus. 3.0.101
winlinvip commentedon Mar 14, 2020
winlinvip commentedon Mar 14, 2020
candidate is the address that RTC provides services to the outside. In certain deployment scenarios, we need to configure it in multiple ways.
This is also the only item that RTC must confirm the configuration for, while others can be left at their default settings.
For detailed configuration instructions, please refer to: https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#config-candidate
TRANS_BY_GPT3
For #1638, #307, rtc conf support ENV.
91 remaining items
For #307, RTC RTP support padding
winlinvip commentedon Apr 13, 2020
There is an improved way to merge multiple NALUs into one since the protocol layer generally does not need to understand NALUs.
For example:
Taking the plaintext RTP in the attached file as an example:
41 9a 20 16 3c ... 02 98 f7 f0
41 01 82 68 80 ... 3a 94 e6 11 7b c0
rtc-plaintext-multiple-slices.pcapng.zip
In general, each NALU is usually sent as a separate RTP packet (unless it exceeds the maximum size for FUA). As seen above, there are 5 RTP packets. According to our algorithm, we can concatenate the last 4 packets using Annex B to the end of the first NALU, for example:
As shown in the following illustration:
0 0 1
.rtc-plaintext-multiple-slices-as-one-NALU.pcapng.zip
SRS defaults to enabling the merging of NALUs, and it can be disabled through configuration:
TRANS_BY_GPT3
For #307, support merge multiple slices/NALUs to one NALU/RTP/FUA
winlinvip commentedon Apr 13, 2020
Linux GSO can delay the segmentation of multiple UDP packets to improve performance, refer to UDP GSO principle and application.
rtc-plaintext-linux4-gso-ok.pcapng.zip
rtc-plaintext-multiple-slices-as-one-NALU.pcapng.zip
rtc-plaintext-linux3-gso-invalid.pcapng.zip
SRS has added an API for packet performance analysis: http://localhost:1985/api/v1/perf.
You can use the tool to analyze it:
./scripts/perf_gso.py http://localhost:1985/api/v1/perf
.No GSO, Fragment at Source
Without turning on GSO, when receiving packets from the source, the messages are divided into RTP packets, and the statistics are as follows:
No GSO, Fragment at Connection
Without enabling GSO, when sending packets (Connection), the messages are divided into RTP packets. The statistics are as follows:
GSO, Fragment at Connection
When GSO is enabled, the messages are divided into RTP packets during packet transmission (Connection), and the statistical data is as follows:
GSO, Larger FU-Payload
Previously, the length of FU Payload was 1200, and it was changed to 1300, referring to bfc70d64 and b91e07f4.
After the modification, the maximum size of IP packets is 1356 bytes, which is smaller than the 1500 bytes MTU. From the results, it can be seen that the RTP packet size has decreased from 1.56 times to 1.49 times, and the GSO fragmentation is not affected.
GSO, Padding Packets
There are usually more audio packets, and sometimes the difference is not significant. For example, there are three packets:
257 256 255
. If some padding can be added, then they can be sent as a single GSO packet, referring to c95a8517.From a data perspective, by enabling padding (127), you can lower the GSO packet multiplier from 0.74 to 0.67, and improve the efficiency from 0.67 to 0.74.
TRANS_BY_GPT3
For #307, support linux GSO for RTC
For #307, allow dedicated cache for GSO.
For #307, remove dedicate GSO cache
70 remaining items