Skip to content

DTLS error when using 4096 key/cert #252

Closed
@saghul

Description

@saghul
Contributor

I'm getting this:

[1675940553] DTLS timeout on component 1 of stream 1, retransmitting
[WARN] [1675940553] The DTLS stack is trying to send a packet of 2236 bytes, this may be larger than the MTU and get dropped!

After switching to a 4096 key to avoid #251

Eventually all components timeout and fail.

Activity

saghul

saghul commented on Jun 4, 2015

@saghul
ContributorAuthor

I checked and 2048 has the same problem. 1024 certs work well.

lminiero

lminiero commented on Jun 4, 2015

@lminiero
Member

That's a well known issue, and has been discussed several times on the google group. The reason for this is that the DTLS stack in OpenSSL does not fragment packets that exceed the MTU when needed, when a BIO is used instead of UDP directly.

Normally, the DTLS stack should try sending the "huge" packet, and when the timeout fires (because the message never reached the other side), fragment the original packet in smaller ones and send those instead. This apparently only works when you use the DTLS stack over UDP directly, that is, when OpenSSL has more control over the transport. When a BIO is used, as in Janus because we need to handle the transport ourselves (libnice), this never happens, and as such the message containing the too large certificate is dropped somewhere in the network and never reaches the destination, thus leading to a handshake failure.

I tried investigating ways on how to fix this and force the right behaviour somehow, but never managed to get it to work as expected. As such, the only solution as of now is to rely on "smaller" certificates. Hopefully someone will find the proper solution in the near future: not sure, for instance, if BoringSSL handles that properly, as I never managed to use that as a stack in place of OpenSSL.

I was documenting this guideline in an additional .md file in the certs folder, also to account for the feedback in #251. It will basically say that yes, as of now, certificates must be 1024 bits. Any additional text or clarification (or even example if you have any) is more than welcome!

saghul

saghul commented on Jun 4, 2015

@saghul
ContributorAuthor

That's a well known issue, and has been discussed several times on the google group. The reason for this is that the DTLS stack in OpenSSL does not fragment packets that exceed the MTU when needed, when a BIO is used instead of UDP directly.

Oh, sorry about that! I looked through the open issue and didn't find anything related. Will search the group next time!

I was documenting this guideline in an additional .md file in the certs folder, also to account for the feedback in #251. It will basically say that yes, as of now, certificates must be 1024 bits. Any additional text or clarification (or even example if you have any) is more than welcome!

Cool, I'll have a look once it lands!

lminiero

lminiero commented on Jun 4, 2015

@lminiero
Member

Just to add some more details to what I explained in my previous posts, the DTLS stack in OpenSSL does indeed take care of fragmenting the packets according to what is assumed to be the MTU (1472 by default). The problem is that the mem BIO ignores that fragmentation info completely, and so, when you do an BIO_read, makes available at the application the whole message anyway. This results in the whole buffer being passed to nice_agent_send, which means it's just as not fragmenting anything. You can verify this by using, e.g., a 4096 bits certificate, and capture the DTLS traffic with Wireshark: you'll see that the message is recognized as composed of not only multiple messages, but also fragments.

My guess is that the mem BIO simply isn't smart enough to inspect the actual messages being transported: it probably doesn't care if it's DTLS, TLS or whatever else, and just acts as an opaque transport, which means that when the internal stack writes the fragmented packets in a bunch, that's what you get when you get the pending data to send. Not sure if this means we'll have to inspect the payload ourselves, e.g., do a BIO_read, process the packet to see if there are fragments (length+offset), and if so send each of them separately through libnice. This might probably do it, although it sounds a bit silly that the application is required to do so, especially considering that the application is not assumed to be aware of the protocol specifics in the first place (that's why you rely on a library usually).

lminiero

lminiero commented on Jun 5, 2015

@lminiero
Member

I asked about this on the OpenSSL mailing list, and I already received useful feedback:

https://mta.openssl.org/pipermail/openssl-users/2015-June/001503.html

They basically confirm that the mem BIO has not enough knowledge to handle this, specifically as to datagram semantics, for instance. A suggestion they made is to write a BIO filter that wraps the mem BIO in order to handle fragmentation automatically. I'll try to do that ASAP.

lminiero

lminiero commented on Jun 5, 2015

@lminiero
Member

@saghul can you check if #254 works for you?

saghul

saghul commented on Jun 5, 2015

@saghul
ContributorAuthor

Great, will check it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @saghul@lminiero

        Issue actions

          DTLS error when using 4096 key/cert · Issue #252 · meetecho/janus-gateway