Fix reconnection logic to not exhaust retries prematurely#785
Fix reconnection logic to not exhaust retries prematurely#785TMysliwiec wants to merge 2 commits intolivekit:mainfrom
Conversation
…exhaust a lot of retries prematurely as with the default one - MissedTickBehavior::Burst
|
|
|
@TMysliwiec thank you for the contribution! could you sign the CLA so we can merge the fix? |
|
Already signed it. Maybe it takes some time to process that? When I click on the button it says it is already signed. |
|
@davidzhao not sure if there is anything more I can do in terms of the CLA. When I click recheck the status it reroutes me to this page and on their side it says it's already signed. |
|
I think I have mixed commits and submit using my 2 accounts. Let me recreate the PR |
|
@davidzhao created new one: #786 |
This is a fix for: #481, which was just closed again, likely due to inactivity.
Problem:
The reconnection logic doesn't properly honor the reconnection interval. The interval starts counting from engine creation, not from when reconnection begins. If the engine runs for over 50 seconds (RECONNECT_ATTEMPTS × RECONNECT_INTERVAL), the reconnection loop immediately executes all attempts without waiting, causing it to fail after a couple of seconds. That is because we are using Tokio's default interval, which utilizes MissedTickBehavior::Burst. Any accumulated ticks will fire immediately.
rust-sdks/livekit/src/rtc_engine/mod.rs
Line 52 in 0699735
Solution:
By changing the interval missed tick behavior to MissedTickBehavior::Delay, we can achieve the potentially initially desired await before making another reconnection attempt. That way, the reconnection window is significantly larger, thereby increasing the likelihood of reconnection.
Reproduction steps:
The problem is quite simple to reproduce. You need to run the
basic_roomexample, wait a minute, disconnect the internet, and observe how fast the reconnection attempts are exhausted.