Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Reconnect Crashes MQTT when client.reconnect() is called #465

Open
lizziemac opened this issue Dec 6, 2023 · 7 comments
Open
Milestone

Comments

@lizziemac
Copy link

Issue Description

When automatic reconnection is enabled in the MQTT client, manually calling client.reconnect() after a disconnect causes the application to crash. This issue occurs specifically when the client doesn't automatically reconnect after a manual disconnect. Disabling automatic reconnection prevents this crash.

Steps to Reproduce

  1. Enable automatic reconnection in the MQTT client.
auto connOpts = mqtt::connect_options_builder()
                    .clean_session(false)
                    .automatic_reconnect(seconds(2), seconds(30))
                    .finalize();
  1. Manually disconnect the client.
cli->disconnect()->wait();
  1. Attempt to manually reconnect using client.reconnect().
cli->reconnect()->wait();

Expected Behavior

The client should successfully reconnect without causing the application to crash, regardless of whether automatic reconnection is enabled or not.

Actual Behavior

The application crashes when client.reconnect() is called with automatic reconnection enabled. Backtrace from a modified multithr_pub_sub.cpp:

0x0000007ff7b1b7e8 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x0000007ff7b1b7e8 in raise () from /lib/libc.so.6
#1  0x0000007ff7b08dd4 in abort () from /lib/libc.so.6
#2  0x0000007ff7dc0cf8 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3  0x0000007ff7dbe85c in ?? () from /usr/lib/libstdc++.so.6
#4  0x0000007ff7dbe8c0 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x0000007ff7dbebb0 in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x00000000004127c0 in mqtt::async_client::reconnect() ()
#7  0x0000000000406210 in publisher_func(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>) ()
#8  0x000000000040f288 in void std::__invoke_impl<void, void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> >(std::__invoke_other, void (*&&)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>&&, std::shared_ptr<multithr_counter>&&) ()
#9  0x000000000040f174 in std::__invoke_result<void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> >::type std::__invoke<void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> >(void (*&&)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>&&, std::shared_ptr<multithr_counter>&&) ()
#10 0x000000000040f10c in void std::thread::_Invoker<std::tuple<void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> > >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) ()
#11 0x000000000040f098 in std::thread::_Invoker<std::tuple<void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> > >::operator()() ()
#12 0x000000000040ec58 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter>), std::shared_ptr<mqtt::async_client>, std::shared_ptr<multithr_counter> > > >::_M_run() ()
#13 0x0000007ff7de9eec in ?? () from /usr/lib/libstdc++.so.6
#14 0x0000007ff7fa5478 in ?? () from /lib/libpthread.so.0
#15 0x0000007ff7bb4c1c in ?? () from /lib/libc.so.6

Additional Context

I have a callback for when the network is disconnected, and then reconnected. If I don't force the MQTT client to disconnect, it will never update to think it is disconnected, and therefore never retry connecting. As a result, I want to make sure that I do reconnect. I was hoping to avoid re-instantiating all of my options again with a standard client->connect() but if that is the suggested/correct approach, I can do that.

@fpagliughi
Copy link
Contributor

What version of the Paho C client are you using? Something that sounds just like this was recently fixed in the C v1.3.13 client.

@fpagliughi
Copy link
Contributor

Oh, actually, this may be different. What platform? What compiler/version?

I can’t say that the expected behavior would be exactly what you want. Perhaps the library might ignore an explicit request to reconnect if the automatic option is turned on? I’d have to look at the implementation.

But either way, it definitely should never hard crash.

@lizziemac
Copy link
Author

lizziemac commented Dec 7, 2023

As of this morning, I've reproduced it on two separate OS's, both on ARM platforms. If you want to reproduce, I just modified the multithr_pub_sub.cpp example (attached with my changes) - multithr_pubsub.txt

MacOS (just tested this morning)

  • Using MacPorts with MQTTVersion 1.3.13
  • Compiler is clang
$ clang --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: arm64-apple-darwin23.1.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
  • Backtrace:
* thread #4, stop reason = signal SIGABRT
  * frame #0: 0x000000018ba0111c libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000018ba38cc0 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000018b948a40 libsystem_c.dylib`abort + 180
    frame #3: 0x000000018b9f06d8 libc++abi.dylib`abort_message + 132
    frame #4: 0x000000018b9e07ac libc++abi.dylib`demangling_terminate_handler() + 320
    frame #5: 0x000000018b68b8a4 libobjc.A.dylib`_objc_terminate() + 160
    frame #6: 0x000000018b9efa9c libc++abi.dylib`std::__terminate(void (*)()) + 16
    frame #7: 0x000000018b9f2a48 libc++abi.dylib`__cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 36
    frame #8: 0x000000018b9f29f4 libc++abi.dylib`__cxa_throw + 140
    frame #9: 0x00000001004c7e94 libpaho-mqttpp3.1.dylib`mqtt::async_client::reconnect() + 324
    frame #10: 0x0000000100004588 harbor.out`publisher_func(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>) + 292
    frame #11: 0x000000010000e540 harbor.out`decltype(static_cast<void (*>(fp)(static_cast<std::__1::shared_ptr<mqtt::async_client>>(fp0), static_cast<std::__1::shared_ptr<multithr_counter>>(fp0))) std::__1::__invoke<void (*)(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>), std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter> >(void (*&&)(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>), std::__1::shared_ptr<mqtt::async_client>&&, std::__1::shared_ptr<multithr_counter>&&) + 84
    frame #12: 0x000000010000e4a0 harbor.out`void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>), std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>), std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter> >&, std::__1::__tuple_indices<2ul, 3ul>) + 76
    frame #13: 0x000000010000d91c harbor.out`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter>), std::__1::shared_ptr<mqtt::async_client>, std::__1::shared_ptr<multithr_counter> > >(void*) + 84
    frame #14: 0x000000018ba39034 libsystem_pthread.dylib`_pthread_start + 136

Linux (initial PR) - Debian Bullseye

  • running off of master branch (I probably shouldn't, and will likely address after this to run on 1.3.13 or greater)
  • Building with CMake & clang under the hood, version is what is provided in Bullseye:
[bullseye (oldstable)](https://packages.debian.org/bullseye/clang) (devel): C, C++ and Objective-C compiler (LLVM based), clang binary
1:11.0-51+nmu5: amd64 arm64 armel armhf i386 mips64el mipsel ppc64el s390x

@fpagliughi
Copy link
Contributor

Awesome, thanks. I'll have a look. Probably over the weekend.

But in general, think what might work better for you - automatic or manual updates. It's not too tough to do it manually; many of the example apps demonstrate it. But the auto/backoff algorithm is usable in most situations.

@lizziemac
Copy link
Author

Sounds good, thanks! I was having some issues with the connection_lost callback being called for manual (yet autoreconnect working) but I'll take another stab at it.

@fpagliughi
Copy link
Contributor

fpagliughi commented Dec 8, 2023

I haven't figured out if the C lib fires the "connection lost" callback if it gets a clean disconnect from the server (i.e. receives a DISCONNECT packet). At first I assumed it did, then I thought it didn't. Now I'm not sure. I'm currently trying to set up the test broker to figure it out and/or check the C code to confirm.

That means you may need to set an on_disconnect handler via set_disconnected_handler() to make sure you're always being informed when the connection is lost.

I'll confirm ASAP, since I need to resolve #458.

@lizziemac
Copy link
Author

Thanks! I'll look into implementing that

@fpagliughi fpagliughi modified the milestones: v1.4, v1.5 Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants