Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RiakCS get/put object "overload" error #1340

Open
tartemov opened this issue Jul 18, 2017 · 1 comment
Open

RiakCS get/put object "overload" error #1340

tartemov opened this issue Jul 18, 2017 · 1 comment

Comments

@tartemov
Copy link

We use RiakCS 2.1.0 version in our project. Our cluster handle about 600 requests per second.
But last month we have these errors in RiakCS error.log

2017-07-18 15:40:01.586 [error] <0.3884.892> gen_fsm <0.3884.892> in state waiting_update_command terminated with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 2017-07-18 15:40:01.586 [error] <0.3884.892> CRASH REPORT Process <0.3884.892> with 1 neighbours exited with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in gen_fsm:terminate/7 line 622 2017-07-18 15:40:01.586 [error] <0.1575.0> Supervisor riak_cs_put_fsm_sup had child undefined started with {riak_cs_put_fsm,start_link,undefined} at <0.1266.888> exit with reason no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in context child_terminated 2017-07-18 15:40:01.587 [error] <0.22414.830> gen_server <0.22414.830> terminated with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 2017-07-18 15:40:01.587 [error] <0.5694.893> Webmachine error at path "/buckets/integration/objects/7e3e3fab-2345-4e67-890b-7d7381e5403f" : {error,{exit,{{function_clause,[{riakc_obj,get_update_metadata,[<<"overload">>],[{file,"src/riakc_obj.erl"},{line,289}]},{riak_cs_manifest,manifests_from_riak_object,1,[{file,"src/riak_cs_manifest.erl"},{line,62}]},{riak_cs_manifest_fsm,update,6,[{file,"src/riak_cs_manifest_fsm.erl"},{line,389}]},{riak_cs_manifest_fsm,get_and_update,4,[{file,"src/riak_cs_manifest_fsm.erl"},{line,319}]},{riak_cs_manifest_fsm,waiting_update_command,3,[{file,"src/riak_cs_manifest_fsm.erl"},{line,225}]},{gen_fsm,...},...]},...},...}} in gen_fsm:sync_send_event/3 line 214 2017-07-18 15:40:01.587 [error] <0.22414.830> CRASH REPORT Process <0.22414.830> with 0 neighbours exited with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in gen_server:terminate/6 line 744 2017-07-18 15:40:02.299 [error] <0.3194.895> gen_fsm <0.3194.895> in state waiting_update_command terminated with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 2017-07-18 15:40:02.299 [error] <0.3194.895> CRASH REPORT Process <0.3194.895> with 1 neighbours exited with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in gen_fsm:terminate/7 line 622 2017-07-18 15:40:02.300 [error] <0.1575.0> Supervisor riak_cs_put_fsm_sup had child undefined started with {riak_cs_put_fsm,start_link,undefined} at <0.30674.898> exit with reason no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in context child_terminated 2017-07-18 15:40:02.300 [error] <0.5694.893> Webmachine error at path "/buckets/integration/objects/7e3e3fab-2345-4e67-890b-7d7381e5403f" : {error,{exit,{{function_clause,[{riakc_obj,get_update_metadata,[<<"overload">>],[{file,"src/riakc_obj.erl"},{line,289}]},{riak_cs_manifest,manifests_from_riak_object,1,[{file,"src/riak_cs_manifest.erl"},{line,62}]},{riak_cs_manifest_fsm,update,6,[{file,"src/riak_cs_manifest_fsm.erl"},{line,389}]},{riak_cs_manifest_fsm,get_and_update,4,[{file,"src/riak_cs_manifest_fsm.erl"},{line,319}]},{riak_cs_manifest_fsm,waiting_update_command,3,[{file,"src/riak_cs_manifest_fsm.erl"},{line,236}]},{gen_fsm,...},...]},...},...}} in gen_fsm:sync_send_event/3 line 214 2017-07-18 15:40:02.300 [error] <0.32743.900> gen_server <0.32743.900> terminated with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 2017-07-18 15:40:02.300 [error] <0.32743.900> CRASH REPORT Process <0.32743.900> with 0 neighbours exited with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289 in gen_server:terminate/6 line 744

How to determinate what is cause of "overload" error?

@Bob-The-Marauder
Copy link

Riak CS logs indicate various Riak KV nodes entering overload state. This state is triggered by the {fsm_limit, 50000} configuration option in the Riak KV's app.config or max_concurrent_requests in Riak KV's riak.conf (this setting is hidden by default). The logic states that if more than 50000 fsms are active on a local Riak node, the node will return <<"overload">> for all requests. In Riak CS we see the following logs indicating this return state:

2017-07-18 15:40:01.586 [error] <0.3884.892> gen_fsm <0.3884.892> in state waiting_update_command terminated with reason: no function clause matching riakc_obj:get_update_metadata(<<"overload">>) line 289

This error will return a 503 to the client. Once the fsm count on the Riak KV node is below 50000 the node exits overload state and returns to normal operation. You can of course increase max_concurrent_requests above 50,000, in fact all the way to "infinite" is possible but removes overload protection. Note that if you do plan to increase max_concurrent_requests, please make sure to also set erlang.process_limit to at least three times the value of max_concurrent_requests.

Given the comparatively low load you have, it may be related to #828 but the chances are that a rolling restart should correct your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants