taskprocessor: Enable subsystems and overload by subsystem

To prevent one subsystem's taskprocessors from causing others
to stall, new capabilities have been added to taskprocessors.

* Any taskprocessor name that has a '/' will have the part
  before the '/' saved as its "subsystem".
  Examples:
  "sorcery/acl-0000006a" and "sorcery/aor-00000019"
  will be grouped to subsystem "sorcery".
  "pjsip/distributor-00000025" and "pjsip/distributor-00000026"
  will bn grouped to subsystem "pjsip".
  Taskprocessors with no '/' have an empty subsystem.

* When a taskprocessor enters high-water alert status and it
  has a non-empty subsystem, the subsystem alert count will
  be incremented.

* When a taskprocessor leaves high-water alert status and it
  has a non-empty subsystem, the subsystem alert count will be
  decremented.

* A new api ast_taskprocessor_get_subsystem_alert() has been
  added that returns the number of taskprocessors in alert for
  the subsystem.

* A new CLI command "core show taskprocessor alerted subsystems"
  has been added.

* A new unit test was addded.

REMINDER: The taskprocessor code itself doesn't take any action
based on high-water alerts or overloading.  It's up to taskprocessor
users to check and take action themselves.  Currently only the pjsip
distributor does this.

* A new pjsip/global option "taskprocessor_overload_trigger"
  has been added that allows the user to select the trigger
  mechanism the distributor uses to pause accepting new requests.
  "none": Don't pause on any overload condition.
  "global": Pause on ANY taskprocessor overload (the default and
  current behavior)
  "pjsip_only": Pause only on pjsip taskprocessor overloads.

* The core pjsip pool was renamed from "SIP" to "pjsip" so it can
  be properly grouped into the "pjsip" subsystem.

* stasis taskprocessor names were changed to "stasis" as the
  subsystem.

* Sorcery core taskprocessor names were changed to "sorcery" to
  match the object taskprocessors.

Change-Id: I8c19068bb2fc26610a9f0b8624bdf577a04fcd56
This commit is contained in:
George Joseph
2019-02-15 11:53:50 -07:00
parent 8681fc9db7
commit c2adeb9dc2
13 changed files with 523 additions and 10 deletions

View File

@@ -1908,6 +1908,26 @@
<configOption name="send_contact_status_on_update_registration" default="no">
<synopsis>Enable sending AMI ContactStatus event when a device refreshes its registration.</synopsis>
</configOption>
<configOption name="taskprocessor_overload_trigger">
<synopsis>Trigger scope for taskprocessor overloads</synopsis>
<description><para>
This option specifies the trigger the distributor will use for
detecting taskprocessor overloads. When it detects an overload condition,
the distrubutor will stop accepting new requests until the overload is
cleared.
</para>
<enumlist>
<enum name="global"><para>(default) Any taskprocessor overload will trigger.</para></enum>
<enum name="pjsip_only"><para>Only pjsip taskprocessor overloads will trigger.</para></enum>
<enum name="none"><para>No overload detection will be performed.</para></enum>
</enumlist>
<warning><para>
The "none" and "pjsip_only" options should be used
with extreme caution and only to mitigate specific issues.
Under certain conditions they could make things worse.
</para></warning>
</description>
</configOption>
</configObject>
</configFile>
</configInfo>
@@ -5298,7 +5318,7 @@ static int load_module(void)
/* The serializer needs threadpool and threadpool needs pjproject to be initialized so it's next */
sip_get_threadpool_options(&options);
options.thread_start = sip_thread_start;
sip_threadpool = ast_threadpool_create("SIP", NULL, &options);
sip_threadpool = ast_threadpool_create("pjsip", NULL, &options);
if (!sip_threadpool) {
goto error;
}