The res_pjsip_mwi previously required a reload to set up the proper
subscriptions to allow unsolicited MWI to work. This change
makes it so the act of registering will also cause this to occur.
This is particularly useful if realtime is involved as no reload
needs to occur within Asterisk to cause the MWI information
to get sent.
ASTERISK-25180 #close
Change-Id: Id847b47de4b8b3ab8858455ccc2f07b0f915f252
This resolves two observed race conditions.
First, a bit of background on what the Stasis application does:
1a Creates a stasis_app_control structure. This structure is linked into
a global container and can be looked up using a channel's unique ID.
2a Puts the channel in an event loop. The event loop can exit either
because the stasis_app_control structure has been marked done, or
because of some other factor, such as a hangup. In the event loop, the
stasis_app_control determines if any specific ARI commands need to be
run on the channel and will run them from this thread.
3a Checks if the channel is bridged. If the channel is bridged, then
ast_bridge_depart() is called since channels that are added to Stasis
bridges are always imparted as departable.
4a Unlink the stasis_app_control from the container.
When an ARI command is received by Asterisk, the following occurs
1b A thread is spawned to handle the HTTP request
2b The stasis_app_control(s) that corresponds to the channel(s) in the
request is/are retrieved. If the stasis_app_control cannot be
retrieved, then it is assumed that the channel in question has exited
the Stasis app or perhaps was never in Stasis in the first place.
3b A command is queued onto the stasis_app_control, and the channel's
event loop thread is signaled to run the command.
4b While most ARI commands do nothing further, some, such as adding or
removing channels from a bridge, will block until the command they
issued has been completed by the channel's event loop.
The first race condition that is solved by this patch involves a crash
that can occur due to faulty detection of the channel's bridged status
in step 3a. What can happen is that in step 2a, the event loop may run
the ast_bridge_impart() function to asynchronously place the channel
into a bridge, then immediately exit the event loop because the channel
has hung up. In step 3a, we would detect that the channel was not
bridged and would not call ast_bridge_depart(). The reason that the
channel did not appear to be bridged was that the depart_thread that is
spawned by ast_bridge_impart() had not yet started. That is the thread
where the channel is marked as being bridged. Since we did not call
ast_bridge_depart(), the Stasis application would exit, and then the
channel would be destroyed Then the depart_thread would start up and
try to manipulate the destroyed channel, causing a crash.
The fix for this is to switch from using ast_channel_is_bridged() to
checking the NULLity of ast_channel_internal_bridge_channel() to
determine if ast_bridge_depart() needs to be called. The channel's
internal bridge_channel is set when ast_bridge_impart() is called and
is NULLed by the call to ast_bridge_depart(). If the channel's internal
bridge_channel is non-NULL, then the channel must have been imparted
into the bridge and needs to be departed, even if the actual bridging
operation has not yet started. By departing the channel when necessary,
the thread that is running the Stasis application will block until the
bridge gives the okay that the depart_thread has exited.
The second race condition that is solved by this patch involves a leak
of HTTP handler threads. The problem was that step 2b would successfully
retrieve a stasis_app_control structure. Then step 2a would exit the
channel from the event loop due to a hangup. Steps 3a and 4a would
execute, and then finally steps 3b and 4b would. The problem is that at
step 4b, when attempting to add a channel to a bridge, the thread would
block forever since the channel would never execute the queued command
since it was finished with the event loop. This meant that the HTTP
handling thread would be leaked, along with any references that thread
may have owned (in my case, I was seeing bridges leaked).
The fix for this is to hone in better on when the channel has exited the
event loop. The stasis_app_control structure has an is_done field that
is now set at each point where the channel may exit the event loop. If
step 2b retrieves a valid stasis_app_control structure but the control
is marked as done, then the attempted operation exits immediately since
there will be nothing to service the attempted command.
ASTERISK-25091 #close
Reported by Ilya Trikoz
Change-Id: If66265b73b4c9f8f58599124d777fedc54576628
To prevent confusion I am removing the prefetch option until such
time as it is implemented. All other functionality, however, has
been implemented.
ASTERISK-25067
Change-Id: I9ce6aa3e5c6c5bc3c5baa8ff90fa036d73939895
This event was added some time ago in order to clarify when a channel
took the place of another channel in a parking lot. However, there was
no XML documentation added for the event. This patch adds the XML
documentation.
ASTERISK-24900 #close
Reported by Rusty Newton
Change-Id: I4cfe7777c4b94bbff91c9221c6096a7a02a92eac
Some phones send g.726 audio packed for AAL2, which differs from what is
recommended by RFC 3351. If Asterisk receives audio formatted as such when
negotiating g.726 then it sounds a bit distorted. Added an option to
res_pjsip_endpoint that allows g.726 negotiated audio to be treated as g.726
AAL2 packed.
ASTERISK-25158 #close
Reported by: Steve Pitts
Change-Id: Ie7e21f75493d7fe53e75e12c971e72f5afa33615
All send/receive processing for a SIP transaction needs to be done under
the same threadpool serializer to prevent reentrancy problems inside
pjproject when using an external DNS resolver to process messages for the
transaction.
* Add threadpool API call to get the current serializer associated with
the worker thread.
* Pick a serializer from a pool of default serializers if the caller of
res_pjsip.c:ast_sip_push_task() does not provide one.
This is a simple way to ensure that all outgoing SIP request messages are
processed under a serializer. Otherwise, any place where a pushed task is
done that would result in an outgoing out-of-dialog request would need to
be modified to supply a serializer. Serializers from the default
serializer pool are picked in a round robin sequence for simplicity.
A side effect is that the default serializer pool will limit the growth of
the thread pool from random tasks. This is not necessarily a bad thing.
* Made pjsip_resolver.c use the requesting thread's serializer to execute
the async callback.
* Made pjsip_distributor.c save the thread's serializer name on the
outgoing request tdata struct so the response can be processed under the
same serializer.
ASTERISK-25115 #close
Reported by: John Bigelow
Change-Id: Iea71c16ce1132017b5791635e198b8c27973f40a
* Fix query_set destruction before we are done kicking the queries off.
* Fixed no queries requested handling.
* Add empty queries request unit test.
* Added missing allocation check in ast_dns_query_set_add().
* Made initial pjsip resolving query vector slightly larger.
ASTERISK-25115
Reported by: John Bigelow
Change-Id: Ie8be8347d0992e93946d72b6e7b1299727b038f2
This patch fixes use-after-free bugs caught by AddressSanitizer.
1. PJSIP transport manager may decide to destroy transport on its own.
For example, when the contact registered via websocket has not renewed
its registration in time. The transport was destoyed, but the websocket
listener thread was still active until the socket closes, and then tried
to call transport_shutdown on transport that has been freed.
Also, the transport destructor accessed wstransport->rdata.tp_info.pool
right after freeing memory that contained wstransport itself.
This patch converts transport to an ao2 object, allowing it to be
refcounted, so that it is available until both websocket listener and
pjsip transport manager are finished with it.
2. The websocket listener deletes the last reference on websocket session
when the tcp connection is closed, and it gets destroyed, but
the transport manager may still use it, for example when disconnect
happens in the middle of a SIP transaction.
A new reference to websocket session has been added that is released
with the transport to prevent this.
ASTERISK-25096 #close
Reported by: Josh Kitchens
ASTERISK-24963 #close
Reported by: Badalian Vyacheslav
Change-Id: Idc0b63eb6e459c1ddfb2430127d34b3c4d8d373b
* Add some type casting so tv_usec can really be a long, instead of
some strange platform specific type.
* Add some .dylib style files to .gitignore.
* Switch from using -Xlinker to -Wl,. For [reasons unknown][], newer
versions of GCC, when compiling the Homebrew formula for Asterisk,
are not properly passing the -Xlinker options to the linker. Given
that -Wl, does exactly the [same thing][], and does it properly, this
patch changes the -Xlinker options to use -Wl, instead.
[reasons unknown]: http://bit.ly/1SUbEYx
[same thing]: https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html
Change-Id: Id5e6b3c6cc86282ea5fca630dc3991137c5bf4dd
This change implements the expire_on_reload option for memory caches.
If enabled and a reload is performed all objects within the cache
will be expired and the cache emptied.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: Id46aa1957d660556700e689e195eed57c989b85e
This change adds a CLI command which can perform memory cache thrashing as well
as unit tests which perform thrashing under the following configurations:
1. Low number of unique objects that go stale after 1 second
2. Low number of unique objects that expire after 1 second
3. Low number of unique objects which are constantly updated
4. Large number of unique objects which exceed a defined cache size
5. Large number of unique objects which exceed a defined cache size
that also expire and go stale rapidly
6. Large number of unique objects which expire and go stale rapidly
7. Large number of unique objects
For all of the above there are a large number of threads constantly
attempting to retrieve random objects and each test runs for a few
seconds.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: I8c8ceff977332c80ed4a31f10d694d48552b2f78
This change adds a testsuite event for when a refresh occurs.
This is useful as it provides a guaranteed mechanism of knowing when
it has occurred instead of waiting an arbitrary amount of time.
ASTERISK-25067
Reported by: Matt Jordan
Change-Id: Iaa6b8d2d6bab7f99ee08e1c8908b8272a8987e65
It is possible to receive incoming requests or responses after the channel
on an ast_sip_session has been destroyed and NULLed out. Handlers of these
sorts of requests or responses need to be prepared for the possibility
that the channel is NULL or else they could cause a crash.
While several places have been amended to deal with NULL channels, there
were still a couple of places that needed updating.
res_pjsip_dtmf_info.c: When handling incoming INFO requests, we need to
return early if there is no channel on the session.
res_pjsip_session.c: When handling a 302 response, we need to stop the
redirecting attempt if there is no channel on the session.
ASTERISK-25148 #close
reported by Mark Michelson
Change-Id: Id1a75ffc3d0eaa168b0b28188fb54d6cf9fc47a9
contact_apply_handler calls ast_res_pjsip_find_or_create_contact_status
to force the creation of a contact_status object whenever a new
contact is added but it didn't unref the returned object.
Added an ao2_cleanup(status) to plug the leak.
ASTERISK-25141
Change-Id: Icc1401cae142855a1abc86ab5179dfb3ee861c40
Reported-by: Corey Farrell
app_control_register_rule and app_control_unregister_rule lock/unlock
the queue, which is a mutating operation according to the
ao2_lock/_unlock prototype. Depending on the specific (implicit) casts
in SCOPED_LOCK and RAII_VAR, the compiler may warn or not. As the only
callers of those functions do not have the const, get consistent results
by just dropping it.
Change-Id: Ib9e6296155a39bc5d627142a3828180c3cfe8fbb
When the remote peer requires authentication for in-dialog requests then
re-INVITEs to the peer cause the call to be disconnected and other
in-dialog requests to the peer like MESSAGE just don't go through.
* Made session_inv_on_tsx_state_changed() handle in-dialog authentication
for re-INVITEs and other methods. Initial INVITEs cannot be handled here
because the INVITE transaction must be restarted earlier.
* Pulled needed code from res/res_pjsip/pjsip_outbound_auth.c in
preparation for removing the file. The generic outbound authentication
code did not work as well as anticipated.
* Created outbound_invite_auth() to only handle initial outbound INVITEs.
Re-INVITEs cannot be handled here. The re-INVITE transaction is still in
progress and the PJSIP library cannot handle the overlapping INVITE
transactions. Other method types should not be handled here as this code
only works on outgoing calls and we need to handle incoming and outgoing
calls.
ASTERISK-25131 #close
Reported by: Richard Mudgett
Change-Id: I12bdd7ddccc819b4ce4b091e826d1e26334601b0
The loop to find the first available contact of an endpoint grabbed
contact from the iterator, then checked for offline state. This
caused the first contact after the state was found to leak a reference.
ASTERISK-25141
Change-Id: Id0f1d87410fc63742db0594eb4b18b36e99aec08
When permanent_uri_handler was creating the contact status
object for each contact, it wasn't unreffing it at the
end of the loop.
ASTERISK-25141 #close
Reported-by: Corey Farrell
Change-Id: I7bb127994677bb3d459f87952f8425c9b9967b12
This change adds the following CLI commands and AMI actions:
sorcery memory cache show
sorcery memory cache dump
sorcery memory cache expire
sorcery memory cache stale
SorceryMemoryCacheExpire
SorceryMemoryCacheExpireObject
SorceryMemoryCacheStale
SorceryMemoryCacheStaleObject
These allow both examination and manipulation of sorcery memory
caches from external sources.
Cached objects can be explicitly expired from a cache or marked
as stale. If expired they are immediately removed. If marked as
stale they will be background refreshed when next retrieved.
ASTERISK-25067
Reported by Matt Jordan
Change-Id: I68e03cfd8c34b5e07f4b6ee4fd93a3f4a00a3d9e
This change introduces a check of object_lifetime_stale when retrieving
cached objects. If the amount of time the object has been in the cache
exceeds the lifetime, then a task is scheduled to update the cached
object based on an object retrieved from other sorcery wizards instead.
To prevent the cached object from being retrieved during a refresh,
thread-local storage is used to mark the thread as being a stale object
update. This results in the cache returning no object, leading to
sorcery querying other wizards for the object instead.
A test has been added for stale objects as well. This test ensures that
stale objects are retrieved the same as freshly-cached objects. The test
also ensures that after an object is stale, changes in the backend are
reflected in the cache, to include if the object has been deleted from
the backend.
ASTERISK-25067
Reported by Matt Jordan
Change-Id: I9bd7c049adf6939bfe2899f393c2bfbbf412d217
Add a new ContactStatus AMI event.
Publish the following status/state changes:
Created
Removed
Reachable
Unreachable
Unknown
Contact URI, new status/state, aor and endpoint names, and the
last qualify rtt result are included in the event.
ASTERISK-25114 #close
Change-Id: Id25aae5f7122facba183273efb3e8f36c20fb61e
Reported-by: George Joseph <george.joseph@fairview5.com>
Tested-by: George Joseph <george.joseph@fairview5.com>
Use function PQescapeStringConn for escaping the name of the table and
schema instead of doing it manually.
ASTERISK-25132 #close
Reported By: Rodrigo Ramírez Norambuena <decipher.hk@gmail.com>
Change-Id: I302a263f7210d20925f14716b508b081998b7608
Incoming SIP packets larger than PJSIP_MAX_PKT_LEN were themselves
truncated before passing to pjsip_tpmgr_receive_packet, but the length
was passed unaltered, thus causing memory corruption and segfault.
ASTERISK-25122 #close
Change-Id: I608a6b6b7f229eacc33a0a7d771d18e27e5b08ab
Many uses of stasis_unsubscribe in modules can be reached through unload.
These have been switched to stasis_unsubscribe_and_join.
Some subscription callbacks do nothing, for these I've created a noop
callback function in stasis.c. This is used by some modules that monitor
MWI topics in order to enable cache, since the callback does not become
invalid after dlclose it is safe to use stasis_unsubscribe on these, even
during module unload.
ASTERISK-25121 #close
Change-Id: Ifc2549fbd8eef7d703c222978e8f452e2972189c
In addition to specifying lists of 'presence' and 'message-summary',
users can also create lists of type 'dialog'. These should be treated in
the same fashion as 'presence'.
Change-Id: I583bb69cd9f88b0b29bf09ddaddeac4e84189f6e
When a SUBSCRIBE request is made to a dialplan hint that doesn't exist,
the current NOTICE message informing users of this swaps the context and
extension parameters. This can cause a bit of confusion.
Thanks to CptBurger in #asterisk for helping to point this out.
Change-Id: Ie584d1a58ae217385c87a450ca25b55ca0e36e43
Prior to this patch, when a WebSocket connection is made, ARI would not
be informed of the connection until after the WebSocket layer had
accepted the connection. This created a brief race condition where the
ARI client would be notified that it was connected, a channel would be
sent into the Stasis dialplan application, but ARI would not yet have
registered the Stasis application presented in the HTTP request that
established the WebSocket.
This patch resolves this issue by doing the following:
* When a WebSocket attempt is made, a callback is made into the ARI
application layer, which verifies and registers the apps presented in
the HTTP request. Because we do not yet have a WebSocket, we cannot
have an event session for the corresponding applications. Some
defensive checks were thus added to make the application objects
tolerant to a NULL event session.
* When a WebSocket connection is made, the registered application is
updated with the newly created event session that wraps the WebSocket
connection.
ASTERISK-24988 #close
Reported by: Joshua Colp
Change-Id: Ia5dc60dc2b6bee76cd5aff0f69dd53b36e83f636
This patch refactors the transaction timeout processing to eliminate
calling the lower level public pjsip functions and reverts to calling
pjsip_endpt_send_request again. This is the result of me noticing
a possible incompatibility with pjproject-2.4 which was causing
contact status flapping.
The original version of this feature used the lower level calls to
get access to the tsx structure in order to cancel the transaction
when our own timer expires. Since we no longer have that access,
if our own timer expires before the pjsip timer, we call the callbacks
and just let the pjsip transaction take it's own course. When the
transaction ends, it discovers the callbacks have already been run
and just cleans itself up.
A few messages in pjsip_configuration were also added/cleaned up.
ASTERISK-25105 #close
Change-Id: I0810f3999cf63f3a72607bbecac36af0a957f33e
Reported-by: George Joseph <george.joseph@fairview5.com>
Tested-by: George Joseph <george.joseph@fairview5.com>