A certain situation can result in our attempting to send a NOTIFY on a
destroyed dialog. Say we attempt to send a NOTIFY to a subscriber, but
that subscriber has dropped off the network. We end up retransmitting
that NOTIFY until the appropriate SIP timer says to destroy the NOTIFY
transaction. When the pjsip evsub code is told that the transaction has
been terminated, it responds in kind by alerting us that the
subscription has been terminated, destroying the subscription, and then
removing its reference to the dialog, thus destroying the dialog.
The problem is that when we get told that the subscription is being
terminated, we detect that we have not sent a terminating NOTIFY
request, so we queue up such a NOTIFY to be sent out. By the time that
queued NOTIFY gets sent, the dialog has been destroyed, so attempting to
send that NOTIFY can result in a crash.
The fix being introduced here is actually a reintroduction of something
the pubsub code used to employ. We hold a reference to the dialog and
wait to decrement our reference to the dialog until our subscription
tree object is destroyed. This way, we can send messages on the dialog
even if the PJSIP evsub code wants to terminate earlier than we would
like.
In doing this, some NULL checks for subscription tree dialogs have been
removed since NULL dialogs are no longer actually possible.
Change-Id: I013f43cddd9408bb2a31b77f5db87a7972bfe1e5
When sending a NOTIFY, we lock the dialog and then unlock the dialog
when finished. A recent change made it so that the subscription tree's
dialog pointer will be set NULL when sending the final NOTIFY request
out. This means that when we attempt to unlock the dialog, we pass a
NULL pointer to pjsip_dlg_dec_lock(). The result is that the dialog
remains locked after we think we have unlocked it. When a response to
the NOTIFY arrives, the monitor thread attempts to lock the dialog, but
it cannot because we never released the dialog lock. This results in
Asterisk being unable to process incoming SIP traffic any longer.
The fix in this patch is to use a local pointer to save off the pointer
value of the subscription tree's dialog when locking and unlocking the
dialog. This way, if the subscription tree's dialog pointer is NULLed
out, the local pointer will still have point to the proper place and the
dialog lock will be unlocked as we expect.
Change-Id: I7ddb3eaed7276cceb9a65daca701c3d5e728e63a
The SIP dialog is removed from the subscription tree when the final
NOTIFY is sent. However, after the final NOTIFY is sent, the persistence
update function still attempts to access the cseq from the dialog,
resulting in a crash.
This fix removes the subscription persistence at the same time that the
dialog is removed from the subscription tree. This way, there is no
attempt to update persistence when the subscription is being destroyed.
Change-Id: Ibb46977a6cef9c51dc95f40f43446e3d11eed5bb
There have been crashes seen where a taskprocessor's listener is NULL
unexpectedly.
Looking at backtraces, the problem was specifically seen in PJSIP
serializers.
Subscriptions make the mistake of removing a serializer from a dialog
during subscription tree destruction. Since subscription trees are
reference-counted, guaranteeing the circumstances behind the destruction
are not possible. This makes it so that the dialog serializer can be
removed while not holding the dialog lock. This makes it possible for
the distributor to get a pointer to the dialog serializer and have that
serializer get freed out from under it.
The fix for this is to remove the serializer from a subscription dialog
when sending the final NOTIFY. This guarantees that the serializer is
removed with the dialog lock held. By doing this, we guarantee that if
the distributor gains access to the dialog's serializer, it will not be
possible for the serializer to get freed by another thread.
Change-Id: I21f5dac33529f65cec45679bdace60670800ff66
If an old persistent subscription is recreated but then immediately
destroyed because it is out of date, the subscription tree will have no
leaf subscriptions on it. This was resulting in a crash when attempting
to destroy the subscription tree.
A simple NULL check fixes this problem.
Change-Id: I85570b9e2bcc7260a3fe0ad85904b2a9bf36d2ac
There have been crashes and general instability seen in the pubsub code,
so this patch introduces three changes to increase the stability.
First, the ownership model for subscriptions has been modified. Due to
RLS, subscriptions are stored in memory as a tree structure. Prior to my
patch, the PJSIP subscription was the owner of the subscription tree.
When the PJSIP subscription told us that it was terminating, we started
destroying the subscription tree along with all of the individual leaf
subscriptions that belong to the tree. The problem with this model is
that the two actors in play here, the PJSIP subscription and the
individual leaf subscriptions, need to have joint ownership of the
subscription tree. So now, the PJSIP subscription and the individual
leaf subscriptions each have a reference to the subscription tree. This
way, we will not actually free memory until no players are left that
care. The PJSIP subscription is a bigger stakeholder, in that if the
PJSIP subscription's reference to the subscription tree is removed, the
subscription tree instructs the leaf subscriptions to shut down and drop
their references to the subscription tree when possible. The individual
leaf subscriptions, upon being told to shut down, can drop their stasis
subscriptions or whatever they use to learn of new state, and then drop
their reference to the subscription tree once they are ready to die.
Second, the lifetime of a PJSIP subscription's reference to our
subscription tree has been altered. As I learned from doing a deep dive,
the PJSIP evsub code can tell Asterisk multiple times that the
subscription has been terminated, and not all of these times
are especially helpful. I have altered the message flow that we use for
SIP subscriptions such that we will always drop the PJSIP subscription's
reference to the subscription tree when we send the NOTIFY that
terminates a SIP subscription. This also means that we will now queue
NOTIFY requests to be sent after responding to incoming SUBSCRIBEs so
that we can have predictable state changes from the PJSIP evsub code.
Third, the synchronization of operations has been improved. PJSIP can
call into our code from a serializer thread (e.g. upon receiving an
incoming request) or from the monitor thread (e.g. when a subscription
times out). Because of this, there is the possibility of competing
threads stepping on each other. PJSIP attempts to do some
synchronization on its own by always keeping the dialog lock held when
it calls into us. However, since we end up pushing tasks into the
serializer, the result was that serialized operations were not grabbing
the dialog lock and could, as a result, step on something that was being
attempted by a different thread. Now we ensure that serialized
operations grab the dialog lock, then check for extenuating
circumstances, then proceed with their operation if they can.
Change-Id: Iff2990c40178dad9cc5f6a5c7f76932ec644b2e5
Users of functions which call __ast_str_helper() such as the ones listed
below are likely to not check the return value for failure so ensuring
that the string is always nil terminated is a good safety measure.
ast_str_set_va()
ast_str_append_va()
ast_str_set()
ast_str_append()
Change-Id: I36ab2d14bb6015868b49329dda8639d70fbcae07
In a realtime based system with a limited number of threadpool threads
it is possible for a deadlock to occur. This happens when permanent
endpoint state is updated, which will cause database queries to be done.
These queries may result in URI validation being done which is done
synchronously using a PJSIP thread. If all PJSIP threads are in use
processing traffic they themselves may be blocked waiting to get the
permanent endpoint container lock when identifying an endpoint.
This change moves URI validation to occur at use time instead of
configuration time. While this comes at a cost of not seeing a problem
until you use it it does solve the underlying deadlock problem.
ASTERISK-25486 #close
Change-Id: I2d7d167af987d23b3e8199e4a68f3359eba4c76a
In September 2006, the maximum packetization time (ptime) were set to such a
low value, packetization was disabled for many codecs actually. This was fixed
for many codecs but not for iLBC 30. This enables packetization for iLBC which
can be enabled for example via allow=ilbc:60,gsm,alaw,ulaw in the file sip.conf.
ASTERISK-7803
Change-Id: I2ef90023d35efb7cb8fe96ed74f53f6846ffad12
With Asterisk 13, the structures ast_format and ast_codec changed. Because of
that, the paketization timing (framing) of the RTP channel moved away from the
formats/codecs. In the course of that change, the ptime of the callee was not
honored anymore, when the optional autoframing was enabled.
ASTERISK-25484 #close
Change-Id: Ic600ccaa125e705922f89c72212c698215d239b4
Error response code descriptions may contain wiki markup that need to be
escaped. Without this patch, Confluence will reject the document being sent
and the responsible script will raise an exception.
Change-Id: I21fcb66fee7f6332381f2b99b1b0195dff215ee5
When ab803ec342 was committed, it accidentally forgot to actually *add* the
HOLD_INTERCEPT function. This highlights two interesting points:
* Gerrit forces you to put the patch as it is going to into the repo up for
review, which Review Board did not. Yay Gerrit.
* No one apparently bothered to use this feature, or else they don't know about
it. I'm going to go with the latter explanation.
ASTERISK-24922
Change-Id: Ida38278f259dd07c334a36f9b7d5475b5db72396
This patch adds the functions
ast_cdr_modifier_register()
ast_cdr_modifier_unregister()
That work much like ast_cdr_register() and ast_cdr_unregister().
Modules registered will be given a chance to modify (or to do whatever
they want) CDR fields just before they are passed to registered engines.
Thus, for instance, if a module change the "userfield" field of a CDR,
the modified value will be passed to every registered CDR backend for
logging.
ASTERISK-25479 #close
Change-Id: If11d8fd19ef89b1a66ecacf1201e10fcf86ccd56
This patch adds some minor tweaks for autosupport to update it for Asterisk 13.
This includes:
* Finally removing most references to Zaptel
* Adding support for some additional 'core' commands, and fixing nomenclature
that generally hasn't been used for some time
* Adding some PJSIP/SIP commands to gather endpoints/peers and active channels
Change-Id: Ic997b418cbd9313588b6608e50f47b0ce6f4f1f1
(cherry picked from commit 9fc9777fa3)
In app_queue added value Paused Reason on QueueMemberStatus when a member
on queue is paused and the reason was set.
ASTERISK-25480 #close
Reporte by: Rodrigo Ramírez Norambuena
Change-Id: Ia5db503482f50764c15e2020196c785f59d4a68e
ASTERISK-25308 fixed a memory leak in res_ari_events.c, but
this file is regenerated by a template and the template was
not fixed.
Change-Id: Ied4c6deae89d21f87f9cf99676b1d055aa83b38b
On v13, loading several thousand PJSIP endpoints on Asterisk start causes
a deadlock most of the time.
Thanks to mdu113 for discovering that there was a call to pgsql_exec() not
protected by the pgsql_lock reentrancy lock.
{quote}
I believe a code path exists that attempts to use pgsql connection without
locking pgsql_lock. I believe what happens during that deadlock that I
see is two concurrent threads are both attempting to send query to pgsql,
one of the thread is using a code path without locking pgsql_lock. If
they managed to send queries at the same time, it seems postgres ignores
one of the queries and replies only to the one of them. If it happens so
that the thread holding the lock didn't receive the reply it will wait for
it (and hold the lock) forever (or at least for very long time), thus
completely blocking all access to db.
{quote}
* Added missing reentrancy locking around pgsql_exec() in find_table().
* Moved unlock of pgsql_lock in unload_module() to avoid locking inversion
between the psql_tables list lock and the pgsql_lock.
ASTERISK-25455 #close
Reported by: mdu113
Patches:
res_config_pgsql.c-connlock2.diff (license #5543) patch uploaded by mdu113
Change-Id: Id9e7cdf8a3b65ff19964b0cf942ace567938c4e2
To quote Olle:
"When issuing a hangup due to RTP timeouts the cause code is not set. I have
selected 44 based on Cisco's implementation..."
ASTERISK-25135 #close
Reported by: Olle Johansson
patches:
rtp-timeout-cause-1.8.diff uploaded by Olle Johansson (License 5267)
Change-Id: Ia62100c55077d77901caee0bcae299f8dc7375fc
The memory corruption could happen if the [section](+) is the last section
in the file with trailing comments. In this case process_text_line() has
left *last_cat is set to newcat and newcat is destroyed.
Change-Id: I0d1d999f553986f591becd000e7cc6ddfb978d93
An #include right after a [section](+) would associate any variable
assignments before a new section in the #include with the wrong section.
* Fix section association by setting the current section to the appended
section.
* Fix '+' and '!' section flag interaction corner case depending upon
which flag came first. If the '!' came first then it would be ignored.
If the '!' came after then it would affect the appended section. The '!'
will now no longer be ignored.
ASTERISK-25461 #close
Reported by: Sean Pimental
Change-Id: Ic9d3191c8758048e2cbce6432f854b32531731c3
Versions of libunbound before v1.4.21 do not compile with Asterisk.
However, since v1.4.21 has a configure script bug that fails to detect the
ldns library (which is fixed in v1.4.22) and v1.4.22 is not an easily
detectable version we will require v1.5.0 as a minimum version of the
library to work with Asterisk.
ASTERISK-25108 #close
Reported by: Richard Mudgett
Change-Id: Ieb228bfb01467573fc121c7356a9dde27128894d
Wrote the skeleton framework for the Asterisk StatsD dialplan
application. This includes a load function, unload function, a
callback for execution, and XML documentation.
ASTERISK-25419
Reported By: Ashley Sanders
Change-Id: I9597730e134c6e82c8a55ef4d5334b62dd473363
The struct send_request_wrapper has a pjsip lock associated with it that
is created non-recursive. There is a code path for the struct
send_request_wrapper lock that will attempt to lock it recursively. The
reporter's deadlock showed that the thread calling endpt_send_request()
deadlocked itself right after the wrapper object got created.
Out-of-dialog requests such as MESSAGE, qualify OPTIONS, and unsolicited
MWI NOTIFY messages can hit this deadlock.
* Replaced the struct send_request_wrapper pjsip lock with the mutex lock
that can come with an ao2 object since all of Asterisk's mutexes are
recursive. Benefits include removal of code maintaining the pjsip
non-recursive lock since ao2 objects already know how to maintain their
own lock and the lock will show up in the CLI "core show locks" output.
ASTERISK-25435 #close
Reported by: Dmitriy Serov
Change-Id: I458e131dd1b9816f9e963f796c54136e9e84322d
In ast_rtp_read, the value of the variable 'mark' which we try to assign to a
frame->subclass.frame_ending may be 0, 1 or (1<<23), but we should translate
it to 0 or 1.
ASTERISK-25451 #close
Change-Id: I53bdf5c026041730184a6a809009c028549ce626
Return AST_PRESENCE_NOT_SET when CustomPresence AstDB key does not
exist, i.e. when a new CustomPresence is added in the dialplan.
ASTERISK-25400 #close
Reported by: Andrew Nagy
Change-Id: I6fb17b16591b5a55fbffe96f3994ec26b1b1723a
When we decide we will no longer schedule an RTCP write, we remove the
reference to the RTP instance, then assign -1 to the stored scheduler ID
in case something else comes along and wants to see if anything is scheduled.
That scheduler ID is on the RTP instance. After 60a9172d7e was merged to
fix the regression introduced by 3cf0f29310, this improper assignment on a
potentially destroyed object started getting tripped on the build agents.
Frankly, this should have been crashing a lot more often earlier. I can only
assume that the timing was changed just enough by both changes to start
actually hitting this problem.
As it is, simply moving the assignment prior to the ao2 deference is sufficient
to keep the RTP instance from being referenced when it is very, truly,
aboslutely dead.
(Note that it is still good practice to assign -1 to the scheduler ID when we
know we won't be scheduling it again, as the ao2 deref *may* not always destroy
the ao2 object.)
ASTERISK-25449
Change-Id: Ie6d3cb4adc7b1a6c078b1c38c19fc84cf787cda7
If a Via header containes an IPv6 address and a port number is ommitted,
as it is the standard port, we now leave the port empty and to not set it
to the value after the first colon of the IPv6 address.
ASTERISK-25443 #close
Change-Id: Ie3c2f05471cd006bf04ed15598589c09577b1e70
Apparently some endpoints attempt to send a reINVITE before completing the
initial INVITE transaction. In this case PJSIP responds appropriately to
the reINVITE with a 491 INVITE request pending. Unfortunately chan_pjsip
is using the initial INVITE transaction state to determine if an INVITE is
the initial INVITE or a reINVITE. Since the initial INVITE transaction
has not been confirmed yet chan_pjsip thinks the reINVITE is an initial
INVITE and starts another PBX thread on the channel. The extra PBX thread
ensures that hilarity ensues.
* Fix checks for a reINVITE on incoming requests to look for the presence
of a to-tag instead of the initial INVITE transaction state.
* Made caller_id_incoming_request() determine what to do if there is a
channel on the session or not. After a channel is created it is too late
to just store the new party id on the session because the session's party
id has already been copied to the channel's caller id.
ASTERISK-25404 #close
Reported by: Chet Stevens
Change-Id: Ie78201c304a2b13226f3a4ce59908beecc2c68be
When 5c713fdf18 was merged, it allowed for scheduled items to have an ID of
'0' returned. While this was valid per the documentation for the API, it was
apparently never returned previously. As a result, several users of the
scheduler API viewed the result as being invalid, causing them to reschedule
already scheduled items or otherwise fail in interesting ways.
This patch corrects the users such that they view '0' as valid, and a returned
ID of -1 as being invalid.
Note that the failing HEP RTCP tests now pass with this patch. These tests
failed due to a duplicate scheduling of the RTCP transmissions.
ASTERISK-25449 #close
Change-Id: I019a9aa8b6997584f66876331675981ac9e07e39
Some systems require the REFER packet to include a Referred-By header.
If the channel variable SIPREFERREDBYHDR is set, it passes that value as the
Referred-By header value. Otherwise, it adds the current dialog’s local info.
Reported by: Dan Cropp
Tested by: Dan Cropp
Change-Id: I3d17912ce548667edf53cb549e88a25475eda245
When GetConfigJSON was introduced back in 1.6, it returned each
section as an array of strings: ["key=value", "key2=value2"].
Afterwards, it was changed a few times and became
["key": "value", "key2": "value2"], which is not a correct JSON.
This patch fixes that by constructing a JSON object {} instead of
an array [].
Also, the keys "istemplate" and "tempates" that are used to
indicate templates and their inherited categories are now wrapped in
quotes.
ASTERISK-25391 #close
Reported by: Bojan Nemčić
Change-Id: Ibbe93c6a227dff14d4a54b0d152341857bcf6ad8