This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
ASTERISK-25023
Change-Id: I9c11b9d597468f63916c99e1dabff9f4a46f84c1
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
* Fix clearing autokillid in __sip_autodestruct() even though we could
reschedule.
ASTERISK-25023
Change-Id: I450580dbf26e2e3952ee6628c735b001565c368f
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
* Fix retrans_pkt() to call check_pendings() with both the owner channel
and the private objects locked as required.
* Refactor dialog retransmission packet list to safely remove packet
nodes. The list nodes are now ao2 objects. The list has a ref and the
scheduled entry has a ref.
ASTERISK-25023
Change-Id: I50926d81be53f4cd3d572a3292cd25f563f59641
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
ASTERISK-25023
Change-Id: I98a694fd42bc81436c83aa92de03226e6e4e3f48
This patch is part of a series to resolve deadlocks in chan_sip.c.
* Make dialog_unlink_all() unschedule all items at once in the sched
thread.
ASTERISK-25023
Change-Id: I7743072fb228836e8228b72f6dc46c8cc50b3fb4
This patch is part of a series to resolve deadlocks in chan_sip.c.
The reordering of chan_sip's shutdown is to handle any immediate events
that get put onto the scheduler so resources aren't leaked. The typical
immediate events at this time are going to be concerned with stopping
other scheduled events.
ASTERISK-25023
Change-Id: I3f6540717634f6f2e84d8531a054976f2bbb9d20
This patch is part of a series to resolve deadlocks in chan_sip.c.
Delaying destruction of the chan_sip sip_pvt structures caused the
/channels/chan_sip/test_sip_rtpqos unit test to crash. That test
registers a special test ast_rtp_engine with the rtp engine module. When
the unit test completes it cleans up by unregistering the test
ast_rtp_engine and exits. Since the delayed destruction of the sip_pvt
happens after the unit test returns, the destructor tries to call the rtp
engine destroy callback of the test ast_rtp_engine auto variable which no
longer exists on the stack.
* Change the test ast_rtp_engine auto variable to a static variable. Now
the variable can still exist after the unit test exits so the delayed
sip_pvt destruction can complete successfully.
ASTERISK-25023
Change-Id: I61e34a12d425189ef7e96fc69ae14993f82f3f13
This patch is part of a series to resolve deadlocks in chan_sip.c.
* Updated sched unit test to check new behavior.
ASTERISK-25023
Change-Id: Ib69437327b3cda5e14c4238d9ff91b2531b34ef3
This prevents pbx_core from hanging up the channel if the app isn't
registered.
ASTERISK-25846 #close
Change-Id: I63216a61f30706d5362bc0906b50b6f0544aebce
Remove destructor calling destroy_it calling really_destroy_it
for no benefit. Just make the destructor the really_destroy_it
function.
Change-Id: Idea0d47b27dd74f2488db75bcc7f353d8fdc614a
Older versions of PJSIP do not have the proto field on the TLS transport
setting structure. This change adds a configure check so even if it is
not present we will still be able to build.
Change-Id: Ibf3f47befb91ed1b8194bf63888baa6fee05aba9
I can't ever recall actually needing the intermediate files or the checking
that a double compile produces. What I CAN remember is every DONT_OPTIMIZE
build needing 3 invocations of gcc instead of 1 just to do the checks and
produce those intermediate files.
Having said that, Richard pointed out that the reason for the double compile
was that there were cases in the past where a submitted patch failed to compile
because the submitter never tried it with the optimizations turned on.
To get the best of both worlds, COMPILE_DOUBLE has been split into its own
option. If DONT_OPTIMIZE is turned on, COMPILE_DOUBLE will also be selected
BUT you can then turn it off if all you need are the debugging symbols. This
way you have to make an informed decision about disabling COMPILE_DOUBLE.
To allow COMPILE_DOUBLE to be both auto-selected and turned off, a new feature
was added to menuselect. The <use> element can now contain an "autoselect"
attribute which will turn the used member on but not create a hard dependency.
The cflags.xml implementation for COMPILE_DOUBLE looks like this...
<member name="DONT_OPTIMIZE" displayname="Disable Optimizations ...">
<use autoselect="yes">COMPILE_DOUBLE</use>
<support_level>core</support_level>
</member>
<member name="COMPILE_DOUBLE" displayname="Pre-compile with ...>
<depend>DONT_OPTIMIZE</depend>
<support_level>core</support_level>
</member>
When DONT_OPTIMIZE is turned on, COMPILE_DOUBLE is turned on because
of the use.
When DONT_OPTIMIZE is turned off, COMPILE_DOUBLE is turned off because
of the depend.
When COMPILE_DOUBLE is turned on, DONT_OPTIMIZE is turned on because
of the depend.
When COMPILE_DOUBLE is turned off, DONT_OPTIMIZE is left as is because
it only uses COMPILE_DOUBLE, it doesn't depend on it.
I also made a few tweaks to the ncurses implementation to move things
left a bit to allow longer descriptions.
Change-Id: Id49ca930ac4b5ec4fc2d8141979ad888da7b1611
The pjproject Makefile now uses the Asterisk optimization flags which
are determined by the setting of the DONT_OPTMIZE menuselect flag.
The Makefile was also restructured so a change to the top level
menuselect.makeopts will result in a rebuild of pjproject.
Also, "--disable-resample" was removed from the pjproject configure
options. Without resample, pjsua (which is used by the testsuite)
can't make audio calls. When it can't, it segfaults.
Change-Id: I24b0a4d0872acef00ed89b3c527a713ee4c2ccd4
Channel masquerading had a conflict with autochannel locking.
When locking autochannel->channel, the channel is fetched from the
autochannel and then locked. During the fetch, the autochannel -- which
has no locks itself -- can be modified by someone who owns the channel
lock. That means that the value of autochan->channel cannot be trusted
until you hold the lock.
In practice, this caused problems with Local channels getting
masqueraded away while the ChanSpy attempted to get info from that
channel. The old channel which was about to get removed got locked, but
the new (replaced) channel got unlocked (no-op). Because the replaced
channel was now locked (and would never get unlocked), it couldn't get
removed from the channel list in a timely manner, and would now cause
deadlocks when iterating over the channel list.
This change checks the autochannel after locking the channel for changes
to the autochannel. If the channel had been changed, the lock is
reobtained on the new channel.
In theory it seems possible that after this fix, the lock attempt on the
old (wrong) channel can be on an already destroyed lock, maybe causing
a crash. But that hasn't been observed in the wild and is harder induce
than the current deadlock.
Thanks go to Filip Frank for suggesting a fix similar to this and
especially to IRC user hexanol for pointing out why this deadlock was
possible and testing this fix. And to Richard for catching my rookie
while loop mistake ;)
ASTERISK-25321 #close
Change-Id: I293ae0014e531cd0e675c3f02d1d118a98683def
Refactor and created function ast_cli_print_timestr_fromseconds to print
seconds formatted: year(s) week(s) day(s) hour(s) second(s)
This function now is used in addons/cdr_mysql.c,cdr_pgsql.c, main/cli.c,
res_config_ldap.c, res_config_pgsql.c.
Change-Id: Ibeb8634102cd11d3f8623398b279cb731bcde36c
RedHat/CentOS needs python-devel
Debian/Ubuntu needs automake, libsrtp-dev and python-dev
Ubuntu also needed libncurses5-dev for cmenuselect so while not
needed for pjproject, I adedd it anyway.
Change-Id: Idf5fa16e2d87c687439621507e122cb9461d7089
Per RFC3325, the 'From' header is now anonymized on outgoing calls when
caller id presentation is prohibited.
TID = trust_id_outbound
PRO = Set(CALLERID(pres)=prohib)
USR = endpoint/from_user
DOM = endpoint/from_domain
PAI = YES(privacy=off), NO(not sent), PRI(privacy=full) (assumes send_pai=yes)
Conditions |Result
--------------------|----------------------------------------------------
TID PRO USR DOM |PAI FROM
--------------------|----------------------------------------------------
Y Y abc def.ghi |PRI "Anonymous" <sip:abc@def.ghi>
Y Y abc |PRI "Anonymous" <sip:abc@anonymous.invalid>
Y Y def.ghi |PRI "Anonymous" <sip:anonymous@def.ghi>
Y Y |PRI "Anonymous" <sip:anonymous@anonymous.invalid>
Y N abc def.ghi |YES <sip:abc@def.ghi>
Y N abc |YES <sip:abc@<ip_address>>
Y N def.ghi |YES "Caller Name" <sip:<caller_exten>@def.ghi>
Y N |YES "Caller Name" <sip:<caller_exten>@<ip_address>>
N Y abc def.ghi |NO "Anonymous" <sip:abc@def.ghi>
N Y abc |NO "Anonymous" <sip:abc@anonymous.invalid>
N Y def.ghi |NO "Anonymous" <sip:anonymous@def.ghi>
N Y |NO "Anonymous" <sip:anonymous@anonymous.invalid>
N N abc def.ghi |YES <sip:abc@def.ghi>
N N abc |YES <sip:abc@<ip_address>>
N N def.ghi |YES "Caller Name" <sip:<caller_exten>@def.ghi>
N N |YES "Caller Name" <sip:<caller_exten>@<ip_address>>
ASTERISK-25791 #close
Reported-by: Anthony Messina
Change-Id: I2c82a5ca1413c2c00fb62ea95b0ae8e97af54dc9
Apparently the != operator is fairly new so I've replaced it with
the old $(shell ...) syntax.
Change-Id: I16b2e1878a4f91e7e9740abd427f9639f933c479
Reported-by: Richard Mudgett
Although we use the RTLD_LAZY flag when calling dlopen
the first time on a module, this only defers resolution
for function calls. Pointer references to functions are
determined at link time so dlopen expects them to be there.
Since we don't cross-module link, pointers to functions
in other modules won't be available and dlopen will fail.
Doing a "hardened" build also causes problems because it
typically sets "-z now" on the ld command line which
overrides RTLD_LAZY at run time.
If the failing module isn't a GLOBAL_SYMBOLS module, then
dlopen will be called again after all the GLOBAL_SYMBOLS
modules have been loaded and they'll eventually resolve.
If the calling module IS a GLOBAL_SYMBOLS module itself
and a third module depends on it, then there's an issue
because the second time through the dlopen loop,
GLOBAL_SYMBOLS modules aren't given any special treatment
and since the order in which dlopen is called isn't
deterministic, the dependent may again be tried before the
module it needs is loaded.
Simple solution: Save modules that fail load_resource
because of a dlopen error in a list and retry them
immediately after the first pass. Keep retrying until
the failed list is empty or we reach a #defined max
retries. Error messages are suppressed until the final
pass which also gets rid of those confusing error messages
about module failures that are later corrected.
Change-Id: Iddae1d97cd2f00b94e61662447432765755f64bb
It's possible for the transferer channel to get hung up early during the
attended transfer process. For instance, a phone may send a "bye" immediately
upon receiving a sip notify that contains a sip frag 100 (I'm looking at you
Jitsi). When this occurs a race begins between the transferer being hung up
and completion of the transfer code.
If the channel hangs up too early during a transfer involving stasis bridging
for instance, then when the created local channel goes to look up its swap
channel (and associated datastore) it can't find it (since it is no longer in
the bridge) thus it fails to enter the stasis application. Consequently, the
created local channel(s) hang up as well. If the timing is just right then the
bridging code attempts to add the message link with missing local channel(s).
Hence the crash.
Unfortunately, there is no great way to solve the problem of the unexpected
"bye". While we can't guarantee we won't receive an early hangup, and in this
case still fail to enter the stasis application, we can make it so asterisk
does not crash.
This patch does just that by locking the local channel structure, checking
that the local channel's peer has not been lost, and then continuing. This
keeps the local channel's peer from being ripped out from underneath it by
the local/unreal hangup code while attempting to set the stasis message link.
ASTERISK-25771
Change-Id: Ie6d6061e34c7c95f07116fffac9a09e5d225c880
During the transfer process, some phones (okay it was the Jitsi softphone,
but maybe others are out there) send a "bye" immediately after receiving a
SIP Notify. When a "bye" is received early for some types of transfers the
transferer channel may no longer be available during late stage transfer
processing.
For instance, during an attended transfer involving stasis bridging at one
point the created local channel looks for an associated swap channel in
order to retrieve the stasis application name. If the transferer has hung
up then the local channel will fail to find it. The local channel then has
no way to know which stasis app to enter, so it fails and hangs up as well.
Thus the transfer does not complete as expected.
This patch delays the sending of the initial notify in order to give the
transfer process enough time to gather the necessary data for a successful
transfer.
ASTERISK-25771
Change-Id: I09cfc9a5d6ed4c007bc70625e0972b470393bf16
PJSIP does not ensure that when printing the message body the
buffer will be NULL terminated. This is problematic when searching
for the signal and duration values of the DTMF.
This change ensures the buffer is always NULL terminated.
Change-Id: I52653a1a60c93092d06af31a27408d569cc98968