utf8.c: Add UTF-8 validation and utility functions

There are various places in Asterisk - specifically in regards to
database integration - where having some kind of UTF-8 validation would
be beneficial. This patch adds:

* Functions to validate that a given string contains only valid UTF-8
  sequences.

* A function to copy a string (similar to ast_copy_string) stopping when
  an invalid UTF-8 sequence is encountered.

* A UTF-8 validator that allows for progressive validation.

All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.
More information is available here:

    https://bjoern.hoehrmann.de/utf-8/decoder/dfa/

The API was written in such a way that should allow us to replace the
implementation later should we determine that we need something more
comprehensive.

Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9
This commit is contained in:
Sean Bright
2020-07-13 16:06:14 -04:00
committed by Kevin Harwell
parent c10ed8d4d6
commit 7d96b3e437
3 changed files with 570 additions and 0 deletions

View File

@@ -242,6 +242,7 @@ int daemon(int, int); /* defined in libresolv of all places */
#include "asterisk/media_cache.h"
#include "asterisk/astdb.h"
#include "asterisk/options.h"
#include "asterisk/utf8.h"
#include "../defaults.h"
@@ -4068,6 +4069,7 @@ static void asterisk_daemon(int isroot, const char *runuser, const char *rungrou
check_init(ast_json_init(), "libjansson");
ast_ulaw_init();
ast_alaw_init();
ast_utf8_init();
tdd_init();
callerid_init();
ast_builtins_init();