![]() | InDesign SDK 20.5 |
#include <UnicodeSavvyString.h>

Public Types | |
| typedef int32 | size_type |
| typedef std::ptrdiff_t | difference_type |
| typedef UTF16TextChar | code_value |
| typedef UTF32TextChar | code_point |
| typedef code_value * | code_value_iterator |
| typedef code_value const * | const_code_value_iterator |
| typedef const UnicodeSavvyString & | const_reference |
| typedef UTF16TextChar | value_type |
Public Member Functions | |
| bool16 | HasMultiWordUnicode () const |
| size_type | CharCount () const |
| size_type | NumUTF16TextChars () const |
| size_type | capacity (void) const |
| void | reserve (size_type newCapacity) |
| void | resize (size_type newSize, code_value fill=code_value()) |
| void | clear () |
| const UTF16TextChar * | GrabUTF16Buffer (int32 *numUTF16s) const |
| int32 | CodePointIndexToUTF16Index (int32 index) const |
| void | Truncate (CharCounter count) |
| void | Remove (int32 position, CharCounter count) |
| UTF32TextChar | GetUTF32TextChar (int32 pos) const |
| const_code_value_iterator | begin () const |
| const_code_value_iterator | end () const |
Protected Types | |
| enum | { kMaxSmallString = 23 } |
Protected Member Functions | |
| UnicodeSavvyString (adobe::move_from< UnicodeSavvyString > other) | |
| UnicodeSavvyString (UnicodeSavvyString &&other) noexcept | |
| void | move_from (UnicodeSavvyString &other) noexcept |
| template<class IteratorType > | |
| UnicodeSavvyString (IteratorType b, IteratorType e, size_type nCodePoints=0) | |
| int32 | CountChars () const |
| int32 | CountCharsUtil (const UTF16TextChar *buffer, int32 bufferLength) const |
| void | InsertGap (uint32 wordWiseIndex, size_type numberOfSpaces) |
| void | RemoveGap (uint32 wordWiseIndex, size_type numberOfSpaces) |
| void | InsertUTF32TextChar (UTF32TextChar c, int32 pos=0) |
| void | InsertUTF16String (const UTF16TextChar *buf, int32 len, int32 position=0) |
| void | AppendUTF32TextChar (UTF32TextChar c32) |
| void | CopyFrom (const UnicodeSavvyString &other) |
| bool16 | operator== (const UnicodeSavvyString &s) const |
| template<class IteratorType > | |
| UnicodeSavvyString & | assign (IteratorType b, IteratorType e, size_type nCodePoints=0) |
| UnicodeSavvyString & | replace (size_type pos, size_type n1, code_value const *s, size_type n2) |
| UnicodeSavvyString & | append (code_value const *s, size_type nCodeValues, size_type nCodePoints=0) |
| UTF32TextChar | surro_GetUTF32TextChar (int32 pos) const |
| const UTF16TextChar * | ConstBuffer () const |
| void | insert_safe (code_value_iterator i, const_code_value_iterator sb, const_code_value_iterator se) |
| void | erase_safe (code_value_iterator b, code_value_iterator e) |
| void | replace_safe (code_value_iterator b, code_value_iterator e, const_code_value_iterator sb, const_code_value_iterator se) |
| template<class InputIterator > | |
| void | assign_impl (InputIterator b, InputIterator e, size_type nCodePoints, std::input_iterator_tag) |
| template<class FwdIterator > | |
| void | assign_impl (FwdIterator b, FwdIterator e, size_type nCodePoints, std::forward_iterator_tag) |
| bool16 | UnicodeBufferIsValid () const |
| UTF16TextChar * | GetBufferForWriting (size_type size) |
Protected Attributes | |
| StringStorage * | fStorage |
| UTF16TextChar | fSmallStorage [kMaxSmallString+1] |
| size_type | fUTF16BufferLength |
| size_type | fNumChars |
Friends | |
| void | swap (UnicodeSavvyString &lhs, UnicodeSavvyString &rhs) noexcept |
This is a base class that handles UTF16 code values. It is really important that the users of this class understand the distinction between a code value and a code point. A code point (or a Unicode character) can be stored as one or more code values. In the UTF16 encoding you can have one or two code values for each code point. When a character has two code values, those are called surrogates. Most of the functions of this class work on code values, not code points.
| inlineprotected |
Movable constructor - assumes ownership of the remote part
| inlineprotected |
Constructs the string using a range of code values [b, e). The code values in the range need to be UTF16 encoded.
| b | [IN] - beginning of the range. |
| e | [IN] - end of the range (one past last one). |
| nCodePoints | [IN, OPTIONAL] - number of code points in the range. This parameter can be used for optimization purposes, if the caller knows the number of code points represented in the range. |
| protected |
Appends the code values from the C-array s at the end of the current string.
| s | [IN] - C-array of code values that will be added to this string. |
| nCodeValues | [IN] - number of code values to be added. |
| nCodePoints | [IN, OPTIONAL] - number of code points that nCodeValues represent. This can be used for optimization purposes if the caller knows how many code points are added. |
| inlineprotected |
Assigns to the string the code values in the specified range [b, e). The code values in the range need to be UTF16 encoded.
| b | [IN] - beginning of the range. |
| e | [IN] - end of the range (one past last one). |
| nCodePoints | [IN, OPTIONAL] - number of code points in the range. This parameter can be used for optimization purposes, if the caller knows the number of code points represented in the range. |
| inline |
Returns a const iterator for the beginning of the storage of the string. The iterator works only over code values and it is agnostic of code points.
| inline |
Retrieves the number of UTF16 code values that we can fit in the string without re-allocating. An unicode code value is not the same with an unicode code point (unicode character). Beware of unicode code points that can span 2 code values (surrogates)!
| inline |
Retrieves the number of code points stored in this string. The number of code points can be different from the number of code values if surrogates are present
| void UnicodeSavvyString::clear | ( | ) |
| inline |
Converts a code point index to a code value index inside the string.
| index | [IN] - zero based index of the code point. |
| inline |
Returns a const iterator for the end of the storage of the string. The iterator works only over code values and it is agnostic of code points.
| inline |
Retrieves the unicode code point at the specified position.
| pos | [IN] - position (in code points) where the character is. |
| inline |
Retrieves a pointer to a UTF16 encoded representation of the string (null terminated). This function is identical to c_str() of the std::string.
| numUTF16s | [OUT, OPTIONAL] - if the pointer is not nil the function will set it on return to the number of code values it contains. |
| inline |
Checks if the string has surrogates.
| inlineprotectednoexcept |
Moves the data from other into this leaving other in destructible state
| inline |
Retrieves the number of code values present in this string. The number of code points (characters) can be smaller than this number if surrogates are present.
| protected |
Equality check for two strings.
| s | [IN] - other string to compare with. |
| void UnicodeSavvyString::Remove | ( | int32 | position, |
| CharCounter | count | ||
| ) |
Removes the specified number of code points starting at position.
| position | [IN] - index of code point from where the removal should start. |
| count | [IN] - number of code points to remove. If the value of count is kMaxInt32 the function will remove all the code points after position. |
| protected |
Replaces the code values in range [pos, pos + n1) with n2 code values from the C-array s. WARNING: This function operates on CODE VALUES only. It doesn't know anything about surrogate pairs. It is the caller's responsability to make sure that the replacement leaves the string in a consistent state. The function grows the string if necessary to accomodate for the replacement string.
| pos | [IN] - index of the code value from where the replacement starts. |
| n1 | [IN] - number of code values to be replaced. |
| s | [IN] - C-array of code values that will replace the existing code values in the string. Needs to have at least n2 code values in it. |
| n2 | [IN] - length of the replacement sequence. |
| void UnicodeSavvyString::reserve | ( | size_type | newCapacity | ) |
Reserves internal memory for at least newSize UTF16 code values. If newCapacity is smaller than the current capacity, the call is taken as a nonbinding request to shrink the capacity. The capacity is never reduced below the current number of code values in the string (a call to reserve() doesn't modify the number of code values in the string). Each reallocation invalidates all references, pointers and iterators and it carries a cost so a preemptive call to reserve() is useful to increase speed and not invalidate references and iterators.
| newCapacity | [IN] - the minimum capacity that the string should have. |
| void UnicodeSavvyString::resize | ( | size_type | newSize, |
| code_value | fill = code_value() | ||
| ) |
Changes the number of code values of *this to newSize. If newSize is bigger than current size, new code values initialized with the fill value are appended to the string. If fill parameter is not specified, the default constructor for code_value is used ('\0'). If newSize is smaller, code values are removed from the end of the string. Calling resize(0) has the same effect as clearing the string.
| newSize | [IN] - the new size of the string. |
| fill | [IN] - the fill value for new code values if size increases. |
| void UnicodeSavvyString::Truncate | ( | CharCounter | count | ) |
Truncates the string so it contains the specified number of code points.
| count | [IN] - the desired number of code points the string should contain. |
| friend |
Swaps this object with another one. swap() should never throw. The swap idiom is used to efficiently exchange two objects. It is important to declare swap for your own data structs so other classes can contain them and implement swap(). Is is important to have swap() defined in your class because it allows other clients who use it as a data member to implement a correct assignment operator for their classes.
| rhs | [IN/OUT] - the other object. |