#include <UnicodeSavvyString.h>

Inheritance diagram for UnicodeSavvyString:

Public Types
typedef int32	size_type

typedef std::ptrdiff_t	difference_type

typedef UTF16TextChar	code_value

typedef UTF32TextChar	code_point

typedef code_value *	code_value_iterator

typedef code_value const *	const_code_value_iterator

typedef const UnicodeSavvyString &	const_reference

typedef UTF16TextChar	value_type

Public Member Functions
bool16	HasMultiWordUnicode () const

size_type	CharCount () const

size_type	NumUTF16TextChars () const

size_type	capacity (void) const

void	reserve (size_type newCapacity)

void	resize (size_type newSize, code_value fill=code_value())

void	clear ()

const UTF16TextChar *	GrabUTF16Buffer (int32 *numUTF16s) const

int32	CodePointIndexToUTF16Index (int32 index) const

void	Truncate (CharCounter count)

void	Remove (int32 position, CharCounter count)

UTF32TextChar	GetUTF32TextChar (int32 pos) const

const_code_value_iterator	begin () const

const_code_value_iterator	end () const

Protected Types
enum	{ kMaxSmallString = 23 }

Protected Member Functions
	UnicodeSavvyString (adobe::move_from< UnicodeSavvyString > other)

	UnicodeSavvyString (UnicodeSavvyString &&other) noexcept

void	move_from (UnicodeSavvyString &other) noexcept

template<class IteratorType >
	UnicodeSavvyString (IteratorType b, IteratorType e, size_type nCodePoints=0)

int32	CountChars () const

int32	CountCharsUtil (const UTF16TextChar *buffer, int32 bufferLength) const

void	InsertGap (uint32 wordWiseIndex, size_type numberOfSpaces)

void	RemoveGap (uint32 wordWiseIndex, size_type numberOfSpaces)

void	InsertUTF32TextChar (UTF32TextChar c, int32 pos=0)

void	InsertUTF16String (const UTF16TextChar *buf, int32 len, int32 position=0)

void	AppendUTF32TextChar (UTF32TextChar c32)

void	CopyFrom (const UnicodeSavvyString &other)

bool16	operator== (const UnicodeSavvyString &s) const

template<class IteratorType >
UnicodeSavvyString &	assign (IteratorType b, IteratorType e, size_type nCodePoints=0)

UnicodeSavvyString &	replace (size_type pos, size_type n1, code_value const *s, size_type n2)

UnicodeSavvyString &	append (code_value const *s, size_type nCodeValues, size_type nCodePoints=0)

UTF32TextChar	surro_GetUTF32TextChar (int32 pos) const

const UTF16TextChar *	ConstBuffer () const

void	insert_safe (code_value_iterator i, const_code_value_iterator sb, const_code_value_iterator se)

void	erase_safe (code_value_iterator b, code_value_iterator e)

void	replace_safe (code_value_iterator b, code_value_iterator e, const_code_value_iterator sb, const_code_value_iterator se)

template<class InputIterator >
void	assign_impl (InputIterator b, InputIterator e, size_type nCodePoints, std::input_iterator_tag)

template<class FwdIterator >
void	assign_impl (FwdIterator b, FwdIterator e, size_type nCodePoints, std::forward_iterator_tag)

bool16	UnicodeBufferIsValid () const

UTF16TextChar *	GetBufferForWriting (size_type size)

Protected Attributes
StringStorage *	fStorage

UTF16TextChar	fSmallStorage [kMaxSmallString+1]

size_type	fUTF16BufferLength

size_type	fNumChars

Friends
void	swap (UnicodeSavvyString &lhs, UnicodeSavvyString &rhs) noexcept

Detailed Description

This is a base class that handles UTF16 code values. It is really important that the users of this class understand the distinction between a code value and a code point. A code point (or a Unicode character) can be stored as one or more code values. In the UTF16 encoding you can have one or two code values for each code point. When a character has two code values, those are called surrogates. Most of the functions of this class work on code values, not code points.

Constructor & Destructor Documentation

UnicodeSavvyString::UnicodeSavvyString ( adobe::move_from< UnicodeSavvyString > other )

inlineprotected

Movable constructor - assumes ownership of the remote part

template<class IteratorType >

UnicodeSavvyString::UnicodeSavvyString	(	IteratorType	b,
		IteratorType	e,
		size_type	nCodePoints = `0`
	)

inlineprotected

Constructs the string using a range of code values [b, e). The code values in the range need to be UTF16 encoded.

Parameters

b	[IN] - beginning of the range.
e	[IN] - end of the range (one past last one).
nCodePoints	[IN, OPTIONAL] - number of code points in the range. This parameter can be used for optimization purposes, if the caller knows the number of code points represented in the range.

Member Function Documentation

UnicodeSavvyString& UnicodeSavvyString::append	(	code_value const *	s,
		size_type	nCodeValues,
		size_type	nCodePoints = `0`
	)

protected

Appends the code values from the C-array s at the end of the current string.

Parameters

s	[IN] - C-array of code values that will be added to this string.
nCodeValues	[IN] - number of code values to be added.
nCodePoints	[IN, OPTIONAL] - number of code points that nCodeValues represent. This can be used for optimization purposes if the caller knows how many code points are added.

Returns: reference to this string.

template<class IteratorType >

UnicodeSavvyString & UnicodeSavvyString::assign	(	IteratorType	b,
		IteratorType	e,
		size_type	nCodePoints = `0`
	)

inlineprotected

Assigns to the string the code values in the specified range [b, e). The code values in the range need to be UTF16 encoded.

Parameters

b	[IN] - beginning of the range.
e	[IN] - end of the range (one past last one).
nCodePoints	[IN, OPTIONAL] - number of code points in the range. This parameter can be used for optimization purposes, if the caller knows the number of code points represented in the range.

const_code_value_iterator UnicodeSavvyString::begin ( ) const

inline

Returns a const iterator for the beginning of the storage of the string. The iterator works only over code values and it is agnostic of code points.

size_type UnicodeSavvyString::capacity ( void ) const

inline

Retrieves the number of UTF16 code values that we can fit in the string without re-allocating. An unicode code value is not the same with an unicode code point (unicode character). Beware of unicode code points that can span 2 code values (surrogates)!

Returns: current capacity in code values that the string can hold.

size_type UnicodeSavvyString::CharCount ( ) const

inline

Retrieves the number of code points stored in this string. The number of code points can be different from the number of code values if surrogates are present

Returns: number of unicode code points.

void UnicodeSavvyString::clear ( )

Erases the string making it empty. Capacity stays the same.

See Also: reserve, capacity

int32 UnicodeSavvyString::CodePointIndexToUTF16Index ( int32 index ) const

inline

Converts a code point index to a code value index inside the string.

Parameters

index [IN] - zero based index of the code point.

Returns: the code value index where the code point start in the UTF16 buffer.

const_code_value_iterator UnicodeSavvyString::end ( ) const

inline

Returns a const iterator for the end of the storage of the string. The iterator works only over code values and it is agnostic of code points.

UTF32TextChar UnicodeSavvyString::GetUTF32TextChar ( int32 pos ) const

inline

Retrieves the unicode code point at the specified position.

Parameters

pos	[IN] - position (in code points) where the character is.

Returns: the unicode character.

const UTF16TextChar * UnicodeSavvyString::GrabUTF16Buffer ( int32 * numUTF16s ) const

inline

Retrieves a pointer to a UTF16 encoded representation of the string (null terminated). This function is identical to c_str() of the std::string.

Parameters

numUTF16s [OUT, OPTIONAL] - if the pointer is not nil the function will set it on return to the number of code values it contains.

Returns: a pointer to a null terminated buffer of code values. This pointer can (and will) be different after a non-const method was called on the string.

bool16 UnicodeSavvyString::HasMultiWordUnicode ( ) const

inline

Checks if the string has surrogates.

Returns: true if the string has surrogate pairs.

void UnicodeSavvyString::move_from ( UnicodeSavvyString & other )

inlineprotectednoexcept

Moves the data from other into this leaving other in destructible state

size_type UnicodeSavvyString::NumUTF16TextChars ( ) const

inline

Retrieves the number of code values present in this string. The number of code points (characters) can be smaller than this number if surrogates are present.

bool16 UnicodeSavvyString::operator== ( const UnicodeSavvyString & s ) const

protected

Equality check for two strings.

Parameters

s	[IN] - other string to compare with.

Returns: kTrue if the strings are equal, kFalse otherwise.

void UnicodeSavvyString::Remove	(	int32	position,
		CharCounter	count
	)

Removes the specified number of code points starting at position.

Parameters

position	[IN] - index of code point from where the removal should start.
count	[IN] - number of code points to remove. If the value of count is kMaxInt32 the function will remove all the code points after position.

UnicodeSavvyString& UnicodeSavvyString::replace	(	size_type	pos,
		size_type	n1,
		code_value const *	s,
		size_type	n2
	)

protected

Replaces the code values in range [pos, pos + n1) with n2 code values from the C-array s. WARNING: This function operates on CODE VALUES only. It doesn't know anything about surrogate pairs. It is the caller's responsability to make sure that the replacement leaves the string in a consistent state. The function grows the string if necessary to accomodate for the replacement string.

Parameters

pos	[IN] - index of the code value from where the replacement starts.
n1	[IN] - number of code values to be replaced.
s	[IN] - C-array of code values that will replace the existing code values in the string. Needs to have at least n2 code values in it.
n2	[IN] - length of the replacement sequence.

Returns: reference to this string.

void UnicodeSavvyString::reserve ( size_type newCapacity )

Reserves internal memory for at least newSize UTF16 code values. If newCapacity is smaller than the current capacity, the call is taken as a nonbinding request to shrink the capacity. The capacity is never reduced below the current number of code values in the string (a call to reserve() doesn't modify the number of code values in the string). Each reallocation invalidates all references, pointers and iterators and it carries a cost so a preemptive call to reserve() is useful to increase speed and not invalidate references and iterators.

Parameters

newCapacity [IN] - the minimum capacity that the string should have.

void UnicodeSavvyString::resize	(	size_type	newSize,
		code_value	fill = `code_value()`
	)

Changes the number of code values of *this to newSize. If newSize is bigger than current size, new code values initialized with the fill value are appended to the string. If fill parameter is not specified, the default constructor for code_value is used ('\0'). If newSize is smaller, code values are removed from the end of the string. Calling resize(0) has the same effect as clearing the string.

Parameters

newSize	[IN] - the new size of the string.
fill	[IN] - the fill value for new code values if size increases.

void UnicodeSavvyString::Truncate ( CharCounter count )

Truncates the string so it contains the specified number of code points.

Parameters

count [IN] - the desired number of code points the string should contain.

Friends And Related Function Documentation

void swap	(	UnicodeSavvyString &	lhs,
		UnicodeSavvyString &	rhs
	)

friend

Swaps this object with another one. swap() should never throw. The swap idiom is used to efficiently exchange two objects. It is important to declare swap for your own data structs so other classes can contain them and implement swap(). Is is important to have swap() defined in your class because it allows other clients who use it as a data member to implement a correct assignment operator for their classes.

Parameters

rhs	[IN/OUT] - the other object.

Public Types

Public Member Functions

Protected Types

Protected Member Functions

Protected Attributes

Friends

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Friends And Related Function Documentation