mbsrtowcs, mbsrtowcs_s

From cppreference.com
< c‎ | string‎ | multibyte
Defined in header <wchar.h>
(1)
size_t mbsrtowcs( wchar_t* dst, const char** src, size_t len, mbstate_t* ps );
(since C95)
(until C99)
size_t mbsrtowcs( wchar_t *restrict dst, const char **restrict src, size_t len,
                  mbstate_t *restrict ps );
(since C99)
errno_t mbsrtowcs_s( size_t *restrict retval,

                     wchar_t *restrict dst, rsize_t dstsz,
                     const char **restrict src, rsize_t len,

                     mbstate_t *restrict ps );
(2) (since C11)
1) Converts a null-terminated multibyte character sequence, which begins in the conversion state described by *ps, from the array whose first element is pointed to by *src to its wide character representation. If dst is not null, converted characters are stored in the successive elements of the wchar_t array pointed to by dst. No more than len wide characters are written to the destination array. Each multibyte character is converted as if by a call to mbrtowc. The conversion stops if:
  • The multibyte null character was converted and stored. *src is set to null pointer value and *ps represents the initial shift state.
  • An invalid multibyte character (according to the current C locale) was encountered. *src is set to point at the beginning of the first unconverted multibyte character.
  • the next wide character to be stored would exceed len. *src is set to point at the beginning of the first unconverted multibyte character. This condition is not checked if dst is a null pointer.
2) Same as (1), except that
  • the function returns its result as an out-parameter retval
  • if no null character was written to dst after len wide characters were written, then L'\0' is stored in dst[len], which means len+1 total wide characters are written
  • the function clobbers the destination array from the terminating null and until dstsz
  • If src and dst overlap, the behavior is unspecified.
  • the following errors are detected at runtime and call the currently installed constraint handler function:
  • retval, ps, src, or *src is a null pointer
  • dstsz or len is greater than RSIZE_MAX/sizeof(wchar_t) (unless dst is null)
  • dstsz is not zero (unless dst is null)
  • There is no null character in the first dstsz multibyte characters in the *src array and len is greater than dstsz (unless dst is null)
As with all bounds-checked functions, mbsrtowcs_s is only guaranteed to be available if __STDC_LIB_EXT1__ is defined by the implementation and if the user defines __STDC_WANT_LIB_EXT1__ to the integer constant 1 before including <wchar.h>.

Parameters

dst - pointer to wide character array where the results will be stored
src - pointer to pointer to the first element of a null-terminated multibyte string
len - number of wide characters available in the array pointed to by dst
ps - pointer to the conversion state object
dstsz - max number of wide characters that will be written (size of the dst array)
retval - pointer to a size_t object where the result will be stored

Return value

1) On success, returns the number of wide characters, excluding the terminating L'\0', written to the character array. If dst is a null pointer, returns the number of wide characters that would have been written given unlimited length. On conversion error (if invalid multibyte character was encountered), returns (size_t)-1, stores EILSEQ in errno, and leaves *ps in unspecified state.
2) zero on success (in which case the number of wide characters excluding terminating zero that were, or would be written to dst, is stored in *retval), non-sero on error. In case of a runtime constraint violation, stores (size_t)-1 in *retval (unless retval is null) and sets dst[0] to L'\0' (unless dst is null or dstmax is zero or greater than RSIZE_MAX)

Example

#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#include <string.h>
 
void print_as_wide(const char* mbstr)
{
    mbstate_t state;
    memset(&state, 0, sizeof state);
    size_t len = 1 + mbsrtowcs(NULL, &mbstr, 0, &state);
    wchar_t wstr[len];
    mbsrtowcs(&wstr[0], &mbstr, len, &state);
    wprintf(L"Wide string: %ls \n", wstr);
    wprintf(L"The length, including L'\\0': %zu\n", len);
}
 
int main(void)
{
    setlocale(LC_ALL, "en_US.utf8");
    print_as_wide(u8"z\u00df\u6c34\U0001f34c"); // u8"zß水🍌"
}

Output:

Wide string: zß水🍌
The length, including L'\0': 5

References

  • C11 standard (ISO/IEC 9899:2011):
  • 7.29.6.4.1 The mbsrtowcs function (p: 445)
  • K.3.9.3.2.1 The mbsrtowcs_s function (p: 648-649)
  • C99 standard (ISO/IEC 9899:1999):
  • 7.24.6.4.1 The mbsrtowcs function (p: 391)

See also

converts a narrow multibyte character string to wide string
(function)
converts the next multibyte character to wide character, given state
(function)
converts a wide string to narrow multibyte character string, given state
(function)
C++ documentation for mbsrtowcs