Introduction
Recently I was working on a PPM encoder. The encoder was supposed to take an image in a raw format, which is an array of unsigned chars, where each byte had a numerical vale in range from 0 to 255, and encode it in the PPM format, which is basically a very simple text file. I needed a function that would translate a number to a C-style null-terminated string.
// Input
const UINT number = 255;
// Output
const char array[4] = { '2', '5', '5', '\0' };
When I was researching this problem it turned out that there are several solution available to us:
- the
snprintffunction - the
itoafunction - the
stringstream - the
std::to_string - the
std::to_chars
Let’s review all of them, and at the end I’ll explain which solution I decided to implement.
The snprintf function
The first potential solution was to use the sprintf function. The basic sprintf function takes a variable number of arguments and a formatting string that describes how to put those arguments together, and it produces the composed C-style string. This is very similar to the printf function, but instead of printing, the sprintf function puts the output string to a buffer, which is an array of chars. A pointer to that buffer is the first argument to this function, so you need to manage that memory yourself and you have to make sure that it is big enough to store the output string, which could be tricky.
Now, there is a better and safer function we could use here, which is snprintf. It is very similar to the sprintf, but it takes one additonal argument which represents the maximum number of characters to be written. You can simply provide the size of the buffer that will store your output, and therefore, the function itself can protect against buffer overflow. It is worth noting that this function appends the \0 sign at the end.
Once the function is done, it returns the number of characters written to the buffer, not including the \0 character at the end. However, if anything was wrong during the encoding, the return value will be negative. So the quick way to check if the function worked correctly, is to check if the number of chars written is both bigger than zero, and smaller than the output buffer size.
Here I create a local STL array of chars, with arbitrary number of elements. In my case, the number of characters I need to be able to store has to be enough to store a maximum value of unsigned integer. Then, I simply call the the snprintf function, where I want it to output an integer (so “%d” formatting string) and I give it the number I want to translate. Lastly, I check if the function worked correctly, and eventually I use all the information I gathered to append my new string to the output buffer using my custom function.
The itoa function
The second option was to use itoa function. Note that we’re talking here about the ITOA function, not IOTA, which has a very similar name, but does something completly different.
So the itoa function does exactly what we want, which is it converts an integer value to a null-terminated string using the specified base and stores the result in the array given by str parameter. The function also returns a pointer to that char buffer, which is the same as the first parameter. The buffer needs to be big enough to be able to store the output array of characters, which is a tricky part, since we don’t know the exact number of digits in a number.
There are a couple of problems here. First, we don’t know if the string was encoded correctly, there is no error reporting. Second, we don’t know how many character has been written, and therefore we don’t know how many characters we need to append to our buffer, and we need to find it out some other way. Third, this function doesn’t provide the protection against the buffer overflow. And lastly, which I think is the most important, this function is not defined in ANSI-C and is not part of C++, however, it is supported by some compilers. So thins might reduce the portability of the code.
Using stringstream
The third was to use string and stringstream, but it seems like using a big, big axe to do such a simple task.
This solution is convinient, but it does seem like a big tool for such a small job. Additionally, we need a temporary string object from the standard library.
The std::to_string function
It is fairly easy and convinient to use the std::to_string function. We can basically get everything that we want in a single line.
Very clean, very convinient, however, this is not even a template function, and it only works for some subset of types, like integer or floats. It does the job, but for instance it won’t work for string literals. The memory is managed by the function, which for some might be problematic, but for sure it is convinient.
The std::to_chars function
Finally, this is one of the newer function and it is very interesting and appealing. This function is all about performance. The function is available in the <charconv> library. The function converts an integer or floating-point value to a sequence of char. The conversion functions don’t allocate memory. You own the output buffer in all cases. Note that for floating point numbers, the conversion functions aren’t locale aware, that means they always print and parse decimal points as '.', and never as ',' for locales that use commas. You need to provide a range in the char array, where the function write down the result of its work. The function returns to_chars_result which is very simple and has all the information we need. We have to reduce the range by one, because eventually we want to append the null-terminaned character at the end.
The return structure has only two fields. The ec field contains an error, if there is any. If the error occur, you can still try to recover from it. The second field ptr, and if the function succeded, it is the one-past-the-end pointer of the written characters. This is very convinent for our case, because we can use that pointer to add a ‘\0’ at the very and of it.
Conclusion
I created a little table to summarize all the properties of each solution, to make sure I’ll pick up the right one for my case. Eventually, I decided to go with this solution.
Basically, I have two overriden functions, one takes an unsigned integer, transtlate it to array of chars using the modern and fast to_chars function and appends the null tereminated character at the end. The second function takes a pointer to a char array, which represents the null-terminated C-style string, calculates how many characters we need to copy, and then copy it. What is also worth noting, at the very beginning of my functions, as soon as I know roughly how many characters I would need, I reserve that many bytes so I can avoid constant memory allocations during realtime.
Leave a Reply