The Truth of Sisyphus
  • Introduction
  • Deep Learning
    • Basics
      • Hinge Loss
      • Regularizations
      • Linear Classification
      • Multi-Class and Cross Entropy Loss
      • Batch Norm and other Normalizations
      • Optimization
      • Optimization Functions
      • Convolution im2col
      • Activation Functions
      • Derivatives
        • Derivatives of Softmax
        • A Smooth (differentiable) Max Function
      • Model Ensemble
      • Layers Python Implementation
    • Classification
      • Mobile friendly networks
      • Non-local Neural Networks
      • Squeeze-and-Excitation Networks
      • Further Attention Utilization -- Efficience & Segmentation
      • Group Norm
      • ShuffleNet V2
    • Segmentation
      • Several Instance Segmentation
      • A Peek at Semantic Segmentation
      • Design Choices for Mobile Friendly Deep Learning Models, Semantic Segmentation
      • Efficient Video Object Segmentation via Network Modulation
      • BiSeNet
      • DeepLabV3+
    • Detection
      • CornerNet
      • IoU-Net
      • Why smooth L1 is popular in BBox Regression
      • MTCNN-NCNN
      • DetNet
      • SSD Illustration
    • RNN Related
      • GRU vs LSTM
      • BERT
    • Reinforcement Learning
      • AutoML in Practice Review
      • DRL for optimal execution of profolio transaction
    • Multi-task
      • Multi-task Overview
      • What are the tricks in Multi-Task network design?
    • Neural Network Interpretation
      • Neuron Visualization
    • Deep Learning Frameworks
      • How does Caffe work
      • [Gluon] When to use (Hybrid)Sequential and (Hybrid)Block
      • Gluon Hybrid Intro
      • Gluon HybridBlocks Walk-Through
      • A quick tour of Torch internals
      • NCHW / NHWC in Pytorch
      • Static & Dynamic Computation Graph
    • Converting Between DL Frameworks
      • Things To Be Considered When Doing Model Converting
      • Caffe to TensorFlow
    • Computation Graph Optimization
      • Two ways of TensorRT to optimize Neural Network Computation Graph
      • Customized Caffe Memory Optimization
      • NCNN Memory Optimization
      • Symbolic Programs Advantages: More Efficient, Reuse Intermediate Memory, Operation Folding
    • Deep Learning Debug
      • Problems caused by dead ReLU
      • Loss jumps to 87.3365
      • Common Causes of NANs During Training
    • Deployment
      • Efficient Convolution Operation
      • Quantization
    • What I read recently
      • Know Google the Paper Way
      • ECCV 2018
      • Neural Machine Translation
      • Street View OCR Extraction System
      • Teaching Machines to Draw
      • Pixel to Graph
      • Burst Image Deblurring
      • Material for Masses
      • Learning to Separate Object Sounds by Watching Unlabeled Video
    • Papers / Posts to be read
    • Dummy thoughts
  • Machine Learning
    • Classification
    • Regression
    • Clustering
    • Dimension Reduction
    • Metrics
    • Regularization
    • Bayesian Example
    • Machine Learning System Design
    • Recommendation
    • Essentials of Machine Learning
    • Linear Regression
    • Logistic Regression
      • Logistic Function
    • Gaussian Discriminant Analysis
    • Naive Bayes
    • SVM
    • MLE vs MAP
    • Boosting
    • Frequent Questions
    • Conclusion of Machine Learning
  • Python notes
    • Python _ or __ underscores usage
    • Python Multiprocess and Threading Differences
    • Heapq vs. Q.PriorityQueue
    • Python decorator
    • Understanding Python super()
    • @ property
    • Python __all__
    • Is Python List a Linked List or Array
    • What is the "u" in u'Hello world'
    • Python "self"
    • Python object and class
    • Python Class' Instance method, Class method, and Static Methods Demystified
    • Python WTF
    • Python find first value index in a list: [list].index(val)
    • Sort tuples, and lambda usecase
    • Reverse order of range()
    • Python check list is empty
    • Python get ASCII value from character
    • An A-Z of useful Python tricks
    • Python nested function variable scope
    • Python reverse a list
    • Python priority queue -- heapq
  • C++ Notes
    • Templates
    • std::string (C++) and char* (or c-string "string" for C)
    • C++ printf and cout
    • Class Member Function
    • Inline
    • Scope Resolution Operator ::
    • Constructor
    • Destructor
    • Garbage Collection is Critical
    • C++ Question Lists
  • Operating System
    • Basics
    • Mutex & Semaphore
    • Ticket Selling System
    • OS and Memory
    • Sort implementation in STL
    • Compile, link, loading & run
    • How to understand Multithreading and Multiprocessing from the view of Operating System
  • Linux & Productivity
    • Jupyter Notebook on Remote Server
    • Nividia-smi monitoring
  • Leetcode Notes
    • Array
      • 11. Container With Most Water
      • 35. Search Insert Position
    • Linked List
      • Difference between Linked List and Array
      • Linked List Insert
      • Design of Linked List
      • Two Pointers
        • 141. Linked List Cycle
        • 142. Linked List Cycle II
        • 160. Intersection of two Linked List
        • 19. Remove N-th node from the end of linked list
      • 206. Reverse Linked List
      • 203. Remove Linked List Elements
      • 328. Odd Even Linked List
      • 234. Palindrome Linked List
      • 21. Merge Two Sorted Lists
      • 430. Flatten a Multilevel Doubly Linked List
      • 430. Flatten a Multilevel Doubly Linked List
      • 708. Insert into a Cyclic Sorted List
      • 138. Copy List with Random Pointer
      • 61. Rotate List
    • Binary Tree
      • 144. Binary Tree Preorder Traversal
      • 94. Binary Tree Iterative In-order Traverse
    • Binary Search Tree
      • 98. Validate Binary Search Tree
      • 285. Inorder Successor in BST
      • 173. Binary Search Tree Iterator
      • 700. Search in a Binary Search Tree
      • 450. Delete Node in a BST
      • 701. Insert into a Binary Search Tree
      • Kth Largest Element in a Stream
      • Lowest Common Ancestor of a BST
      • Contain Duplicate III
      • Balanced BST
      • Convert Sorted Array to Binary Search Tree
    • Dynamic Programming
      • 198. House Robber
      • House Robber II
      • Unique Path
      • Unique Path II
      • Best time to buy and sell
      • Partition equal subset sum
      • Target Sum
      • Burst Ballons
    • DFS
      • Clone Graph
      • General Introduction
      • Array & String
      • Sliding Window
  • Quotes
    • Concert Violinist Joke
    • 船 Ship
    • What I cannot create, I do not understand
    • Set your course by the stars
    • To-do list
Powered by GitBook
On this page
  • Built-in String Support (C-style Strings)
  • DISADVANTAGES OF C-STRINGS
  • What If We Could Fix Those Disadvantages?
  • std::string
  • DECLARATION AND ASSIGNMENT
  • COMPARING STRINGS
  • CONCATENATING STRINGS
  • ACCESSING CHARACTERS
  • OTHER STD::STRING INTERFACES
  • A Note on Avoiding Copy Overhead
  • When Should I Use std::string?
  • std::string Limitations
  • C++17: std::string_view
  • Putting it All Together
  1. C++ Notes

std::string (C++) and char* (or c-string "string" for C)

You're not working with strings. You're working with pointers. var1 is a char pointer (const char*). It is not a string. If it is null-terminated, then certain C functions will treat it as a string, but it is fundamentally just a pointer.

So when you compare it to a char array, the array decays to a pointer as well, and the compiler then tries to find an operator == (const char*, const char*).

Such an operator does exist. It takes two pointers and returns true if they point to the same address. So the compiler invokes that, and your code breaks.

IF you want to do string comparisons, you have to tell the compiler that you want to deal with strings, not pointers.

The C way of doing this is to use the strcmp function:

strcmp(var1, "dev");

This will return zero if the two strings are equal. (It will return a value greater than zero if the left-hand side is lexicographically greater than the right hand side, and a value less than zero otherwise.)

So to compare for equality you need to do one of these:

if (!strcmp(var1, "dev")){...}
if (strcmp(var1, "dev") == 0) {...}

However, C++ has a very useful string class. If we use that your code becomes a fair bit simpler. Of course we could create strings from both arguments, but we only need to do it with one of them:

std::string var1 = getenv("myEnvVar");

if(var1 == "dev")
{
   // do stuff
}

Now the compiler encounters a comparison between string and char pointer. It can handle that, because a char pointer can be implicitly converted to a string, yielding a string/string comparison. And those behave exactly as you'd expect.

--------

Today we'll continue our C-to-C++ migration theme by focusing on std::string, a container-like class used to manage strings. std::string provides much more straightforward string management interfaces, allows you to utilize SBRM design patterns, and helps eliminate string management overhead.

Let's start off by reviewing built-in string support in C/C++.

Built-in String Support (C-style Strings)

Let's start off with a review of built-in string support, henceforth referred to as "C-style strings".

Neither C or C++ have a default built-in string type. C-strings are simply implemented as a chararray which is terminated by a null character (aka 0). This last part of the definition is important: all C-strings are char arrays, but not all char arrays are c-strings.

const char * str = "This is a string literal. See the double quotes?"

The standard library contains functions for processing C-strings, such as strlen, strcpy, and strcat. These functions are defined in the C header string.h and in the C++ header cstring. These standard C-string functions require the strings to be terminated with a null character to function correctly.

DISADVANTAGES OF C-STRINGS

C arrays do not track their own size. You must keep up with size on your own or rely on the linear-time strlen function to determine the size of each string during runtime. Since C has no concept of boundary protection, the use of the null character is of paramount importance: the C library functions require it, or else they operate past the bounds of the array

Working with C-strings is not intuitive. Functions are required to compare strings, and the output of the strcmp functions is not intuitive either. For functions like strcpy and strcat, the programmer is required to remember the correct argument order for each call. Inverting arguments can have a non-obvious yet negative effect.

Many C-strings are used as fixed-size arrays. This is true for literals as well as arrays that are declared in the form char str[32]. For dynamically sized strings, programmers must worry about manually allocating, resizing, and copying strings.

The concept of C-string size/length is not intuitive and commonly results in off-by-one bugs. The null character that marks the end of a C-string requires a byte of storage in the char array. This means that a string of length 24 needs to be stored in a 25-byte char array. However, the strlen function returns the length of the string without the null character. This simple fact has tripped up many programmers (including myself) when copying around memory. Eventually, you end up with a non-null-terminated string, causing the string library functions to operate out-of-bounds.

What If We Could Fix Those Disadvantages?

What if we could fix those disadvantages? What would our ideal string use-case look like? Here are some ideas:

  • Flexible storage capacity

  • Constant-time string length retrieval (rather than a linear-time functional check)

  • No need to worry about manual memory management or resizing strings

  • Boundary issues are handled for me, with or without a null character.

  • Intuitive assignment using the = operator rather than strcpy

  • Intuitive comparison using the == operator rather than strcmp

  • Intuitive interfaces for other operations such as concatenation (+ operator is nice!), tokenization

std::string

The std::string class manages the underlying storage for you, storing your strings in a contiguous manner. You can get access to this underlying buffer using the c_str() member function, which will return a pointer to null-terminated char array. This allows std::string to interoperate with C-string APIs.

Let's take a look at using std::string.

DECLARATION AND ASSIGNMENT

Declaring a std::string object is simple:

std::string my_str;

You can also initialize it with a C-string:

std::string name("Phillip");

Or initialize it by copying another std::string object:

std::string name2(name);

Or even by making a substring out of another std::string:

std::string lip(name, 4);

There's also a "fill" constructor for std::string which allows you to populate the buffer with a repeated series of characters:

// fill the string with a char. note the single quotes
std::string filled(16, 'A');

Assigning values to a std::string is also simple, as you just need to use the = operator:

// c-string assignment
my_str = "Phillip";

// Copy assignment
my_str = filled;

// Move assignment
my_str = std::move(name2);

Isn't this so much easier than using strcpy?

COMPARING STRINGS

Comparing strings for equality using std::string is also much more intuitive, as the ==operator has been overloaded for comparison:

if(my_str == name2)
{
    std::cout << "my_str and name2 match!" << std::endl;
}

The use of the == operator works as long as one of the values is a std::string. This means we can compare the std::string to a string literal:

if(my_str == "Phillip")
{
    std::cout << "my_str and \"Phillip\" match!" << std::endl;
}

You can also compare strings lexicographically using the other comparison operators (<, data-preserve-html-node="true" >):

if(string1 < string2)
{
    std::cout << "string1 comes first lexicographically" << std::endl;
}

If you're not familiar with lexicographical ordering, it is the ordering by ASCII values of the characters. In ASCII, all upper case letters come before the lower case letters, so "apple" > "Apple".

If you prefer a functional comparison interface, std::string also provides a compare function. This function is similar to strcmp:

  • 0 indicates equality

  • Positive values indicate that the second string comes first lexicographically

  • Negative values mean your string object comes first lexicographically.

if(!str1.compare(str2))
{
    std::cout << "These strings are equal" << std::endl;
}

You can also compare substrings of two different string objects. The substring is of length Y, starting at position X.

if(!str1.compare(str2, x, y))
{
    std::cout << "String 1 is equal to the substring of String 2" << std::endl;
}

CONCATENATING STRINGS

I'm sure at this point you won't be surprised: concatenating two strings is a trivial operation that involves using the + operator:

//Concatenation is also simple!
my_str = lip + name2;
my_str += "lip"; //C-string cat works too

If you prefer a functional interface, std::string also provides an append function. Each of these functions appends something onto the end of your std::string object:

std::string my_str("test");
std::string str2("boo");
const char * c_str = "This is a c_str";

// We can append a string
my_str.append(str2);
my_str.append(c_str);

// We can append X characters from the beginning of a string
my_str.append(str2, x);
my_str.append(c_str, x);

// We can also append a substring, starting at index X and of length Y
my_str.append(str2, x, y);
my_str.append(c_str, x, y);

ACCESSING CHARACTERS

Similar to C-strings, std::string supports the indexing operator [] to access specific characters. Just as with C-strings and arrays, indexing starts at 0. As with other containers, the indexing operator does not support bounds checking. If you wish to have bounds checking applied, you can use the at() member function.

OTHER STD::STRING INTERFACES

For handling storage:

  • size() and length() both return the length of the std::string

    • size is provided to maintain a common interface with container classes

  • capacity() provides the current number of characters that can be held in the currently allocated storage

  • empty() returns true if a string is currently empty

  • clear() resets the container to an empty string

  • reserve() resizes the underlying storage buffer to the requested capacity

  • resize() performs a similar operation, but provides the option of filling new characters with a specific value

  • shrink_to_fit() shrinks the buffer to the current string size, freeing up unused storage capacity

For modifying strings:

  • insert() inserts characters or strings at a specific position

  • replace() replaces characters in a substring

  • push_back() appends a character to the end of the string

  • pop_back() removes the last character of the string

  • erase() removes specific characters

For working with substrings:

  • substr() returns a copy of the substring at the specified position

  • find() identifies the first position within a string where the specified character or substring can be found

  • rfind() finds the last occurrence of a substring

  • find_first_of() finds the first occurrence of a substring

  • find_last_of() finds the last occurrence of a substring

  • find_first_not_of() finds the first absence of a substring

  • find_last_not_of() finds the last absence of a substring

A Note on Avoiding Copy Overhead

Unless you want to make a copy of your std::string, you will want to avoid passing around strings by value:

void foo(std::string str);

Instead, you should pass the argument by reference if you want to modify the string:

void foo(std::string &str);

Or by const reference if the string will not be modified:

void foo(const std::string &str);

I rarely find myself passing around std::string containers by value, since I want to avoid the unnecessary copies.

When Should I Use std::string?

Great, now we have some idea of what we can do with a std::string. When and why should I use std::string over C-strings?

Let's consider some of the advantages to using std::string:

  • The interfaces are much more intuitive to use, leading to less chances of messing up argument order

  • Better searching, replacement, and string manipulation functions (c.f. the cstring library)

  • The size/length functions are constant time (c.f. the linear time strlen function)

  • Reduced boilerplate by abstracting memory management and buffer resizing

  • Reduced risk of segmentation faults by utilizing iterators and the at() function

  • Compatible with STL algorithms

In general, std::string provides a modern interface for string management and will help you write much more straightforward code than C-strings. In general, prefer std::string to C-strings, but especially prefer std::string for mutable strings.

std::string Limitations

There's storage overhead involved with using a std::string object. C-strings are the simplest possible storage method for a string, making them attractive in situations where memory must be conserved. However, similar to other C++ containers, I find that this minor overhead is worth the convenience.

When utilizing a std::string, memory must be dynamically allocated and initialized during runtime. You cannot pre-allocate a std::string buffer during compile-time ands you cannot supply a pre-existing buffer for std::string to assume ownership over. Unlike std::string, C-strings can utilize compile-time allocation and determination of size. Additionally, memory allocation is handled by the std::string class itself. If you need fine-grained control of memory management, look to manual management with C-strings.

One major gripe I have with std::strings is that they don't play nicely with string literals. String literals are placed in static storage and cannot be taken over by a std::string. Initializing a std::string using a string literal will always involve a copy. C-strings still seem to be the best storage option for string literals, especially if you want to avoid unnecessary copies (such as in an embedded environment).

C++17: std::string_view

std::string_view provides the same API that std::string does, so it is a perfect match for C-style string literals.

std::string_view my_view("Works with a string literal");

The only catch with std::string_view is that it is non-owning, so the programmer is responsible for making sure the std::string_view does not outlive the string which it points to. Embedded applications are mostly interested in forcing static memory allocations, so there is little worry about lifetime problems when using string literals with std::string_view.

Putting it All Together

PreviousTemplatesNextC++ printf and cout

Last updated 5 years ago

C-strings of this form are called "":

are indicated by using the double quote (") and are stored as a constant (const) C-string. The null character is automatically appended at the end for your convenience.

Luckily, the C++ class scratches most of these itches for us. Fundamentally, you can consider std::string as a container for handling char arrays, similar to std::vector<char> with some specialized function additions.

std::string provides . I'll just provide a brief overview of functionality - full interface documentation can be found at .

Remember, full documentation can be found on

Ability to utilize

If you are using C++17, you can avoid memory allocation and still enjoy the C++ stringinterfaces by using . The entire purpose of std::string_view is to avoid copying data which is already owned and of which only a fixed view is required. A std::string_view can refer to both a C++ string or a C-string. All that std::string_viewneeds to store is a pointer to the character sequence and a length.

I've written a which can be found in the .

string literals
String literals
std::string
many other useful interfaces
cppreference
cppreference.com
SBRM design patterns
std::string_view
basic std::string example
embedded-resourcesGithub repository