How to split a string in C/C++, Python and Java?
Splitting a string by some delimiter is a very common task. For example, we have a comma-separated list of items from a file and we want individual items in an array.
Almost all programming languages, provide a function split a string by some delimiter.
In C:
// Splits str[] according to given delimiters. // and returns next token. It needs to be called // in a loop to get all tokens. It returns NULL // when there are no more tokens. char * strtok(char str[], const char *delims);
// A C/C++ program for splitting a string // using strtok() #include <stdio.h> #include <string.h> int main() { char str[] = "Geeks-for-Geeks" ; // Returns first token char *token = strtok (str, "-" ); // Keep printing tokens while one of the // delimiters present in str[]. while (token != NULL) { printf ( "%s\n" , token); token = strtok (NULL, "-" ); } return 0; } |
Output: Geeks for Geeks
In C++
Note: The main disadvantage of strtok() is that it only works for C style strings. Therefore we need to explicitly convert C++ string into a char array. Many programmers are unaware that C++ has two additional APIs which are more elegant and works with C++ string.
Method 1: Using stringstream API of C++
Stringstream object can be initialized using a string object, it automatically tokenizes strings on space char. Just like “cin” stream stringstream allows you to read a string as a stream of words.
Some of the Most Common used functions of StringStream. clear() — flushes the stream str() — converts a stream of words into a C++ string object. operator << — pushes a string object into the stream. operator >> — extracts a word from the stream.
The code below demonstrates it.
- C++
#include <bits/stdc++.h> using namespace std; // A quick way to split strings separated via spaces. void simple_tokenizer(string s) { stringstream ss(s); string word; while (ss >> word) { cout << word << endl; } } int main( int argc, char const * argv[]) { string a = "How do you do!" ; // Takes only space separated C++ strings. simple_tokenizer(a); cout << endl; return 0; } |
Output : How do you do!
Method 2: Using C++ find() and substr() APIs.
This method is more robust and can parse a string with any delimiter, not just spaces(though the default behavior is to separate on spaces.) The logic is pretty simple to understand from the code below.
- C++
#include <bits/stdc++.h> using namespace std; void tokenize(string s, string del = " " ) { int start = 0; int end = s.find(del); while (end != -1) { cout << s.substr(start, end - start) << endl; start = end + del.size(); end = s.find(del, start); } cout << s.substr(start, end - start); } int main( int argc, char const * argv[]) { // Takes C++ string with any separator string a = "Hi$%do$%you$%do$%!" ; tokenize(a, "$%" ); cout << endl; return 0; } |
Output: Hi do you do !
Method 3: Using temporary string
If you are given that the length of the delimiter is 1, then you can simply use a temp string to split the string. This will save the function overhead time in the case of method 2.
- C++
#include <iostream> using namespace std; void split(string str, char del){ // declaring temp string to store the curr "word" upto del string temp = "" ; for ( int i=0; i<( int )str.size(); i++){ // If cur char is not del, then append it to the cur "word", otherwise // you have completed the word, print it, and start a new word. if (str[i] != del){ temp += str[i]; } else { cout << temp << " " ; temp = "" ; } } cout << temp; } int main() { string str = "geeks_for_geeks" ; // string to be split char del = '_' ; // delimiter around which string is to be split split(str, del); return 0; } |
Output
geeks for geeks
In Java :
In Java, split() is a method in String class.
// expregexp is the delimiting regular expression; // limit is the number of returned strings public String[] split(String regexp, int limit); // We can call split() without limit also public String[] split(String regexp)
- Java
// A Java program for splitting a string // using split() import java.io.*; public class Test { public static void main(String args[]) { String Str = new String( "Geeks-for-Geeks" ); // Split above string in at-most two strings for (String val: Str.split( "-" , 2 )) System.out.println(val); System.out.println( "" ); // Splits Str into all possible tokens for (String val: Str.split( "-" )) System.out.println(val); } } |
Output:
Geeks for-Geeks Geeks for Geeks
In Python:
The split() method in Python returns a list of strings after breaking the given string by the specified separator.
// regexp is the delimiting regular expression; // limit is limit the number of splits to be made str.split(regexp = "", limit = string.count(str))
- Python
line = "Geek1 \nGeek2 \nGeek3" print (line.split()) print (line.split( ' ' , 1 )) |
Output:
['Geek1', 'Geek2', 'Geek3'] ['Geek1', '\nGeek2 \nGeek3']
Last Updated on October 25, 2021 by admin