Pandas is a widely-used library in Python for data manipulation and analysis. It provides a powerful data structure called DataFrame which is used to represent tabular data in rows and columns. When we have a large dataset and print it, it is often truncated by default. In this article, we will discuss how to print the entire Pandas DataFrame or Series without truncation using different methods.
Method 1: Using to_string() Method
The to_string() method is the simplest of all methods and converts the entire DataFrame into a string object. However, it is not recommended for very large datasets because it works well only for data frames of size in the order of thousands. Here’s an example code:
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
df = pd.DataFrame(data.data, columns = data.feature_names)
# Convert the whole dataframe as a string and display
display(df.to_string())
Method 2: Using pd.option_context() Method
Pandas allows changing settings via the option_context() and set_option() methods. The option_context() method changes the settings only within the context manager scope, while the set_option() method changes the settings permanently throughout the entire script. Here’s an example code using option_context():
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
df = pd.DataFrame(data.data, columns = data.feature_names)
# Change pandas settings locally with a context manager
with pd.option_context('display.max_rows', None,
'display.max_columns', None,
'display.precision', 3):
print(df)
Method 3: Using pd.set_option() Method
The pd.set_option() method is similar to the pd.option_context() method but changes the settings permanently throughout the entire script. To explicitly reset the value, the pd.reset_option(‘all’) method can be used to revert the changes. Here’s an example code:
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
df = pd.DataFrame(data.data, columns = data.feature_names)
# Permanently change the pandas settings
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', -1)
# Display the dataframe
display(df)
# Reset the options
print('**RESET_OPTIONS**')
pd.reset_option('all')
display(df)
Method 4: Using to_markdown() Method
The to_markdown() method is similar to the to_string() method but also adds styling and formatting to the DataFrame. Here’s an example code:
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
df = pd.DataFrame(data.data, columns = data.feature_names)
# Convert the dataframe to a string object with formatting
print(df.to_markdown())
In conclusion, these methods help us print the entire Pandas DataFrame or Series without truncation. It’s important to select the method based on the size of the dataset and the desired output format.
Method 5: Using set_option() with context manager
This method allows you to temporarily change the display settings of the DataFrame using a context manager, which means that the settings will only apply within the context of the with
statement.
import pandas as pd
import numpy as np
# create a DataFrame with 1000 rows and 10 columns
df = pd.DataFrame(np.random.randn(1000, 10))
# temporarily set the display.max_rows option to None
with pd.option_context('display.max_rows', None):
print(df)
Method 6: Using .style Pandas provides a style
attribute for DataFrames
which allows you to apply custom formatting to the DataFrame. This method can be useful for printing DataFrames with a large number of columns.
import pandas as pd
import numpy as np
# create a DataFrame with 1000 rows and 10 columns
df = pd.DataFrame(np.random.randn(1000, 10))
# create a custom style object with no truncation
custom_style = {'max_colwidth': -1}
# apply the custom style to the DataFrame and print it
print(df.style.set_properties(**custom_style))
Method 7: Using IPython.display If you’re working in a Jupyter notebook or IPython console.
you can use the display
function from the IPython.display
module to print the entire DataFrame.
import pandas as pd
import numpy as np
from IPython.display import display
# create a DataFrame with 1000 rows and 10 columns
df = pd.DataFrame(np.random.randn(1000, 10))
# print the entire DataFrame using IPython.display
display(df)
Method 8: Using pandas.options.display
This method uses the pandas.options.display
module to set the maximum number of rows and columns to be displayed.
import pandas as pd
# Create a sample dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
'B': [6, 7, 8, 9, 10],
'C': [11, 12, 13, 14, 15],
'D': [16, 17, 18, 19, 20]})
# Set the maximum number of rows and columns to be displayed
pd.options.display.max_rows = len(df)
pd.options.display.max_columns = len(df.columns)
# Print the entire dataframe
print(df)
Last Updated on April 16, 2023 by admin