Monday, October 3, 2022
HomeData ScienceHow To Add a New Empty Column in Pandas

How To Add a New Empty Column in Pandas


Including an empty column to current pandas DataFrame with Python

Picture by Kelly Sikkema on Unsplash

In a few of my latest articles right here on Medium, we mentioned about find out how to add new columns in Pandas DataFrames, primarily based on the values of different columns.

One other comparable operation that often customers want to use over DataFrames is the addition of a brand new empty column in an current body. That is often helpful after we want to fill in that column in a while somewhat than simply initialising it with some specific values, both hardcoded or primarily based on another calculations.

In at present’s brief tutorial we can be demonstrating find out how to add such empty column in an current pandas DataFrame.

First, let’s create an instance DataFrame that we are going to be referencing all through this text as a way to show a couple of ideas.

import pandas as pddf = pd.DataFrame(
[
(1, 121, True),
(2, 425, False),
(3, 176, False),
(4, 509, False),
(5, 120, True),
(6, 459, False),
(7, 981, True),
(8, 292, True),
],
columns=['colA', 'colB', 'colC']
)
print(df)
colA colB colC
0 1 121 True
1 2 425 False
2 3 176 False
3 4 509 False
4 5 120 True
5 6 459 False
6 7 981 True
7 8 292 True

Creating a brand new empty column in Pandas

Since we wish the brand new column to be empty, we might really assign numpy.nan worth to each report. Inserting a brand new column with None values is so simple as

import numpy as npdf['colD'] = np.nanprint(df)
colA colB colC colD
0 1 121 True NaN
1 2 425 False NaN
2 3 176 False NaN
3 4 509 False NaN
4 5 120 True NaN
5 6 459 False NaN
6 7 981 True NaN
7 8 292 True NaN

Specifying tha dtype of the empty column

Now although the newly created column within the DataFrame is empty, we could need to specify the actual dtype. The expectation is that sooner or later this empty column can be stuffed in with some values (in any other case what’s the level of making it within the first place!).

We are able to print out the information forms of every column in our DataFrame by calling the dtypes property:

>>> df.dtypes
>>> df.dtypes
colA int64
colB int64
colC bool
colD float64
dtype: object

Pandas robotically created an empty column of kind float64 — it’s because the kind of np.nan is a float by default.

>>> kind(np.nan)
<class 'float'>

As a substitute, we could intend to retailer within the newly created empty column boolean values. On this case, we will forged the column straight after creating it, as illustrated under:

import numpy as npdf['colD'] = np.nan
df['colD'] = df['colD'].astype('boolean')
print(df.dtypes)
colA int64
colB int64
colC bool
colD boolean
dtype: object
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments