pyspark add multiple columns to dataframe code example

Example: python add multiple columns to pandas dataframe

# Basic syntax:
df[['new_column_1_name', 'new_column_2_name']] = pd.DataFrame([[np.nan, 'word']], index=df.index)
# Where the columns you're adding have to be pandas dataframes

# Example usage:
# Define example dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame({
    'col_1': [0, 1, 2, 3],
    'col_2': [4, 5, 6, 7]
})

print(df)
   col_1  col_2
0      0      4
1      1      5
2      2      6
3      3      7

# Add several columns simultaneously:
df[['new_col_1', 'new_col_2', 'new_col_3']] = pd.DataFrame([[np.nan, 42, 'wow']], index=df.index)
print(df)
   col_1  col_2  new_col_1  new_col_2 new_col_3
0      0      4        NaN         42       wow
1      1      5        NaN         42       wow
2      2      6        NaN         42       wow
3      3      7        NaN         42       wow

# Note, this isn't much more efficient than simply doing three
#	separate assignments, e.g.:
df['new_col_1'] = np.nan
df['new_col_2'] = 42
df['new_col_3'] = 'wow'

pyspark add multiple columns to dataframe code example

Example: python add multiple columns to pandas dataframe

Tags:

Python Example

Related

Recent Posts