Convert pandas DataFrame column of comma separated strings to one-hot encoded

Note that you're not dealing with OHEs.

str.split + stack + get_dummies + sum

df = pd.DataFrame(data)
df

      mesh
0  A, B, C
1      C,B
2         

(df.mesh.str.split('\s*,\s*', expand=True)
   .stack()
   .str.get_dummies()
   .sum(level=0))
df

   A  B  C
0  1  1  1
1  0  1  1
2  0  0  0

apply + value_counts

(df.mesh.str.split(r'\s*,\s*', expand=True)
   .apply(pd.Series.value_counts, 1)
   .iloc[:, 1:]
   .fillna(0, downcast='infer'))

   A  B  C
0  1  1  1
1  0  1  1
2  0  0  0

pd.crosstab

x = df.mesh.str.split('\s*,\s*', expand=True).stack()
pd.crosstab(x.index.get_level_values(0), x.values).iloc[:, 1:]
df

col_0  A  B  C
row_0         
0      1  1  1
1      0  1  1
2      0  0  0

Figured there is a simpler answer, or I felt this as more simple compared to multiple operations that we have to make.

  1. Make sure the column has unique values separated be commas

  2. Use get dummies in built parameter to specify the separator as comma. The default for this is pipe separated.

    data = {"mesh": ["A, B, C", "C,B", ""]}
    sof_df=pd.DataFrame(data)
    sof_df.mesh=sof_df.mesh.str.replace(' ','')
    sof_df.mesh.str.get_dummies(sep=',')
    

OUTPUT:

    A   B   C
0   1   1   1
1   0   1   1
2   0   0   0