How to drop rows with nulls in one column pyspark

Dataframes are immutable. so just applying a filter that removes not null values will create a new dataframe which wouldn't have the records with null values.

df = df.filter(df.col_X. isNotNull())

Use either drop with subset:["col_X"])

or isNotNull()


if you want to drop any row in which any value is null, use  //same as"any") default is "any"

to drop only if all values are null for that row, use"all")

to drop by passing a column list, use"all", Seq("col1", "col2", "col3"))