Create an if-else condition column in dask dataframe

Answers:

  1. What you're doing now is almost ok. You don't need to call compute until you're ready for your final answer.

    # ddf1 = ddf.assign(col1 = list(ddf.shop_week.apply(f).compute()))
    ddf1 = ddf.assign(col1 = ddf.shop_week.apply(f))
    

    For some cases dd.Series.where might be a good fit

    ddf1 = ddf.assign(col1 = ddf.shop_week.where(cond=ddf.balance > 0, other=0))
    
  2. As of version 0.10.2 you can now insert columns directly into dask.dataframes

    ddf['col'] = ddf.shop_week.apply(f)
    

You could just use:

f = lambda x: 'THIS' if x == 200607 else 'NOT THIS' if x == 200608 else 'THAT' if x == 200609 else 'NONE'

And then:

ddf1 = ddf.assign(col1 = list(ddf.shop_week.apply(f).compute()))

Unfortunately I don't have an answer to the second question or I don't understand it...