sklearn.compose.ColumnTransformer: fit_transform() takes 2 positional arguments but 3 were given

There are two major reasons why this will not work for your purpose.

  1. LabelEncoder() is desinged to be used for the target variable (y). That is the reason for getting the positional argument error, when columnTransformer() tries to feed X, y=None, fit_params={}.

From Documentation:

Encode labels with value between 0 and n_classes-1.

fit(y)
Fit label encoder

Parameters:
y : array-like of shape (n_samples,)
Target values.

  1. Even if you do a workaround to remove the empty dictionary, then also LabelEncoder() cannot take 2D array (basically multiple features at a time) because it takes only 1D y values.

Short answer - we should not be using LabelEncoder() for input features.

Now, what is the solution to encode the input features?

Use OrdinalEncoder() if your features are ordinal features or OneHotEncoder() in case of nominal features.

Example:

>>> from sklearn.compose import ColumnTransformer
>>> from sklearn.preprocessing import OrdinalEncoder, OneHotEncoder
>>> X = np.array([[1000., 100., 'apple', 'green'],
...               [1100., 100., 'orange', 'blue']])
>>> ct = ColumnTransformer(
...     [("ordinal", OrdinalEncoder(), [0, 1]),
         ("nominal", OneHotEncoder(), [2, 3])])
>>> ct.fit_transform(X)   
array([[0., 0., 1., 0., 0., 1.],
       [1., 0., 0., 1., 1., 0.]]) 

I believe this is actually an issue with LabelEncoder. The LabelEncoder.fit method only accepts self, and y as arguments (which is odd as most transformer objects have the paradigm of fit(X, y=None, **fit_params)). Anyway, in pipeline the transformer gets called with fit_params regardless of what you have passed. In this particular situation the exact arguments passed to LabelEncoder.fit are X and an empty dictionary {}. Thus raising the error.

From my point of view this is a bug in LabelEncoder, but you should take that up with the sklearn folks as they may have some reason for implementing the fit method differently.