You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
threshold = 125 # Example: if unique values > 10, fill with mean, else fill with mode
Function to fill NaN values based on the threshold
def fill_nulls_with_mean_or_mode(df, threshold):
for column in df.columns:
unique_count = df[column].nunique()
print(f"Processing column: {column}")
if unique_count > threshold:
# Fill NaN values with the mean if unique value count exceeds the threshold
df[column].fillna(df[column].mean(), inplace=True)
else:
# Fill NaN values with the mode if unique value count is below or equal to the threshold
df[column].fillna(df[column].mode()[0], inplace=True)
return df
fill_nulls_with_mean_or_mode(train2, threshold)
The text was updated successfully, but these errors were encountered:
We can't investigate if we don't have the data, our guess is train2 is really large. If you can share a minimal reproducible notebook and data we can take a look but it doesn't seem Colab specific.
this is the code :
Define your threshold for unique value counts
threshold = 125 # Example: if unique values > 10, fill with mean, else fill with mode
Function to fill NaN values based on the threshold
def fill_nulls_with_mean_or_mode(df, threshold):
for column in df.columns:
unique_count = df[column].nunique()
print(f"Processing column: {column}")
fill_nulls_with_mean_or_mode(train2, threshold)
The text was updated successfully, but these errors were encountered: