Categories
Machine Learning

Solve Pandas Drop Duplicates still not unique in Value Counts

When using pandas drop duplicates, we may encountered rows that still have duplicating by checking via

df.column_name.value_counts()

Not sure why Pandas drop duplicates performance showing inconsistent result. However, to remove duplicate row, produce 100% unique based on index or key column, you can use this

df_unique = df_unique.drop(df_unique[df_unique["key_column_name"].duplicated()].index)
df_unique.temp_id.value_counts()

Leave a Reply

Your email address will not be published. Required fields are marked *