First time here? Checkout the FAQ!
x
0 votes
132 views
asked in Exploratory Data Analysis by (150 points)  
So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.


I know that we can use pd.get_dummies to convert the column to numerical data within the dataframe, but I also know that there are other 'converters' (not sure if that's the correct terminology) that we can use, i.e. OneHotEncoder from Sk-learn (like I could use the pipeline module to make a nice pipeline and feed my dataframe through the pipeline to also get my categorical data encoded to numerical).


How do I know which to use? Does it matter? If it does matter, when does it matter the most (i.e. what types of problems? When there are lots of categorical variables, or few?) If anyone can give me any pointers on this type of stuff I'd greatly appreciate it.
  

Please log in or register to answer this question.

...