2,684 views
1 1 vote
I preprocessed the data, normalized the numerical features, and did one hot encoding for the categorical ones. I end up with a model with R^2=0.7 and RMSE which is 15% of the range of values.
I'm okay with the accuracy but I was wondering if there's a way to reduce RMSE to maybe ~7%?


Let me know please.


Thanks!
0% Accept Rate Accepted 0 answers out of 3 questions

1 Answer

0 0 votes

The path through getting better results from this point is not smooth. There are several recommendations, such as looking at the records that cause the largest errors and finding out the roots of those large errors. Are they outliers? Do you need more data or features?

The other guidelines are presented in this article.

Related questions

1 1 vote
1 1 answer
1.1k
1.1k views
engy.abdelazeez asked Feb 28, 2019
1,101 views
I’ve heard that it’s hard to visualize the output of random forest models with large trees/forest but I’m finding it hard to understand what the use case for the model i...
1 1 vote
0 0 answers
7.8k
7.8k views
TomGoncalves asked Mar 24, 2019
7,825 views
Hi everyone.I'm currently working on my final project for a Data Science degree and after a month of literature review, exploratory analysis and model testing, I'm not su...
0 0 votes
0 0 answers
550
550 views
Anas asked Nov 28, 2021
550 views
So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.I know that we can use pd.ge...
0 0 votes
0 0 answers
1.3k
1.3k views
Nescafeadjust asked Jun 8, 2022
1,283 views
How do I accurately compare between the number of something a survey measure from my employees each year with a varying umber of survey engagement and employee size?If I ...
1 1 vote
1 1 answer
1.3k
1.3k views
mcneils asked Mar 18, 2019
1,274 views
How do you determine the weight values that connect to the other data points when solving for our output in neural networks?