373 views
0 0 votes

I've been researching sentiment analysis with word embeddings. I read papers that state that word embeddings ignore sentiment information of the words in the text. One paper states that among the top 10 words that are semantically similar, around 30 percent of words have opposite polarity e.g. happy - sad.

So, I computed word embeddings on my dataset (Amazon reviews) with the GloVe algorithm in R. Then, I looked at the most similar words with cosine similarity and I found that actually every word is sentimentally similar. (E.g. beautiful - lovely - gorgeous - pretty - nice - love). Therefore, I was wondering how this is possible since I expected the opposite from reading several papers. What could be the reason for my findings?

Two of the many papers I read:

  • Yu, L. C., Wang, J., Lai, K. R. & Zhang, X. (2017). Refining Word Embeddings Using Intensity Scores for Sentiment Analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(3), 671-681.
  • Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T. & Qin, B. (2014). Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1: Long Papers, 1555-1565

Please log in or register to answer this question.

Related questions

0 0 votes
0 0 answers
495
495 views
ntonis asked Jan 30, 2021
495 views
I am trying to create a sentiment analysis model using binary classification as loss.I have a batch of tweets that some of them are tagged as positive (labeled as 1) and ...
2 2 votes
1 1 answer
561
561 views
codemonkey asked Oct 16, 2018
561 views
If trying to read text and need to finalize texts as good, bad , ugly or any such buckets, where to start? What sentiment functions to use?
1 1 vote
1 1 answer
459
459 views
ntonis asked Apr 10, 2020
459 views
How should i preprocess my data if i am gonna use a pretrainned word embedding like glove or word2vec?Should I use stemming or stopword removal techniques?
0 0 votes
0 0 answers
459
459 views
patmull asked Feb 11, 2021
459 views
I need some tool to classify articles based on short category text which consists of two or three words separated by '-'. The RSS/XML tag content is for example:Foreign -...
0 0 votes
0 0 answers
500
500 views
ntonis asked Jan 30, 2021
500 views
I am trying to create a sentiment analysis model and I have a question.After I preprocessed my tweets and created my vocabulary I've noticed that I have words that appear...