Passing variable length sentences to Tensorflow LSTM

Question

Passing variable length sentences to Tensorflow LSTM

asked Feb 10, 2019 in General by sujitha (130 points)

I have a tensorflow LSTM model for predicting the sentiment. I build the model with the maximum sequence length 150. (Maximum number of words) While making predictions, i have written the code as below:

batchSize = 32
maxSeqLength = 150

def getSentenceMatrix(sentence):
    arr = np.zeros([batchSize, maxSeqLength])
    sentenceMatrix = np.zeros([batchSize,maxSeqLength], dtype='int32')
    cleanedSentence = cleanSentences(sentence)
    cleanedSentence = ' '.join(cleanedSentence.split()[:150])
    split = cleanedSentence.split()
    for indexCounter,word in enumerate(split):
        try:
            sentenceMatrix[0,indexCounter] = wordsList.index(word)
        except ValueError:
            sentenceMatrix[0,indexCounter] = 399999 #Vector for unkown words
    return sentenceMatrix

input_text = "example data"
inputMatrix = getSentenceMatrix(input_text)

In the code i'm truncating my input text to 150 words and ignoring remaining data.Due to this my predictions are wrong.

cleanedSentence = ' '.join(cleanedSentence.split()[:150])

I know that if we have lesser length than sequence length we can pad with zero's. What we need to do if we have more length. Can you suggest me the best way to do this. Thanks in advance.

Passing variable length sentences to Tensorflow LSTM

Please log in or register to add a comment.

Please log in or register to answer this question.

Related questions

Categories