I have trained a sentiment analysis model using the TFIDF vectorizer features with Logistic Regression as classifier. On testing time I am inputting the string of text into TFIDF vectorizer after preprocessing and normalizing the content. However, the following error keeps appearing while using the TFIDF to transform features :
ValueError: Iterable over raw text documents expected, string object received tfidf vectorizer
The code works fine when i am using more than 1 sample for testing.
If anyone can let me know of the problem then it would be great!
I am going to explain this through using an example.
Let’s assume that you have one sample to test. Take an article content for instance. After preprocessing you have the following output:
x = "['doctor', 'mbbs', 'student', 'found', 'shot', 'dead', 'hostel']"
The above output right now is in string format. Look at the quotes around the [] brackets.
TFIDF transformer needs a list (or an iterable) containing a single element (which is nothing but the String itself).
The error can be removed by adding the following line:
x = [x]
After this the output will be in the following form :
["['doctor', 'mbbs', 'student', 'found', 'shot', 'dead', 'hostel']"]
Now you can see that the above list is iterable with a single string.
Any possibility to share code.
The only code to resolve this error is to put the variable x into a list using the square brackets []. You can simply put the variable with content in this. This will satisfy the criteria on the input format.