I’ve been working on an application which has an Alexa/Siri like feature. There are a few commands in my application like “Play music”, “Take a picture”, “Search on the internet” etc. Now, I’ve integrated the speech to text API which gives me the input in the form of text, but I don’t understand how to match it with the predefined commands in my app to determine which command should be executed.
I’ve read that Hamming distance and Levenshtein distance work well for string matching, but I was wondering if someone could explain it to me using code samples.
Also, can I somehow use TF-IDF vectors for this task?
You are correct, when comparing raw strings you can use Hamming or Levenshtein distances. These operations can be implemented as simple functions of two arguments (the two strings to compare).
Python has a package for computing the Levenshtein distance. Here’s an example of using it:
import Levenshtein
Levenshtein.distance('Play music', 'Take a picture') # 11
As for TF-IDF (or other feature vectors for strings) you can encode your strings into vectors and then compute geometry based distances (e.g. Euclidean, cosine) on these vectors. This could work for your task and you can find Python packages which provide this vectorization feature (scikit-learn has TF-IDF).
I would suggest you start simple by using the Levenshtein library (or implement it yourself as an exercise ) and later try out the vector approaches if you find it necessary.
The utility of TF-IDF is to determine which words in a body of text (corpus) are important. This importance is gauged using frequency of occurrence, while discounting for words occurring too frequently because such words are prepositions and other part of speech which give sentences structure but do not hold as much significance.
TF-IDF will tell you want entities the user has commanded to perform on, e.g, “When was Nelson Mandela Born?”. It will tell you “Nelson Mandela” and “born” are of interest but you need a secondary mechanism to determine what exact operation is required.