Page 1 of 1

StringSimilarity for ContainsText

PostPosted: October 16th, 2018, 7:42 pm
by PaPaVero
Hello!

I have read your article "How to match two strings approximately" on http://www.delphiarea.com/articles/how- ... oximately/

So I wonder how could I compare the probability of a string (e.g. 'Path Copy Copy 11.0.2') is contained in 2 strings of approximately 500 characters, where:
• the first 500-char-string (A) contains the string 'PathCopyCopySettings' and
• the second 500-char-string (B) does not contain the string 'PathCopyCopySettings'

Re: StringSimilarity for ContainsText

PostPosted: November 17th, 2018, 12:23 pm
by Kambiz
One way is to split your string into words, then calculate how likely your search word is one of the words in the string.

\[P(word \in sentence) = \max(P(word = w_i, w_i \in sentence))\]
If your search phrase is not a single word, then you have to do the above procedure on n-gram of your string, where N is the number of words in your search phrase.