How to find the difference between two strings in Excel

I have two similar strings as below.
str1: this is an example
str2: An example of the script
How can I edit the script in Custom Actions to get the result as below?
diff: this is
Thank you!

Hi Shawn,
Two questions:
So you only want it to output the diff of what exists in the first string but not in the second, i.e. you want the diff to output “this is” but not “of the script”?
Is there a minimum length of string you want it to compare? Do you simply want it to find the longest possible matching substring between the two then perform the diff or do you want it to cut off at a minimum length so it doesn’t just perform a comparison based on a single character if nothing longer than that matches?
Thanks
-Ethan Morris
RPath RPA Manager

Yes, I’d like to attain the following result by the function:
FindDiff(str1, str2) = this is
FIndDiff(str2, str1) = of the script

The minimum length is uncertain because the delimiter is one space.

Sorry, originally tried to paste this as a code snippet and I did a rather poor job.
The custom script action can be found in the .rpae in this zip,
it accepts two input strings: str_string_one and str_string_two and creates an output list lst_outputs.
The reason the output is a list is in case there are differing texts at both the beginning and end of String One (i.e. if String One were ‘this is an example of the script’ and string two were ‘i need an example for this’ it would output [this is, of the script], whereas with your initial example it would output a list of one containing the requested output of [this is]

You can take it from the attached folder.string_comparison_as_custom_action.zip (1.7 KB)

-Ethan Morris
RPath RPA Manager

2 Likes

Hi @emorris, the script is very useful, thanks a lot for your help!
I can use the script to find the different strings between two strings as below.

I have two questions:

  1. How to use this script in For Each loop for check Excel data?
  2. How to make this script ignore case for text comparison?

To loop through an excel document (I am assuming similar to the one you have in this screenshot) I would write columns A and B to two separate lists, and use an iterator number variable as discussed in this thread:


To get String1 and String2 for each row and then feed them through the script.

To make the comparisons case-agnostic, the step which identifies the longest common sub-string between the two strings should be adjusted to compare versions of the strings that have both been converted entirely to upper case, then once the diff has been extracted the length of the diff results from the beginning and end of the string should be determined and substrings should be extracted from the ORIGINAL, non-case adjusted string one based on that. I will post an updated version that accounts for this a little later on today. (Alternatively, if the case of your output doesn’t matter, you can just add ‘Change Case’ steps prior to using the script to convert both strings entirely to upper or lower case.)
Hope that answers your questions!
-Ethan Morris
RPath RPA Manager

1 Like

I’ve attached the modified script below, when I tested it against your samples it worked on my end, feel free to let me know if that is not the case for you though.

-Ethan Morris
RPath RPA Manager

string_comparison_as_custom_action_text_agnostic.zip (1.8 KB)

3 Likes

Hi @emorris,
I truly appreciate your help in resolving the problem!

The new script is perfect for ignore case for text comparison.
And I use the script in for each loop follow your instructions.


Tips:

  1. Counter and for each loop start from 2 in order to skip the title row.
  2. After assigning result value to the array, clear the lst_outputs list.

Thanks again for all your help.

1 Like

Fantastic!
Glad this helped, Shawn.
-Ethan Morris
RPath RPA Manager

2 Likes

Hi @emorris,
I used this script to compare two columns string with each other, but the result is not correct.
Is there something wrong with my actions flow? Or how do I modify the script?
I have attached the folder and Excel file. Looking forward to your help.
String_Comparison.zip (3.0 KB) StringComparison_test.xlsx (10.1 KB)

Here’s a VBA function - WORDDIF for your reference.
VBA FunctionWORDDIF.txt (760 Bytes)

I’m looking into this, Shawn,
The good news is it’s not a problem in any of the more complex functions, it’s getting held up on the final step which makes the whole thing much easier (Specifically I think I’m doing a count from beginning and need to be doing a count from end). The bad news, though, is that I noticed a discrepancy that WILL occur that I want to double check with you:

As the script stands now, between the strings “Aristo Telenet Pvt Ltd” and “ARISTO TELENET PRIVATE LIMITED” the script (once repaired) would output “vt Ltd” and “RIVATE LIMITED” respectively, rather than what your goal is, which is “Pvt Ltd” and “PRIVATE LIMITED”. So my question is how big of a problem is this? Should there be a failsafe included that says “If the next-to-last character is a space, assume that the last two characters should be excluded” or something to that effect?
Thanks
-Ethan Morris
RPath RPA Manager

Hi @emorris,
Thanks for your quick reply.
I’d like to get the different words like “Pvt Ltd” and “PRIVATE LIMITED”, not “vt Ltd” and “RIVATE LIMITED”.

Good News!
I realized string comparison by variables and current Excel Actions as below Actions Flow.
WORDDIF.zip (2.2 KB)


Tips:

  1. Split the String A by space into the temp array.
  2. Upper temp array words and String B to ignore case.
  3. Consolidate words as the result string if not exist in String B.
  4. Trim the result string.
  5. Add the result string into the result array.

Automation Result:

Thanks a lot for your kind support as always!