How to read HTML attributes into a variable and use it in XPath

I am trying to read html attributes (in this case the style information “top”) of an element. I can find the element via Xpath. I want to extract the “top” information of this element and use it in an Xpath search to find another element, the target element. Is that possible and how?

I need this because the target element changes its position in each of the different documents I process but it has always the reference element (“LIQUIDO DE TOTALES”) at its left.

Hi Tim, if you know that there are only two elements on the page with the same “top” property, then you can do the following:

  • use the 1st element’s xpath to get the value of attribute “style”

  • use Substring between to select the top property value, save it to a string variable

  • use the value of the string variable in the xpath to select the value of the second element on the page with such top property

Hi Alesia,
yes!! The attribute extraction works like that. Thanks!
BUT now I have a different problem. The reference element I thought I could find easily doesnt give me a unique Xpath as I thought. And I dont know why. The text is unique in the PDF document but it shows a red exclamation in the Xpath options:


Any idea why?

Hi Tim, That is because the explorer is not sure if it can recognize this element as unique. It’s just a warning.I see you are using Xpath on a PDF file, maybe my post in Tips and Tricks is helpful to you: Working with PDF using Xpath as substitute for OCR

In this post I recommend a different approach to identifying the Xpath when reading PDF files.

Hi Wilbert
it was precisely your first post from last week that made me change the process from OCR to Xpath! Many thanks for that! Also for the tips&tricks article. I am working on the change now and expect reduction of processing time from about 12 hours to 4 hours!!
Maybe I come back to you in a private message with some more detailed questions if you allow me.
Hi Tim,

Great to hear your enthusiasm! If there are questions, feel free to contact mee. If it are generic questions it would however be more usefull to ask them on the forum so other can comment and learn as well.

Tim, try using xpath //*[contains(text(),‘LIQUIDO DE TOTALES’)].

Hi @ashapkina
Thanks for the suggestion. I tried that one and get an error:

The deactivated step 4 in my flow works, though:

I understand that it is a less precise XPath but in the aproximately 1.200 documents I processed this way I got the correct result.

Weird, it does work in my pdf docs.
Anyway, glad you found a solution :+1:t3:

