Scrape a table on a web page

It might sound basic, but is there a good pattern for this? I’m trying to scrape data from a table that has between 1 and 20 or so rows, and might be spread against several pages (with links to 1…x and next pages).

It would be nice to simply select the table using xpath, but I can’t figure out how to make that work.

I’m fairly new to RPA, but willing to put in time to understand any suggestions.

Hello Chris,

You can find the Xpath information here: https://kb.workfusion.com/display/RPAe/XPath+Guide

The solution for your task could be like this:

  1. Get all necessary rows from a page.
  2. Iterate through these rows and get the information you seek
  3. Click on next page button
  4. Repeat steps 1-2
    …and so on.
1 Like

To add to this - sometimes a good way to find how many “things” there are on a page is to:

  • work out the pattern between each xpath (ie link1 = xpath…11 link2 = xpath…15 link3 = xpath…19 etc)

  • then “click” on each xpath address, increment a counter

  • when one of the clicks fails (xpath doesn’t exist) capture this with a try-catch

You now know how many items you are dealing with (the value in counter) and can load up a list to work your way through.

1 Like

@chris_hogben perhaps, this post can be helpful in building the xpath