Workfusion studio: connection reset error while using <html-to-xml>

Hi All,

I am getting Connection reset error while trying to convert an HTML file to XML using . Here is the sample code. Could you please help?

<?xml version="1.0" encoding="UTF-8"?>
<config xmlns="http://web-harvest.sourceforge.net/schema/1.0/config" scriptlang="groovy">

	<var-def name="url">
		https://kb.workfusion.com
	</var-def>
	
	<var-def name="xml_from_html">
		<html-to-xml> 
			<http url="${url}"></http>
		</html-to-xml>
	</var-def>
	
	<file action="write" path="D:\\Temp\\xml_from_html.xml" type="binary">
    	<var name="xml_from_html"/>
	</file> 
	

    <export include-original-data="true">
    </export>
</config>

Hi @prudviraj_b.

Could you please clarify: do you use separate WorkFusion Studio or RPA Express?

Hi Lera,

I am using WorkFusion Studio. Here is the snapshot

Thank you. I would recommend you to check your Studio version in menu “Help” -> “About WorkFusion Studio”. Please send a screenshot.
Also please advise whether you tried your own script or this is for some Automation Academy assignment.

Hi Lera,

WorkFusion Studio version is 2.3.0. Here is the screenshot. I am trying it for own script.

image

As I see, you’re using RPA Express, not separate Studio. Please provide screenshot from Window > Preferences > WorkFusion Studio > Server Profiles and files from folder C:\RPAExpress\RPA\logs.

Thank you in advance.

Here is the server profiles snapshot and log files.

rpa-node0-2019-06-05.0.log (5.8 KB)
rpa-hub-2019-06-05.0.log (4.9 KB)

Thank you. Could you please show me the rest of this window? It should be some more settings in the bottom.

Hi Lera, Please see attached.

Hi @prudviraj_b,

Please could you change the Server bot port to 15444 and check running the bot again.

Replace with this Localhost Url: http://localhost:15444/wd/hub

Since the above mentioned Url works fine.

Thank you for posting and Please reach out if anything further concerns :slight_smile:

2 Likes

Thank you @aravindhan_mr! You’re speedy than me :slight_smile: Just want to suggest the same.

2 Likes

Hi @aravindhan_mr and Lera,
Changed as suggested, still getting the same error. Here are the snapshots

Thank you. Could you please copy full log from console? By the way, I would recommend to restart Studio after changing in server profile.

Here is the log from console. I have even restarted the studio after changing the server profile

19:59:04 [INFO] VarDefProcessorValidated starts processing…
19:59:04 [INFO] ConstantProcessor starts processing…
19:59:04 [INFO] ConstantProcessor processor executed in 0ms.
19:59:04 [INFO] VarDefProcessorValidated processor executed in 7ms.
19:59:04 [INFO] VarDefProcessorValidated starts processing…
19:59:04 [INFO] HtmlToXmlProcessor starts processing…
19:59:04 [INFO] HttpProcessor starts processing…
19:59:06 [ERROR] IO error during HTTP execution for URL: https://kb.workfusion.com
org.webharvest.exception.HttpException: IO error during HTTP execution for URL: https://kb.workfusion.com
at org.webharvest.runtime.web.HttpClientManager.execute(HttpClientManager.java:224)
at org.webharvest.runtime.processors.HttpProcessor.execute(HttpProcessor.java:106)
at org.webharvest.runtime.processors.BaseProcessor.run(BaseProcessor.java:127)
at org.webharvest.runtime.processors.BodyProcessor.execute(BodyProcessor.java:27)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(BaseProcessor.java:181)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(BaseProcessor.java:189)
at org.webharvest.runtime.processors.BaseProcessor.getBodyTextContent(BaseProcessor.java:193)
at org.webharvest.runtime.processors.HtmlToXmlProcessor.execute(HtmlToXmlProcessor.java:65)
at org.webharvest.runtime.processors.BaseProcessor.run(BaseProcessor.java:127)
at org.webharvest.runtime.processors.BodyProcessor.execute(BodyProcessor.java:27)
at org.webharvest.runtime.processors.VarDefProcessor.execute(VarDefProcessor.java:59)
at com.freedomoss.crowdcontrol.webharvest.processors.VarDefProcessorValidated.execute(VarDefProcessorValidated.java:28)
at org.webharvest.runtime.processors.BaseProcessor.run(BaseProcessor.java:127)
at org.webharvest.runtime.Scraper.execute(Scraper.java:169)
at org.webharvest.runtime.Scraper.execute(Scraper.java:182)
at com.freedomoss.crowdcontrol.webharvest.executor.LocalWebharvestTaskExecutor.executeWebHarvestTask(LocalWebharvestTaskExecutor.java:173)
at com.workfusion.studio.launch.SingleThreadWebHarvestProcess.processTaskInputs(SingleThreadWebHarvestProcess.java:77)
at com.workfusion.studio.launch.SingleThreadWebHarvestProcess.start(SingleThreadWebHarvestProcess.java:46)
at com.workfusion.studio.launch.WebHarvestMainLauncher.launch(WebHarvestMainLauncher.java:108)
at com.workfusion.studio.launch.WebHarvestMainLauncher.main(WebHarvestMainLauncher.java:180)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:747)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at org.apache.commons.httpclient.HttpConnection.flushRequestOutputStream(HttpConnection.java:828)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2116)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.webharvest.runtime.web.HttpClientManager.execute(HttpClientManager.java:215)
… 19 more
19:59:06 [INFO] -------------------------------------------
19:59:06 [INFO] EXECUTION FAILED
19:59:06 [INFO] Connection reset (HTML_to_XML.xml:10)
19:59:06 [INFO] -------------------------------------------

I cannot reproduce your issue, looks all good with your script. But can you please check whether you’re able to open this page manually in browser? Did you try with some other web page?

I tried with http://www.nba.com and it worked. But unable to understand why it is not working with https://kb.workfusion.com, which i am able to open manually in browser

1 Like

Glad to know that it works for other web pages. Perhaps, this can be some security restrictions from your side.

1 Like

Only https sites i am unable to execute from Studio, however there is no problem opening them manually from browser. If possible, could you please suggest what security restrictions may be preventing it.

My Pleasure @Lera I will try my best to solve when i am free from work hours. :slight_smile:

1 Like

Hi @prudviraj_b,

There shouldn’t be a problem with extracting Https sites until some internal security measures had been taken in system or those websites have some scripts for blocking the requests from unknown ports.

Please could you make sure to check the Bot Url in “Running Configurations” to be the same given in above replies before running the scripts.

Please let me know if this helps :slight_smile:

2 Likes