Import data from a current web session?

In 11.3 you can do:

ExternalEvaluate[session, "JavascriptExecute" -> 
  "return document.documentElement.outerHTML;"]

In 12.0 and up, the syntax has change a little bit:

session = StartWebSession[]; 

WebExecute[session, "OpenWebPage" -> 
  "https://weather.com/weather/tenday/l/New+York+NY+10017:4:US"]

html = WebExecute[ session, "JavascriptExecute" -> 
  "return document.documentElement.outerHTML;"]

Note that the Wolfram Language comes with a WeatherData function built-in, so you don't need to scrape data from a web page.

Also the National Weather Service has an public API, which might give you a more structured way to get this sort of data.

You import the html above with this:

ImportString[ html, "XMLObject" ]

This gives you a Wolfram Language expression that you can traverse with Part, Take and use with functions like Cases.

If your actual interest is stocks, then you should probably be aware of the FinancialData function.


This will work on Mathematica 11.3, and not in 12 as websession[] seems to have changed.

Module[
 {
  session = StartExternalSession["WebDriver-Chrome"],
  iws, chromedo, img, links
 }, 
 chromedo[cmd_] := ExternalEvaluate[session, cmd];
 Pause[1];
 iws = ExternalEvaluateWebDriver`Private`websession[];
 Pause[1];(*Time to load chrome*)
 chromedo[
  "OpenWebPage" -> 
  "https://www.barchart.com/stocks/quotes/SPY/options?moneyness=allRows"
   ];
 Pause[15];(*Time to load the page*)
 Echo@WebUnit`GetURL[iws];
 html = WebUnit`GetPageHtml[iws];
 DeleteObject[session];
 ]
TableForm[ImportString[html, {"HTML", "Data"}][[1, 2, 2, 2, 1, 2]]]

enter image description here


Here is a fairly stupid workaround to solve the above problem.

As suggested by rhermans, we can first obtain the html text of the webpage after it has finished loading:

session = StartExternalSession["WebDriver-Chrome"];
ffoxdo[cmd_] := ExternalEvaluate[session, cmd];
iws = ExternalEvaluateWebDriver`Private`websession[];
ffoxdo[ "OpenWebPage" -> "https://weather.com/weather/tenday/l/New+York+NY+10017:4:US"];
Pause[3];
html=WebUnit`GetPageHtml[iws];
DeleteObject[session];

Then, since Import cannot be used directly on a string of html text, we save it to disk and load from there:

Export["my.txt", html];
RenameFile["my.txt", "my.html"];
data = Import["my.html", "Data"];
DeleteFile["my.html"];

Now data indeed contains the output I was hoping for. But the workaround of writing to disk first is kind of unsatisfactory.