Extracting a Hyperlink with WebExecute

As stated in the comment, I think the question was not very well specified so it would be quite hard to give a good satisfactory answer. However, I did not know about WebExecute before and got curious about it. After playing around a bit and reading some docs, my conclusion is that there's no direct way to get the HTML or to manipulate the document using the WebElementObject objects returned by WebExecute.

However, a workaround presents itself in the form of the JavascriptExecute command. The idea is to use a javascript snippet to extract the sought after element(s) and then to add the entire html of said elements as the text of a new element with a given unique ID. Then we return to Mathematica where we find this new element and use ElementText to extract the html of the elements we extracted in JS. Since no HTML was provided I demonstrate on an arbitrary page and assume that OP would be able to narrow to the required element with javascript.

Here is sample code to extract an img tag from https://imgur.com/gallery/uGNoByn:

First, here is a snippet of the original HTML found when inspecting the main image on that page:

<div class="image post-image"><img alt="" src="//i.imgur.com/qE3kW4Z.jpg" style="max-width: 100%; min-height: 574px;" original-title=""></div>

For the sake of this example we'll use the image class to find the appropriate elements and extract their html:

      els = document.getElementsByClassName('image');
      sls = Array(els.length);
      for (j=0; j<els.length;++j) {
          sls[j]=els[j].outerHTML  
      };

We can then add this html to a new element in the document, which we will locate in MMA:

      el=document.createElement('div');
      el.id='myUniqueID';
      el.textContent=JSON.stringify(sls);
      document.children[0].appendChild(el);

We start a new session and load the required page

In[42]:= session = StartWebSession[];
In[43]:= WebExecute[session,"OpenPage" -> "https://imgur.com/gallery/uGNoByn"];

Now, we execute the code from above on the page:

In[44]:= js = "(function() {
      els = document.getElementsByClassName('image');
      sls = Array(els.length);
      for (j=0; j<els.length;++j) {
          sls[j]=els[j].outerHTML  
      };
      el=document.createElement('div');
      el.id='myUniqueID';
      el.textContent=JSON.stringify(sls);
      document.children[0].appendChild(el);
      })()";
In[45]:= WebExecute[session, "JavascriptExecute" -> js];

Now using MMA we extract the newly created element and get the html out of it:

In[46]:= els = 
  WebExecute[session, 
   "LocateElements" -> "CSSSelector" -> "#myUniqueID"];
In[47]:= WebExecute[session, "ElementText" -> els]
Out[47]= {"[\"<div class=\\\"image post-image\\\"><img alt=\\\"\\\" \
src=\\\"//i.imgur.com/qE3kW4Z.jpg\\\" style=\\\"max-width: 100%; \
min-height: 574px;\\\" original-title=\\\"\\\"></div>\"]"}

Now that the html is in MMA it is a matter of munging the required part out of it. Due to the vague nature of the question, this will be left as an exercise to the reader.


Here is a way with no JavaScript. Using the getAttribute function from this answer (by @swish)

getAttribute[element_WebElementObject, attribute_String] := 
 getAttribute[$CurrentWebSession, element, attribute]

getAttribute[session_WebSessionObject, element_WebElementObject, attribute_String] := 
 With[{sessionInfo = session /@ {"SessionID", "Browser", "URL"}}, 
  WebUnit`Private`attribute[sessionInfo, element["ElementId"], attribute]]

you can do:

StartWebSession["Chrome"];
WebExecute["OpenPage" -> "www.wolfram.com"];
imgs = WebExecute["LocateElements" -> "CSSSelector" -> "#_footer-bc > ul > li > a"];
getAttribute[#, "href"]&/@imgs

to get:

{http://www.wolfram.com/legal/?source=footer,http://www.wolfram.com/legal/privacy/wolfram/?source=footer,http://www.wolfram.com/site-map/?source=footer,http://www.wolframalpha.com/?source=footer,https://www.wolframcloud.com/?source=footer}