php - how to get href and text content by html Content

one text

Solution:

Currently, you're just getting all the text content within all <td>'s. And it's not going to include the link inside the anchor tags. To do so, you'll need to dig deeper into the <td>.

Here's one way to do it using xpath:

$xpath = new DOMXpath($DOM);
$base = 'https://www.iana.org/';
foreach($Detail as $sNodeDetail)
{
    $aDataTableDetailHTML[$j][] = trim($sNodeDetail->textContent);
    if ($link = $xpath->evaluate("string(./span[contains(@class, 'domain')]/a/@href)", $sNodeDetail)) {
        $aDataTableDetailHTML[$j][] = "{$base}{$link}";
    }
    $i = $i + 1;
    $j = $i % count($aDataTableHeaderHTML) == 0 ? $j + 1 : $j;
}

Basically the query just extract the href value if the current <td> in the iteration has <span class="domain tld"><a href="xxxx">xxx</a></span> and get the href value.

Another way is to iterate each <tr> instead of each <td>:

$aDataTableDetailHTML = [];
$DOM = new DOMDocument();
$DOM->loadHTML($htmlContent);
$xpath = new DOMXpath($DOM);
$base = 'https://www.iana.org/';
foreach($xpath->query('//table[@id="tld-table"]/tbody/tr') as $row) {
    $domain = trim($xpath->evaluate("string(./td[1])", $row));
    $type = $xpath->evaluate("string(./td[2])", $row);
    $tld_manager = $xpath->evaluate("string(./td[3])", $row);
    $url = $xpath->evaluate("string(./td[1]/span/a/@href)", $row);
    $aDataTableDetailHTML[] = [$domain, $type, $tld_manager, "{$base}{$url}"];
}

Source