php xquery parsing html

one text

Solution:

var_dump won't parse the DOM :)

You just need to re-query for your elements within the tile, then assign them to the array.

Assign a working item array to define the structure if it matters, else just build up the result as you go.

<?php
$str = '<article class="tile">
    <div class="tile-content">
        <a href=link-1">ignore</a>
        <div class="tile-content__text tile-content__text--arrow-white">
            <label class="label-date label-date--blue">02.12.2021</label>         
            <h4><a class="link-color-black" href="link-2">title-2</a></h4>
            <p class="tile-content__paragraph tile-content__paragraph--gray pd-ver-10">
             content-2
            </p>
        </div>
        <a href="link-2" class="btn btn-link btn-link__more btn-link--arrow-right float-right">more</a>
    </div>
</article>';

$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHtml($str);
libxml_clear_errors();

$xpath = new DOMXPath($dom);

$result = [];
foreach ($xpath->query("//article[contains(@class, 'tile')]") as $tile) {
    // define item structure
    $item = [
        'title' => '',
        'link' => '',
        'date' => '',
        'content' => ''
    ];
    
    // find date
    $query = $xpath->query("//label[contains(@class, 'label-date')][1]", $tile);
    if (count($query)) {
        $item['date'] = $query[0]->nodeValue;
    }

    // find link/title
    $query = $xpath->query("//h4/a[1]", $tile);
    if (count($query)) {
        $item['link'] = $query[0]->getAttribute('href');
        $item['title'] = $query[0]->nodeValue;
    }
    
    // find content
    $query = $xpath->query("//p[contains(@class, 'tile-content__paragraph')][1]", $tile);
    if (count($query)) {
        $item['content'] = $query[0]->nodeValue;
    }
    
    // assign
    $result[] = $item;
    
    // cleanup
    unset($item, $query);
}

print_r($result);

Output:

Array
(
    [0] => Array
        (
            [title] => title-2
            [link] => link-2
            [date] => 02.12.2021
            [content] => 
             content-2
            
        )

)

Source