regex - PHP preg_match returns only first match

  1. The first question is this:

I am using http://www.phpliveregex.com/ to check my regex is right and it finds more than one matching lines.

I am doing this regex:

$lines = explode('\n', $text);
foreach($lines as $line) {
    $matches = [];
    preg_match("/[0-9]+[A-Z][a-z]+ [A-Z][a-z]+S[0-9]+\-[0-9]+T[0-9]+/uim", $line, $matches);

    print_r($matches);
}

on the $text which looks like this: http://pastebin.com/9UQ5wNRu

The problem is that printed matches is only one match:

Array
(
     [0] => 3Bajus StanislavS2415079249-2615T01
)

Why is it doing to me? any ideas what could fix the problem?

  1. The second question

Maybe you've noticed not regular alphabetic characters of slovak language inside the text (from pastebin). How to match those characters and select the users which have this format:

{number}{first_name}{space}{last_name}{id_number}

how to do that?

Ok first issue is fixed. Thank you @chris85 . I should have used preg_match_all and do it on the whole text. Now I get an array of all students which have non-slovak (english) letters in the name.

Answer

Solution:

preg_match is for one match. You need to use preg_match_all for a global search.

[A-Z] does not include an characters outside that range. Since you are using the i modifier that character class actual is [A-Za-z] which may or may not be what you want. You can use \p{L} in place of that for characters from any language.

Demo: https://regex101.com/r/L5g3C9/1

So your PHP code just be:

preg_match_all("/^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$/uim", $text, $matches);
print_r($matches);

Answer

Solution:

You can also use T-Regx library:

pattern("^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$", 'uim')->match($text)->all();

Source