regex - PHP preg_match returns only first match
- The first question is this:
I am using http://www.phpliveregex.com/ to check my regex is right and it finds more than one matching lines.
I am doing this regex:
$lines = explode('\n', $text);
foreach($lines as $line) {
$matches = [];
preg_match("/[0-9]+[A-Z][a-z]+ [A-Z][a-z]+S[0-9]+\-[0-9]+T[0-9]+/uim", $line, $matches);
print_r($matches);
}
on the $text
which looks like this: http://pastebin.com/9UQ5wNRu
The problem is that printed matches is only one match:
Array
(
[0] => 3Bajus StanislavS2415079249-2615T01
)
Why is it doing to me? any ideas what could fix the problem?
- The second question
Maybe you've noticed not regular alphabetic characters of slovak language inside the text (from pastebin). How to match those characters and select the users which have this format:
{number}{first_name}{space}{last_name}{id_number}
how to do that?
Ok first issue is fixed. Thank you @chris85 . I should have used preg_match_all
and do it on the whole text. Now I get an array of all students which have non-slovak (english) letters in the name.
Answer
Solution:
preg_match
is for one match. You need to use preg_match_all
for a global search.
[A-Z]
does not include an characters outside that range. Since you are using the i
modifier that character class actual is [A-Za-z]
which may or may not be what you want. You can use \p{L}
in place of that for characters from any language.
Demo: https://regex101.com/r/L5g3C9/1
So your PHP code just be:
preg_match_all("/^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$/uim", $text, $matches);
print_r($matches);
Answer
Solution:
You can also use T-Regx library:
pattern("^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$", 'uim')->match($text)->all();
Source