php - Unable to json decode string from json file

I am facing an issue where I have a JSON array of objects in a .json file. I can get the content of the file using file_get_contents $str = file_get_contents($jsonFile); However when I perform json_decode on the content I just get null as result. Below is some of the content from the .json file

[
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11749,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de sant?� McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En pr???�sentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Context 04072022 12h41",
    "shortname.fr": "Formation Context 04072022 12h41",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Context 04072022 12h41",
    "title.fr": "Formation Context 04072022 12h41",
    "visible": false
},
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11748,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de sant?� McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En pr???�sentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Contexte 040722 08h51m",
    "shortname.fr": "Formation Contexte 040722 08h51m",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Contexte 040722 08h51m",
    "title.fr": "Formation Contexte 040722 08h51m",
    "visible": true
},
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11747,
    "duree": "",
    "dureeminutes": 0,
    "establishmentaltname": "06-ciusss-cusm",
    "establishmentfullname": "Centre universitaire de sant?� McGill",
    "fcpresponsable": "",
    "idnumber": "",
    "idnumberalt": "",
    "imgurl": null,
    "ispartageable": false,
    "keywords": null,
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En pr???�sentiel",
    "nombreinscriptions": 1,
    "parentestablishmentfullname": "Territoire CUSM",
    "parentestablishmentshortname": "CUSM-FCP",
    "partageable": "Locale",
    "shortname.en": "Formation Contexte 04072022",
    "shortname.fr": "Formation Contexte 04072022",
    "summary.en": "",
    "summary.fr": "",
    "title.en": "Formation Contexte 04072022",
    "title.fr": "Formation Contexte 04072022",
    "visible": false
}]

How can I convert it into valid json string for php or an array. The JSON is a valid JSON but after I use file_get_contents it inserts line breaks and \n like here: https://3v4l.org/5Zd7O Below is a snippet of my code:

$str = file_get_contents('jsondump.json');
var_dump(gettype($str));
var_dump($str);

$jsonArr = json_decode($str,1); // decode the JSON into an associative array
var_dump($jsonArr) ;
echo json_last_error_msg();

I tried checking the encoding using mb_convert_encoding() however the result is still the same, I did:

$str = file_get_contents($jsonFile); 

$encoding = mb_detect_encoding($str, 'UTF-8, ISO-8859-1', true);
$str2 =  mb_convert_encoding($str, 'UTF-8', $encoding);
var_dump(gettype($str));
var_dump($str);
var_dump($str2);
var_dump($encoding);

When I display the var_dump results I get $encoding value as "\nstring(5) "UTF-8" The first $str is like below: [{\n "accreditation": false,\n "category.en": "Template",\n "category.fr": "Gabarit",\n "clientele.en": n ull,\n "clientele.fr": null,\n "courseid": 816,\n "duree": "1h00m",\n "dureeminutes": 60,\n "establishmentaltname": "06-ciusss-cusm",\n "establishmentfullname": "Centre universitaire de sant \xc3\xa9 McGill",\n "fcpresponsable": "",\n "idnumber": "",\n "idnumberalt": "",\n "imgurl": null,\n "ispartageable": true,\n "keywords": null,\n "lastupdate": 1483246800,\n "m odalite.en": "Online",\n "modalite.fr": "En ligne",\n "nombreinscriptions": 6,\n "parentestablishmentfullname": "Territoire CUSM",\n "parentestablishmentshortname": "CUSM-FCP",\n "partageable": "Pa rtageable",\n "shortname.en": "E-learning Course Template",\n "shortname.fr": "gabarit d\'une formation en ligne",\n "summary.en": "This template is to be used when creating an e-learning course as part of the F CP program. It is important that we standardize the training structure to allow users a more user friendly experience. ",\n "summary.fr": "Ce gabarit devra \xc3\xaatre utilis\xc3\xa9 lors de la cr\xc3\xa9ation d\'un cours FCP en ligne. Il est important d\'uniformiser la structure de formation afin de permettre une exp\xc3\xa9rience plus conviviale aux apprenants.",\n "title.en": "FCP E-learning Course Template",\n "title.fr": "FCP Gabarit de formation en ligne",\n "visible": false\n }] and $str2 is the same like this [{\n "accreditation": false,\n "category.en": "Template",\n "category.fr": "Gabarit",\n "clientele.en": n ull,\n "clientele.fr": null,\n "courseid": 816,\n "duree": "1h00m",\n "dureeminutes": 60,\n "establishmentaltname": "06-ciusss-cusm",\n "establishmentfullname": "Centre universitaire de sant \xc3\xa9 McGill",\n "fcpresponsable": "",\n "idnumber": "",\n "idnumberalt": "",\n "imgurl": null,\n "ispartageable": true,\n "keywords": null,\n "lastupdate": 1483246800,\n "m odalite.en": "Online",\n "modalite.fr": "En ligne",\n "nombreinscriptions": 6,\n "parentestablishmentfullname": "Territoire CUSM",\n "parentestablishmentshortname": "CUSM-FCP",\n "partageable": "Pa rtageable",\n "shortname.en": "E-learning Course Template",\n "shortname.fr": "gabarit d\'une formation en ligne",\n "summary.en": "This template is to be used when creating an e-learning course as part of the F CP program. It is important that we standardize the training structure to allow users a more user friendly experience. ",\n "summary.fr": "Ce gabarit devra \xc3\xaatre utilis\xc3\xa9 lors de la cr\xc3\xa9ation d\'un cours FCP en ligne. Il est important d\'uniformiser la structure de formation afin de permettre une exp\xc3\xa9rience plus conviviale aux apprenants.",\n "title.en": "FCP E-learning Course Template",\n "title.fr": "FCP Gabarit de formation en ligne",\n "visible": false\n }]

Answer

Solution:

  1. The JSON text you posted is OK. Unfortunately, that's NOT the text you're passing to json_decode(). Hence the error.

  2. Assuming your original .json file is OK, it appears that file_get_contents() is corrupting the JSON text.

  3. SUGGESTION:

http://truelogic.org/wordpress/2018/08/19/php-file_get_contents-for-utf-encoded-content/

One of the problems of file_get_contents() is that it messes up the data if the file contains special characters outside the standard ASCII character set.

The solution is to convert the encoding of the contents to UTF-8, but only after it has detected the desired encoding. So for instance if we know the file contains European languages like Spanish or French then we specify the detection for ISO-8859-1. For Arabic it would be ISO-8859-6 and so on.

function file_get_contents_utf8($fn) {
     $content = file_get_contents($fn);
      return mb_convert_encoding($content, 'UTF-8',
          mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}

It sounds like your file is French/ISO-8859-1, and it sounds like all you have to do is use mb_convert_encoding() to convert it to UTF-8 before attempting json_decode().

See also mb_detect_encoding for more details.


Per the OP, he's reading a perfectly legal JSON file like this:

[
{
    "accreditation": false,
    "category.en": "Administration and Management",
    "category.fr": "Administration et gestion",
    "clientele.en": null,
    "clientele.fr": null,
    "courseid": 11749,
    ...
    "lastupdate": 0,
    "modalite.en": "In Person",
    "modalite.fr": "En pr???�sentiel",
    "nombreinscriptions": 1,
    ...
    "partageable": "Locale",

... but file_get_contents() is corrupting the text, like this:

[{
        "accreditation": false,
        "category.en": "Template",
        "category.fr": "Gabarit",
        "clientele.en": n ull,
        ...
        "m odalite.en": "Online",
        "modalite.fr": "En ligne",
        "nombreinscriptions": 6,
        ...
        "partageable": "Pa rtageable",

file_get_contents() doesn't always "play nice" with non-ASCII, multi-byte text, per the link I cited above. A common solution is to call mb_convert_encoding() to convert the string to UTF-8. I gave an example above.

It appears, however, that the OP's input text is corrupted badly enough that mb_convert_encoding() doesn't work. I can't explain this.

SUGGESTED ALTERNATIVE: read the bytes directly (instead of using file_get_contents()). Then call mb_convert_encoding(), to ensure json_decode() gets UTF-8 text:

Is there an alternative to file_get_contents?

fwrite() and UTF8

https://stackoverflow.com/a/31214886/421195

@Karan -

Q: Are you SURE the input file is 100% OK? There seem to be a few minor discrepancies between the examples.

Q: Have you looked at one of the "bad" files in a hex editor? Perhaps the "mysterious spaces" might be due to "hidden characters" that would only show up if you viewed the file in hex?

Q: What's your PHP version? Perhaps upgrading might resolve the problem?

Answer

Solution:

I did try with the JSON data you provided and it's working fine. You can check if you are using correct path where your JSON file is stored in file_get_contents().

Below is the example: https://3v4l.org/bFS59

Source