Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: charset malformed JSON... this happen with columns fields from db with specials characters... any fix? #8

Open
rapgithub opened this issue Aug 27, 2018 · 7 comments
Assignees

Comments

@rapgithub
Copy link

I have a table with a column text with French and Spanish characters and the JSON returned from the API gives me error

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /Applications/MAMP/htdocs/www/cpanel_tripsbro/vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

I think the issue it is the charset problem... my table is utf-8 charset in the db and header and html charset is ok but the issue it is from the API returning malformed Json when the column has accents in the field to show...

is there a way to set charset when using the line below?
$db = new \atk4\data\Persistence_SQL connection here?

or where?

thanks

@rapgithub rapgithub changed the title Charset for mysql connect JSON Charset for mysql connect for the API or for fields? Aug 27, 2018
@rapgithub rapgithub changed the title JSON Charset for mysql connect for the API or for fields? how to set the Charset for mysql connect or for fields it gives me error for fields with special characters, any fix? Aug 27, 2018
@rapgithub rapgithub changed the title how to set the Charset for mysql connect or for fields it gives me error for fields with special characters, any fix? how to set the Charset for mysql connect in the API or for fields? Aug 27, 2018
@rapgithub rapgithub changed the title how to set the Charset for mysql connect in the API or for fields? how to set the Charset when using mysql connect in the API or set it up for the problematic fields? Aug 27, 2018
@rapgithub rapgithub changed the title how to set the Charset when using mysql connect in the API or set it up for the problematic fields? error: charset --- how to set the Charset when using mysql connect in the API or set it up for the problematic fields? Aug 27, 2018
@rapgithub rapgithub changed the title error: charset --- how to set the Charset when using mysql connect in the API or set it up for the problematic fields? error: charset malformed JSON... this happen with columns fields from db with specials characters... any fix? Aug 27, 2018
@rapgithub
Copy link
Author

rapgithub commented Aug 28, 2018

I found a line that seems to solve this issue of Malformed JSON when the field contains special UTF-8 chars... if I change the Persistence_SQL.php and add the line below it fix the issue for the API JSON but for the Model edit in ATK4 it creates the problem with these fields showing wrong encoding...

change to Persistence_SQL.php added this line to fix the API but it affects the Model...

        case 'text':
            $v = utf8_encode($v);
            break;

Anyone knows how to fix it in the API only without affecting the ATK4 Model edit mode for those fields?
thanks

@DarkSide666
Copy link
Member

DarkSide666 commented Aug 28, 2018

Can you please try adding | JSON_UNESCAPED_UNICODE constant in these two places?

Please report back if this helps you and I will include this fix in repo code then.

@DarkSide666 DarkSide666 self-assigned this Aug 28, 2018
@rapgithub
Copy link
Author

rapgithub commented Aug 28, 2018

First of all I installed the API using composer and checking the link you sent me it is a different API.php file than the one I installed via composer.. mine was having 368 lines...

I have added too these line JSON_UNESCAPED_UNICODE you mentioned me before and it did not work either... I tried with your version of API.php and added the JSON_UNESCAPED_UNICODE and still the problem persist...

Error again the same:

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

can you test it first with text fields in my case those text fields are:

$this->addField('description_en', ['type' => 'text']);
$this->addField('description_it', ['type' => 'text']);
$this->addField('description_fr', ['type' => 'text']);
$this->addField('description_de', ['type' => 'text']);

I have English not problem, I have italian, French and German and I have the problem.. the table that has these columns is collation utf8_general_ci and every field is defined with collation utf8_general_ci
try to create a text column in db and enter French German or italian characters you will see the problem then...

In ATK4 in edit mode the field are rendered OK and the character are well saved in the db... showing perfect but in the API the JSON gives this error...

I put back also the Persistence_SQL.php to its normal code taking out the lines I have added and the problem it is the same:

Fatal error: Uncaught InvalidArgumentException: Unable to encode data to JSON in Zend\Diactoros\Response\JsonResponse: Malformed UTF-8 characters, possibly incorrectly encoded in /vendor/zendframework/zend-diactoros/src/Response/JsonResponse.php:164 Stack trace: #0 

adding this | JSON_UNESCAPED_UNICODE in those lines:

https://github.com/atk4/api/blob/develop/src/Api.php#L240
https://github.com/atk4/api/blob/develop/src/Api.php#L429

does not fix the problem yet...
thanks

@rapgithub
Copy link
Author

rapgithub commented Aug 28, 2018

can you verify if your Persistence_SQL.php it is like mine here?

 public function _typecastLoadField(Field $f, $value)
    {
        // LOB fields return resource stream
        if (is_resource($value)) {
            $value = stream_get_contents($value);
        }

        // work only on copied value not real one !!!
        $v = is_object($value) ? clone $value : $value;

        switch ($f->type) {
        case 'integer':
            $v = (int) $v;
            break;
        case 'float':
            $v = (float) $v;
            break;
        case 'money':
            $v = round($v, 4);
            break;
        case 'boolean':
            if (isset($f->enum) && is_array($f->enum)) {
                if (isset($f->enum[0]) && $v == $f->enum[0]) {
                    $v = false;
                } elseif (isset($f->enum[1]) && $v == $f->enum[1]) {
                    $v = true;
                } else {
                    $v = null;
                }
            } else {
                $v = (bool) $v;
            }
            break;
        case 'date':
        case 'datetime':
        case 'time':
            $dt_class = isset($f->dateTimeClass) ? $f->dateTimeClass : 'DateTime';
            $tz_class = isset($f->dateTimeZoneClass) ? $f->dateTimeZoneClass : 'DateTimeZone';

            if (is_numeric($v)) {
                $v = new $dt_class('@'.$v);
            } elseif (is_string($v)) {
                // ! symbol in date format is essential here to remove time part of DateTime - don't remove, this is not a bug
                $format = ['date' => '+!Y-m-d', 'datetime' => '+!Y-m-d H:i:s', 'time' => '+!H:i:s'];
                $format = $f->persist_format ?: $format[$f->type];

                // datetime only - set from persisting timezone
                if ($f->type == 'datetime' && isset($f->persist_timezone)) {
                    $v = $dt_class::createFromFormat($format, $v, new $tz_class($f->persist_timezone));
                    if ($v === false) {
                        throw new Exception(['Incorrectly formatted datetime', 'format' => $format, 'value' => $value, 'field' => $f]);
                    }
                    $v->setTimeZone(new $tz_class(date_default_timezone_get()));
                } else {
                    $v = $dt_class::createFromFormat($format, $v);
                    if ($v === false) {
                        throw new Exception(['Incorrectly formatted date/time', 'format' => $format, 'value' => $value, 'field' => $f]);
                    }
                }

                // need to cast here because DateTime::createFromFormat returns DateTime object not $dt_class
                // this is what Carbon::instance(DateTime $dt) method does for example
                if ($dt_class != 'DateTime') {
                    $v = new $dt_class($v->format('Y-m-d H:i:s.u'), $v->getTimeZone());
                }
            }
            break;
        case 'array':
            // don't decode if we already use some kind of serialization
            $v = $f->serialize ? $v : json_decode($v, true);
            break;
        case 'object':
            // don't decode if we already use some kind of serialization
            $v = $f->serialize ? $v : json_decode($v, false);
            break;
        }

        return $v;
    }

I added those lines I mentioned firstly because without them the problem exists at least for the API trying to create the JSON.

I do not understand why 'TEXT' type does not exist in the type definition inside _typecastLoadField function

this works for the API if I add it but then stop working will in ATK4 in edit mode when editing this text fields with special characters...

        case 'text':
            $d = mb_detect_encoding($v);
            if ($d=='ASCII') { $v=utf8_encode($v); } else { $v=utf8_encode($v); };
            if ($d=='UTF-8') { $v=$v; };
            break;

or this too

        case 'text':
       $v=utf8_encode($v);
            break;

but then in ATK4 modifying the Persistence_SQL.php in those lines adding this lines of code above in edit mode shows then strange symbols and this is logical because in ATK4 the fields are will rendered all the time but in the API those fields does not render will if the have special characters...

so until now I do not find still the correct solution when using the API with ATK4...

@DarkSide666
Copy link
Member

DarkSide666 commented Aug 29, 2018

OK I will make a testcase on my local machine and test this better in next few days and get back to you.

Btw, how your composer.json looks like?
Should be "atk4/api": "dev-develop", and same for atk4/data. I am testing on develop versions.
Also sometimes composer have a bug and not update. Then you can try to delete your vendor folder and then run composer update once again. Then it should pull last commit from develop branches.

@rapgithub
Copy link
Author

rapgithub commented Aug 30, 2018

thanks for having the time to test it! I wait...

for the compose.json I have this on the root of ATK4/composer.json folder: I do not see why atk4/data is not listed there ...

{
"minimum-stability": "dev",
"prefer-stable": false,
"require":{
"atk4/ui": "^1.2",
"atk4/api": "^0.1.0",
"atk4/mastercrud": "^1.0",
"atk4/login": "dev-master",
"atk4/filestore": "dev-master"
}

}

@DarkSide666
Copy link
Member

So here is fully featured test case - no problem found.
Please try this out and see if it works for you.

<?php
require_once 'vendor/autoload.php';

// Config
$config = [
    'db' => 'mysql://root@localhost/sampledb;charset=utf8',
    'uri' => '.'.$_SERVER['SCRIPT_NAME'],
];
/*
// No need to add character set and collation because these are default settings anyway
CREATE TABLE test_utf8 (
    id int unsigned not null auto_increment,
    desc_en varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_it varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_fr varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    desc_de varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
    PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

this is white cat
questo è gatto bianco
c'est chat blanc
Das ist weiße Katze
*/

// Initialize App
$app = new \atk4\ui\App(['API UTF8 test', 'always_run' => false]); // always_run=false to not render App on shutdown
$app->initLayout('Centered');
$app->init();

// Connect DB
$db = $app->dbConnect($config['db']);

// Header
$app->add(['Header', 'subHeader'=>'https://github.com/atk4/api/issues/8'])
    ->set('Testing utf8');

// Model
class TestModel extends \atk4\data\Model {
    public $table = 'test_utf8';
    public function init() {
        parent::init();
        $this->getElement($this->id_field)->ui['visible'] = true; // to see it in CRUD
        $this->addField('desc_en', ['type' => 'string']);
        $this->addField('desc_it', ['type' => 'string']);
        $this->addField('desc_fr', ['type' => 'string']);
        $this->addField('desc_de', ['type' => 'string']);
    }
}
$m = new TestModel($db);
$m->tryLoadAny();

// CRUD test - works perfectly
$c = $app->add('CRUD');
$c->setModel($m);

// API test
$api = $app->add(new \atk4\api\Api());
$api->rest('/test_utf8', new TestModel($db));

/*
    By default GET converts utf8 entities to \u notation
    See https://stackoverflow.com/a/16498304/1466341
    
    http://localhost/utf8.php/test_utf8/1 returns
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
    If you add JSON_UNESCAPED_UNICODE in Api->successResponse, then it will not convert characters to \u notation,
    but French character ' still will be converted to \u0027 because of this contant JSON_HEX_APOS we have in api.
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo è gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist weiße Katze"
}
*/

$endpoint = $config['uri'].'/test_utf8/';
$c->addModalAction('GET', 'GET request', function($p, $id) use ($m, $endpoint){
    $p->add('Console')->set(function ($console) use ($id, $m, $endpoint, $id) {

        // Make GET request
        $console->debug('Making GET request... ');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Decode JSON
        $console->debug('Decoded JSON:');
        $data = json_decode($result, true);
        $console->output(var_export($data, true));
        $before = $data;
        
        // Set values in model
        $console->debug('Set received values in model:');
        $m->set($data);
        $fields = $m->get();
        unset($fields['field_sql']); // new bug in atk4/data? what the hell is field_sql ???
        $console->output(var_export($fields, true));
        
        // POST request
        $console->debug('Making POST request...');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
            CURLOPT_POST => 1,
            CURLOPT_POSTFIELDS => $fields,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Make GET request again
        $console->debug('Making GET request again...');
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => $endpoint.$id,
        ]);
        $result = curl_exec($curl);
        curl_close($curl);

        // Response
        $console->debug('Response:');
        $console->output($result);

        // Decode JSON
        $console->debug('Decoded JSON:');
        $data = json_decode($result, true);
        $console->output(var_export($data, true));
        $after = $data;
        
        // Compare old and new values
        if ($before == $after) {
            $console->info('Old and new values are equal');
        } else {
            $console->error('Old and new values are different');
        }
    });
});

// Render App
// If API URL was called, then we don't go here and App will not be rendered becauseof always_run=false
$app->run();

Output:

Making GET request... 
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Decoded JSON:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)
Set received values in model:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)
Making POST request...
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Making GET request again...
Response:
{
    "id": "1",
    "desc_en": "this is white cat",
    "desc_it": "questo \u00e8 gatto bianco",
    "desc_fr": "c\u0027est chat blanc",
    "desc_de": "Das ist wei\u00dfe Katze"
}
Decoded JSON:
array (
  'id' => '1',
  'desc_en' => 'this is white cat',
  'desc_it' => 'questo è gatto bianco',
  'desc_fr' => 'c\'est chat blanc',
  'desc_de' => 'Das ist weiße Katze',
)

Old and new values are equal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants