<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Question2Answer Q&amp;A - Recent questions tagged parser</title>
<link>https://www.question2answer.org/qa/tag/parser</link>
<description>Powered by Question2Answer</description>
<item>
<title>Import content form external website, utf-8 character problem in post</title>
<link>https://www.question2answer.org/qa/67065/import-content-form-external-website-character-problem-post</link>
<description>

&lt;p&gt;I am working on a custom plugin that import posts from external website.&lt;/p&gt;

&lt;p&gt;I use following code to create a post in QA platform with data retrieved from external website.&lt;/p&gt;

&lt;blockquote&gt;

&lt;p&gt;qa_post_create('Q', null, $title, $content, $format = '', $categoryid, $tags, $userid, $notify = null, $email = null, $extravalue = null, $name = null);&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I get $title, $content, and $tags successfully from external website using simple HTML DOM parser. And I use strip_tags() function to remove html tags and keep only text.&lt;/p&gt;

&lt;p&gt;Everything is well. the qa_post_create() function creates post. However, in content of just created post there are non-utf-8 characters such as&amp;nbsp;&lt;/p&gt;

&lt;blockquote&gt;

&lt;p&gt;&amp;amp;yacute; (ý)
&lt;br&gt;&amp;amp;uuml;
&lt;br&gt;&amp;amp;ldquo;
&lt;br&gt;&amp;amp;mdash;
&lt;br&gt;&amp;amp;rdquo;
&lt;br&gt;&amp;amp;Ccedil; (Ç)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;img alt=&quot;&quot; src=&quot;https://www.question2answer.org/qa/?qa=blob&amp;amp;qa_blobid=7145867896208243289&quot; style=&quot;height:350px; width:600px&quot;&gt;&lt;/p&gt;

&lt;p&gt;Actually, in $title there are also some characters such as (&lt;em&gt;ý&lt;/em&gt;), but it appears normal, not as&amp;nbsp;&lt;span style=&quot;color:rgb(52, 73, 94); font-family:ubuntu,helvetica,arial,freesans,sans-serif; font-size:16px&quot;&gt;&amp;amp;yacute;&lt;/span&gt;&lt;em&gt;.&lt;/em&gt; It happens only in content of question.&lt;/p&gt;

&lt;p&gt;When I just retrieve the data and print it with echo() or print_r() functions the text appear normal without any irregular symbols. These symbols appear only when I use&amp;nbsp;qa_post_create() function to create question.&lt;/p&gt;

&lt;p&gt;Also, when I change format from '' to 'html' in qa_post_create() function, it resolves. But I do not want to use format='html'. Is there any other way to fix it?&amp;nbsp;&lt;/p&gt;</description>
<category>Plugins</category>
<guid isPermaLink="true">https://www.question2answer.org/qa/67065/import-content-form-external-website-character-problem-post</guid>
<pubDate>Sun, 09 Sep 2018 19:41:37 +0000</pubDate>
</item>
</channel>
</rss>