Freetag - 8bit-tags list in entry-editor

Creating and modifying plugins.
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

OK, I can understand, why pure 8bit tags are big problem for european developers, but - English isn't single (and even most used) language in the world.
But everybody with 8 or more bits alphabet want to see tags on his native language sorted and ordered in the same style as it happens with english tags, which he can't get for now (updated to 3.12.2 and re-checked), because it's smth. like this

Code: Select all

|A: adobe, |B: bluetooth, |D: DVCS, |F: FIDO, |G: Git, |H: headsets, |M: Mercurial, |N: nokia, |�: я.ру, шутки, телефоны, фразы, фантастика, филология, фото, русский язык, религия, |P: pdf, pdf reader, |S: SCM, sony, |W: wireless, |�: Анонс, АИ, Россия, Тарковский, Яндекс, аудио, аудиофильское, агрегатор, алармизм, анонс, бред, байки, блоги, гарнитуры, детектив, злое, история России, мысли, масскульт, мифы, мониторы, на злобу дня, новость, псевдолитература, па, История,
and even worse - tags output is browser-dependent (Safary, FireFox, Opera,QTWeb tried)
I don't ask fix it ASAP, I just ask "Where I can read code, which generate such output?" (tried to read sources, no luck)
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

I believe the problem is that the first character is used with substr instead of checking for mb_substr(). The sorting of the keys is done with PHPs sorting, which in fact really might not properly order by UTF-8 bytemarks.

I'll try to look into this the next days!

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

OK. Please kill also lowercasing of tags, or made it as option (will most possible choices)
Quis custodiet ipsos custodes?
Don Chambers
Regular
Posts: 3652
Joined: Mon Feb 13, 2006 2:40 am
Location: Chicago, IL, USA
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by Don Chambers »

LazyBadger wrote:OK. Please kill also lowercasing of tags, or made it as option (will most possible choices)
Forcing tags to be lower case already IS an option in the freetag event plugin.
=Don=
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

Forcing tags to be lower case already IS an option in the freetag event plugin
Mea culpa, found it now. Tnx
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

The freetag plugin in fact already uses mb_strtolower. Can you check that your PHP has the mb* functions enabled?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

Garvin, I'll slightly lazy to write code, but - phpinfo() tells me:
mbstring
Multibyte Support enabled
Multibyte string engine libmbfl
Multibyte (japanese) regex support enabled
Multibyte regex (oniguruma) version 4.4.4
Multibyte regex (oniguruma) backtrack check On
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

Can you create a test.php file with:

Code: Select all

<?php
echo mb_strtolower('teststring');
?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

garvinhicking wrote: Can you create a test.php file
Yes, but it's not so easy for UTF8-texts. All OK with 8bit ANSI. Maybe you'll tell me, whichpart of of code (and where) I can verify and I'll write full report here?
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

I'm not sure I understand. Does the test.php code work or does it produce errors? Does it not properly lowercase your UTF-8 characters?

If a UTF-8character cannot be lowercased, it should be shown as the original character...

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

garvinhicking wrote: I'm not sure I understand. Does the test.php code work or does it produce errors? Does it not properly lowercase your UTF-8 characters?
I can't understand results of my test too - and it drive me nuts
Test code (extended version)

Code: Select all

<?php
echo('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">');
echo('<HTML>');
echo(' <HEAD>');
echo('  <TITLE> Test of mb_ functionality v.1.5</TITLE>');
echo('  <META NAME="Generator" CONTENT="EditPlus">');
echo('  <META NAME="Author" CONTENT="LazyBadger">');
echo('  <META NAME="Keywords" CONTENT="test, mb_ functions">');
echo('  <META NAME="Description" CONTENT="Test Page">');
echo('  <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="TEXT/HTML; CHARSET=UTF-8" />');

echo(' </HEAD>');

echo(' <BODY>');
if (is_callable('mb_strtolower'))
                    echo mb_strtolower('mb_strtolower is callable- Пробные РаЗные ТЕксты; ');
                else
                    echo('mb_strtolower NOT callable!!! ');

echo ('Pure UTF8 - Пробные РаЗные ТЕксты; ');
echo strtolower('strtolower - Пробные РаЗные ТЕксты');
 echo('</BODY>');
echo('</HTML>');
?>
saved as "UTF8 without BOM" file (yes, stupid code, but it must work AFAIK for our needs) doesn't show lowercased texts as readable texts - see http://www.lazybadger.ru/test.php
If a UTF-8character cannot be lowercased, it should be shown as the original character...
Not in my case. Is it hosting-related issue? If yes, we can close my bug-report with "not a bug" resolution
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

Let's not give up yet. Your code is good to make further tests.

Try to add a:

Code: Select all

mb_strtolower('...your strings...', 'UTF-8');
This should instruct mb_strolower to recognize your original encoding as UTF-8 code. It seems as if your server is setup to have a internal mb-encoding not equal to UTF-8.

Also you might want to add a:

Code: Select all

$cur = mb_internal_encoding();
mb_internal_encoding('UTF-8');
echo "Internal encoding set to UTF-8, was: $cur<br />\n";
before any calls to mb_strtolower.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

Code: Select all

Internal encoding set to UTF-8, was: ISO-8859-1
Result (OK)

Code: Select all

mb_strtolower is callable- пробные разные тексты; Pure UTF8 - Пробные РаЗные ТЕксты;
Second parameter for for mb_strtolower without using mb_internal_encoding() produce the same result as above (good)
Quis custodiet ipsos custodes?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by garvinhicking »

Hi!

Excellent. I patched the plugin to add that call:

http://php-blog.cvs.sourceforge.net/vie ... sion=1.141

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
LazyBadger
Regular
Posts: 176
Joined: Mon Aug 25, 2008 12:25 pm
Location: Russia
Contact:

Re: Freetag - 8bit-tags list in entry-editor

Post by LazyBadger »

garvinhicking wrote: Excellent. I patched the plugin to add that call:
Patch applied, but nothing changed in output. Local browser's cache was cleaned before testing, sure
Quis custodiet ipsos custodes?
Post Reply