Hello,
I'm running s9y for almost five years now, using SQLite as database engine. On the migration 1.4.1 -> 1.5.2, I also changed the database format from Latin-1 to UTF-8. After that the search does not find anything if I search for an umlaut, which worked when the blog was run using Latin-1. Searching for "word1 word2" does fail also, even though an article contains the exact phrase. A search for "word1*word2" does fail also.
I've tried another blog that runs on UTF-8 since it's creation and the bugs appear there also. Any hints on where to start looking for problem causing this?
Search broken using UTF-8 and SQLite
Re: Search broken using UTF-8 and SQLite
After some more testing and by accident I found the real cause. It has nothing to do with umlauts and probably nothing with the database: you can't search anything that's the text of a link.
If your text includes:
a search for "jump" or "page" will not find the article containing that link.
Sorry if I caused any inconvenience,
SvOlli
If your text includes:
Code: Select all
<a href="http://www.s9y.org">jump to s9y page</a>Sorry if I caused any inconvenience,
SvOlli
-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: Search broken using UTF-8 and SQLite
Hi!
That's strange, because I believe the SQL search uses "LIKE %...%" which should also find something if you have "ajumper" and search for "jump"...?!
Regards,
Garvin
That's strange, because I believe the SQL search uses "LIKE %...%" which should also find something if you have "ajumper" and search for "jump"...?!
Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
-
kleinerChemiker
- Regular
- Posts: 765
- Joined: Tue Oct 17, 2006 2:36 pm
- Location: Vienna/Austria
- Contact:
Re: Search broken using UTF-8 and SQLite
does sqlite use the same placeholder like mysql?
Re: Search broken using UTF-8 and SQLite
Hmmm... I tried it again and it's not quite that either. Fact is, that it doesn't find obvious texts. Please try it yourself: go to http://blog.h8u.de/ and search for the terms "Qt 4.6" and "lighttpd". Both are available in the latest article "Das Ende meines Apachen...?", but cannot be found using the search.
I tried to dig up the search function by greping the the code for strings like "searchTerm", which is passed as an argument to search for, but could not dig up something. Could anyone point me out on where to look for the function in question?
And from another project, where I was using databases from Qt 4.x, I can tell that SQLite and MySQL are "more compatible" as SQLite and PostgreSQL for example. So, yes, SQLite uses the same placeholders as described in the SQL standard: "_" for one random character and "%" for zero or more random characters.
I tried to dig up the search function by greping the the code for strings like "searchTerm", which is passed as an argument to search for, but could not dig up something. Could anyone point me out on where to look for the function in question?
And from another project, where I was using databases from Qt 4.x, I can tell that SQLite and MySQL are "more compatible" as SQLite and PostgreSQL for example. So, yes, SQLite uses the same placeholders as described in the SQL standard: "_" for one random character and "%" for zero or more random characters.
-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: Search broken using UTF-8 and SQLite
Hi!
Do you have an external SQLite3 tool that you could use to execute SQL queries? If so, you could try:
If this does not yield results, SQLite might not support it properly, and I wouldn'T know how to change that in the application, if the database query does not return expected results...
The function is in include/functions_entries.incphp, serendipity_searchEntries().
HTH,
GArvin
Do you have an external SQLite3 tool that you could use to execute SQL queries? If so, you could try:
Code: Select all
SELECT * FROM serendipity_entries WHERE body LIKE '%lighttpd%'
The function is in include/functions_entries.incphp, serendipity_searchEntries().
HTH,
GArvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Re: Search broken using UTF-8 and SQLite
Using the SQLite2 tool, it works as expected: it finds the article.garvinhicking wrote: Do you have an external SQLite3 tool that you could use to execute SQL queries? If so, you could try:
Code: Select all
SELECT * FROM serendipity_entries WHERE body LIKE '%lighttpd%'
I tried it again on some other data, and it seems to me that everything that's in between tags, e.g. <span>...</span>, <table>...</table> can not be found?
I'll dig into function_entries as soon as possible.
Greetings,
SvOlli
Re: Search broken using UTF-8 and SQLite
Who needs sleep anyway? 
When the search didn't bring up enough matches, the search was restarted with an appended "*". SQLite treats "*" like any other character, so the second search didn't bring up anything at all.
Here's my suggestion for a fix / workaround:
So it wasn't my imagination... just some bad and misinterpreted coincidences. 
When the search didn't bring up enough matches, the search was restarted with an appended "*". SQLite treats "*" like any other character, so the second search didn't bring up anything at all.
Here's my suggestion for a fix / workaround:
Code: Select all
diff --git a/include/functions_entries.inc.php b/include/functions_entries.inc.p
index 5518e4a..fae7913 100644
--- a/include/functions_entries.inc.php
+++ b/include/functions_entries.inc.php
@@ -773,6 +773,7 @@ function &serendipity_searchEntries($term, $limit = '', $sea
} elseif ($serendipity['dbType'] == 'sqlite' || $serendipity['dbType'] == '
// Very extensive SQLite search. There currently seems no other way to
// But it's better than no search at all :-D
+ $term = str_replace( "*", "%", $term );
$cond['group'] = 'GROUP BY e.id';
$cond['distinct'] = '';
$term = serendipity_mb('strtolower', $term);
@@ -863,7 +864,7 @@ function &serendipity_searchEntries($term, $limit = '', $sea
//if * wasn't already appended and if there are none or not enough
//results, search again for entries containing the searchterm as a part
- if (strpos($term, '*') === false) {
+ if ((strpos($term, '*') === false) && (strpos($term, '%') === false)) {
if (! is_array($search)) {
return serendipity_searchEntries($term.'*', $orig_limit);
}else if (count($search) < 4){
-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: Search broken using UTF-8 and SQLite
Hi!
That's a very good catch, thanks a lot! I just committed your patch.
Regards,
Garvin
That's a very good catch, thanks a lot! I just committed your patch.
Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Re: Search broken using UTF-8 and SQLite
You're welcome, and I couldn't have done it without you pointing me to include/functions_entries.inc.php, serendipity_searchEntries().
I've updated my patch, as my first attempt had one drawback: it searched twice just to get the same results, if there was none or less than four matches.
I've updated my patch, as my first attempt had one drawback: it searched twice just to get the same results, if there was none or less than four matches.
Code: Select all
diff --git a/include/functions_entries.inc.php b/include/functions_entries.inc.php
index 5518e4a..ef6643d 100644
--- a/include/functions_entries.inc.php
+++ b/include/functions_entries.inc.php
@@ -773,6 +773,7 @@ function &serendipity_searchEntries($term, $limit = '', $searchresults = '') {
} elseif ($serendipity['dbType'] == 'sqlite' || $serendipity['dbType'] == 'sqlite3') {
// Very extensive SQLite search. There currently seems no other way to perform fulltext search in SQLite
// But it's better than no search at all :-D
+ $term = str_replace( "*", "%", $term );
$cond['group'] = 'GROUP BY e.id';
$cond['distinct'] = '';
$term = serendipity_mb('strtolower', $term);
@@ -863,7 +864,9 @@ function &serendipity_searchEntries($term, $limit = '', $searchresults = '') {
//if * wasn't already appended and if there are none or not enough
//results, search again for entries containing the searchterm as a part
- if (strpos($term, '*') === false) {
+ if ((strpos($term, '*') === false) &&
+ ($serendipity['dbType'] != 'sqlite') &&
+ ($serendipity['dbType'] != 'sqlite3')) {
if (! is_array($search)) {
return serendipity_searchEntries($term.'*', $orig_limit);
}else if (count($search) < 4){-
garvinhicking
- Core Developer
- Posts: 30022
- Joined: Tue Sep 16, 2003 9:45 pm
- Location: Cologne, Germany
- Contact:
Re: Search broken using UTF-8 and SQLite
Hi!
Thanks a lot, committed your fix!
Regards,
Garvin
Thanks a lot, committed your fix!
Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/