some issues with RSS import

Found a bug? Tell us!!
Post Reply
romulus
Regular
Posts: 49
Joined: Fri Sep 24, 2004 4:31 pm
Contact:

some issues with RSS import

Post by romulus »

I have installed s9y fresh from scratch and wanted to import my blog entries from my wordpress installation. I had some issues with this I will describe here:

- although I have named my categories the same as in wordpress, all entries were NOT categorized. Strangely it was working as expected (=articles were categorized) when wordpress RSS feed was set to only deliver 10 articles. After I set it to deliver all articles no categories were given for imported articles. I don't know if this is a WP or s9y issue
- unfortunately the import tries to fill both bodies: entry body and extended body. This is working less than optimal, because article text is split in 2 halfs regardless of content and sometime even in the middle of a word, f.i. if entry body would end with "serendipity" the import splits it so that entry body ends with "serend" and the extended body starts with "ipity". This is very ugly because I have to edit all entries manually to fix this!
- the text that is entered as entry body gets stripped from all links and html code so that I manually have to add my links and html again.

I think it would be better to insert the imported articles exclusively as body entries and let the user decide if this article needs to be split in a teaser and full text.

Nonetheless, s9y is a fine piece of software and ihmo much better and more stable than wordpress. Thanks! :)
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: some issues with RSS import

Post by garvinhicking »

Hi romulus!

Can you point me to the URL of your wordpress installation? The category thing should only be because of WP giving you a different feed with 'all articles' than without that option. Because our parsing logic is the same, no matter how many entries you import...

The 'both bodies' function is kinda hard to explain - it uses both 'content:encoded' and 'description' element. I have fixed some of that concatenation logic in our codebase, thanks!

Anyways, the stripped HTML is because of wordpress not giving those in one of the mentioned elements. Serendipity does no stripping of HTML code anywhere!

I have now added a switch to not import any text into the 'extended entry' field, thanks for that suggestion!

Best regards,
Garvin.
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
romulus
Regular
Posts: 49
Joined: Fri Sep 24, 2004 4:31 pm
Contact:

Post by romulus »

my WP installation is my HP ;)

www.romulus23.de the rss feed is www.romulus23.de/feed/rss2

Right now, WP delivers only 10 articles again. If you want to try it with all articles (actually I set it to deliver 1000 posts ;) ) give me a hint and I set it to all again.

I haven't tried it yet but if I enable the switch to not import text into "extrended entry" does it import all text with all links and html into the entry body?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi Romulus!
romulus wrote:my WP installation is my HP ;)
Ack, now I remember you - the nickname alone didn't help enough ;)

I fixed some more things thanks to your feed. The categories stuff looks fine for me, just look if your 1000-posts feed also contain the '<category>' element.

You should be able to fetch our autogenerated tarball on s9y.org, which will contain the latest patches. It should be built at 13:37 today. :)

Links/HTML is now imported properly!

Regards,
Garvin.
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
romulus
Regular
Posts: 49
Joined: Fri Sep 24, 2004 4:31 pm
Contact:

Post by romulus »

I have tested it now and the new button to only fill one body text works perfectly. Thx :)

But my categories were not filled again although I have only imported 10 articles from my existing feed. Look into my test installation:
www.romulus23.de/s9y

The first few (new) articles are all without any category, although the categories are named the same as in my old blog.
Why isn't it working?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

I spotted the bug. Please open your serendipity_rss_exchange.inc.php and replace all references to $entry['category'] to $item['category']. Leave $entry['categories'] as is. There should be two occurences in line 47 and 48.

I also committed those changes to CVS...

Regards,
Garvin.
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
romulus
Regular
Posts: 49
Joined: Fri Sep 24, 2004 4:31 pm
Contact:

Post by romulus »

Thanks! It works now :)

Although I have found yet another (minor) bug: if entered URL is not valid, you get the following (PHP) error:

Code: Select all

Error on line 126 of /www/xxx/xxx/s9y/bundled-libs/Onyx/RSS.php: The specified file could not be opened.
Unfortunately after that, there is also the message: "Einträge erfolgreich importiert!" It seems that it doesn't check if URL is valid.
Actually I have only forgot to begin the URL with "http://". Maybe you can add that automagically if its missing?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi romulus!

Okay, I will enter this check in our development branch for the next release when I have the time. :)

Great, that it's working now. :)

Thanks for your help,
Garvin.
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Post Reply