Joseph Scott

Problems With libxml2 For WordPress XML-RPC Users

with 52 comments

Updates
3 Feb 2009 @ 2:00pm : Update on libxml2 issues

4 Mar 2009 @ 2:38pm : Conclusion of libxml2 Issues – Use PHP 5.2.9 & libxml2 2.7.3

17 Mar 2009 @ 3:05pm : WordPress & libxml2 Episode IV: A New Plugin

A gradually growing list of people have run into a very odd problem using XML-RPC methods in WordPress, where the left angle bracket ( < ) gets stripped. There's been a fair bit of discussion about this on ticket #7771. The bottom line: the behavior of the PHP XML extension when built against newer versions of libxml2 changed, such that left angle brackets get stripped when parsing XML.

There’s been some back and forth between libxml2 folks (email list) and the PHP folks (bug 45996), with no real solution for those using the tainted versions of libxml2. So what are your options if you’ve got this problem? Here’s two:

  • Stick with older, known to work versions of libxml2. It’s been reported by others that libxml2 <= 2.6.32 work. I've personally only tested up to 2.6.30, which has been working fine for me.
  • Build the PHP XML module against the expat parser instead of libxml2

Both of these options require some server admin abilities and know how, making them unrealistic options for many WordPress users. Undoubtedly many hosting services will role out these newer versions of libxml2 as part of their regular updates. This will leave some WordPress users with sudden errors that weren’t there before.

As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets). Short of that happening perhaps the XML extension for PHP will need to grow to work correctly with the new way that libxml2 works. Either way, I’d like to see PHP XML parsing work correctly again.

If you aren’t having any of these problems right now I recommend NOT upgrading libxml2 on your system until this has been sorted out.

Written by Joseph Scott

December 30th, 2008 at 6:01 pm

Tagged with , , ,

52 Responses to 'Problems With libxml2 For WordPress XML-RPC Users'

Subscribe to comments with RSS or TrackBack to 'Problems With libxml2 For WordPress XML-RPC Users'.

  1. Colin Barrett said, on December 30th, 2008 at 6:14 pm

    “As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets).”

    In the thread you linked to, Rob Richards mentioned[1] that it was due to a gross hack in the PHP XML extension that this broke:

    “So basically the extension was using voodoo code to get the entities to work as it wanted them to and it has finally caught up with it.”

    Instead of being snippy at the libxml2 guys, you should direct your anger towards the PHP maintainers.

    [1]: http://thread.gmane.org/gmane.comp.gnome.lib.xml.general/14595/focus=14610

    • Joseph Scott said, on December 31st, 2008 at 8:17 am

      That may be the case that PHP is using a gross hack, but that doesn’t change the fact that libxml2 2.6.30 works fine and libxml2 2.7.1 doesn’t, with the exact same version of PHP. Sounds like you didn’t read the rest of my post, where I clearly stated that the bottom line was to make PHP XML parsing working correctly. I don’t have an interest in “being snippy”, simply getting things working again.

      At this point the problem is only going to get worse as more hosts update libxml2 we’ll see more and more people complain that XML-RPC in WordPress is suddenly broken.

  2. Michael Moncur said, on December 30th, 2008 at 6:57 pm

    I was going to say that large chunks of your post were missing, making it unreadable in my RSS reader (Bloglines), but after clicking through here I found that Bloglines has a bug where it strips left angle brackets, and most of the paragraphs that follow them.

    I found that strangely ironic…

  3. Caleb said, on December 31st, 2008 at 2:05 am

    While I haven’t delved too deeply it seems like both libxml and php camps admit it was cause php’s code was a hack that finally caught up with them. So to me it seems like they should probably be the ones to either update their code to handle that libxml is exhibiting more proper behavior and isn’t so easy to fool into acting like expat, or providing a patch that can enable a compatible mode. Though either way it would be nice to see a fix, it’s been a known issue for way too long.

    I have some sites on Joyent, switching the php version from 5 to 4 worked to keep this vital feature working in the meantime.

  4. Matt said, on January 1st, 2009 at 11:06 pm

    And people say you shouldn’t use regex for these things. ;)

    • Joseph Scott said, on January 2nd, 2009 at 12:17 pm

      For parsing XML-RPC regex’s probably wouldn’t be to bad. For XML in general? That would probably be one monster of a regex.

  5. [...] with WordPress 2.7 XML-RPC: Joseph Scott reports on problems with libxml2 For WordPress XML-RPC users that he’s been finding on the WordPress Support Forums and elsewhere. The problem is not [...]

  6. [...] Scott reports on problems with libxml2 For WordPress XML-RPC users that he’s been finding on the WordPress Support Forums and [...]

  7. Sander said, on January 4th, 2009 at 8:09 am

    And who is working on it? Wordpress? or other developers?

    • Joseph Scott said, on January 4th, 2009 at 7:45 pm

      From what I can see it isn’t clear that anyone is working on fixing the situation with PHP and libxml2.

  8. [...] with WordPress 2.7 XML-RPC: Joseph Scott reports on problems with libxml2 For WordPress XML-RPC users that he’s been finding on the WordPress Support Forums and elsewhere. The problem is not [...]

  9. Ajay said, on January 8th, 2009 at 11:00 am

    Hi Joseph,

    Is there a solution to this in the end, or do we just wait it out?

    • Joseph Scott said, on January 8th, 2009 at 12:14 pm

      If your current setup is working, don’t upgrade libxml2. If isn’t working employ one of the work arounds (libxml2 download, or build against expat) and start bugging the PHP and libxml2 folks to get this sorted out. Someone from libxml2 and/or PHP needs to get this addressed.

  10. Ajay said, on January 8th, 2009 at 12:49 pm

    Current setup is messed up.

    For those interested, I came across http://blog.code-head.com/fixing-libxml-php-bug-and-issues-with-html-entities-downgrading-libxml which details the process of downgrading on cPanel servers.

    • Joseph Scott said, on January 11th, 2009 at 2:57 pm

      Thanks, looks like a good resource for cPanel users who run into this.

      • Ajay said, on January 11th, 2009 at 3:22 pm

        No problem. I ran the setup as mentioned in the link on my VPS and everything works perfectly now.

        Unfortunately, if you don’t have root access you’re kind of stuck :(

  11. [...] v2.7. Posting to the localhost Wordpress v2.7 does not have same problem. After some googling, I found the problem is caused by a new behavior of PHP XML library libxml2.dll not backward compatib…. My localhost is using libxml2.lib v2.6.26 which turns out to be OK. I wonder what libxml2.dll my [...]

  12. WPMU Tutorials » Bug affecting XMLRPC said, on January 9th, 2009 at 6:32 pm

    [...] Updated to add: Joseph Scott was one of the WP devs who worked on tracking down and testing libxml versions for the issue. He has lots of additional links on the bug here. [...]

  13. Pete Ware said, on January 10th, 2009 at 1:42 pm

    Here are my notes on downgrading libxml2 for OpenSuse 11.1. This assumes you have administrative rights on the box:

    http://www.peteware.com/blog/2009/01/fixing-libxml2-php-wordpress-and-the-missing-angle-brackets/

    • Joseph Scott said, on January 11th, 2009 at 2:55 pm

      Thanks for the link. Since downgrading libxml2 is a common fix for this issue it’s helpful to have how-to guides for various systems.

  14. [...] across this article by Joseph Scott. Joseph is one of the developers of WordPress. He writes: A gradually growing list of people have [...]

  15. ishara said, on January 14th, 2009 at 2:33 am

    This is the workaround patch for wordpress users.If you can not downgrade libxml version or can not wait libxml2 fix it later.

    http://blog.hoofoo.net/2009/01/14/wordpress-patch-for-problamatic-libxml2-version/

  16. david said, on January 14th, 2009 at 8:38 pm

    I’ve tried the patch on a test blog and it seems to work OK.

    I noticed, however, that … even though my system is reporting that libxml2 2.7.2 is installed, php is reporting (via phpinfo) that it thinks that 2.6.32 is installed.

    I have no idea why this is.

    • Joseph Scott said, on January 15th, 2009 at 10:33 am

      My concern for these patches that do global search and replace like that is the potential to mess up other types of data.

      Having you tried restarting your web server? That might get PHP to pick up on the correct library version.

      • Ajay said, on January 15th, 2009 at 12:53 pm

        Hi Joseph,

        Considering that there is no solution from PHP folks and libxml folks in the near future, is it possible for a similar workaround be put into the WordPress core?

        • Joseph Scott said, on January 15th, 2009 at 3:51 pm

          It’s possible, but awkward. In general trying to fix your foundation while standing on it is not an approach I’d like to take. I’ll try to make some time to test out those patches.

  17. MGD King said, on January 26th, 2009 at 1:00 pm

    Anyone have any luck with the patches? I’ve tried both manually editing the files (the HOO FOO method I call it) and the zip file method (the AJAY method) and neither work. Can anyone shed some light my way?

    • Joseph Scott said, on January 28th, 2009 at 1:51 pm

      So far I’ve seen conflicting reports as to how well these patches work. I’m going to setup a test system to try them out and see for myself.

  18. Ajay said, on January 28th, 2009 at 4:46 pm

    I’m seeing more failures than successes so far… :(

  19. Update On libxml2 Issues || Joseph Scott said, on February 3rd, 2009 at 1:30 pm

    [...] the end of December I detailed problems people were seeing with WordPress, XML-RPC and libxml2. I’ve got good news, both PHP and libxml2 have been updated to fix the issue. You can send [...]

  20. [...] you can tell them exactly which versions do work. Feel free to point them this and previous posts (Problems With libxml2 For WordPress XML-RPC Users and Update On libxml2 Issues) if they want some history and context of the [...]

  21. [...] تحديث الحمد لله وجدت الحل وهو عن طريق هذا الرابط http://josephscott.org/archives/2008/12/problems-with-libxml2-for-wordpress-xml-rpc-users [...]

  22. Patrick Mackaaij said, on April 4th, 2009 at 1:41 pm

    I’ll contact my webhost to update the library but came by to thank you for your plugin. This works!
    http://wordpress.org/extend/plugins/libxml2-fix/

    I found several links to manual fixes which comment out certain code but that didn’t work out for me.

    So thank you!

  23. Bakawan Web Design | My Sticky Note said, on April 27th, 2009 at 5:54 pm

    [...] dan “>” pada contoh contoh coding udah dibereskan sesuai petunjuk blaszta, josep scott 1″, Joseph scott 2, hoofoo. Ternyata ada masalah pada proses memposting via tool eksternal seperti [...]

  24. [...] did encounter an issue when I tried to migrate my old Blogger blog into the hosted WP. Apparently, there are problems with libxml2 for WP XML-RPC users. I’ve consulted the web hosting service about this and they suggested that I first import the [...]

  25. misiek said, on June 9th, 2009 at 9:43 am

    I still have this issue witch correct versions.

    Please check this topic:
    http://wordpress.org/support/topic/193720?replies=8

    or go here to check how the rss appears:

    http://maugustyniak.corpface.com/picasa/

    thanks for any suggestions

    • Joseph Scott said, on June 10th, 2009 at 10:52 am

      Double check that your PHP is actually built against the proper version of libxml2. I’ve seen some cases where libxml2 was updated, but PHP was still using an older version.

  26. misiek said, on June 10th, 2009 at 11:01 am

    I checked that in phpinfo()

    it use correct version.

    • Joseph Scott said, on June 10th, 2009 at 1:45 pm

      Since you are using fetch_rss() did you check to see if it was a using a cached copy of the feed? If it cached the feed while having libxml2 problems then it might be still using it.

  27. misiek said, on June 10th, 2009 at 2:48 pm

    very interesting !

    I replaced the url of the rss from the same user picasa account and this WP took it.

    Caching old rss maybe possible, but have no idea how to clear it.

    Thanks for help and responds.

  28. misiek said, on June 10th, 2009 at 3:51 pm

    I figured that it cache in DB in wp_options table, I found my record, but before I changed url of rss to see is it actually caching it there. New record showed up, awesome.

    So I removed the new record and that bad one.

    Now it does not cache at all, my rss feed won’t show up because fetch_rss() return false.

    I wonder why it doesn’t want to cache new feeds

    Any ideas

    I know that this is not right place to ask about that but you may know the solution.

    Thanks

    • Joseph Scott said, on June 10th, 2009 at 8:22 pm

      I’d have to look through that chunk of code to figure out why it isn’t trying to refresh the cache for that feed. If I were to guess there might be another option that lists what feeds have been cached and where in the options table they are. If that’s the case and you only deleted the cached feeds then it might think that the feed is cached, but then find nothing in the cache.

  29. misiek said, on June 10th, 2009 at 8:52 pm

    yeah , there is another option timestamp for that feed, deleted it didn’t help, I will look at it tomorrow yet.

    I just thought that there are WP functions which deals with caching.

    I played also with code little bit, found cache_age vars or other which may be responsible for keeping the cache but there must be something else yet like you said.

  30. misiek said, on June 10th, 2009 at 10:06 pm

    wheee

    I figured that finally, basically went through the code and could display the status var of rss object, it says 500. Then I could finally display the error and it says limit 2s timeout, the time out of fetching the rss, and this hit me , because this rss is pretty big, so increasing to 5s fixed the problem :)

    thanks Joseph to figure the problem


Leave a Reply