Commit Graph

6 Commits

Author SHA1 Message Date
Brion Vibber fedfde9bbb Bookmark plugin: fixes for bad DOM element nesting in delicious import data
delicious bookmark exports use the godawful HTML bookmark file format that ancient versions of Netscape used (and has thus been the common import/export format for bookmarks since the dark ages of the web :)
This arranges bookmark entries as an HTML definition list, using a lot of implied close tags (leaving off the </dt> and </dd>).
DOMDocument->loadHTML() uses libxml2's HTML mode, which generally does ok with muddling through things but apparently is really, really bad about handling those implied close tags.

Sequences of adjacent <dt> elements (eg bookmark without a description, followed by another bookmark "<dt><dt>"), end up interpreted as nested ("<dt><dt></dt></dt>") instead of as siblings ("<dt></dt><dt></dt>").
The first round of code tried to resolve the nesting inline, but ended up a bit funky in places.
I've replaced this with a standalone run through the data to re-order the elements, based on our knowing that <dt> and <dd> cannot directly contain one another; once that's done, our main logic loop can be a bit cleaner. I'm not 100% sure it's doing nested sublists correctly, but these don't seem to show up in delicious export (and even if they do, with the way we flatten the input it shouldn't make a difference).

Also fixed a clearer edge case where some bookmarks didn't get imported when missing descriptions.
2010-12-31 12:09:54 -08:00
Evan Prodromou ca28140107 remove debugging outputter from delicious backup importer 2010-12-26 21:10:54 -08:00
Evan Prodromou ccb290cb68 Break up delicious import into a queue manager by bookmark 2010-12-21 11:09:01 -05:00
Evan Prodromou 331639d6e4 Code standards for deliciousbackupimporter.php 2010-12-21 09:42:44 -05:00
Evan Prodromou 704a20f58b some corrections for double-posting of bookmarks 2010-12-20 13:39:07 -05:00
Evan Prodromou 510e79a96c Starting point for adding bookmarks 2010-12-20 12:04:02 -05:00