Parsing hcards for the data we need wasn't hard enough to justify using
hkit. It was dependent on a number of external systems (something to
run tidy), and only could handle XHTML.
We now parse HTML with the PHP dom libraries used elsewhere, and
scrape out our own hcards. Seems to work nicer and faster and most of
all works with Google Buzz profile URLs.
* Subscription::start was sometimes passing users instead of profiles to hooks, which broke OStatus subscription notifications; now normalizing to profiles for processing.
* H-card parsing would trigger a lot of PHP warnings and notices in hKit. Now suppressing warnings and notices for the duration of the call to keep them out of output when display_errors is on.
* H-card parsing would trigger a PHP fatal error if the source page was not well-formed XML and Tidy was not present on the system. Switched normalization to use the PHP DOM module which is always present, as we have no need for Tidy's extra features here.
* Trying to fetch avatars from Google profiles failed and triggered a PHP warning due to the relative URL not being resolved during h-card parsing. Now passing profile page URL into hKit by sneaking a <base> tag in while we normalize the HTML source.
* Profile pages without a "Link" header could trigger PHP notices due to a bad NULL -> array(NULL) conversion in LinkHeader::getLink(). Now checking that there was a return value before converting single return value into array.
Base problem is that our caching-on-insert interferes with relying on column default values; the cached object is missing those fields, so they appear to be empty (null) when the object is retrieved from cache.
Now explicitly setting them when inserting subscriptions, and cleaned up some code that had alternate code paths.
May also have made auto-subscription work for remote OStatus subscribers, but can't test until magic sigs are working again.
Under MySQL, new tables will be created as InnoDB with UTF-8 (utf8/utf8_bin) same as core tables.
Existing plugin tables will have table engine and default charset/collation updated, and string columns will have charset updated, at checkschema time.
Switched from 'DESCRIBE' to INFORMATION_SCHEMA for pulling column information in order to get charset. A second hit to INFORMATION_SCHEMA is also needed to get table properties.
Indices were only being created at table creation time, which ain't so hot. Now also adding/dropping indices when they change.
Fixed up some schema defs in OStatus plugin that were a bit flaky, causing extra alter tables to be run.
TODO: Generalize this infrastructure a bit more up to base schema & pg schema classes.