If display_errors is on, typical settings would cause PHP error messages to spew to output before the HTTP headers for setting a 400 error go through.
Also switched from deprecated static DOMDocument::loadXML() to non-static call.
Was triggering errors due to use of common_canonical_nickname() on arbitrary input without checking for exceptions about invalid nicknames (which didn't exist long ago in the before time)
UserActivityStream -- used to create a full activity stream including subscriptions, favorites, notices, etc -- normally buffers everything into memory at once. This is infeasible for accounts with long histories of serious usage; it can take tens of seconds just to pull all records from the database, and working with them all in memory is very likely to hit resource limits.
This commit adds an alternate mode for this class which avoids pulling notices until during the actual output. Instead of pre-sorting and buffering all the notices, empty spaces between the other activities are filled in with notices as we're making output. This means more smaller queries spread out during operations, and less stuff kept in memory.
Callers (backupaccount action, and backupuser.php) which can stream their output pass an $outputMode param of UserActivityStream::OUTPUT_RAW, and during getString() it'll send straight to output as well as slurping the notices in this extra funky fashion.
Other callers will let it default to the OUTPUT_STRING mode, which keeps the previous behavior.
There should be a better way to do this, swapping out the stringer output for raw output more consitently.
file_quota is adjusted from the defined value to take into account the maximum upload size limits in PHP, or cropped to 0 if uploads are disabled.
This can be used by client apps to determine maximum size for an attachment.
Search highlighting was being done with a regex on raw HTML text, followed by a second regex undoing replacements within double-quoted attribute values.
This broke on imported Twitter messages, as the way we generate the markup uses single quotes on the attributes, which didn't get matched by the second regex.
I've replaced this do-then-undo cycle by dividing up the import HTML into freetext spans and tags; the freetext gets replaced, while the tags are left untouched.