common_shorten_links() can only access the web session's logged-in user, so never properly took user options into effect for posting via XMPP, API, mail, etc.
Adds an optional $user parameter on common_shorten_links(), and a $user->shortenLinks() as a clearer interface for that.
Tweaked some lower-level functions so $user gets passed down -- making the $notice_id param previously there for saving URLs at notice save time generalized a little.
Note also ticket #2919: there's a lot of duplicate code calling the shortening, checking the length, and reporting near-identical error messages. These should be consolidated to aid in code and translation maintenance.
Code was doing a batch call to $avatar->delete() which fails to properly engage the file deletion code. Calling the existing profile->delete_avatars() function deletes them individually, which makes it all work nice again.
This option may be useful for intranet sites that don't have direct access to the internet, as they may be unable to successfully fetch those resources.
Newly supported:
- TwitPic: added a local function using TwitPic's API, since the oohembed implementation for TwitPic produced invalid output which Services_oEmbed rejects. (bug filed upstream)
Tweaked...
- Flickr: works, now using whitelist to use their endpoint directly instead of going through oohembed
- Youtube: worked around a bug in Services_oEmbed which broke the direct use of API discovery info, so we don't have to use oohembed.
Not currently working...
- YFrog: whitelisting their endpoint directly as the oohembed output is broken, but this doesn't appear to work currently as I think things are confused by YFrog's servers giving a '204 No Content' response on our HEAD checks on the original link.
The old code attempted to compare the value of the notice.created field against now() directly, which tends to explode in our current systems. now() comes up as the server/connection local timezone generally, while the created field is currently set as hardcoded UTC from the web servers. This would lead to breakage when we got a difference in seconds that's several hours off in either direction (depending on the local timezone). New code calculates a threshold by subtracting the number of seconds from the current UNIX timestamp and passing that in in correct format for a simple comparison. As a bonus, this should also be more efficient, as it should be able to follow the index on profile_id and created.
* moved some translator comments that were not directly above the line with the message to the correct location.
* i18n for UI text.
* superfluous whitespace removed.
I've consolidated the checks for which user to use for single-user mode into User::singleUser(), which now uses the configured nickname by preference, falling back to the site owner if it's unset.
This is now called consistently from the places that needed to use the primary user's nickname in routing setup.
Setting $config['singleuser']['nickname'] should now work again as expected.
Doesn't clear all possible cached entries, but this should get the ones that matter most: lookups by id, nickname, and alias. This should ensure that if a group name gets reused as a new group or alias, it should work properly.
There are some user-visible areas that aren't clear such as the 'top groups' lists on the GroupsAction sidebar; if a deleted group appears in those lists it'll go away within an hour when the cached query expires.
When bogus SSL sites etc were hit through a shortening redirect, sometimes link resolution kinda blew up and the user would get a "Can't linkify" error, aborting their post.
Now catching this case and just passing through the URL without attempting to resolve it. Could benefit from an overall scrubbing of the freaky link/attachment code though...! :)
http://status.net/open-source/issues/2513
When bogus SSL sites etc were hit through a shortening redirect, sometimes link resolution kinda blew up and the user would get a "Can't linkify" error, aborting their post.
Now catching this case and just passing through the URL without attempting to resolve it. Could benefit from an overall scrubbing of the freaky link/attachment code though...! :)
http://status.net/open-source/issues/2513
SubMirror: redid add-mirror frontend to accept a feed URL, then pass that on to OStatus, instead of pulling from your subscriptions.
Profile: tweaked subscriberCount() so it doesn't subtract 1 for foreign profiles who aren't subscribed to themselves; instead excludes the self-subscription in the count query.
Memcached_DataObject: tweak to avoid extra error spew in the DB error raising
Work in progress: tweaking feedsub garbage collection so we can count other uses
On my test setup, this fixes inbox delivery to 10,000 local recipients from background queuedaemon running with a 32mb memory limit, completes the job within a minute from start.
Pretty much everything in File and File_redirection initial processing needs to be rewritten to be non-awful; this code is very hard to follow and very easy to make huge bugs. A fair amount of the complication is probably obsoleted by the redirection following being built into HTTPClient now.
* Fake_XMPP back to Queued_XMPP, refactor how we use it and don't create objects and load classes until we need them.
* fix fatal error in IM settings while waiting for a Jabber confirmation.
* Caching fix for user_im_prefs
* fix for saving multiple transport settings
* some fixes for AIM & using normalized addresses for lookups
Users and administrators can set how long an URL can be before it's
shortened, and how long a notice can be before all its URLs are
shortened. They can also turn off shortening altogether.
Squashed commit of the following:
commit d136b39011
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:39:00 2010 -0400
use site and user settings to determine when to shorten URLs
commit 1e1c851ff3
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:38:40 2010 -0400
add a method to force shortening URLs
commit 4d29ca0b91
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:37:41 2010 -0400
static method for getting best URL shortening service
commit a9c6a3bace
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:37:11 2010 -0400
allow 0 in numeric entries in othersettings
commit 767ff2f7ec
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:36:46 2010 -0400
allow 0 or blank string in inputs
commit 1e21af42a6
Author: Evan Prodromou <evan@status.net>
Date: Mon Apr 26 02:01:11 2010 -0400
add more URL-shortening options to othersettings
commit 869a6be0f5
Author: Evan Prodromou <evan@status.net>
Date: Sat Apr 24 14:22:51 2010 -0400
move url shortener superclass to lib from plugin
commit 9c0c9863d5
Author: Evan Prodromou <evan@status.net>
Date: Sat Apr 24 14:20:28 2010 -0400
documentation and whitespace on UrlShortenerPlugin
commit 7a1dd5798f
Author: Evan Prodromou <evan@status.net>
Date: Sat Apr 24 14:05:46 2010 -0400
add defaults for URL shortening
commit d259c37ad2
Author: Evan Prodromou <evan@status.net>
Date: Sat Apr 24 13:40:10 2010 -0400
Add User_urlshortener_prefs
Add a table for URL shortener prefs, a corresponding class, and the
correct mumbo-jumbo in statusnet.ini to make everything work.
* Moved notification sending from Notice::saveReplies to distrib queue handler, so it'll pull from the reply set we've saved regardless of how we got it.
* Set up gettext infrastructure for command-line scripts; gets localization mail notifications etc working from background queues.
* Adjusted locale switching: common_switch_locale() works at runtime for bg scripts, forces a message catalog update
Conflicts:
actions/imsettings.php
lib/jabber.php
Made a quick attempt to merge the new JID validation into the XmppPlugin, have not had a chance to test that version live yet.
Should also move over the test cases.
Should help with dupes that come in when inbox distrib jobs die and get restarted, etc.
Conflicts:
classes/Inbox.php
Looks like this was implemented on master recently and not copied up to testing. Merging to my version on testing as I've added some doc comments and extracted a couple functions for future ease of use.
to (profile_id, id) instead of (profile_id, created, id).
It's been falling back to PRIMARY instead, which is really
very inefficient for a profile that hasn't posted in a few
months. Even though forcing the index will cause a filesort,
it's usually going to be better. Even for large profiles it
seems much faster than the badly-indexed query.
to (profile_id, id) instead of (profile_id, created, id).
It's been falling back to PRIMARY instead, which is really
very inefficient for a profile that hasn't posted in a few
months. Even though forcing the index will cause a filesort,
it's usually going to be better. Even for large profiles it
seems much faster than the badly-indexed query.
The magic __call() method is used to implement a getter and setter interface, and simply didn't bother to throw an error for things it didn't recognize.
This may expose a number of existing errors where mistyped method names are called and we're not noticing that they're failing.
This bug was hitting a number of places where we had the pattern:
$db->find();
while($dbo->fetch()) {
$x = clone($dbo);
// do anything with $x other than storing it in an array
}
The cloned object's destructor would trigger on the second run through the loop, freeing the database result set -- not really what we wanted.
(Loops that stored the clones into an array were fine, since the clones stay in scope in the array longer than the original does.)
Detaching the database result from the clone lets us work with its data without interfering with the rest of the query.
In the unlikely even that somebody is making clones in the middle of a query, then trying to continue the query with the clone instead of the original object, well they're gonna be broken now.
* Subscription::start was sometimes passing users instead of profiles to hooks, which broke OStatus subscription notifications; now normalizing to profiles for processing.
* H-card parsing would trigger a lot of PHP warnings and notices in hKit. Now suppressing warnings and notices for the duration of the call to keep them out of output when display_errors is on.
* H-card parsing would trigger a PHP fatal error if the source page was not well-formed XML and Tidy was not present on the system. Switched normalization to use the PHP DOM module which is always present, as we have no need for Tidy's extra features here.
* Trying to fetch avatars from Google profiles failed and triggered a PHP warning due to the relative URL not being resolved during h-card parsing. Now passing profile page URL into hKit by sneaking a <base> tag in while we normalize the HTML source.
* Profile pages without a "Link" header could trigger PHP notices due to a bad NULL -> array(NULL) conversion in LinkHeader::getLink(). Now checking that there was a return value before converting single return value into array.
Base problem is that our caching-on-insert interferes with relying on column default values; the cached object is missing those fields, so they appear to be empty (null) when the object is retrieved from cache.
Now explicitly setting them when inserting subscriptions, and cleaned up some code that had alternate code paths.
May also have made auto-subscription work for remote OStatus subscribers, but can't test until magic sigs are working again.
While deletion is in progress, the account is locked with the 'deleted' role, which disables all actions with rights control.
Todo:
* Pretty up the notice on the profile page about the pending delete. Show status?
* Possibly more thorough account disabling, such as disallowing all use for login and access.
* Improve error recovery; worst case is that an account gets left locked in 'deleted' state but the queue jobs have gotten dropped out. This would leave the username in use and any undeleted notices in place.
This bug was hitting a number of places where we had the pattern:
$db->find();
while($dbo->fetch()) {
$x = clone($dbo);
// do anything with $x other than storing it in an array
}
The cloned object's destructor would trigger on the second run through the loop, freeing the database result set -- not really what we wanted.
(Loops that stored the clones into an array were fine, since the clones stay in scope in the array longer than the original does.)
Detaching the database result from the clone lets us work with its data without interfering with the rest of the query.
In the unlikely even that somebody is making clones in the middle of a query, then trying to continue the query with the clone instead of the original object, well they're gonna be broken now.
It's not currently used, and won't be efficient when we update the notice.profile_id_idx index to optimize for our id-based sorting when pulling user post lists for profile pages, feeds etc.
On my test system (without memcache), while testing the LDAP
authentication plugin, when I sign in for the first time, triggering
auto-registration, I get these messages in the output page:
Warning: ksort() expects parameter 1 to be array, null given in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 219
Warning: Invalid argument supplied for foreach() in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 224
Warning: assert() [function.assert]: Assertion failed in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 241
(plus two "Cannot modify header information..." messages as a result of
the above warnings)
This change appears to fix this (although I can't really explain exactly
why).
Also stripping id from foreign HTML messages (could interfere with UI) and disabled failing attachment popup for a.attachment links that don't have a proper id, so you can click through instead of getting an error.
Issues:
* any other links aren't marked and saved
* inconsistent behavior between local and remote attachments (local displays in lightbox, remote doesn't)
* if the enclosure'd object isn't referenced in the content, you won't be offered a link to it in our UI
We only need one author for user feeds: the user themselves. So, show
the user as the activity:subject, and don't repeat the same
activity:actor for every notice unnecessarily.
In a federated system, "@nickname" is insufficient to uniquely
identify a user. However, it's a very convenient idiom. We need to
guess from context who 'nickname' refers to.
Previously, we were using the sender's profile (or what we knew about
them) as the only context. So, we assumed that they'd be mentioning to
someone they followed, or someone who followed them, or someone on
their own server.
Now, we include the notice information for context. We check to see if
the notice is a reply to another notice, and if the author of the
original notice has the nickname 'nickname', then the mention is
probably for them. Alternately, if the original notice mentions someone
with nickname 'nickname', then this notice is probably referring to
_them_.
Doing this kind of context sleuthing means we have to render the
content very late in the notice-saving process.
We add a local_group table to store data about local groups. It has
the unique key for nickname, so /group/<nickname> looks up here.
Updated DB data object classes and data files.
- added rel="ostatus:attention" links for group delivery
- added events for plugins to override group profile/permalink pages
- pulled Notice::saveGroups up to save-time so we can override;
it's relatively cheap and gives us a clean list of target
groups for distrib time even with customized delivery.
- fixed notice::getGroups to return group objects as expected
- added some doc on new parameters to Notice::saveNew
- 'groups' list of group IDs to push to in place of parsing
- messages that come in via PuSH and contain local group targets
are delivered to local group members
- messages that come in via PuSH and contain remote group targets
are delivered to local members of the remote group
Todo:
- handle group posts that only come through Salmon
- handle conflicts in case something comes in both through Salmon and PuSH
- better source verification
- need a cleaner interface to look up groups by URI
- need a way to handle remote groups with conflicting names
Combined the code that finds mentions of other profiles into one place.
common_find_mentions() finds mentions and calls hooks to allow
supplemental syntax for mentions (like OStatus).
common_linkify_mentions() links mentions.
common_linkify_mention() links a mention.
Notice::saveReplies() now uses common_find_mentions() instead of
trying to parse everything again.
The subs_* functions in subs.php have made a lot of assumptions
about users versus profiles. I've refactored the functions to
be methods of the Subscription class instead, and to use Profile
objects throughout.
Some of the checks for blocks or existing subscriptions depended
on users or profiles, so I've moved those methods around a bit.
I've left stubs for the subs_* functions until we get time to replace
them.
- Multiplexing queues into groups and for multiple sites.
- Sharing vs breakout configurable per site and per queue via $config['queue']['breakout']
- Detect how many times a message is redelivered, discard if it's killed too many daemons
- count configurable with $config['queue']['max_retries']
- can dump the items to files in $config['queue']['dead_letter_dir']
Queue daemon memory & resource leak fixes:
- avoid unnecessary reconnections to memcached server (switch persistent connections back in on second initialization, assuming it's child process)
- monkey-patch for leaky .ini loads in DB_DataObject::databaseStructure() - was leaking 200k per active switch
- applied leak fixes to Status_network as well, using intermediate base Safe_DataObject for both it and Memcache_DataObject
Misc queue fixes:
- correct handling of child processes exiting due to signal termination instead of regular exit
- shutdown instead of infinite respawn loop if we're already past the soft memory limit at startup
- Added --all option for xmppdaemon... still opens one xmpp connection per site that has xmpp active
Cache updates:
- add Cache::increment() method with native support for memcached atomic increment
statusnet.links.ini file could not be read anymore due to the entry for nonce containing a comma in its key value.
PHP's parse_ini_file() function no longer allows commas in keys, and rejects the *ENTIRE FILE* if it's present, breaking various automatic joins.
* detection of group feeds is currently a nasty hack based on presence of '/groups/' in URL -- should use some property on the feed?
* listing for the remote group is kinda cruddy; needs to be named more cleanly
* still need to establish per-author profiles (easier once we have the updated Atom code in)
* group delivery probably not right yet
* saving of group messages still triggering some weird behavior
Added support for since_id and max_id on group timeline feeds as a free extra. Enjoy!
* Treat linkless feed posts as status updates; drop the "New post:" prefix and quotes on them.
* Use stable user IDs for atom/rss2 feed links instead of unstable nicknames
* Pull Atom feed preferentially when subscribing -- can now put the remote user's profile page straight into the feed subscription form and get to the right place.
* Clean up naming for push endpoints
No change in efficiency for the common case where nothing's deleted: does the same bulk fetch of just the notices we think we'll need as before, then if we turned up short keeps checking one by one until we've filled up to our $limit.
This can leave us with overlap between pages, but we already have that when new messages come in between clicks; seems to be the lesser of evils versus not getting a 'before' button.
More permanent fix for that will be to switch timeline paging in the UI to use notice IDs.
No change in efficiency for the common case where nothing's deleted: does the same bulk fetch of just the notices we think we'll need as before, then if we turned up short keeps checking one by one until we've filled up to our $limit.
This can leave us with overlap between pages, but we already have that when new messages come in between clicks; seems to be the lesser of evils versus not getting a 'before' button.
More permanent fix for that will be to switch timeline paging in the UI to use notice IDs.