In a federated system, "@nickname" is insufficient to uniquely
identify a user. However, it's a very convenient idiom. We need to
guess from context who 'nickname' refers to.
Previously, we were using the sender's profile (or what we knew about
them) as the only context. So, we assumed that they'd be mentioning to
someone they followed, or someone who followed them, or someone on
their own server.
Now, we include the notice information for context. We check to see if
the notice is a reply to another notice, and if the author of the
original notice has the nickname 'nickname', then the mention is
probably for them. Alternately, if the original notice mentions someone
with nickname 'nickname', then this notice is probably referring to
_them_.
Doing this kind of context sleuthing means we have to render the
content very late in the notice-saving process.
- add event hooks to profile update pings
- send Salmon pings with custom update-profile event to OStatus subscribees and groups (subscribers will see it on your next post)
- fix OStatus queues with overlong transport names, should work on DB queues now
- Ostatus_profile::notifyActivity() and ::notifyDeferred() now can take XML, Notice, or Activity for convenience
Combined the code that finds mentions of other profiles into one place.
common_find_mentions() finds mentions and calls hooks to allow
supplemental syntax for mentions (like OStatus).
common_linkify_mentions() links mentions.
common_linkify_mention() links a mention.
Notice::saveReplies() now uses common_find_mentions() instead of
trying to parse everything again.
* detection of group feeds is currently a nasty hack based on presence of '/groups/' in URL -- should use some property on the feed?
* listing for the remote group is kinda cruddy; needs to be named more cleanly
* still need to establish per-author profiles (easier once we have the updated Atom code in)
* group delivery probably not right yet
* saving of group messages still triggering some weird behavior
Added support for since_id and max_id on group timeline feeds as a free extra. Enjoy!
* Treat linkless feed posts as status updates; drop the "New post:" prefix and quotes on them.
* Use stable user IDs for atom/rss2 feed links instead of unstable nicknames
* Pull Atom feed preferentially when subscribing -- can now put the remote user's profile page straight into the feed subscription form and get to the right place.
* Clean up naming for push endpoints
Moved much of the writing that happens when posting a notice to a new
queuehandler, distribqueuehandler. This updates tags, groups, replies
and inboxes at queue time (or at Web time, if queues are disabled).
To make this work well, I had to break up the monolithic
Notice::blowCaches() and make cache blowing happen closer to where
data is updated.
Squashed commit of the following:
commit 5257626c62750ac4ac1db0ce2b71410c5711cfa3
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 14:56:41 2010 -0500
slightly better handling of blowing tag memory cache
commit 8a22a3cdf6ec28685da129a0313e7b2a0837c9ef
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 01:42:56 2010 -0500
change 'distribute' to 'distrib' so not too long for dbqueue
commit 7a063315b0f7fad27cb6fbd2bdd74e253af83e4f
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 01:39:15 2010 -0500
change handle_notice() to handle() in distributqueuehandler
commit 1a39ccd28b9994137d7bfd21bb4f230546938e77
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 16:05:25 2010 -0500
error with queuemanager
commit e6b3bb93f305cfd2de71a6340b8aa6fb890049b7
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 01:11:34 2010 -0500
Blow memcache at different point rather than one big function for Notice class
commit 94d557cdc016187d1d0647ae1794cd94d6fb8ac8
Author: Evan Prodromou <evan@status.net>
Date: Mon Jan 25 00:48:44 2010 -0500
Blow memcache at different point rather than one big function for Notice class
commit 1c781dd08c88a35dafc5c01230b4872fd6b95182
Author: Evan Prodromou <evan@status.net>
Date: Wed Jan 20 08:54:18 2010 -0500
move broadcasting and distributing to new queuehandler
commit da3e46d26b84e4f028f34a13fd2ee373e4c1b954
Author: Evan Prodromou <evan@status.net>
Date: Wed Jan 20 08:53:12 2010 -0500
Move distribution of notices to new distribute queue handler
Queue handlers for XMPP individual & firehose output now send their XML stanzas
to another output queue instead of connecting directly to the chat server. This
lets us have as many general processing threads as we need, while all actual
XMPP input and output go through a single daemon with a single connection open.
This avoids problems with multiple connected resources:
* multiple windows shown in some chat clients (psi, gajim, kopete)
* extra load on server
* incoming message delivery forwarding issues
Database changes:
* queue_item drops 'notice_id' in favor of a 'frame' blob.
This is based on Craig Andrews' work branch to generalize queues to take any
object, but conservatively leaving out the serialization for now.
Table updater (preserves any existing queued items) in db/rc3to09.sql
Code changes to watch out for:
* Queue handlers should now define a handle() method instead of handle_notice()
* QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning
thread management (RespawningDaemon) infrastructure.
* The polling XmppConfirmManager has been dropped, as the message is queued
directly when saving IM settings.
* Enable $config['queue']['debug_memory'] to output current memory usage at
each run through the event loop to watch for memory leaks
To do:
* Adapt XMPP i/o to component connection mode for multi-site support.
* XMPP input can also be broken out to a queue, which would allow the actual
notice save etc to be handled by general queue threads.
* Make sure there are no problems with simply pushing serialized Notice objects
to queues.
* Find a way to improve interactive performance of the database-backed queue
handler; polling is pretty painful to XMPP.
* Possibly redo the way QueueHandlers are injected into a QueueManager. The
grouping used to split out the XMPP output queue is a bit awkward.
Conflicts:
scripts/xmppdaemon.php
Queue handlers for XMPP individual & firehose output now send their XML stanzas
to another output queue instead of connecting directly to the chat server. This
lets us have as many general processing threads as we need, while all actual
XMPP input and output go through a single daemon with a single connection open.
This avoids problems with multiple connected resources:
* multiple windows shown in some chat clients (psi, gajim, kopete)
* extra load on server
* incoming message delivery forwarding issues
Database changes:
* queue_item drops 'notice_id' in favor of a 'frame' blob.
This is based on Craig Andrews' work branch to generalize queues to take any
object, but conservatively leaving out the serialization for now.
Table updater (preserves any existing queued items) in db/rc3to09.sql
Code changes to watch out for:
* Queue handlers should now define a handle() method instead of handle_notice()
* QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning
thread management (RespawningDaemon) infrastructure.
* The polling XmppConfirmManager has been dropped, as the message is queued
directly when saving IM settings.
* Enable $config['queue']['debug_memory'] to output current memory usage at
each run through the event loop to watch for memory leaks
To do:
* Adapt XMPP i/o to component connection mode for multi-site support.
* XMPP input can also be broken out to a queue, which would allow the actual
notice save etc to be handled by general queue threads.
* Make sure there are no problems with simply pushing serialized Notice objects
to queues.
* Find a way to improve interactive performance of the database-backed queue
handler; polling is pretty painful to XMPP.
* Possibly redo the way QueueHandlers are injected into a QueueManager. The
grouping used to split out the XMPP output queue is a bit awkward.
Key changes:
* Initialization code moved from common.php to StatusNet class;
can now switch configurations during runtime.
* As a consequence, configuration files must now be idempotent...
Be careful with constant, function or class definitions.
* Control structure for daemons/QueueManager/QueueHandler has been refactored;
the run loop is now managed by IoMaster run via scripts/queuedaemon.php
IoManager subclasses are woken to handle socket input or polling, and may
cover multiple sites.
* Plugins can implement notice queue handlers more easily by registering a
QueueHandler class; no more need to add a daemon.
The new QueueDaemon runs from scripts/queuedaemon.php:
* This replaces most of the old *handler.php scripts; they've been refactored
to the bare handler classes.
* Spawns multiple child processes to spread load; defaults to CPU count on
Linux and Mac OS X systems, or override with --threads=N
* When multithreaded, child processes are automatically respawned on failure.
* Threads gracefully shut down and restart when passing a soft memory limit
(defaults to 90% of memory_limit), limiting damage from memory leaks.
* Support for UDP-based monitoring: http://www.gitorious.org/snqmon
Rough control flow diagram:
QueueDaemon -> IoMaster -> IoManager
QueueManager [listen or poll] -> QueueHandler
XmppManager [ping & keepalive]
XmppConfirmManager [poll updates]
Todo:
* Respawning features not currently available running single-threaded.
* When running single-site, configuration changes aren't picked up.
* New sites or config changes affecting queue subscriptions are not yet
handled without a daemon restart.
* SNMP monitoring output to integrate with general tools (nagios, ganglia)
* Convert XMPP confirmation message sends to use stomp queue instead of polling
* Convert xmppdaemon.php to IoManager?
* Convert Twitter status, friends import polling daemons to IoManager
* Clean up some error reporting and failure modes
* May need to adjust queue priorities for best perf in backlog/flood cases
Detailed code history available in my daemon-work branch:
http://www.gitorious.org/~brion/statusnet/brion-fixes/commits/daemon-work
Consolidated several separate implementations of the same weighting algorithm into common_sql_weight() and fixed some bugs...
For MySQL, now using timestampdiff() instead of subtraction for the comparison, so we get sane results when the year doesn't match, and utc_timestamp() rather than now() so we don't get negative ages for recent items with local server timezone.
Unknown whether the same problems affect PostgreSQL, but note that it lacks the timestampdiff() SQL function.
Consolidated several separate implementations of the same weighting algorithm into common_sql_weight() and fixed some bugs...
For MySQL, now using timestampdiff() instead of subtraction for the comparison, so we get sane results when the year doesn't match, and utc_timestamp() rather than now() so we don't get negative ages for recent items with local server timezone.
Unknown whether the same problems affect PostgreSQL, but note that it lacks the timestampdiff() SQL function.