Commit Graph

10 次程式碼提交

作者 SHA1 備註 提交日期
Brion Vibber
532e486a93 Detect when queuedaemon/xmppdaemon parent processes die and kill the child processes.
Keeps stray daemon subprocesses from floating around when we kill the parents via a signal!

Accomplished by opening a bidirectional pipe in the parent process; the children close out the writer end and keep the reader in their open sockets list. When the parent dies, the children see that the socket's been closed out and can perform an orderly shutdown.
2010-03-10 11:54:00 -08:00
Brion Vibber
ce6be4f836 Queues: redid the breakout control model so we can start up and subscribe to queues without running through the complete site list, which is ok at 1k sites but too slow at 10k.
All breakout queues that we're going to need to listen to now need to be explicitly listed in $config['queue']['breakout'].

Until XMPP is moved to component model, this setting will let the individual processes work with their own queues:
$config['queue']['breakout'][] = 'xmpp/xmppout/' . $config['site']['nickname'];
2010-02-17 16:49:00 -08:00
Brion Vibber
c74aea589d Stomp queue restructuring for mass scalability:
- Multiplexing queues into groups and for multiple sites.
- Sharing vs breakout configurable per site and per queue via $config['queue']['breakout']
- Detect how many times a message is redelivered, discard if it's killed too many daemons
 - count configurable with $config['queue']['max_retries']
 - can dump the items to files in $config['queue']['dead_letter_dir']

Queue daemon memory & resource leak fixes:
- avoid unnecessary reconnections to memcached server (switch persistent connections back in on second initialization, assuming it's child process)
- monkey-patch for leaky .ini loads in DB_DataObject::databaseStructure() - was leaking 200k per active switch
- applied leak fixes to Status_network as well, using intermediate base Safe_DataObject for both it and Memcache_DataObject

Misc queue fixes:
- correct handling of child processes exiting due to signal termination instead of regular exit
- shutdown instead of infinite respawn loop if we're already past the soft memory limit at startup
- Added --all option for xmppdaemon... still opens one xmpp connection per site that has xmpp active

Cache updates:
- add Cache::increment() method with native support for memcached atomic increment
2010-02-16 09:16:51 -08:00
Brion Vibber
288dc3452f Log exceptions from queuedaemon.php if they're not already caught 2010-01-28 22:05:14 -08:00
Brion Vibber
58be61b641 Control channel for queue daemons to request graceful shutdown, restart, or update to listen to a newly added or reconfigured site.
queuectl.php --update -s<site>
  queuectl.php --stop
  queuectl.php --restart

Default control channel is /topic/statusnet-control. For external utilities to send a site update ping direct to the queue server, connect via Stomp and send a message formatted thus:

  update:<nickname>

(Nickname here, *not* server hostname! The rest of the queues will be updated to use nicknames later.)

Note that all currently-connected queue daemons will get these notifications, including both queuedaemon.php and xmppdaemon.php. (XMPP will ignore site update requests for sites that it's not handling.)

Limitations:
* only implemented for stomp queue manager so far
* --update may not yet handle a changed server name properly
* --restart won't reload PHP code files that were already loaded at startup. Still need to stop and restart the daemons from 'outside' when updating code base.
2010-01-26 11:49:49 -08:00
Brion Vibber
055a00bcae drop now-unused --skip-xmpp and --xmpp-only options from queuedaemon.php 2010-01-25 09:36:20 -08:00
Brion Vibber
0e852def6a XMPP queued output & initial retooling of DB queue manager to support non-Notice objects.
Queue handlers for XMPP individual & firehose output now send their XML stanzas
to another output queue instead of connecting directly to the chat server. This
lets us have as many general processing threads as we need, while all actual
XMPP input and output go through a single daemon with a single connection open.

This avoids problems with multiple connected resources:
* multiple windows shown in some chat clients (psi, gajim, kopete)
* extra load on server
* incoming message delivery forwarding issues

Database changes:
* queue_item drops 'notice_id' in favor of a 'frame' blob.
  This is based on Craig Andrews' work branch to generalize queues to take any
  object, but conservatively leaving out the serialization for now.
  Table updater (preserves any existing queued items) in db/rc3to09.sql

Code changes to watch out for:
* Queue handlers should now define a handle() method instead of handle_notice()
* QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning
  thread management (RespawningDaemon) infrastructure.
* The polling XmppConfirmManager has been dropped, as the message is queued
  directly when saving IM settings.
* Enable $config['queue']['debug_memory'] to output current memory usage at
  each run through the event loop to watch for memory leaks

To do:
* Adapt XMPP i/o to component connection mode for multi-site support.
* XMPP input can also be broken out to a queue, which would allow the actual
  notice save etc to be handled by general queue threads.
* Make sure there are no problems with simply pushing serialized Notice objects
  to queues.
* Find a way to improve interactive performance of the database-backed queue
  handler; polling is pretty painful to XMPP.
* Possibly redo the way QueueHandlers are injected into a QueueManager. The
  grouping used to split out the XMPP output queue is a bit awkward.
2010-01-21 22:40:35 -08:00
Brion Vibber
598072468c --xmpp-only hack for queuedaemon.php to run separate queue daemon with only xmpp threads 2010-01-15 11:13:06 -08:00
Brion Vibber
58bc33850a temporary --skip-xmpp flag on queuedaemon.php, allows to run queue daemons but skip subscription to xmpp-based queues
(still working on making these behave gracefully when server is down)
2010-01-14 15:32:37 -08:00
Brion Vibber
ec145b73fc Major refactoring of queue handlers to support running multiple sites in one daemon.
Key changes:
* Initialization code moved from common.php to StatusNet class;
  can now switch configurations during runtime.
* As a consequence, configuration files must now be idempotent...
  Be careful with constant, function or class definitions.
* Control structure for daemons/QueueManager/QueueHandler has been refactored;
  the run loop is now managed by IoMaster run via scripts/queuedaemon.php
  IoManager subclasses are woken to handle socket input or polling, and may
  cover multiple sites.
* Plugins can implement notice queue handlers more easily by registering a
  QueueHandler class; no more need to add a daemon.

The new QueueDaemon runs from scripts/queuedaemon.php:

* This replaces most of the old *handler.php scripts; they've been refactored
  to the bare handler classes.
* Spawns multiple child processes to spread load; defaults to CPU count on
  Linux and Mac OS X systems, or override with --threads=N
* When multithreaded, child processes are automatically respawned on failure.
* Threads gracefully shut down and restart when passing a soft memory limit
  (defaults to 90% of memory_limit), limiting damage from memory leaks.
* Support for UDP-based monitoring: http://www.gitorious.org/snqmon

Rough control flow diagram:
QueueDaemon -> IoMaster -> IoManager
                           QueueManager [listen or poll] -> QueueHandler
                           XmppManager [ping & keepalive]
                           XmppConfirmManager [poll updates]

Todo:

* Respawning features not currently available running single-threaded.
* When running single-site, configuration changes aren't picked up.
* New sites or config changes affecting queue subscriptions are not yet
  handled without a daemon restart.
* SNMP monitoring output to integrate with general tools (nagios, ganglia)
* Convert XMPP confirmation message sends to use stomp queue instead of polling
* Convert xmppdaemon.php to IoManager?
* Convert Twitter status, friends import polling daemons to IoManager
* Clean up some error reporting and failure modes
* May need to adjust queue priorities for best perf in backlog/flood cases

Detailed code history available in my daemon-work branch:
http://www.gitorious.org/~brion/statusnet/brion-fixes/commits/daemon-work
2010-01-12 20:45:09 -08:00