Previously, messages once delivered would just get stuck in the queue seemingly forever if they never got ACKed.
Note this could lead to partial duplication, for instance if the OMB or Twitter queue handlers die after 1/2 of the outgoing sends.
Recommendations:
* catch exceptions more aggressively within queue handlers (so only PHP fatal errors are likely to kill in the middle)
* for processing that involves sending to multiple clients, consider a second queue similar to the XMPP output, eg for OMB:
- first queue gets delivery list and builds message data, enqueueing it for each target address
- second queue can handle each individual outgoing message (and attempt redelivery etc separately)
This would also protect better against a recurring error preventing delivery in the second part, and could spread out any slow sends over multiple threads.
Queue handlers for XMPP individual & firehose output now send their XML stanzas
to another output queue instead of connecting directly to the chat server. This
lets us have as many general processing threads as we need, while all actual
XMPP input and output go through a single daemon with a single connection open.
This avoids problems with multiple connected resources:
* multiple windows shown in some chat clients (psi, gajim, kopete)
* extra load on server
* incoming message delivery forwarding issues
Database changes:
* queue_item drops 'notice_id' in favor of a 'frame' blob.
This is based on Craig Andrews' work branch to generalize queues to take any
object, but conservatively leaving out the serialization for now.
Table updater (preserves any existing queued items) in db/rc3to09.sql
Code changes to watch out for:
* Queue handlers should now define a handle() method instead of handle_notice()
* QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning
thread management (RespawningDaemon) infrastructure.
* The polling XmppConfirmManager has been dropped, as the message is queued
directly when saving IM settings.
* Enable $config['queue']['debug_memory'] to output current memory usage at
each run through the event loop to watch for memory leaks
To do:
* Adapt XMPP i/o to component connection mode for multi-site support.
* XMPP input can also be broken out to a queue, which would allow the actual
notice save etc to be handled by general queue threads.
* Make sure there are no problems with simply pushing serialized Notice objects
to queues.
* Find a way to improve interactive performance of the database-backed queue
handler; polling is pretty painful to XMPP.
* Possibly redo the way QueueHandlers are injected into a QueueManager. The
grouping used to split out the XMPP output queue is a bit awkward.
Conflicts:
scripts/xmppdaemon.php
Previously, messages once delivered would just get stuck in the queue seemingly forever if they never got ACKed.
Note this could lead to partial duplication, for instance if the OMB or Twitter queue handlers die after 1/2 of the outgoing sends.
Recommendations:
* catch exceptions more aggressively within queue handlers (so only PHP fatal errors are likely to kill in the middle)
* for processing that involves sending to multiple clients, consider a second queue similar to the XMPP output, eg for OMB:
- first queue gets delivery list and builds message data, enqueueing it for each target address
- second queue can handle each individual outgoing message (and attempt redelivery etc separately)
This would also protect better against a recurring error preventing delivery in the second part, and could spread out any slow sends over multiple threads.
Queue handlers for XMPP individual & firehose output now send their XML stanzas
to another output queue instead of connecting directly to the chat server. This
lets us have as many general processing threads as we need, while all actual
XMPP input and output go through a single daemon with a single connection open.
This avoids problems with multiple connected resources:
* multiple windows shown in some chat clients (psi, gajim, kopete)
* extra load on server
* incoming message delivery forwarding issues
Database changes:
* queue_item drops 'notice_id' in favor of a 'frame' blob.
This is based on Craig Andrews' work branch to generalize queues to take any
object, but conservatively leaving out the serialization for now.
Table updater (preserves any existing queued items) in db/rc3to09.sql
Code changes to watch out for:
* Queue handlers should now define a handle() method instead of handle_notice()
* QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning
thread management (RespawningDaemon) infrastructure.
* The polling XmppConfirmManager has been dropped, as the message is queued
directly when saving IM settings.
* Enable $config['queue']['debug_memory'] to output current memory usage at
each run through the event loop to watch for memory leaks
To do:
* Adapt XMPP i/o to component connection mode for multi-site support.
* XMPP input can also be broken out to a queue, which would allow the actual
notice save etc to be handled by general queue threads.
* Make sure there are no problems with simply pushing serialized Notice objects
to queues.
* Find a way to improve interactive performance of the database-backed queue
handler; polling is pretty painful to XMPP.
* Possibly redo the way QueueHandlers are injected into a QueueManager. The
grouping used to split out the XMPP output queue is a bit awkward.
- NOTICE_INBOX_SOURCE_* constants moved to common.php since Notice_inbox.php not always loaded
- fixed typo in User::staticGet() call which caused user #1 to receive messages once for each subscriber instead of for him/herself
- 'continue' -> 'continue 2' inside switch() statement to fix loop escape (PHP considers switch() a looping construct for break & continue)
Key changes:
* Initialization code moved from common.php to StatusNet class;
can now switch configurations during runtime.
* As a consequence, configuration files must now be idempotent...
Be careful with constant, function or class definitions.
* Control structure for daemons/QueueManager/QueueHandler has been refactored;
the run loop is now managed by IoMaster run via scripts/queuedaemon.php
IoManager subclasses are woken to handle socket input or polling, and may
cover multiple sites.
* Plugins can implement notice queue handlers more easily by registering a
QueueHandler class; no more need to add a daemon.
The new QueueDaemon runs from scripts/queuedaemon.php:
* This replaces most of the old *handler.php scripts; they've been refactored
to the bare handler classes.
* Spawns multiple child processes to spread load; defaults to CPU count on
Linux and Mac OS X systems, or override with --threads=N
* When multithreaded, child processes are automatically respawned on failure.
* Threads gracefully shut down and restart when passing a soft memory limit
(defaults to 90% of memory_limit), limiting damage from memory leaks.
* Support for UDP-based monitoring: http://www.gitorious.org/snqmon
Rough control flow diagram:
QueueDaemon -> IoMaster -> IoManager
QueueManager [listen or poll] -> QueueHandler
XmppManager [ping & keepalive]
XmppConfirmManager [poll updates]
Todo:
* Respawning features not currently available running single-threaded.
* When running single-site, configuration changes aren't picked up.
* New sites or config changes affecting queue subscriptions are not yet
handled without a daemon restart.
* SNMP monitoring output to integrate with general tools (nagios, ganglia)
* Convert XMPP confirmation message sends to use stomp queue instead of polling
* Convert xmppdaemon.php to IoManager?
* Convert Twitter status, friends import polling daemons to IoManager
* Clean up some error reporting and failure modes
* May need to adjust queue priorities for best perf in backlog/flood cases
Detailed code history available in my daemon-work branch:
http://www.gitorious.org/~brion/statusnet/brion-fixes/commits/daemon-work
* Mostly punctuation updates so that the same message is used consistently in all of StatusNet.
* Some cases of "Title Case" removed, because that does not appear to be used consistently.
This reverts commit 5d9a2eb17e.
These are commands that are/were implemented by Twitter, and we don't
(yet) implemented. People will be looking for that information.
* We now cache negative lookups; clear them in Memcached_DataObject->insert()
* Mark file.url as a unique key in statusnet.ini so its negative lookups are cleared properly (first save of a notice with a new URL was failing due to double-insert)
* Now using serialization for default in-process cache instead of just saving objects; avoids potential corruption if you save an object to cache, change the original object, then fetch the same key from cache again
Consolidated several separate implementations of the same weighting algorithm into common_sql_weight() and fixed some bugs...
For MySQL, now using timestampdiff() instead of subtraction for the comparison, so we get sane results when the year doesn't match, and utc_timestamp() rather than now() so we don't get negative ages for recent items with local server timezone.
Unknown whether the same problems affect PostgreSQL, but note that it lacks the timestampdiff() SQL function.
Consolidated several separate implementations of the same weighting algorithm into common_sql_weight() and fixed some bugs...
For MySQL, now using timestampdiff() instead of subtraction for the comparison, so we get sane results when the year doesn't match, and utc_timestamp() rather than now() so we don't get negative ages for recent items with local server timezone.
Unknown whether the same problems affect PostgreSQL, but note that it lacks the timestampdiff() SQL function.
Allows storage of larger objects (over 1mb in size uncompressed), such as huge LDAP schemas.
Should also improve cache efficiency (allows more stuff to be stored in same memory) and reduce network latency (less data transfer)
self-subscription) via the API. Additionally, make it impossible
to block yourself or unsubscribe from yourself, period.
I also made User use the subs.php helper function for unsubscribing
during a block.
Hopefully, these changes will get rid of the problem of people
accidentally deleting their self-subscriptions once and for all
(knock on wood).
* 0.9.x: (141 commits)
Reload the admin design panel page to show the new CSS when the
Only pick up new default site colors if the theme has NOT changed.
Delete design when user chooses to restore default design, instead
Localisation updates for !StatusNet from !translatewiki.net !sntrans
Do not rebuild/add .mo files by default
If an XHR notice is sent form a page that has no timeline, show a
Revert "If the page doesn't have .notices, silently skip the notice XHR response"
Revert "Clear/reset the XHR notice form on pages where there is no timeline"
Clear/reset the XHR notice form on pages where there is no timeline
If the page doesn't have .notices, silently skip the notice XHR response
Remove useless debugging statement
Moved form control styles (i.e., border and radius) out of base
add pluginhandler to list of daemons to shut down
Using box-shadow only on the current navigation item
Updated theme readme
Fix regression in password settings: users have been unable to change their passwords since introduction of ChangePassword event (later StartChangePassword) November 5 in commit d6ddb84132
Ticket 2048: make OMB posting HTTP timeout configurable as $config['omb']['timeout']; defaults to 5 seconds instead of 20-second default in Yadis library
ticket 1100: add Drupal source link
Fix makefile wildcards for locale compilation (now works on Ubuntu 8.04)
typo fix: '$this' now spelled correctly. Looks like this'll fix acceptance of 'source' param for direct messages posted to API
...
Conflicts:
js/util.js
locale/ar/LC_MESSAGES/statusnet.po
locale/bg/LC_MESSAGES/statusnet.po
locale/ca/LC_MESSAGES/statusnet.po
locale/cs/LC_MESSAGES/statusnet.po
locale/de/LC_MESSAGES/statusnet.po
locale/el/LC_MESSAGES/statusnet.po
locale/en_GB/LC_MESSAGES/statusnet.po
locale/es/LC_MESSAGES/statusnet.po
locale/fi/LC_MESSAGES/statusnet.po
locale/fr/LC_MESSAGES/statusnet.po
locale/ga/LC_MESSAGES/statusnet.po
locale/he/LC_MESSAGES/statusnet.po
locale/is/LC_MESSAGES/statusnet.po
locale/it/LC_MESSAGES/statusnet.po
locale/ja/LC_MESSAGES/statusnet.po
locale/ko/LC_MESSAGES/statusnet.po
locale/mk/LC_MESSAGES/statusnet.po
locale/nb/LC_MESSAGES/statusnet.po
locale/nl/LC_MESSAGES/statusnet.po
locale/nn/LC_MESSAGES/statusnet.po
locale/pl/LC_MESSAGES/statusnet.po
locale/pt/LC_MESSAGES/statusnet.po
locale/pt_BR/LC_MESSAGES/statusnet.po
locale/ru/LC_MESSAGES/statusnet.po
locale/statusnet.po
locale/sv/LC_MESSAGES/statusnet.po
locale/te/LC_MESSAGES/statusnet.po
locale/tr/LC_MESSAGES/statusnet.po
locale/uk/LC_MESSAGES/statusnet.po
locale/vi/LC_MESSAGES/statusnet.po
locale/zh_CN/LC_MESSAGES/statusnet.po
locale/zh_TW/LC_MESSAGES/statusnet.po
plugins/Realtime/realtimeupdate.js
* master: (67 commits)
Ticket 2038: fix bad bug tracker link
Fix regression in group posting: bug introduced in commit 1319002e15. Need to use actual profile object rather than an id on a variable that doesn't exist when checking blocks :D
Log database errors when saving notice_inbox entries
Drop the username from the log id for now; seems to trigger an error loop in some circumstances
request id on logs... pid + random id per web request + username + method + url
Add OpenID ini info back into statusnet.ini as a stopgap until we can
Some changes to the OpenID DataObjects to make them emit the exact same
OpenID plugin should set 'user_openid.display' as unique key
Remove relationship: user_openid.user_id -> user.id. I don't think this
Have OpenID plugin DataObjects emit their own .ini info
Revert "Allow plugin DB_DataObject classes to not have to use the .ini file by overriding keys(), table(), and sequenceKey() for them"
Catch and report exceptions from notice_to_omb_notice() instead of letting the OMB queue handler die.
Fix regression in remote subscription; added hasRole() shadow method on Remote_profile.
Fix fatal error on OMB subscription for first-timers
Remove annoying log msg
Drop error message on setlocale() failure; this is harmless, since we actually have a working locale set up.
Catch uncaught exception
Fixed bug where reply-sync bit wasn't getting saved
Forgot to render the nav menu when on FB Connect login tab
Facebook plugin no longer takes over Login and Connect settings nav menus
...
Conflicts:
db/08to09_pg.sql
db/statusnet_pg.sql
locale/pt_BR/LC_MESSAGES/statusnet.mo
plugins/Mapstraction/MapstractionPlugin.php
DB_DataObject hides errors by silently returning null for any non-existent method call, making it harder to tell what the heck's going on... the rights check for blocked remote users returned null for the check for subscribe rights, thus eval'ing to false. We now log a note in this circumstance, which would have cut about 3 hours off of the debug time.
DB_DataObject hides errors by silently returning null for any non-existent method call, making it harder to tell what the heck's going on... the rights check for blocked remote users returned null for the check for subscribe rights, thus eval'ing to false. We now log a note in this circumstance, which would have cut about 3 hours off of the debug time.
Success return code from omb_broadcast_message was dropped in commit ec88d2650e (Aug 10 2009) which switched us to libomb backend. With queues enabled, this would lead to the notice being readded to the outgoing OMB queue for redelivery as the queue system thought the send failed. The resends caused extra load and confusion for third-party sites, and more worryingly just plugged up our own queue so legit messages were badly delayed.
This commit should restore the previous state, where we fire-and-forget; that is, we're not actually checking to see if all remote subscribers received the message successfully and there will be no resends.
Success return code from omb_broadcast_message was dropped in commit ec88d2650e (Aug 10 2009) which switched us to libomb backend. With queues enabled, this would lead to the notice being readded to the outgoing OMB queue for redelivery as the queue system thought the send failed. The resends caused extra load and confusion for third-party sites, and more worryingly just plugged up our own queue so legit messages were badly delayed.
This commit should restore the previous state, where we fire-and-forget; that is, we're not actually checking to see if all remote subscribers received the message successfully and there will be no resends.