SipThorDescription

Version 61 (Adrian Georgescu, 09/20/2009 04:13 pm) → Version 62/96 (Adrian Georgescu, 09/20/2009 04:14 pm)

= SIP Thor description =

[[TOC(SipThorDescription, SipThorPreparations, SipThorInstallation, depth=2)]]

SIP Thor provides scalability, load-sharing and resilience for [wiki:MSPPreparations Multimedia Service Platform]. Its implementation is mature and stable, having several years in production environments with a good track record. Based on previous experiences, it takes between 6 to 12 weeks to put in service a SIP infrastructure based on it.

SIP Thor platform is using the same software components for the interfaces with the end-user SIP devices, namely the SIP Proxy, Media relay and XCAP server used by [wiki:MSPPreparations Multimedia Service Platform] but it implements a different system architecture for them by using Peer-To-Peer concepts.

[[Image(http://www.ag-projects.com/images/stories/ag_images/thor-platform-big.png, width=500)]]

== Architecture ==

To implement its functions, SIP Thor introduces several new components to Multimedia Service Platform. SIP Thor implements a peer-to-peer overlay of several logical network entities called '''roles''' installed on multiple physical machines called '''nodes'''.

Each node can be configured to run one or multiple roles. Typical example of such roles are '''sip_proxy''' and '''media_relay'''. Nodes that advertise SIP Proxy or Media Relay capabilities, will handle the load associated with the SIP and RTP traffic respectively and will inherit the built-in resilience and load distribution provided by SIP Thor design.

SIP Thor operates at IP layer and the nodes can be installed at different IP locations, like different data centers, cities or countries.The sum of all nodes provide a consolidated single logical platform.

The platform provides a fail-proof NAT traversal solution that impose no requirements in the SIP clients by using a reverse-outbound technique for SIP signaling and geographically distributed relay function for RTP media streams.

== References ==

The closest standards based description for what SIP Thor implements is the [http://tools.ietf.org/html/draft-bryan-p2psip-usecases-00#section-3.4.2 Self-organizing SIP Proxy farm] described in 2007 by the original P2P use cases draft produced by IETF [http://www.ietf.org/dyn/wg/charter/p2psip-charter.html P2PSIP Working Group]. SIP Thor started development during early 2005 and for this reason the software uses a slight variation of the terminology used later by the P2PSIP Working Group.

SIP Thor particular design and implementation has been explored in several white-papers and conferences:

* [http://www.sipcenter.com/sip.nsf/html/AG+P2P+SIP Addressing survivability and scalability of SIP networks by using Peer-to-Peer protocols] published by SIP Center in September 2005
* [http://ag-projects.com/docs/Present/20060518-ScalableSIP.pdf Building scalable SIP networks] presented by Adrian Georgescu at VON Conference held Stockholm in May 2006
* [http://ag-projects.com/docs/Present/20061004-IMSP2P.pdf Solving IMS problems using P2P technology] presented by Adrian Georgescu at Telecom Signalling World held in London in October 2006
* [http://ag-projects.com/docs/Present/20070227-P2PSIP.pdf Overview of P2P SIP Principles and Technologies] presented by Dan Pascu at International SIP Conference held in Paris in January 2007
* [http://www.imsforum.org/search/imsforum/p2p P2PSIP and the IMS: Can they complement each other?] published by IMS forum June 2008 - online accessible [http://www.ag-projects.com/content/view/519/176/ here]

== P2P Design design ==

SIP Thor is designed around the concept of a peer-to-peer overlay with equal peers. The overlay is a flat level logical network that handles multiple roles. The peers are dedicated servers than handle the application logic of each role. Peers are dedicated servers with good IP connectivity and low churn rate and are part of an infrastructure managed by an service provider. The software design and implementation has been fine-tuned for this scope and differs to some degree from other classic implementations of P2P overlays that are typically run by transitive end-points.

The nodes interface with native SIP clients that are unaware of the overlay logic employed by the servers. Internally to the SIP Thor network, the lookup of a resource (a node that handles a certain subscriber for a given role at the moment of the query) is a one step lookup in a hash table.

The hash table is an address space with integers arranged on a circle, nodes and SIP addresses mapp to integers in this space. This concept can be found in classic DHT implementations like [http://en.wikipedia.org/wiki/Chord_(DHT) Chord]. Join and leave primitives take care for the addition and removal of nodes in the overlay in a self-organizing fashion.

== Security ==

Communication between SIP Thor nodes is encrypted by using Transport Level Security (TLS). Each node part of the SIP Thor network is uniquely identified by a X.509 certificate. The certificates are signed by a Certificate Authority managed by the service provider and can be revoked as necessary for example when a node has been compromised.

The X.509 certificate and its attributes are used for authentication and authorization of the nodes when they exchange overlay messages over the SIP Thor network.

== Scalability ==

Because by scope, the number of peers in the overlay is fairly limited (tens to hundreds of nodes in practice), there is no need for a Chord-like finger table, iterative or recursive queries. The overlay lookup type is one hop, referred as O(1) in classic P2P terminology and SIP Thor's implementation handles up to half a million queries per second on a typically server processor, which is several order of magnitude higher than when is expected in normal operations.

Thanks to the single hop lookup mechanism, SIP call flows over the SIP Thor overlay involves a maximum of two nodes, regardless of the number of nodes, subscribers or devices handled by the SIP Thor network. Shall SIP devices be 'SIP Thor aware' and able to perform lookups in the overlay themselves, this could greatly improve the overal efficiency of the system as less SIP traffic and less queries will be generated inside the SIP Thor network. A publicly reachable lookup interface is exposed over a TCP socket by each node using a simple query syntax.

The current implementation allows SIP Thor to grow to accomodate thousands of physical nodes, which can handle the traffic of any size for a real-time communication service deployable in the real world today (e.g. if the SIP server node implementation can handle one hundred thousand subscribers then 100 nodes (roughly the equivalent of three 19 inch racks of data center equipment) are required to handle a base of 10 million subscribers.

The service scalability is in reality limited by the performance of accounting sub-system used by the operator or by the presence of centralized functions like prepaid. If the accounting functions are performed outside SIP Thor, for instance in external gateway system, there is no hard limitation in how much the overly can really scale.

== Load Sharing sharing ==

SIP Thor is designed to equally share the traffic between all available nodes. This is done by returning to the SIP clients that use standard RFC 3263 style lookups, a random and limited slice of the DNS records that point to actual live nodes that perform the SIP server role. DNS records are managed internally by a special role '''thor-dns''' on multiple nodes assigned as DNS servers in the network. This simple DNS query/response mechanism achieves a near perfect distribution without introducing any intermediate load balancer or latency. Internally to SIP Thor, similar principle is used for load balancing internal functions like XCAP queries or SOAP/XML provisioning requests.

For functions driven internally by SIP Thor, for instance the reservation of a media relay for a SIP session, other selection techniques could be potentially applied for instance selecting a candidate based on geographic proximity to the calling party to minimize round trip time. Though captured in the initial design, such techniques have not been implemented because no customers demanded them.

By using a virtualization technique, the peer-to-peer network is able to function with a minimum number of nodes while still achieving fair equal distribution of load when using at least three physical servers.

== Zero Configuration configuration ==

There is no need to configure anything in the SIP Thor network for supporting the addition of a new node besides starting it with the right X.509 certificate.

== Thor Event Server server ==

'''thor-eventserver''' is an event server, which is the core of the messaging system that is used by the SIP Thor network to implement communication between the network
members. The messaging system is based on publish/subscribe messages that are exchanged between network members. Each entity in the network publishes its own
capabilities and status for whomever is interested in the information. At the same time each entity may subscribe to certain types of information which is published by the other network members based on the entity's functionality in the network.

Multiple event servers can be run as part of a SIP Thor network (on different systems, that are preferably in different hosting facilities) which will improve the
redundancy of the SIP Thor network and its resilience in the face of network/system failures, at the expense of linearly increasing the messaging traffic with
the number of the network members. It is recommended to run at least 3 event servers in a given SIP Thor network.

== Thor Manager ==

'''thor-manager''' is the SIP Thor network manager, which has the role of maintaining the consistency of the SIP Thor network as members join and leave the network. The manager will publish the SIP Thor network status regularly, or as events occur to inform all network members of the current network status, allowing them to adjust their internal state as the network changes.

Multiple managers can be run as part of a SIP Thor network (on different systems, that are preferably in different hosting facilities), which will improve the redundancy of the SIP Thor network and its resilience in the face of network/system failures, at the expense of a slight increase in the messaging traffic with each new manager that is added. If multiple managers are run, they will automatically elect one of them as the active one and the others will be idle until the active manager stops working or leaves the network. Then a new manager is elected and becomes the active manager. It is recommended to run at least 3 managers in a given SIP Thor network preferably in separate hosting facilities.

== Thor Database ==

'''thor-database''' is a component of the SIP Thor network that runs on the central database(s) used by the SIP Thor network. Its purpose is to publish the location of the provisioning database in the network, so that other SIP Thor network members know where to find the central database if they need to access information from it.

== Thor DNS ==

'''thor-dns''' is a component of the SIP Thor network that runs on the authoritative name servers for the SIP Thor domain. Its purpose is to keep the DNS entries for the SIP Thor network in sync with the network members that are currently online. Each authoritative name-server needs to run a copy of the DNS manager in combination with a DNS server. The SIP Thor DNS manager will update the DNS backend database with the appropriate records as nodes join/leave the SIP Thor network, making it reflect the network status in realtime.

== Thor Node ==

'''thor-node''' is to be run on a system that wishes to become a SIP Thor network member. By running this program, the system will join the SIP Thor network and become part of it, sharing its resources and announcing its capabilities to the other SIP Thor network members.

The network can accomodate one or more nodes with this role, SIP Thor takes care automatically of the additions and removal of each instance. The currently supported roles are '''sip_proxy''' in combination with OpenSIPS and '''voicemail_server''' in combination with Asterisk. Other roles are directly built in MediaProxy ('''media_relay'''), NGNPro ('''provisioning_server''') and OpenXCAP ('''xcap_server'''), for these resources no thor-node standalone component is required.

== Thor Monitor ==

'''thor-monitor''' is a utility that shows the SIP Thor network state in a terminal. It can be used to monitor the SIP Thor network status and events. Example:

== NGNPro ==

NGNPro component performs the enrollment and provisioning server role. It saves all changes persistently in the bootstrap database and caches the data on the responsable node at the moment of the change. The network can accomodate multiple nodes with this role, SIP Thor takes care automatically of the additions and removal of each instance.

NGNPro exposes a [wiki:ProvisioningGuide SOAP/XML interface] to the outside world and bridges the SOAP/XML queries with the distributed data structures employed by SIP Thor nodes.

NGNPro is also the component used to harvest usage statistics and provide status information from the SIP Thor nodes.

== Third-party Software software ==

Adding new roles to the system can be realized programatically by obeying to the SIP Thor API and depending on the way of working of the component that needs to be integrated in the SIP Thor network.

The following integration steps must be taken to add a new role to the system in the form of a third-party software:

1. The third-party software must implement a component that publishes its availability in the network. This can also be programmed outside of the specific software by adding it to the generic thor_node configuration and logic
1. The third-party software must be able to lookup resources in the SIP Thor network and use the returned results in its own application logic
1. Depending of the inner-working of the application performed by the new role, other roles may need to be updated in order to serve it (e.g. adding specific entries into the DNS or moving provisioning data to it)