Comments: Scaling Stateful SIP Proxy Servers

Once on the "Inter-tubes", always on the "Inter-tubes"! http://bit.ly/fZTBHn

Posted by Adam Uzelac at December 16, 2010 09:12 AM

Hi Aswath,

I've used Casandra, and I think it's an excellent choice for this job.

I would add perhaps one further suggestion by extending your point #5 to say that clients could (not to say they should, just an option) connect to 2 different Proxies at once, for a call. Then you have fault tolerance within a call without a need to reconnect on the fly - you just switch to the other one. Having an essentially idle secondary connection will have close to zero load impact, but it's there in reserve. That would speed up the handover a bit.

What you describe at the signalling/nameserver end, if SIP were the protocol, is essentially a non-Stateful SIP Registrar. Kamailio and openSIPs can both do this. They can both run in proxy mode or non-proxy (P2P) mode.

We used to run VoIP User in exactly this configuration using SIP as the protocol, other than we didn't need a hashtable proxy index, as we only had 2 proxies and no requirement to scale beyond that.

We subsequently changed to a stateful SIP proxy model to assist with NAT traversal for the signalling side. No reason this couldn't be included in your Proxy model (so proxy traffics both media and signalling), which is exactly what Skype super-nodes do. Again, if SIP, FreeSwitch can be configured to behave in this way.

You can also do some interesting tricks to look for the mean best proxy/route for the 2 parties traffic and send them to the *best* proxy for their call (either an unloaded route or mean best distance between the 2).

Dean

Posted by Dean at December 16, 2010 10:18 AM

Thanks Adam for the image.

Thanks Dean for your detailed comment. I considered the second proxy option and dismissed for the simple reason that the SP must deploy "double" the servers. But you are suggesting that it will not have much impact. Don't the client and standby server need to maintain the TCP connection? Are you thinking the standby server can support lot more end-points than the primary server?

In my design I want to include stateful Proxy servers. Even though I did not explicitly identify SIP to be general. But in the case of SIP, do you think my proposal does not cover stateful Proxies? I am thinking that this design uses stateful Registrar because the name server maintains the list of registered end points and the corresponding Proxy servers.

My proposal does not exclude relay nodes, but they are independent of this.

Posted by Aswath at December 16, 2010 12:08 PM

Hi Aswath,

>>Thanks Dean for your detailed comment. I considered the second proxy option and dismissed for the simple reason that the SP must deploy "double" the servers. But you are suggesting that it will not have much impact. Don't the client and standby server need to maintain the TCP connection? Are you thinking the standby server can support lot more end-points than the primary server?>But in the case of SIP, do you think my proposal does not cover stateful Proxies?> I am thinking that this design uses stateful Registrar because the name server maintains the list of registered end points and the corresponding Proxy servers.<<

In which case what you describe is exactly as VoIP User is configured at the moment, and is basically a standard design for a SIP network.

Posted by Dean at December 16, 2010 01:00 PM

Sorry, my message got truncated for some reason.

"But you are suggesting that it will not have much impact."

Correct.

"Don't the client and standby server need to maintain the TCP connection?"

Media is almost never sent over TCP. It is UDP, which is stateless and therefore consumes no resource at either side of the socket. There's a small amount of in-memory "AUTH" which is required, but that's minimal (negligible).

Dean

Posted by dean at December 16, 2010 07:07 PM

@Aswath: good post, thanks.

@Dean, @Aswath: I know original intent was to stay generic but for SIP specifically are you saying most SIP clients support ability to register to multiple proxies and that there are widely implemented standards-based mechanisms to govern fail-over, re-balancing after failure cleared etc?

Posted by gzino at December 29, 2010 10:58 PM