Entries in article (3)
Efficiency of HTTP Push Vs Pull
I'm working on a new application (StaffLocation.com) that requires the ability to stream live data from a server to the client web application via JavaScript. There are two methods that I could approach this.
The first (and simplest) involves polling a script on the server every x-number of seconds to see if the data has changed. This is the traditional way most web applications retrieve status updates from the server but it has two nasty side affects:
- Every active client sends a request to the server every x-seconds whether there is changed data or not. Therefore, if there are 1,000 users on the website and you want the data to be updated every second, then those users will generate 1,000 hits to your application back-end (and that's 1,000 times whatever database reads/writes you do per hit).
- Every hit uses bandwidth whether it contains data or not. A blank request/response will still contain many hundreds of bytes. Therefore, using the example of 1,000 active users polling every second at a 200 byte payload we would be using 200KB/s (1.2mbps - almost 500GB per month) of bandwidth.
Of course this situation could be improved by decreasing the frequency of polls, and some empirical testing with end-users of our prototypes show that anything below 4 seconds for an update feels instantaneous. However, bandwidth for this type of solution is still using 300kbps (125GB per month) for 1,000 users. Also, advanced caching techniques will be required as there is no way MySQL can handle the authorization & status checks required for much more than 1,000 hits per second. Maybe it's time to investigate other options.
The other type of client/server communication is via server-push. So, instead of the client requesting the server for a new status with 90% of the responses being, "NOT YET!!" The client connects once to the server and the server sends the clients updates on what the client is listening for. At this point I hear you all telling me that HTTP and the web just doesn't work this way. It is a PULL-only based technology. Well, some of you might be surprised to know that server-PUSH functionality has existed since Netscape 1.1 in the form of the content-type multipart/x-mixed-replace. The trouble is, Internet Explorer doesn't seem to support this content-type anymore (it did in 3.0) and it is very hard to use with JavaScript. However, there are ways in which JavaScript may be pushed to the client using modern browsers.
Now, the trouble with most web servers is (especially when utilizing dynamic content creation) that they don't really like pushing data to the browser without caching it first (large static files are the exception). Luckily for me, it is relatively easy today to roll your own web server. For this proof of concept I shall be using Ruby and WEBrick.
Of course, WEBrick by default follows the same pattern of, "lets cache all the data before sending it to the client." Luckily, due to the object-oriented nature of Ruby, that assumption is very easy to override. As a proof of concept, I've created a web server that dynamically adds lines to the page every 3 seconds with the use of JavaScript. One of the tricky things to remember is that the browser is expecting regular data. If it doesn't receive it within it's timeout window (usually 60 seconds), then it will think the connection has been closed. This example sends a single space character every second as a KeepAlive packet. The code is a little long, so you can download it here (push_server.rb).
Now, of course, this doesn't solve the problem of working out when there are new statuses to update the client with, but it does provide an example of a highly efficient server that can easily handle tens of thousands of connections (with the correct file-descriptor permissions on your server) with very little load when the data-set isn't changing. Also, it reduces the bandwidth usage down to the KeepAlive packets (1 byte per 10 seconds for 1,000 users is 12.5kbps or 250MB per month).
To start the server just run ruby push_server.rb. The server will start on port 2000 and you can access the example page via http://localhost:2000/hold.
The next article will be about putting a scalable observer/listener layer on top of this HTTP push server connected to our back-end status database.
Ruby on Rails - ActiveRecord#build_from_xml function
I was playing with the new to_xml feature of Ruby on Rails and I found myself wondering... if you can create XML from ActiveRecord objects, why can't you create ActiveRecord objects from XML?
After searching for a while in the RoR Documentation I wasn't able to find the inverse functionality of to_xml. So now, it seems, I have an opportunity to contribute back to the Rails community with an a functional improvement of my own. I announce to you the build_from_xml method to ActiveRecord.
Just place the below code in your config/environment.rb file.
require "rexml/document"
module ActiveRecord
class Base
def self.build_from_xml(xml)
xml = REXML::Document.new(xml) if xml.class == String
ar = self.new
xml.elements[1].elements.each do | ele |
sym = ele.name.underscore.to_sym
# An association
if ele.has_elements?
klass = self.reflect_on_association(sym).klass
ar.__send__(sym) << klass.build_from_xml(ele)
# An attribute
else
ar[sym] = ele.text
end
end
return ar
end
end
end
You can call this from the main class of any ActiveRecord object. Here is an example.
This ruby code:
firm_xml = File.new("firm_data.xml").read
firm = Firm.build_from_xml(firm_xml)
Will convert this XML file into a fully functional ActiveRecord object, including the associations.
<firm>
<rating type="integer">1</rating>
<name>37signals</name>
<clients>
<client>
<rating type="integer">1</rating>
<name>Summit</name>
<id type="integer">1</id>
<firm-id type="integer">1</firm-id>
</client>
<client>
<rating type="integer">1</rating>
<name>Microsoft</name>
<id type="integer">2</id>
<firm-id type="integer">1</firm-id>
</client>
</clients>
<accounts>
<account>
<id type="integer">1</id>
<firm-id type="integer">1</firm-id>
<credit-limit type="integer">50</credit-limit>
</account>
</accounts>
<id type="integer">1</id>
</firm>
You may have noticed one caveat. This function accepts well formed XML code only that conforms to your model. If it doesn't, it may produce unpredictable results but will probably raise the usual ActiveRecord exceptions in most non-trivial error cases. Oh, and it requires REXML, but you knew that already right.
I will probably convert this to a plugin in the not-to-distant future. That is if the code isn't included in Rails' release branch (hint, hint).
Good vs Great software developers.
I was recently asked an interesting question as to whether I perceive myself to be a great developer. As someone who's previously been responsible for employing technical staff, here are my opinions on the differences between good developers and absolutely great developers
-
Language is unimportant:
Programming languages are all very much the same and have some common roots.
A good programmer will learn a number of these languages to allow him to be more flexible for his employers.
A great programmer will learn and understand the root language and semantics (like Latin for most European languages) and be able to adjust quickly to other languages as his employer requires.
-
Continual self development:
The IT industry is a fast-paced beast. Ruby on Rails was only released in 2005 yet it has revolutionised the way powerful web-based applications can be developed.
A good developer will stay abreast of news and technologies that most affect him (usually within the technologies he's already familiar with).
A great developer will keep informed about as much as possible on as much as possible. Not just IT-related news, but he needs to know what's happening in the rest of the world. A great developer will also continually evaluate and test how these technologies can be used to solve the issues he's faced with on a day-to-day basis. Agility is the great developer's best friend.
-
Problem solving:
There is so much more to creating applications than writing code. Most problems don't lend themselves to being easily mapped to solutions. Of course there are usually many different solutions to the same problem, some are more correct/elegant/efficient than others.
A good developer will break a large problem into many parts and solve each part individually following a set of standards/rules.
A great developer will also do this, but be able to take into account the complete problem-set when deciding on a solutions to the constituent parts. Also, due to the great developers large knowledge base, solutions tend to be more correct/elegant/efficient than those of the good developer.
-
Social/communication skills:
Developers (especially web-based developers) are ultimately developing applications to be used by people.
A good developer understands this and has empathy for the end user when developing.
A great developer will involve the user in his design and development process. He will stay in constant contact with the end-user and attempt to provide as much of the core value required from the application as quickly as possible.
Well, that's just my opinion of course. I'm sure I've missed a number of qualities others may feel are more important than these, or listed some others don't agree with. If you're a great developer, why not drop me a line and we can chat over some a hot pot of ruby code.
