Handling APIs with Ruby XML Parsing

Posted by acts_as_flinn Sat, 26 Jan 2008 03:08:00 GMT

Have you ever wanted needed to write a Ruby wrapper for an XML based API?

If you have a burning desire to seamlessly exchange data over the web or you just want to use the latest Interweb 2.0 service – you’re most likely contemplating writing your API client in Ruby… yes that’s why you’re here isn’t it?

Why Use Ruby to Parse XML?

It’s a fact – Ruby kicks ass at parsing XML. You can find tons of examples of XML API clients written in Ruby.

ActiveResource parses XML and handles RESTful HTTP

The new ActiveResource Rails gem found in Rails2 makes pretty light work of handling XML APIs via REST. Unfortunately not everyone is rushing to support REST just yet. If you support a big Rails 1.2 app you can’t just run out an add ActiveResource to your project (which is my case). This post is not about REST or ActiveResource so if you’re looking for that, click the link you just clicked skipped over.

Enough with the useful useless Ruby XML facts…

Show me how to write a Ruby XML API wrapper

I’ve been working with an API recently for a third party registration system on a project we’re rolling out soon. The third party provides their API using domain scoped query URLs and HTTP GET params and returns XML documents.

Huh?

http://example.com/aaflinn/lookup?username=billlumberg&password=swingline

XML Messages

When you get a matching user/pass combination you get something like this.

<?xml version="1.0" encoding="ISO-8859-1"?>
<auth>
    <user>
        <username><![CDATA[billlumberg]]></username>
        <fullname><![CDATA[Bill Lumberg]]></fullname>
        <zipcode><![CDATA92131]></zipcode>
        <email><![CDATA[bill.lumberg@initech.com]]></email>
    </user>
</auth>

When you put the wrong password you get an error like so.

<?xml version="1.0" encoding="ISO-8859-1"?>
<auth>
  <error><![CDATA[Invalid username/password combination]]></error>
</auth>

If you don’t enter a username at all you’ll get an error thusly.

<?xml version="1.0" encoding="ISO-8859-1"?>
<auth>
  <error><![CDATA[No username given.]]></error>
</auth>
Pretty easy, no?

API Wrapper Concepts

Here are some concepts that I felt were important when I started writing my wrapper.

  1. Simple – it should be easy to code (I hate writing stupid code)
  2. DRY – if it’s worth writing the wrapper make sure it’s reusable
  3. Self Documenting – it should be as self documenting as possible (rdoc)
  4. “Exceptional Code” – it should raise errors on exceptions and handle errors from the libraries it makes use of
  5. Don’t Spam – Don’t abuse APIs (grrr!)

Requirements

The wrapper should use a GET query string to perform a query to the service provider passing an md5 hashed password and username combination. If the user exists parse the XML result document and return an instantiated user object based on the XML. If no user exists raise some type of rescuable error (RecordNotFound).

Additionally the wrapper should be able to create a new user. This particular service provider uses an HTTP GET query string but in some cases you might find a plain old POST like you’d see in a form or you’ll need to build XML and pass that back to the service provider (which I am not doing here). The wrapper should be able to be able to interpret error messages and determine the status of our create request in a graceful way.

Ruby API Wrapper by Example

The names have been changed to protect the innocent. I’ve edited the wrapper a bit to reduce some complexity and renamed it to hide the actual API provider. Read on in the comments of the code, I’ve done everything on that bullet list and you can read the code as you read the comments setting up just about every line of code written.

# =Example API Ruby Wrapper=
#
# Usage
#
# === setup ===
#   require ‘ApiExample’
#   ApiExample::account = ‘test’
#   ApiExample::logger = Logger.new(‘example.log’)
#
# ===Find A User===
# user = ApiExample::User.find(‘bill.lumberg’, :password => ‘swingline’)
#
# ===Create A User===
# user = ApiExample::User.create(:username => ‘bill.lumberg’, :password => ‘swingline’, :email => ‘bill.lumberg@initech.com’)
#

require ‘base64’
require ‘digest/md5’
require ‘net/http’
require ‘rexml/document’
require ‘cgi’
require ‘logger’

module ApiExample
  # Example Database Host
  HOST = ‘www.example.com’

  # These params are required to register a new user
  REGISTRATION_PARAMS = [ :username, :password, :email ]

  # ApiExample account (brand account not user account)
  @@account = nil
  @@success_url = nil
  @@fail_url = nil
  @@logger = Logger.new(STDERR)

  mattr_accessor :account, :logger, :success_url, :fail_url

  # Exception handling
  class ApiExampleError < StandardError; end
  class UnexpectedError < ApiExampleError; end
  class RegistrationError < ApiExampleError; end
  class RecordNotFound < ApiExampleError; end

  def md5password(password)
    Base64.encode64(Digest::MD5.digest(password)).strip
  end

  # Using Struct here allows us to make our object act similar to an
  # ActiveRecord object.  In this example it’s not so obvious of a pain
  # in the ass it is but the real class has about 20 or so attributes
  # Using means I don’t have to create attribute read and writes
  # attribution – I got this idea from the Ben Vinegar’s
  # <a href="http://rubyforge.org/projects/freshbooks/">freshbooks gem</a>

  User = Struct.new(:username, :email, :password, :fullname, :zipcode)

  # Extend the Struct attributes by adding the class and instance methods we want
  class User
    attr_accessor :attributes, :errors, :new_record

    # class method for create a new user
    # Usage:
    # user = ApiExample::User.create(:username => ‘bill.lumberg’, :password => ‘swingline’, :email => ‘bill.lumberg@initech.com’)
    #
    def self.create(attributes = {})
      object = new(attributes)
      object.create
      object
    end

    # class method for finding an existing user
    # Usage:
    # ApiExample::User.find(‘bill.lumberg’, ‘swingline’) # plain text will be sent md5 hashed rather than clear text
    #
    def self.find(username, password)
      query_params = { :username => username, :password => ApiExample::md5password(password_option[:password]) }
      query = query_params.collect{ |k, v| [k, v].map{ |kv| CGI::escape(kv.to_s) }.join(’=’) }.join(’&’)

      # @@account doesn’t need to be encoded because we are setting internally, the other stuff is vulnerable
      uri = URI::HTTP.build(:host => ApiExample::HOST, :path => "/#{ApiExample::account}/lookup", :query => query)

      # I’ve got a bunch of these loggers throughout which are used to debug, remove as you see fit.
      ApiExample::logger.debug("URI: #{uri}")

      # Make the actual HTTP Request
      result = Net::HTTP.get_response(uri)

      # Parse the XML Result of the HTTP Request
      response = REXML::Document.new(result.body)

      ApiExample::logger.debug("RESPONSE: #{response}")

      # Check to see if there is a user node, if there’s not raise an error similar to ActiveRecord
      unless response.elements[’//user’].nil?
        attributes = {}

        # Members is the Struct method for the attributes we setup…they correspond to what
        # we expect the XML to return, so lets only handle what we know
        members.each do |field_name|
          node = response.elements[’//user’].elements[field_name.to_s]
          next if node.nil?

          attributes[field_name.to_sym] = node.text # you can do casting here if you need, I did
        end

        object = allocate
        object.attributes = attributes
        object
      else
        # Raise an error, setting the message to what the API sets as the error in XML
        # if there is no ‘error’ node in the XML set an ‘unknown error’ message
        raise RecordNotFound, (response.elements[’//error’].text rescue "Couldn’t find ApiExample::User with Username: #{username} because of an unknown error!")
      end
    end

    # instance method to setup new object ala ActiveRecord
    # Usage:
    # user = ApiExample::User.new(:username => ‘bill.lumberg’)
    #
    def initialize(attributes = {})
      @new_record = true
      self.attributes = attributes
    end

    # instance method ala ActiveRecord
    #
    def errors
      @errors = [] if @errors.blank?
      @errors
    end

    # instance method to get an attributes hash ala ActiveRecord
    #
    def attributes
      butes = {}
      members.each{ |member| butes[member.to_sym] = self.send(member) } # you can cast here, I did
      butes
    end

    # instance method to set attributes via a hash ala ActiveRecord
    #
    def attributes=(new_attributes = {})
      members.each do |member|
        unless new_attributes.has_value?(member.to_sym)
          # #{member}= because you might possibly write an attribute writer to handle the input to =
          self.send(member+’=’, new_attributes[member.to_sym]) # you can cast here, I did
          ApiExample::logger.debug("#{member}: #{self.send(member).inspect}") # for snooping
        end
      end
    end

    # instance method to create the new instantited object ala ActiveRecord
    # note the difference in create and self.create are the same as in ActiveRecord
    # Usage:
    # user = ApiExample::User.new(:username => ‘bill.lumberg’)
    # user.password = ‘swingline’
    # user.email = ‘bill.lumberg@initech.com’
    # user.create
    #
    def create
      # we pass our success and fail URLs even though we aren’t using them for their intended purpose
      query_params = { :success => ApiExample::success_url, :fail => ApiExample::fail_url }.merge(attributes)

      ApiExample::logger.debug(query_params.inspect) # for snooping

      # check to make sure all required params are included
      if ApiExample::REGISTRATION_PARAMS.all?{ |param| query_params.include?(param) }
        # URL Encode everything
        query = query_params.collect{ |k, v| [k, v].map{ |kv| CGI::escape(kv.to_s) }.join(’=’) unless v.blank? }.compact.join(’&’)

        # @@account doesn’t need to be encoded because we are setting internally, the other stuff is vulnerable
        uri = URI::HTTP.build(:host => ApiExample::HOST, :path => "/#{ApiExample::account}/register", :query => query)

        ApiExample::logger.debug("URI: #{uri}") # for snooping

        # Make the actual HTTP Request
        response = Net::HTTP.get_response(uri)

        ApiExample::logger.debug("RESPONSE: #{response}") # for snooping

        # Check for failure and errors
        if response[‘location’].include?(ApiExample::fail_url)
          # Raise an error for each message
          CGI::parse(URI::parse(response[‘location’]).query)[‘error’].each do |error_message|
            raise RegistrationError, error_message
          end
        elsif response[‘location’].include?(ApiExample::success_url)
          # Change the status to reflect the fact that we’ve saved the object…this could get much more
          # in-depth in terms of ensuring our data was correctly saved… you wouldn’t expect to do
          # that kind of thing in an RDBMS so why here?
          self.new_record = false
        else
          # it wasn’t the fail url, it wasn’t the success url so what the hell was it?
          raise UnexpectedError, "Unknown response URL!"
        end
      else
        # For each missing required param add an error message
        ApiExample::REGISTRATION_PARAMS.each do |missing_param|
          self.errors << "#{missing_param} can’t be blank." unless query_params.include?(missing_param)
        end
      end
    end
  end
end

Ruby on Rails, Logic and best practices - I miss you.

Posted by acts_as_flinn Sat, 01 Sep 2007 04:52:00 GMT

So those of you following my blog know that after two years of being a full time Rubyist I’ve been tossed back into the cruel and unusual work of PHP development. I’ve really had to think long about my own development history when passing judgement on the current code base I am working with. What I’ve come to find is that the things I love about Ruby on Rails are logic, reason and proven best practices. What I dislike about PHP development is preference, opinion and tyranny.

“What do you mean by that?” What I mean is just this. Ruby on Rails has been at the forefront of emerging technologies and is consistently endorsed by well respected leaders in development technology like ThoughtWorks and countless innovators in the industry. It has introduced professional development techniques to the masses in ways that are unobtrusive and encourage thoughtful progressive improvement. What I like most of all – the things that have made it into Rails have been well thought out features that are supported by logic and reason NOT opinion.

What I’m finding in my return to PHP is that people just do things because they like it this way or that way. It’s a matter of personal preference. There is a complete lack of standards on how to best get something done. That usually means brute force wet (!DRY) code that does the job with the amount of style and grace as a bull in a china shop. I also think you see tyranny over reason in the PHP world because it’s a matter of preference. When you boil it down Ruby seems to be a philosophy whereas PHP is just a tool.

If you don’t care about efficiency stop reading now. Some readers might have caught that last bit “does the job” and said “well it does the job.” Yes it does the job for so long…then you change something…then it breaks something unintentionally. But you don’t know that it broke because you have no way of testing other than to put a person in front of a web browser or worse yet wait for a support ticket from your customer. Then you have no way of refactoring because everything is so fucking wet that your refactoring is actually rewriting from the ground up. If your grumbling – I told you to stop reading.

Modern Design in PHP

Posted by acts_as_flinn Sun, 19 Aug 2007 04:57:00 GMT

Yesterday I finished my first week at my new job. Going back to the PHP world is somewhat shocking to see a complete lack standardized coding methods. I spent some time this week getting to know the project I’ll be helping out on for the next several weeks. The code originates from an overseas outsourcing company with absolutely no standards or knowledge of design patterns, or concept of writing DRY code.

I find it difficult sometimes to get the concepts I’ve picked up during Ruby development across to PHP programmers. Since most people self-learn PHP there is a tendency to do a lot of brute force coding, copying and pasting and doing stupid stuff like using template languages when there is know need to use a templates. Because they self learn, modern software engineering philosophies and management techniques that are progressing the state of the art are virtually unheard of by many PHP developers. Let’s explore some of the things PHP developers could be doing to make their lives better, save their employers money and maybe get a raise.

Views Instead of Templates

In the PHP world there is this mysterious draw toward templating using engines like Smarty, Flexy (which I have been guilty of), Blitz or any of the other engines available. I think this draw is the experience of the developer wanting to abstract and separate the presentation layer. It’s a natural progression that comes from experience – the kind of experience you get from PHP’s free style – so free that you often see SQL statements, and HTML markup embedded within the same PHP script.

So why use a template engine in the first place? PHP is a template engine. The only valid reason I have ever found is for applications that allow either designers or end users change the front end presentation layer – and even then I’m skeptical what with the whole existence of CSS…but I can understand the level of flexibility templates allow a designer. A good example that I have used is Shopify use of the Liquid Markup Language which was created for Shopify. Anyway, back to the subject.

If there is no reason for a third party designer to change the presentation layer there is almost no need for a template language, but you see templating in PHP frequently used as a way to force code separation.

So PHP coders pay attention… Forget the template engines unless you really need them they only indirectly force you to do code separation, slow down your application, learn a bunch of additional syntax and limit your ability to present what you want to in PHP. Just chose to do code separation from the design phase of your application. Keep the power of PHP available in your presentation layer views and skip learning a pointless template language.

Testing, Testing, 1. 2. 1. 2.

How do we test in PHP? Write a script, hit the browser and look all over the place to see if what you wrote worked). WRONG Well not really – most PHP developers do just that. No automated unit testing or integration testing, and most have never heard of it. I’ll be the first to admit that when I made the leap from PHP to Ruby I had never heard of unit testing and didn’t incorporate testing until after at least a year of Rails development. I’m also the biggest advocate for automated testing.

People often resist writing automated tests because they can be time intensive at the onset of a project. It’s often a matter of “we don’t have enough time or funds to spend writing tests.” The proponents of this argument are usually the ones in favor of brute force eye ball tests. “Let’s make a change and cross our fingers hoping it doesn’t break anything we’ve already eyeballed.”

Well that just doesn’t make any sense! It’s a business decision that budget managers should understand. Why spend more time eyeballing and crossing your fingers when PHP has unit tests, continuous integration tests and a other great methods. Managers should be asking their developers why they aren’t using PHP’s automated testing facilities!

Data Modeling and Class Design

I do my little turn on the cat walk, yeah on the cat walk… Modeling seems to be an esoteric subject for some PHP developers. In a lot of database schemas I repeatedly see weird table layouts, aggregate data storage and lots of abbreviations. In Ruby on Rails the convention is to model tables based on object usage. We use the Active Record design pattern. PHP developers often think ActiveRecord and Active Record are one and the same – they’re not.

Active Record the design pattern is a description for creating classes that relate to a row in a database, provide static finder methods and implement an object that both provides an interface to data and the methods to store, destroy and modify the data. The pattern is impossible to simply implement in PHP. But that doesn’t mean it can’t be used to the fullest of PHP’s ability or that you can’t implement the next best data design pattern.

Why use abbreviations in database tables? Is it really that convenient to abreviate the hell out of every variable or attribute name? One of the best things you get in Rails is string inflection (huh?). That means we can take a string like “phone_number” and call a method like humanize resulting in a human readable field label, “Phone Number” – using this is a no brainer. It also makes it a lot less tempting to use variable names like “phonum”...but that’s really a no brainer too.

What looks better to the human eye? “phonum” or “phone_number” – you can see where this is going… ”$phonum”, ”$cli_phonum”, ”$client_phonum”, ”$client_phone_number”, ”$client->phone_number”. So if you were working in a team what convention works best, esoteric abbreviations or human readble variable names that any programmer can jump into without a secret decoder ring.

If you’ve got a form full of aggregate information keep it separate. This is an easy one – when you’re on the receiving end of a post why reassemble your information?

HTML


<input type="text" name="client_name" id="client_name" value="Joe Smalltime &amp; Son" />
<input type="text" name="client_email" id="client_email" value="" />
<input type="text" name="client_login" id="client_login" value="" />
...

PHP


$client = array();
$client['name'] = $_POST['client_name'];
$client['email'] = $_POST['client_email'];
$client['login'] = $_POST['client_login'];
...

Or do nothing..

HTML


<input type="text" name="client[name]" id="client_name" value="Fortune 500 Corp." />
<input type="text" name="client[email]" id="client_email" value="" />
<input type="text" name="client[login]" id="client_login" value="" />
...

PHP


$client = $_POST['client'];

Do what you mean, and only that.

Why make things more complex than they need to be? If you mean to take information from the web and save it, do just that. Where should we validate information, every script that needs to accept and save information or in the one place it actually matters? Why validate manually in every script when we can set our validation conditions at the class level then do what we mean? What am I talking about?


if (valid_client_info($_POST['client_info']) {
  save_client_info($_POST['client_info']);
} else {
  throw new Exception("Your client info sucks");
}

Or we could do this…


  $client = new Client($_POST['client']);
  if ($client->save()) {
    $this->render_string("Hazaa!");
  } else {
    throw new Exception("argh!");
  }

Conclusion

PHP developers can learn a lot from Ruby on Rails. You really don’t have to use Ruby to take advantage of modern design techniques but what a lot of people fail to realize about Rails is the fact that most of the conventions weren’t designed just for ruby – Rails is a laundry list of best practices for web development. Some of the best developers in the world are using Rails because of it’s best practices that enable rapid development and get you focused on the business logic not the repetative tasks you face every project.

Older posts: 1 2 3 4