Sunday, October 13, 2013

Closures in Ruby

Closures are anonymous functions with closed scope. I'll dive into each of these concepts in more detail as I go, but first, it's best to oversimplify. Think of a closure as a code block that you can use as an argument, with special scoping rules. I'll use Ruby to show how closures work. Listing 1 shows the simplest possible closure:

Listing 1. The simplest possible closure
3.times {puts "Inside the times method."}

Inside the times method.
Inside the times method.
Inside the times method.

times is a method on the 3 object. It executes the code in the closure three times. {puts "Inside the times method."} is the closure. It's an anonymous function that's passed into the times method and prints a static sentence. This code is tighter and simpler than the alternative with a for loop, shown in Listing 2:

Listing 2: Looping without closures
for i in 1..3 
  puts "Inside the times method."

The first extension that Ruby adds to the simple code block is an argument list. A method or function can communicate with a closure by passing in arguments. In Ruby, you represent the arguments with a comma-separated list of arguments, between || characters, such as |argument, list|. Using arguments in this way, you can easily build iteration into data structures such as arrays. Listing 3 shows an example of iterating over an array in Ruby:

Listing 3. Using closures with collections
['lions', 'tigers', 'bears'].each {|item| puts item}


The each method is just one way to iterate. Often, you want to produce a new collection with the results of an operation. This method in Ruby is called collect. You may also want to join the contents of an array with some arbitrary string. Listing 4 shows such an example. These are just two more of the many ways that you can use closures to iterate.

Listing 4. Passing arguments to a closure
animals = ['lions', 'tigers', 'bears'].collect {|item| item.upcase}
puts animals.join(" and ") + " oh, my."

LIONS and TIGERS and BEARS oh, my.

In Listing 4, the first line of code takes each element of an array, calls the closure on it, and then builds a collection with the results. The second concatenates all of the elements into a sentence, with " and " between each one. So far, I've shown you nothing more than syntactic sugar. You can do all of these things in any language.
From the examples so far, you can see that an anonymous function is simply a function, without an name, that is evaluated in place and determines its context based on where it is defined. But if the only difference between languages with closures and those without were a little bit of syntactic sugar -- the fact that you don't need to declare the function -- there wouldn't be much debate. The advantages of closures go beyond saving a few lines of code, and the usage patterns go beyond simple iteration.
The second part of the closure is the closed scope, which I can best illustrate with another example. Given an array of prices, I'd like to generate a sales-tax table with each price and its associated tax. I don't want to hardcode the tax rate into the closure. I'd rather configure it somewhere. Listing 5 shows one possible implementation:

Listing 5. Using closures to build a tax table
tax = 0.08

prices = [4.45, 6.34, 3.78]
tax_table = prices.collect {|price| {:price => price, :tax => price * tax}}

tax_table.collect {|item| puts "Price: #{item[:price]}    Tax: #{item[:tax]}"}

Price: 4.45    Tax: 0.356
Price: 6.34    Tax: 0.5072
Price: 3.78    Tax: 0.3024

Before dealing with scoping, I should explain a couple of Ruby idioms. First, a symbol is an identifier preceded by a colon. Think of a symbol as a name for some abstract concept. :price and :tax are symbols. Second, you can easily substitute variable values inside strings. The sixth line takes advantage of this technique with puts "Price: #{item[:price]} Tax: #{item[:tax]}". Now, back to the scoping issue.
Look at the first and fourth lines of code in Listing 5. The first assigns a value to the tax variable. The fourth line uses that variable to compute the tax column of the price table. But this usage is in a closure, so this code actually executes in the context of the collect method! Now you have insight into the term closure. The scope between the name space of the environment that defines the code block and the function that uses it are essentially one scope: the scope is closed. This characteristic is essential. This closed scope is the communication that ties the closure to the calling function and the code that defines it.
Customizing with closures
You've seen how to use closures as a client. Ruby also lets you write methods that use your own closures. This freedom means the Ruby API can be more compact because Ruby doesn't need to define every usage model in code. You can build in your own abstractions as needed, with closures. For example, Ruby has a limited set of iterators, but the language gets along fine without them because you can build your own iterative concepts into your code through closures.
To build a function that uses a closure, you simply use the yield keyword to invoke the closure. Listing 6 shows an example. The paragraph function provides the first and last sentences of the output. The user can provide additional sentences with the closure.

Listing 6. Building methods that take closures
def paragraph 
  puts "A good paragraph should have a topic sentence."
  puts "This generic paragraph has a topic, body, and conclusion."

paragraph {puts "This is the body of the paragraph."}

A good paragraph should have a topic sentence.
This is the body of the paragraph.
This generic paragraph has a topic, body, and conclusion.

You can easily take advantage of arguments within your custom closures by attaching a parameter list to yield, as in Listing 7:

Listing 7. Attaching a parameter list
def paragraph
  topic =  "A good paragraph should have a topic sentence, a body, and a conclusion. "
  conclusion = "This generic paragraph has all three parts."

  puts topic 
  yield(topic, conclusion) 
  puts conclusion

t = ""
c = ""
paragraph do |topic, conclusion| 
  puts "This is the body of the paragraph. "
  t = topic
  c = conclusion

puts "The topic sentence was: '#{t}'"
puts "The conclusion was: '#{c}'"

Be careful to get the scoping right, though. The arguments that you declare within the closure are local in scope. Listing 7, for example, works, but Listing 8 would not because the topic and conclusion variables are both local in scope:

Listing 8. Incorrect scoping
def paragraph
  topic =  "A good paragraph should have a topic sentence."      
  conclusion = "This generic paragraph has a topic, body, and conclusion."

  puts topic 
  yield(topic, conclusion) 
  puts conclusion

my_topic = ""
my_conclusion = ""
paragraph do |topic, conclusion|     # these are local in scope
  puts "This is the body of the paragraph. "
  my_topic = topic
  my_conclusion = conclusion

puts "The topic sentence was: '#{topic}'"
puts "The conclusion was: '#{conclusion}'"

Closures in practice
Some of the common closure scenarios are:
  • Refactoring
  • Customization
  • Iterating across collections
  • Managing resources
  • Enforcing policy
When you can build your own closures in a simple and convenient way, you find techniques that open up a whole new range of possibilities. Refactoring turns code that works into better code that works. Most Java programmers look for refactoring possibilities from the inside out. They often look for repetition within the context of a method or loop. With closures, you can also refactor from the outside in.
Customization with closures can take you in some surprising places. Listing 9, a quick example from Ruby on Rails, shows a closure that is used to code the response of an HTTP request. Rails passes an incoming request to a controller, which should generate the data the client wants. (Technically, the controller renders the result based on the contents the client sets in an HTTP accept header.) This concept is easy to communicate if you use a closure.

Listing 9. Rendering an HTTP result with closures
@person = Person.find(id)
respond_to do |wants|
  wants.html { render :action => @show }
  wants.xml { render :xml => @person.to_xml }

The code in Listing 9 is beautiful to look at, and with a quick glance, you can tell exactly what it does. If the request block is requesting HTML, it executes the first closure; if it's requesting XML, it executes the second. You can also easily imagine the implementation. wants is an HTTP request wrapper. The code has two methods -- xml and html -- that each take closures. Each method can selectively call its closure based on the contents of the accept header, as in Listing 10:

Listing 10. Implementation of the request
  def xml
    yield if self.accept_header == "text/xml"
  def html
    yield if self.accept_header == "text/html"

Iteration is by far the most common use of closures within Ruby, but this scenario is broader than using a collection's built-in closures. Think of the types of collections you use every day. XML documents are collections of elements. Web pages are a special case of XML. Databases are made up of tables, which in turn are made up of rows. Files are collections of characters or bytes, and often rows of text or objects. Ruby solves each of these problems quite well within closures. You've seen several examples of closures that iterate over collections. Listing 11 shows a closure that iterates over a database table:

Listing 11. Iterating through rows of a database
require 'mysql'"localhost", "root", "password")

result = db.query "select * from words"
result.each {|row| do_something_with_row}


The code in Listing 11 also shows another possible usage pattern. The MySQL API forces users to set up the database resource and close it using the close method. You could actually use a closure to do the resource setup and cleanup instead. Ruby developers often use this pattern to work with resources such as files. Using the Ruby API, you don't need to open or close your file or manage the exceptions. The methods on the File class handle that for you. Instead, you can use a closure, as in Listing 12:

Listing 12. Working with File using a closure {|file| process_file(file)}

Closures have another huge advantage: they make it easy to enforce policy. For example, if you have a transaction, you can make sure that your code always brackets transactional code with the appropriate function calls. Framework code can handle the policy, and user code -- provided in a closure -- can customize it. Listing 13 shows the basic usage pattern:

Listing 13. Enforcing policy
def do_transaction

Closures with the Java language
The Java language does not formally support closures per se, but it does allow the simulation of closures. Primarily, you can use anonymous inner classes to implement closures. The Spring framework uses this technique for many of the same reasons that Ruby does. Spring templates, for persistence, allow iteration over result sets without burdening the user with details of exception management, resource allocation, or cleanup. Listing 14 shows an example from Spring's sample petclinic application:

Listing 14. Inner classes as a closure substitute
JdbcTemplate template = new JdbcTemplate(dataSource);
final List names = new LinkedList(); 
template.query("SELECT id,name FROM types ORDER BY name",
                           new RowCallbackHandler() {
                               public void processRow(ResultSet rs) 
                                     throws SQLException

Think of the things Listing 14's programmer doesn't need to do:
  • Open a connection
  • Close the connection
  • Handle iteration
  • Process exceptions
  • Deal with database-independent issues
The programmer is free from these issues because the framework handles those problems. But anonymous inner classes give you only a loose approximation of closures, and they don't go as far as you need to go. Look at the wasted syntax in Listing 14. At least half of the example is overhead. Anonymous classes are like a bucketful of cold water that's spilled in your lap every time you want to use them. All of the extra effort in wasted syntax discourages their use. Sooner or later, you're going to quit. When language constructs are cumbersome and awkward, they don't get used. The dearth of Java libraries that effectively use anonymous inner classes make this problem manifest. For closures to be practical and prevalent in the Java language, they must be quick and clean.
In the past, closures were never a serious priority for Java developers. In the early years, the Java designers did not support closures because Java users were skittish about automatically allocating the variables on the heap without an explicit new. Today, a tremendous debate circulates around including closures into the base language. In recent years, practical interest in dynamic languages such as Ruby, JavaScript, and even Lisp has led to a groundswell of support for closures in the Java language. It looks like we'll finally get closures in Java 1.7. Good things happen when you keep crossing borders

source :