Alf in Ruby
This page provides an introduction to using Alf in Ruby programs. Please refer to the available examples, blog posts, and API documentation for further details and advanced use cases. The following sections cover:
- Querying various data sources using
Alf.query
- Understanding query results and hacking with
Relation
- Towards lazy evaluation and database updates with
Relvar
- More on obtaining data source connections
Connect and query
The easiest way to connect and query a data source using Alf is through
Alf.query
. For instance, suppose that you have the sap.sqlite3 database file available in the
current folder. Then:
require 'alf'
require 'sqlite3'
rel = Alf.query("sap.sqlite3") do
restrict(suppliers, city: 'London')
end
puts rel
Your query (inside the block) can be as complex as you like. As of relational algebra's closure property, you can chain relational operators without any restriction.
Alf.query
's first argument can also be:
- A Path or
Pathname,
or string path to a sqlite file with a
.db
,.sqlite
or.sqlite3
extension, as in the example above. A Path or Pathname instance to a folder containing recognized files
# Let query .csv/.json/.yaml files in the current folder Alf.query(Path.pwd) do ... end
A database URL recognized by Sequel:
Alf.query('postgres://user:password@host/dbname') do ... end
A Sequel
Database
instance.db = ::Sequel.connect('postgres://user:password@host/dbname') Alf.query(db) do ... end
Relation
Alf.query
always returns the query result as a Alf::Relation
instance. The
latter is an in-memory relation, that is, a set of tuples entirely independent
from the concrete datasource it comes from.
This Relation
has a bunch of useful methods, as demonstrated by the
following example:
require 'alf'
require 'sqlite3'
rel = Alf.query("sap.sqlite3") do
restrict(suppliers, city: 'London')
end
# info and metadata
rel.size # number of tuples
rel.empty? # is it an empty set of tuples?
rel.attr_list # the list of attribute names as an Alf::AttrList
rel.heading # information about types as a Alf::Heading
# export methods
rel.to_a # an array of ruby hashes
rel.to_text # the ascii table result
rel.to_json # all tuples, but in json
rel.to_csv # all tuples, but in csv
rel.to_yaml # all tuples, but in yaml
# the entire algebra, in postfix, OO syntax
rel.project([:sid])
rel.rename(...)
rel.join(...)
# tuple iteration
rel.each do |supplier|
puts supplier.name
end
# tuple extraction (raises unless exactly one tuple)
rel.tuple_extract
Alf::Relation
implements a pure immutable value, with Alf's relational
algebra shipped in an object-oriented style. It consistently implements
hash
, ==
and eql?
, you can thus use it in hashes, compare relations for
equality, and so on. No surprise. BUT. The object-oriented relational algebra
implemented by Relation
comes at a price: immediate evaluation, no
optimization, in-memory implementation. Called on a Relation
, a relational
operator returns a Relation
.
For this reason, working with the Relation
class is not always a good
choice. For example, it is not recommended when you want to incrementally
build complex queries against a data source, if you need to avoid loading all
tuples in memory, or if you want to use logical query optimization.
Relvar
Relvar
stands for relation variable
. Conceptually, a relvar is a variable
whose value is a relation. Thus, it has a location and it can be updated. In
addition, we distinguish between two kinds of relvars: base relvars (aka
tables) and virtual also called derived variables (aka views).
Alf uses the concept of relvar as a localized, virtual, set of tuples attached
to the concrete datasource(s) it comes from. Unlike Relation
, Relvar
does
not keep the tuples in memory and always recomputes them if needed. It does
not implement hash
, ==
and eql?
with respect to the set of tuples it
virtually denotes.
Relvar
instances are attached to particular data sources and can only be
obtained and used in the context of at least one database connection (see next
section). Once obtained, a relvar has an object-oriented API very similar to
the one of Relation
. A few differences and additional methods though:
require 'alf'
require 'sqlite3'
# See next section about this
Alf.connect("sap.sqlite3") do |conn|
# Remember that you can no longer use the relvar when the connection
# has been closed!
relv = conn.relvar do
restrict(suppliers, city: 'London')
end
# Returns the set of tuples as a Relation
relv.value
# the entire algebra, in postfix, OO syntax
relv.project([:sid]) # returns another relvar, no actual computation
rel.rename(...)
rel.join(...)
# update (VERY experimental, use with care, especially on virtual relvars)
relv.delete(predicate)
relv.insert(tuples)
relv.upsert(tuples)
relv.update(updating, predicate)
relv.affect(relation)
# locking (even more experimental)
relv.lock{
# executes the block with a lock on all underlying base relvars
}
end
Lower-level connection API
When building more complex software, the use of Alf
's facade API is somewhat
too limited. The following example shows more about hacking with lower-level
Alf::Database
and Alf::Database::Connection
objects:
require 'alf'
require 'sqlite3'
# Declare configuration once for all (as well as some options not
# covered here).
DB = Alf::Database.new("sap.sqlite3")
# Connect for querying and obtaining relvars. The connection will be
# automatically closed when the block execution ends. The `conn` object is
# NOT thread-safe.
DB.connect do |conn|
rel = conn.query do
restrict(suppliers, city: 'London')
end
puts rel
end
# Yet another way of obtaining a connection. You MUST ensure that
# you properly disconnect.
begin
conn = DB.connection
conn.query do
restrict(suppliers, city: 'London')
end
puts rel
ensure
conn.close if conn
end