weidagang2046的专栏

物格而后知致
随笔 - 8, 文章 - 409, 评论 - 101, 引用 - 0
数据加载中……

Osh

In a Unix shell, you can run a series of commands, connecting the output of one command with the input to the next command using pipes. The information passed from one command to the next is normally organized into lines of text. Tools such as awk and sed are designed to extract and transform information from these lines. It would be much simpler if the data passed between commands was organized into objects. Osh is based on this idea. Osh is implemented in Python, so much of osh will be familiar if you already know Python.

Osh is not a shell. It is an executable implementing a language that supports the composition of commands through piping. The objects piped from one command to the next can be Python primitives (numbers, lists, maps, etc.), other useful types such as dates/times, or database rows; or represent various OS resources such as files, directories, and processes. Conversions between objects and strings simplify integration with the Unix environment. The commands included with osh manipulate objects, access databases, and execute commands remotely, including parallel execution on nodes in a cluster. In a single osh invocation, all commands run in a single process. Multithreading is used only when executing commands on a cluster.

Example: Suppose you have a cluster named fred, with nodes fred1, fred2, fred3. Each node has a database tracking work requests with a table named request. You can find the total number of open requests in each database as follows:

[jao@zack] osh @fred [ sql "select count(*) from request where state = 'open'" ] ^ out
('fred1', 1)
('fred2', 0)
('fred3', 5)
  • osh: Invokes the osh executable.
  • @fred [ ... ]: Specifies that the following command, delimited by [...] should be run on each node of the cluster named fred. (The osh configuration file, .oshrc, specifies how to connect to the nodes of the cluster.)
  • sql "select count(*) from request where state = 'open'": Sql is an osh command that submits a query to a relational database. The query output is returned as a set of tuples.
  • ^ out: ^ is the osh operator for piping objects from one command to the next In this case, the input objects are tuples resulting from execution of a SQL query on each node of the cluster. The out command renders each object as a string and prints it to stdout.
  • Each output row identifies the node of origination (e.g. fred1, fred2), and contains a tuple from the database on that node. So ('fred3', 5) means that the database on node fred3 has 5 open requests.

Example, continued: Now suppose you want to find the total number of open requests across the cluster. You can pipe the tuples into an aggregation command:

[jao@zack] osh @fred [ sql "select count(*) from request where state = 'open'" ] ^ agg 0 'total, node, count: total + count' $
6
  • agg: agg is the aggregation command. Tuples from across the cluster are piped into the agg command, which will accumulate results from all inputs.
  • 0: This use of agg will maintain a total, which is initialized to 0.
  • 'total, node, count: total + count': This specifies an aggregation function. total is the running total, which was initialized to 0. node and count come from the sql command executed on each node of the cluster. total + count accumulates the counts from each node.
  • $: An alternative to ^ out that can be used at the end of a command only.
  • 6: The total of the counts from across the cluster.

More information:

License: GPL
Release history
Tutorial (TBD)
User guide
Download
Software with similar goals


jao@geophile.com

from: http://geophile.com/osh/

posted on 2005-11-24 19:06 weidagang2046 阅读(543) 评论(0)  编辑  收藏 所属分类: Python


只有注册用户登录后才能发表评论。


网站导航: