I am pleased to announce Morel release 0.5.0. Coming fourteen months after release 0.4.0, this release contains various extensions to syntax to make the language more powerful and easy to use.

We describe a few of the new features in this article. For more information, see the official release notes.

1. List-valued functions in pipelines (into and through)

[MOREL-171] lets you use list-based functions in a pipeline, adding two clauses to the from expression:

  • into occurs at the end of a pipeline, and converts a list to a scalar value;
  • through occurs in the middle of a pipeline, and converts a list into another list.

First, into.

from scans
  steps
  into expression

is equivalent to

(expression) (from scans steps)

expression must be a function of type 'a list -> 'b (for some types 'a and 'b), and if the output of the last step is 'a then the type of the whole from expression is 'b.

For example, sum has type int list -> int, and so into sum converts a stream of int values into an int result.

- (* sum adds a list of integers *)
= sum [1, 2, 4];
val it = 7 : int

- (* 'into' applies a function to its input, collected into a list. *)
= from i in [1, 2, 4]
=   into sum;
val i = 7 : int

- (* Rewrite 'into' to apply the function directly. *)
= sum (from e in [1, 2, 4]);
val i = 7 : int

- (* 'into' is equivalent to existing keyword 'compute'. *)
= from e in [1, 2, 4]
=   compute sum;
val i = 7 : int

- (* Actually, "a into b" is equivalent to "(b) a" for any types, not
=    just lists. *)
= explode "abc";
> val it = [#"a",#"b",#"c"] : char list

- "abc" into explode;
val it = [#"a",#"b",#"c"] : char list

Next, through is similar to into, but has a pattern so that following steps can refer to the data items:

expression1
  through pattern in expression2
  steps

is equivalent to

from pattern in (expression2) (expression1)
  steps

For example, suppose we have a clean_address function that takes a list of orders and returns a list of orders with state and zipcode filled out. To stream orders through this function, we simply add the line through clean_over in clean_address to the pipeline.

- (* Function that converts a list of orders to a list of orders with
=    corrected state and zipcode. *)
= fun clean_address ...
val clean_address = fn : order list -> order list;

- (* Define a function that takes a collection of orders, removes orders with
=    more than 1,000 items, cleans addresses, and summarizes by state. *)
= fun pipeline orders =
=     from order in orders
=        where order.units < 1000
=        through clean_order in clean_address
=        group clean_order.state compute count;
val pipeline = fn : order list -> {count: int, state: string} list;

Note that the pipeline function itself takes a list argument and returns a list. We could therefore include it in a higher-level query using the through keyword, and include that query in another query.

The into and through keywords, combined with Morel’s ability to include queries in function declarations, allow us to do something very powerful: to compose complex and efficient data flows from concise functions.

2. Comma-separated scans

If you speak SQL, you will know that there are two ways to write a join:

-- Comma syntax
SELECT e.ename, d.dname
FROM Emp AS e,
  Dept AS d
WHERE e.deptno = d.deptno;

-- ANSI (SQL-92) syntax using JOIN keyword
SELECT e.ename, d.dname
FROM Emp AS e,
  JOIN Dept AS d ON e.deptno = d.deptno;

Morel has analogous syntax:

- from e in emps,
=     d in depts
=   where e.deptno = d.deptno
=   yield {e.ename, d.dname};
val it =
  [{dname="RESEARCH",ename="SMITH"},{dname="SALES",ename="ALLEN"},
   {dname="SALES",ename="WARD"},...] : {dname:string, ename:string} list

- from e in emps
=   join d in depts on e.deptno = d.deptno
=   yield {e.ename, d.dname};
val it =
  [{dname="RESEARCH",ename="SMITH"},{dname="SALES",ename="ALLEN"},
   {dname="SALES",ename="WARD"},...] : {dname:string, ename:string} list

but used to only allow the comma join syntax immediately after the from keyword, before clauses such as where or join had occurred.

Following [MOREL-216], Morel allows comma-separated joins later in the pipeline, and also allows on in comma-joins. The following is now legal:

- from a in [1, 2],
=     b in [3, 4, 5] on a + b = 6
=   where b < 5
=   join c in [6, 7] on b + c = 10,
=       d in [7, 8];
val it = [{a=2,b=4,c=6,d=7},{a=2,b=4,c=6,d=8}]
  : {a:int, b:int, c:int, d:int} list

This will be particularly convenient (when we have solved some query-planning issues in [MOREL-229]) for writing queries that use unbounded variables to solve constraints:

- from a, b
=   where a < b
= join c, d, e
=   where a > 0
=     andalso b > 0
=     andalso c > 0
=     andalso d > 0
=     andalso e > 0
=     andalso a + b + c + d + e < 8;
val it =
  [{a=1,b=2,c=1,d=1,e=1},{a=1,b=2,c=1,d=1,e=2},{a=1,b=2,c=1,d=2,e=1},
   {a=1,b=2,c=2,d=1,e=1},{a=1,b=3,c=1,d=1,e=1}]
  : {a:int, b:int, c:int, d:int, e:int} list

3. Duplicate elimination (distinct)

[MOREL-231] adds a distinct clause to the from expression. It makes the rows unique.

Here is a query that finds the set of distinct job titles:

- from e in scott.emp
=   yield {e.job}
=   distinct;
val it = ["CLERK","SALESMAN","ANALYST","MANAGER","PRESIDENT"] : string list

distinct is short-hand for group with all fields and no aggregate functions (compute clause), and is similar to SQL’s SELECT DISTINCT.

4. Multiple branches in fn

[MOREL-230] allows a lambda (fn expression) to have multiple branches, similar to case. Following this change, the following expressions are equivalent:

- fn [] => 0 | x :: _ => x + 1;
val it = fn : int list -> int
- fn list => case list of [] => 0 | x :: _ => x + 1;
val it = fn : int list -> int

Prior to this change, the first expression would give a syntax error.

5. Int structure

[MOREL-228] implements the Int structure, a collection of functions and values related to the int type. Per Moscow ML it has the following interface:

val precision : int option
val minInt    : int option
val maxInt    : int option

val ~         : int -> int              (* Overflow      *)
val *         : int * int -> int        (* Overflow      *)
val div       : int * int -> int        (* Div, Overflow *)
val mod       : int * int -> int        (* Div           *)
val quot      : int * int -> int        (* Div, Overflow *)
val rem       : int * int -> int        (* Div           *)
val +         : int * int -> int        (* Overflow      *)
val -         : int * int -> int        (* Overflow      *)
val >         : int * int -> bool
val >=        : int * int -> bool
val <         : int * int -> bool
val <=        : int * int -> bool
val abs       : int -> int              (* Overflow      *)
val min       : int * int -> int
val max       : int * int -> int

val sign      : int -> int
val sameSign  : int * int -> bool
val compare   : int * int -> order

val toInt     : int -> int
val fromInt   : int -> int
val toLarge   : int -> int
val fromLarge : int -> int

val scan      : StringCvt.radix 
                -> (char, 'a) StringCvt.reader -> (int, 'a) StringCvt.reader
val fmt       : StringCvt.radix -> int -> string

val toString  : int -> string
val fromString : string -> int option   (* Overflow      *)

Example use:

- Int.compare;
val it = fn : int * int -> order
- Int.compare (2, 3);
val it = LESS : order
- Int.maxInt;
val it = SOME 1073741823 : int option

The Int structure is an instance of the INTEGER signature in the Standard ML Basis Library but Morel does not currently have signatures.

Conclusion

Now that I have resigned from Google, I will have more time to work on Morel bugs and features, in two areas in particular.

I want to improve how Morel handles ordered and unordered multisets, because SQL tables are unordered, functional programming languages use ordered lists, and real applications need to mix both. We will add a bag type, and to keep the Hindley-Milner type system happy when converting between the list and bag types, we will need better support for operator overloading.

I believe that Morel can be an attractive solution for graph, deductive and constraint-programming problems. To that end, I will be working on constraints, universal and existential quantifiers, deduction of ranges, and tail-call optimization.

If you have comments, please reply on BlueSky @julianhyde.bsky.social or Twitter:

This article has been updated.