A side of Guile: Functions in Guile and Guix

Today we're going to take another Guile aside to cover how to run Guile Scheme programs, how to use functions from Guix/Guile's standard library, and explore how to create our own functions. For a lot of the more advanced Guix capabilities a packager needs to understand the library of functions that Guix provides, and in some cases write their own functions. While those functions don't have to be that complicated it is necessary to understand the way that functions and modules work in Guile. As the focus is on Guix we'll use Guix tools and capabilities in addition to the ones that are part of standard Guile.

This is part 2 in series called A Side (aside?) of Guile, introducing the Guile Scheme language in the context of using it for Guix.

Running Guile code

There are different ways to run Guile code, see the Guile manual's Programming in Scheme for all the options. There are three ways to run Guile code that I use for Guix are:

In a REPL
As a file
As an interpreter for script

In part 1 of this series we covered how to use the REPL to explore Guix functions and packages. As a reminder the easiest option from the command-line is:

$ guix shell guile guile-readline guile-colorized -- guix repl

As we always have the Guix command line tools on our $PATH it's easy to run the Guix REPL. This uses guix shell to load the guile package and some libraries into a clean environment and starts the guix repl.

The second option is to tell the REPL to execute the code from a file. To play with this first create a new directory and then a file in it called hello.scm which has the following contents:

(format #t "Hello World! \n")

Yes, just the one line! To execute it we'll use the Guix REPL rather than the plain Guile interpreter, like so:

$ guix shell guile -- guix repl hello.scm
Hello World!

The final way is that we can use Guile just like any other scripting language such as Bash or Python.

#!/usr/bin/env -S guix repl --
!#

(format #t "Hello World! \n")

We're taking advantage of GNU coreutils env command which has a -S argument to split everything that follows out as a second command. Again, I do this calling guix repl, but you can also call guile -s for a plain Guile execution of the script. The use of #! ... !# is very clever, this is the multi-line comment in Guile, so essentially it ignores the entire section and only see the format command after the REPL has been launched. This leads us onto a way that we can write portable bash/guile scripts:

#!/usr/bin/env sh
exec guix shell --container --nesting -- guix repl "$0"
#!

(format #t "Hello World! \n")

In this case we're using plain, portable shell to execute a guix shell which then calls the guix repl command. For a more sophisticated version of this see the post about using Guix shell and running Linux games.

Don't forget that if you are loading modules then the load-path may need to be altered. For example, if a projects modules were in the directory src/mymodules/ then to include that directory we add --load-path=src/mymodules to the guix repl invocation. To see the module paths that are being loaded check the %load-path from inside the REPL.

Now it's time to turn our attention to the fundamental unit of code organisation in Lisp and Scheme, functions.

Scheme functions

Any introduction to functional programming or Lisp will refer to multiple benefits of using a functional approach. And, there can be some big concepts like first-class functions, HOFs and homo-icon-nuh-nuh-nuh-icity (or however you say it!). Oh boy!

We're going to ignore all of that word salad and start with the basic Lisp concept that we already know - a parenthesised expression is a function call:

=> (+ 1 2 3)
6

=> (max 1 2 3)
3

Both of these are functions in the first position and the parameters are provided in the following positions. In both of these cases the functions max and + are provided by Guile's standard libraries (Arithmetic Functions).

We've also implicitly seen that Guile's structure is a tree [1], the only syntax that we have to be aware of is the brackets. The parentheses with no other syntax is the most commonly cited reason for discomfort with Lisp - particularly amongst developers that are versed in other languages. Here's an example:

=> (max (+ 1 2 3) 10 12)

;;; to make it easier look at it as a tree from the inside to the outside
(max
     (+ 1 2 3)
     (10 12)
)

There are a few things that help me understand Lisp code more easily:

Read the code from the inside to the outside

In this case I'd look at the statement (+ 1 2 3) and resolve that first, then look at the max expression.
Space code out and align brackets

It often helps to make the structure of the code more obvious as I've done above. Adding comments can also help. Some people don't need to do this as they can see the structure, but personally I can't do that. Code can't be contributing like that but that's what the code formatter is for - more on this later.
Use colours for matching brackets

Have matching brackets coloured inside the REPL and editor really helps. Most editors and REPLs have support for this [2].
Short and simple functions

It's much easier to understand a simple function that does one thing. With much less boilerplate than other languages code can be very dense. Break up functions so they are easily understood.

There are many functions that we can use in Guile, there are lots of modules that can be loaded from the standard library (see Guile's SRFI support modules), and Guix provides modules for working with packages, services and so forth. We'll cover how to load modules lower down. Finally, we can create our own functions.

Creating our own functions

To create our own function we use the lambda function: yep a function to create a function! This is called an anonymous function because we haven't assigned it to a variable - it doesn't have a name:

(lambda* (name)
  (format #t "Hello ~A \n" name)
)

Here we use lambda* rather than lambda. Yes, that's two different functions with the same name except for a star at the end! The functions with a star are ones with extra capabilities over those that are specified in the Scheme language R5RS standard (don't worry about what this means) - use the ones with * on the end. In this case lambda* allows keywords, optional arguments and test arguments.

The anonymous function receives a single argument name. The function body is a single expression (format ... ). Format itself is a function, it lets us format text, in this case we tell it to put the variable into the format text.

We have a function, but if we cut-n-paste this into the REPL it doesn't do anything because we didn't actually call it. We call an anonymous function in the same way as other functions, by putting it first inside an S-expression along with any parameters. When the expression is evaluated the function is called and it's provided with the arguments:

;;     <---- anonymous function ------------------->  <- parameter ->
=> (  (lambda* (name) (format #t "Hello ~A \n" name))  "Bob"          )
"Hello Bob"

When we see it on a single line we can see the way the anonymous function is first, and the parameter comes next. Here we provide "Bob" to the function so it uses that as name when it executes the format function.

Anonymous functions are generally useful when they're short or they're going to be called within a specific context so there's no value to defining them somewhere else. In Guix package definitions we often find anonymous functions within the body.

Defining functions

In Guile Scheme we name things by using the define (or define*) function.

;;; assign a name to a number
=> (define* test-num 3)
=> test-num
3

;;; use the test-num variable in an S-expression
=> (+ 2 test-num)
5

;;; assign a string to a name
=> (define* users-name "Bob")
=> users-name
"Bob"

;;; assign a list to a name
=> (define test-list (list "Bob" "Sarah" "Sue"))
=> test-list
("Bob" "Sarah" "Sue")

Functions (Guile calls these procedures) are the same as other types, they can be named using define*. This capability is called first-class functions as a function is just the same as any other variable. We can actually use define in conjunction with lambda* that we used above:

(define hello-world-fn
  (lambda* (name)
    (format #t "Hello ~A \n" name)))

;;; see what 'hello-world-fn' is
=> hello-world-fn
#<procedure hello-world-fn (name)>

;;; call the function
=> (hello-world-fn "Bob")

We created an anonymous function, and then assigned it to a variable (hello-world-fn). We can go one step further and use define* syntax (again an extension to the base define) to remove the lambda* step. It looks like this:

(define* (hello-world-fn name)
  (format #t "Hello ~A \n" name))

;; call the function
(hello-world-fn "Bob")

If you create a file called hello.scm with those contents we can then run the Guile code by calling the guix repl like so:

$ guix shell guile guile-readline guile-colorized -- guix repl hello.scm
Hello Bob

For testing it's quite useful to run the function in the REPL interactively:

$ guix shell guile guile-readline guile-colorized -- guix repl --interactive hello.scm
Hello Bob

As the code is then loaded into the REPL we can call it with different parameters easily:

scheme@(guix-user)> (hello-world-fn "Anne")
Hello Anne
$1 = #t

It's often useful to be able to call functions with different parameters which is where keywords come in.

Keywords in Guile

Keywords are a Guile data-type, which the manual says are "self-evaluating object which are used in functions and lists". They look like this:

#:blah
#:width

Where they are useful is in Guile and Guix functions as a way to provide an argument with an optional or default value.

Optional arguments in functions

The first bit of useful syntax is being able to provide keyword #:optional arguments in a function. As the name implies an argument after the #:optional keyword doesn't have to be provided when the function is called. In most cases that's because the function has a default argument so if the optional one is not provided it uses that. In the context of our example:

(define* (hello-world-fn #:optional (name "Bob"))
  (format #t "Hello ~A \n" name))

;; call the function - uses the optional
(hello-world-fn)
"Hello Bob"

;; call the function but provide an argument
(hello-world-fn "Sarah")
"Hello Sarah"

Notice that we provide the optional argument within it's own S-expression (name "Bob"): if this optional argument is not provided then the default (of "Bob") will be used.

We can provide multiple optional arguments - each one in it's own S-expression like this (name "Bob") (age 42).

Keyword arguments in functions

The #:key argument in a function provides keyword arguments. As in other languages keyword arguments are often used because they 'self-document' the function making it easier to understand what's happening. For example:

(define* (grt-fn #:key name)
  (format #t "Hello ~A \n" name))

;; call the function
(grt-fn #:name "Susan")
Hello Susan

We're specifying that there's a keyword argument called #:name and then we provide it with the string "Susan" when the function is called. If a parameter is not provided then it will be set to #f.

We can use a similar syntax to earlier to provide an optional default value to #:key arguments, again in the form of the argument name and default value within an S-expression. For example, we can define the default name that will be used:

(define* (grt-fn #:key (name "Bob") (age 42))
  (format #t "Hello ~A, your age is ~A \n" name age))

 ;; called without any parameters - using the defaults
 (grt-fn)
 Hello Bob, your age is 42

 ;; called with one keyword parameter changed
 (grt-fn #:name "Susan")
 Hello Susan, your age is 42

We're defining that the grt-fn function has a keyword argument of #:name with a default value of "Bob", and a keyword argument of #:age with a default value of 42.

In some cases we want a function to accept a varying number of keyword arguments, perhaps there are other keywords that are passed along a chain of functions. If we provide #:allow-other-keys after the keyword arguments then unknown keys will be ignored:

(define* (grt-fn #:key (name "Bob") (age 42) #:allow-other-keys)
 (format #t "Hello ~A, your age is ~A \n" name age))

;; call the function with an unknown keyword
(grt-fn #:location "Sunny Crescent")
Hello Bob, your age is 42

Any additional arguments can be put into a list using the #:rest option, where we specify a list to put them into. For example:

(define* (grt-fn #:key (name "Bob") (age 42) #:rest r)
  (format #t "Hello ~A, age ~A, rest arg is ~A \n" name age r)

;; call the function
(grt-fn "Sunny Crescent")
Hello Bob, age is 42, rest arg is (Sunny Crescent)

In this function we've specified a #:rest variable called r, and any additional arguments are put into this list.

If we're using keyword arguments with #:key, and the #:rest option is provided then all keyword arguments are put into the rest list - so you get a full list of the keyword arguments that were provided:

(grt-fn #:name "Sarah" #:age 28 "Sunny Crescent")
Hello Sarah, age 28, rest arg is (#:name Sarah #:age 28 Sunny Crescent)

As we can see defining functions in Guile is very flexible.

Having looked at all the ways that we can define functions, lets look at how we organise them into modules and how we load modules.

Functions in Modules

We've seen previously that we can use modules from within the REPL. As a reminder we do:

=> ,import (guix packages)

To import a module within a file use the use-modules function:

(use-modules (guix packages)

This imports all the functions within the file guix/packages.scm that have been exported (more on this later) so that they can be used within our code.

Commonly in Guix, each file is it's own module and imports are done as part of the module definition. To create a module and import modules the top of the file will look like this:

(define-module (some-path some-name)
  #:use-module (module definition)

The first line uses the define-module function to specify the name of the module, remembering the rules for translating paths and file names. The second and subsequent lines will be #:use-module keyword arguments. There are some additional capabilities for complex situations, see the module definition documentation for more.

As an example, go back to the hello.scm file that was created earlier and remove the old contents, change the file so it's like this:

(define-module (hello)
  #:use-module (guix packages)
  #:use-module (gnu packages toys))

(format #t "Name: ~A \n  Version: ~A \n  Synopsis: ~A \n" (package-name cbonsai)
                                                          (package-version cbonsai)
                                                          (package-synopsis cbonsai))

In this code we're using functions (e.g. package-name) from the (guix packages) module to query the cbonsai variable which is defined in the module (gnu packages toys). To run it do:

$ guix shell guile guile-readline guile-colorized -- guix repl hello.scm

Name: cbonsai
  Version: 1.3.1
  Synopsis: Grow bonsai trees in a terminal

The last thing we need to understand is how a module exports functions.

Defining public functions

Normally, when we define a function within a module it's only available to other functions within it - it's local to the module. We often want to define functions that are available externally. There are two ways to do this:

define* the function and #:export it in the define-module expression
Use define*-public

Guile has a define*-public in (ice-9 optargs) which is the enhanced version that lets you use keywords etc - this is what you'd use if you were writing normal Guile code. When writing Guix packages definitions we use a specialised version of define-public which has the same capabilities and sets some package metadata, this means the import is (guix packages).

As an example, create a file called hello.scm which has:

(define-module (hello)
  #:use-module (ice-9 optargs)) ;; we're using the normal Guile one

(define-public* (grt-fn #:key (name "Bob"))
  (format #t "Hello ~A \n" name))

The first line defines that the module is called hello, and it imports the (ice-9 optargs) module so we can use it's define*-public. Then we create our function using define*-public so it's made publicly available from this module.

In another file in the same directory (can be called anything but mines called greet.scm), add the following contents:

(use-modules (hello))

;;; call the function grt-fn which is from the hello module
(grt-fn)

The first line uses the module hello, as there's no path here it will look in the same directory.

To run it we have to make sure that Guile's load-path knows where the custom module hello is located. This is done by providing a command line option to the guix repl command:

$ guix shell guile guile-readline guile-colorized -- guix repl --load-path=./ greet.scm
Hello Bob

Understanding Guix functions

With all this information about functions, lets look at some different Guix functions and see if we can understand them. In guix/packages.scm there's a function called package-full-name , the function signature is:

(package-full-name package #:optional (delimiter "@")

Even without the function's docstring we can probably work out what this does. To call it we provide a package, and then we have and #:optional argument so we can call it without specifying that argument in which case it will set the argument delimiter to "@". Of course, we can also call it and set our own delimiter. The function doc string says:

"Return the full name of PACKAGE--i.e., `NAME@VERSION'.  By specifying
DELIMITER (a string), you can customize what will appear between the name and
the version.  By default, DELIMITER is \"@\"."

Pretty straight forward, perhaps the one thing to look out for is what is the return value of this function - seems likely that it's a string.

Lets have a look at something a bit more complicated, in guix/build/gnu-build-system there is a gnu-build function which has the following signature:

(define* (gnu-build #:key (source #f) (outputs #f) (inputs #f)
                   (phases %standard-phases)
                   #:allow-other-keys
                   #:rest args)
"Build from SOURCE to OUTPUTS, using INPUTS, and by running all of PHASES
in order.  Return #t if all the PHASES succeeded, #f otherwise."

From the signature we can see that there's #:key so we have the option to provide #:source, #:outputs, #:inputs and #:phases. From the functions docstring we can see that the intention is that we actually provide some of those (otherwise it won't do anything). There is an #:allow-other-keys so there are circumstances where other keyword arguments are provided, and as there's a #:rest all the arguments will be put into args. We can also see from the doctring the return value of the function depends on what happens in PHASES.

The last one to consider is in guix/packages.scm where is the function origin-uri defined? We can see it in the export part of the file at the top, but there's no define (origin-uri ...) in the file - why?

This is a special case caused by <origin> being a Record. Here's an excerpt from the definition:

(define-record-type* <origin>
  %origin make-origin
  origin?
  this-origin
  (uri       origin-uri)        ; string
  (method    origin-method)     ; procedure
  (hash      origin-hash)       ; <content-hash>

A Guile Record automatically creates functions to create the record, test for whether something is of this record type and to access the different fields. In this case the make-origin function creates the <origin> record. The tester function (often called a predicate test) is called origin?. And, the various fields have been given accessor functions, including the one we're looking for origin-uri.

Formatting and Linting code

As I said earlier it's often useful to space out S-expressions to make them easier to understand. However, as we know the formatting of code has caused more rage on the Internet than anything else. A theme in contemporary languages is to provide a standard formatter tool (e.g. for Go gofmt). As Lisps are from a slightly earlier phase there's commonly no standard tool and there may be different formatting in different code-bases.

For Guix, there's guix style which will format package definitions, it's really useful and it can be tuned to only reformat certain things. When changing a package or adding a package in the Guix source tree the command can be run like this:

$ ./pre-inst-env guix style --dry-run <package-definition-that-has-been-changed>

$ ./pre-inst-env guix style --styling=arguments --dry-run <package-definition-that-has-been-changed>

The formatting of Lisp code that Emacs performs is what most developers are expecting. If the code is formatted in a way that Emacs thinks is correct, then I think that's pretty reasonable. I'm not an Emacs user, but I do know how to install Spacemacs and reformat code like this:

$ git clone https://github.com/syl20bnr/spacemacs ~/.emacs.d
$ guix package --install emacs
$ emacs <file-with-code-in-it>

Then inside the file visually select (with the mouse) the block of code that you want reformatted and type in the command area :indent-region. Then write the file (:w) and get out of there as fast as you can (:quit) ... before you start thinking "I wonder what all the excitement is about Emacs?" and never leave! 😉

Finally, for contributing packages to Guix there is guix lint which checks common things like whether the Synopsis is correct, that code lines aren't too long and so forth. It's really useful for reminding yourself about some of the detail of Guix's Package Guidelines and quality requirements.

Useful resources

Ready to explore Guile Scheme further, here are some useful stepping off points:

Final Thoughts

That's enough about Guile Scheme functions and modules to be dangerous! With this done we can explore all sorts of Guix functions and those in the Guile standard library. We can also create our own functions and run code either in the REPL or from a file.

If you have comments on this series, or areas that you think weren't clear I'd love to hear from you - contact me in the usual places!

[1]	It may help you to know that each expression is called an S-expression which is often abbreviated to sexp: yeah that's s-exp not sex-p - making it a whole lot less interesting to CompSci undergrads everywhere! And, this is why Guix calls it's own type of expressions G-expressions or gexps

[2]	We added colours to the REPL in the previous post in this series. For Vim there is Rainbow Parentheses, for VS Code there is Rainbow Brackets, for Emacs there is Rainbow Delimiters

Posted in Tech Wednesday 01 May 2024
Tagged with tech ubuntu guix guile scheme emacs

‹‹Guix package structure: build-system overview and build arguments Guix package structure: build system phases and modify-phases››

Futurile