## Term languages

3rd October 2002

These notes are from a Computation III lecture on 2nd October 2002.

### Term Languages

If ƒ is a function symbol, the number of arguments taken by ƒ is caled its arity. For example, +,—and * have an arity of 2. sin(x) and √ both have arity 1. We can regard a constant as a function with arity 0.

A term language consists of all terms which can be formed with a given set of variables, function symbols and constants. The list of function symbols and constants σ:

{ ƒ_{1}, ..., ƒ_{n}(function symbols) σ = { c_{1}, ..., c_{k}(constants)

Is called the signature of the language.

In the following, a variable is an upper case letter followed by a string of letters and digits. It is assumed that we know the arity and formatting conventions for each function symbol in a signature.

#### Example 1

{ b, g (Arity 2) σ = { h (Arity 1) { 0, 1 (constants)

A grammar for this term language:

S ~> b(S, S) | g(S, S) | h(S) | 0 | 1 | variable

A typical derivation:

S b(S, S) b(S, g(S, S)) b(S, g(h(S), S)) =>* b(X, g(h(Y), X))

#### Example 2

{ +, -, * (Arity 2) σ = { 0, 1 (constants) S ~> (S + S) | (S - S) | (S * S) | 0 | 1 | variable

Note that S is underlined to avoid confusion with the variable S. We do *not* assume that +, -, *, 0 and 1 have their usual meanings. (0 + 0) is not the same as 0, and X is not the same as (X + 0).

#### Interpretation (Semantics) of a Term Language

Suppose T_{σ} is a term language with signature:

{ ƒ_{1}...ƒ_{j}(with arities n_{1}...n_{j}) σ = { c_{1}... c_{k}

An interpretation of a term language gives a specific meaning in some function space to each function symbol in the signature. For T_{σ} the interpretation is a structure (D, ƒ_{1}...ƒ_{j}, c_{1}...c_{k}):

n_{i}ƒ_{i}: D -> D i = 1...j c_{1}.... c_{k}are in set D

An interpretation for example 2 could be defined:

I_{1} = (Z, usual +, usual =, usual *, usual 0, usual 1)

In which case ((X + Y) * Z) evaluated for X = 2, Y = 3, Z = 2 would equal (2 + 3) * 2) = 10. However, an alternative interpretation could use the following rules:

I = ( {R, B} + defined as R + R = R R + B = R B + R = R B + B = B - defined as R - R = R R - B = R B - R = R B - B = R * defined as R * R = B R * B = B B * R = B B * B = R 0 is R (Red) 1 is B (Blue) )

This interpretation is perfectly valid as the functions all have the required arity. The expression ((X + Y) * Z) evaluated with X = B, Y = R, Z = B is ((B + R) * B) = R * B = B so the answer would be Blue.

### Parse Trees

We may display a term as a tree with function names on the interior nodes and constants and variables on the frontier.

((X + Y) * Z) becomes * / \ + Z / \ X Y

Derivations may also be written as growing trees starting at the root:

S => * => * / \ / \ S S S + / \ S S etc

We have been talking about grammars generating languages, so grammars can be thought of as growing parse trees from the root down. Evaluations on the other hand seem to go the other way, starting from the frontier and working up. The idea is that *all* languages work like this—even spoken languages such as English.

#### Language Picture

Speaker—uses a grammar to grow a parsetree top down, then passes the resulting sentence to a listener.

Listener—evaluates the sentence starting at the frontier and working back up the parse tree.

This is how parsers and compilers in computer science work.

The outermost operator of a term is the function name that appears at the top of the tree. This is *first* in derivation, *last* in evaluation.

### Substitution

Suppose N is a term in a term language, X_{1}...X_{n} are variables and M_{1}...M_{n} are terms. If we wish to simultaneously replace X_{1} with M_{1}...X_{n} with M_{n} in N we can use the following notation:

N[X_{1}:= M_{1}, X_{2}:= M_{2}, ..., X_{n}:= M_{n}]

The above operator (the bit in the square braces) is called substitution.

#### Example

N = X + Y σ_{ }= [X := (X + Y), Y := (X + Z)] N_{σ}= (X + Y)_{σ}= ((X + Y) + (X + Z))

Note that if we apply the substitution sequentially rather than all at once we get this: ((X + (X + Z)) + (X + Z)) which is different from the above. In general, M[X_{1} := M_{1}, X_{n} := M_{n}] is *not* the same as M[X_{1} := M_{1}][X_{2} := M_{2}]...[X_{n} := M_{n}].

An example:

[CATS := DOGS][DOGS := CATS] DOGS[CATS := DOGS][DOGS := CATS] = CATS CATS[CATS := DOGS][DOGS := CATS] = CATS

The above example always ends in cats.

Final note: Substitutions do not commute.

## More recent articles

- Weeknotes: datasette-enrichments, datasette-comments, sqlite-chronicle - 8th December 2023
- Datasette Enrichments: a new plugin framework for augmenting your data - 1st December 2023
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023
- ospeak: a CLI tool for speaking text in the terminal via OpenAI - 7th November 2023