In pondering the concept language, I realized that I need to be able to do two things with concepts:

1. I need to be able to say, of a phrase, whether or not it matches the concept. (Call this “analysis.”)

2. I need to be able, given a concept, to generate a random phrase which matches the concept. (Call this “synthesis.”)

Analysis is easy to do if concepts are written as logical predicates. On the other hand, then synthesis becomes very difficult to do; in general, generating an example of something matching a logical predicate is computationally intractable.

Synthesis is easy to do if concepts are written as functions (where the range is the set of phrases matching the concept, and each phrase is given an index). On the other hand, then analysis becomes very difficult to do; in general, saying whether or not something is part of the range of a function is computationally intractable.

So what I require is a different concept language, which is neither like predicates nor like functions, in which both analysis and synthesis are computationally practical. In order to penetrate this problem, I’ve decided to begin by focusing on concepts over natural numbers, before generalizing the task to concepts over phrases.

Let us clarify the task. Both functions and predicates denote sets. The set, for a predicate, is the set of things which satisfy the predicate. The set, for a function, is the set of things which the function can yield. We need another way of denoting sets, for which analysis and synthesis are computationally practical. In this case, the sets are subsets of the natural numbers.

The language doesn’t have to be totally general; in fact, this would be a disadvantage. A totally general language would have wacky computational properties (non-halting, etc.), and would probably make various things more difficult. So we need to have the language at the right level of power — neither too powerful nor not powerful enough.

Here is a first attempt at a language with the properties we need. I’m sure that it’s not too powerful; I’m not sure that it’s powerful enough.

The primitive concepts are “Y” and “N.” They mean “this number is in the set,” and “this number is not in the set,” respectively.

There are two operations for constructing new concepts out of existing concepts. The first is “sequence,” which takes a series of concepts and strings them together in sequence on the number line. We will write it [a, b, c, …].

So for instance, [Y, N, Y, Y] means, “0 is in the set; 1 is not in the set; 2 is in the set; 3 is in the set.” The concept is undefined for numbers higher than 3. (Each concept is over a limited range of numbers.)

The second operation is “repetition,” which takes a concept and a numerical constant. It takes a concept and repeats it on the number line. We will write it as (a * n).

So for instance, ([N, N, Y, Y] * 3) looks like this:

0-N 1-N 2-Y 3-Y 4-N 5-N 6-Y 7-Y 9-N 10-N 11-Y 12-Y

In summary, the concept language is:

* Y and N, the primitive concepts.

* The sequence operation, [a, b, c, …].

* The repetition operation, (a * n).

An “empty” concept is N, a sequence whose members are all empty, a repetition whose operand is empty, or a repetition where the number of repetitions is zero. Empty concepts can be used in building concepts, but an empty concept is not itself a concept. We cannot perform analysis and synthesis on it. (In principle we could perform analysis, but say that nothing satisfies the concept. But it doesn’t even make sense to ask to perform synthesis on it.)

With this language it is computationally easy both to determine membership (analysis) and to randomly generate an example (synthesis). Let us consider how to do both of these.

First we need to be able to determine the “length” of a concept: that is, the amount of number line that it covers. For Y and N, this is 1. For sequence, this is the sum of the lengths of the members. For repetition, this is the length of the operand times the number of repetitions.

Let us consider how to generate an example (synthesis). If the concept is a repetition, generate an example of the operand and add it to the length of the operand multiplied by a random number in [0, n). If the concept is a sequence, randomly choose one of the non-empty members, generate an example of it, and add to it the lengths of all of the members preceding this member. If the concept is Y, the example is 0. (We will never be asked to generate an example of N, and this is fortunate because there is no example of N.)

Now let us consider how to determine membership of a number n (analysis).

If the concept is a sequence, subtract the lengths of the members from n until the length of a member is greater than n. n is “inside” that member. Determine membership in that member. If we run out of members, then n is out of range.

If the concept is a repetition, assert that n integer-divided by the length of the operand is less than the number of repetitions; otherwise n is out of range. If n is in range, determine membership in the operand of n modulo the operand length.

If the concept is Y, then n is a member. If the concept is N, then n is not a member.

So this is an example of the sort of thing that we need. A great many subsets of the natural numbers can be expressed in this language; but things like the set of prime numbers, or the powers of two, can’t be expressed. Is this in the right neighborhood of power level? Hard for me to say; but I lean towards yes. The next task is to invent a language like this for phrases.